Look closely at example P1 and P2 do not access the same element A and A are in the same cache block, so if they are in one cache, they are in the other cache.
False Sharing Different/same processors access different/same items in different/same cache block Leads to ___________ misses
Cache Performance // Pn = my processor number (rank) // NumProcs = total active processors // N = total number of elements // NElem = N / NumProcs For(i=0;i
"name": "Cache Performance // Pn = my processor number (rank) // NumProcs = total active processors // N = total number of elements // NElem = N / NumProcs For(i=0;i
Which is worse? Both access the same number of elements No processors access the same elements as each other
Synchronization Sum += A[i]; Two processors, i = 0, i = 50 Before the action: –Sum = 5 –A = 10 –A = 33 What is the proper result?
Synchronization Sum = Sum + A[i]; Assembly for this equation, assuming –A[i] is already in $t0: –&Sum is already in $s0
Does Cache Coherence solve it? Did load bring in an old value? Sum += A[i] is ___________ –Atomic – operation occurs in one unit, and nothing may interrupt it.
Synchronization Problem Reading and writing memory is a non-atomic operation –You can not read and write a memory location in a single operation We need __________________ that allow us to read and write without interruption
Solution Software Solution –“lock” – –“unlock” – Hardware –Provide primitives that read & write in order to implement lock and unlock