Download presentation
Presentation is loading. Please wait.
Published byByron Mills Modified over 9 years ago
1
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
2
Overview u Software DSM u Release Consistency u Eager Release Consistency u Lazy Release Consistency u Conclusion
3
Software DSM u Provides shared address space using software support u Rely on (user level) memory management techniques to detect access/updates to shared data u Memory coherence protocol – illusion of shared memory u High Communication overheads and Large page- size coherence units u Sending messages expensive in Software DSM
4
Release Consistency u Extension of weak consistency u Weak Consistency Synchronization – Globally Update Memory Local changes propagated to all processors u Release Consistency Propagates only locked memory as needed.
5
RC – Shared Memory Accesses Shared Memory Accesses OrdinarySpecial SyncNsync AcquireRelease
6
RC – Formal Definition u A system is release consistent if Before an ordinary access is allowed to perform with respect to any other processor, all previous acquires must be performed Before a release is allowed to perform with respect to any other processor, all the previous reads and writes must be performed. Special accesses are sequentially consistent with each other.
7
Eager Release Consistency (based on Munin’s write share protocol) u Release Modification propagated at release u Invalidate Protocol – Sends invalidations u Update Protocol Diffs – limit the amount of data exchanged
8
Eager Release Consistency (..Contd) u Acquire No consistency related operations Protocol locates the processor that last executed a release on the same variable u Access Miss Message to directory manager. Directory manager forwards request to current owner
9
Eager Release Consistency P1 P2 P3 P4 w(x) rel acq w(x) rel acq r(x) Repeated Updates of Cached Copies in Eager RC
10
Lazy Release Consistency u Rather than eagerly “sync up” data at release point, why not “lazily” wait until the subsequent acquire? u Propagation of modifications postponed until the time of an acquire. u To do so happened-before-1 partial order is used.
11
Lazy Release Consistency P1 P2 P3 P4 w(x) rel acq w(x) rel acq r(x) Message Traffic in LRC
12
happened-before-1 Partial Order u Shared memory accesses are partially ordered by happened-before-1, denoted by, defined as follows: If a1 and a2 are accesses on the same processor, and a1 occurs before a2 in program order, then a1 a2 If a1 is a release on processor p1, and a2 is an acquire on the same location on processor p2, and a2 returns the value written by a1, then a1 a2 If a1 a2, a2 a3, then a1 a3. hb1
13
Write Notices u RC requires that before a processor may continue past an acquire, all shared accesses that precede acquire must be performed at the acquiring processor u LRC – Guaranteed by write notices u Write Notice Indication of modification
14
Write Notice Propagation u Execution of each processor is divided into intervals u Interval beginning – special access executed by that processor u Interval performed at a processor All modifications during that interval have been performed at the processor
15
Write Notice Propagation P1 P2 P3 P4 w(x) rel acq w(x) rel acq r(x) i p1 i P2 i p3 i p4
16
Write Notice Propagation u V p (i) Vector Timestamp for interval i and processor p. u Number of elements in V p (i) = Number of processors u Entry for p in V p (i) = i u Entry for q in V p (i) = Most recent interval of q performed at p
17
Write Notice Propagation u V p1 (i p1 ) = { i p1, 0, 0, 0} u V p2 (i p2 ) = {i p1, i p2, 0, 0} u V p3 (i p3 ) = {0, i p2, i p3, 0} u V p4 (i p4 ) = {0, 0, i p3, i p4 } u On acquire, the acquiring processor p3 sends its current vector timestamp to previous releaser p2. u Processor p2 uses this information to send p3 the write notices for all intervals of all processors that have performed at p2 but not at p1
18
Data Movement Protocols u Multiple Writer Protocol u False Sharing Occurs when two or more processors access different variables within a page, with at least one of the accesses being a write Generates large amount of message traffic Handling false sharing for software DSM – important because of large page size u LRC allows multiple writer protocol: Allows concurrent writes to different part of the page No message traffic Modifications merged using diffs
19
Invalidate Vs Update u Invalidate Acquiring processor invalidates all pages in its cache for which it receives write notices. u Update Updates those pages Diffs must be obtained for all concurrent modifiers. For interval i, diffs must be obtained from all intervals j, such that, j i, and there exists no k such that j k i hb1
20
Access Misses u Copy of page as well as a number of diffs may have to be retrieved u Modifications summarized by diffs are merged before access u Access Miss: At interval i, diffs must be obtained from all intervals j, such that, j i, and there exists no k such that j k i u If processor has an invalidated copy of page Whole page not sent Write-notices contain all the necessary information of diffs Reduces the amount of data sent. hb1
21
Conclusion u Performance of Software DSM – Sensitive to the number of messages and amount of data exchanged to create shared memory abstraction. u LRC aims at reducing both the number of messages and amount of data exchanged by allowing changes to propagate lazily, only when needed.
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.