1 Release Consistency Slides by Konstantin Shagin, 2002.

Slides:



Advertisements
Similar presentations
Synchronization. How to synchronize processes? – Need to protect access to shared data to avoid problems like race conditions – Typical example: Updating.
Advertisements

Relaxed Consistency Models. Outline Lazy Release Consistency TreadMarks DSM system.
(c) Oded Shmueli Transactions Lecture 1: Introduction (Chapter 1, BHG) Modeling DB Systems.
Multiple-Writer Distributed Memory. The Sequential Consistency Memory Model P1P2 P3 switch randomly set after each memory op ensures some serial order.
Synchronization Chapter clock synchronization * 5.2 logical clocks * 5.3 global state * 5.4 election algorithm * 5.5 mutual exclusion * 5.6 distributed.
1 Munin, Clouds and Treadmarks Distributed Shared Memory course Taken from a presentation of: Maya Maimon (University of Haifa, Israel).
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Distributed Systems Spring 2009
1 Lecture notes: Concurrent And Distributed Programming © Assaf Schuster Spring Itay Maman-
What we will cover…  Distributed Coordination 1-1.
(Software) Distributed Shared Memory (aka Shared Virtual Memory)
Lightweight Logging For Lazy Release Consistent DSM Costa, et. al. CS /01/01.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
Memory consistency models Presented by: Gabriel Tanase.
1 Lecture 1: Parallel Architecture Intro Course organization:  ~5 lectures based on Culler-Singh textbook  ~5 lectures based on Larus-Rajwar textbook.
Distributed Resource Management: Distributed Shared Memory
Memory Consistency Models
Lecture 12 Synchronization. EECE 411: Design of Distributed Software Applications Summary so far … A distributed system is: a collection of independent.
1 Lecture 20: Protocols and Synchronization Topics: distributed shared-memory multiprocessors, synchronization (Sections )
Processor Consistency [Goodman 1989]* Processor Consistency is a memory model in which the result of any execution is the same as if the operations of.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Memory Consistency Models Some material borrowed from Sarita Adve’s (UIUC) tutorial on memory consistency models.
Distributed Shared Memory Systems and Programming
Shared Memory – Consistency of Shared Variables The ideal picture of shared memory: CPU0CPU1CPU2CPU3 Shared Memory Read/ Write The actual architecture.
TreadMarks Distributed Shared Memory on Standard Workstations and Operating Systems Pete Keleher, Alan Cox, Sandhya Dwarkadas, Willy Zwaenepoel.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z.
Computer Architecture 2015 – Cache Coherency & Consistency 1 Computer Architecture Memory Coherency & Consistency By Yoav Etsion and Dan Tsafrir Presentation.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed Shared Memory Steve Ko Computer Sciences and Engineering University at Buffalo.
TECHNIQUES FOR REDUCING CONSISTENCY- RELATED COMMUNICATION IN DISTRIBUTED SHARED-MEMORY SYSTEMS J. B. Carter University of Utah J. K. Bennett and W. Zwaenepoel.
Logical Clocks. Topics Logical clocks Totally-Ordered Multicasting Vector timestamps.
Lamport’s Logical Clocks & Totally Ordered Multicasting.
1 Lecture 12: Hardware/Software Trade-Offs Topics: COMA, Software Virtual Memory.
Distributed Memory and Cache Consistency (some slides courtesy of Alvin Lebeck)
“Virtual Time and Global States of Distributed Systems”
Memory Consistency Models. Outline Review of multi-threaded program execution on uniprocessor Need for memory consistency models Sequential consistency.
Treadmarks: Distributed Shared Memory on Standard Workstations and Operating Systems P. Keleher, A. Cox, S. Dwarkadas, and W. Zwaenepoel The Winter Usenix.
DISTRIBUTED COMPUTING
Page 1 Distributed Shared Memory Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
TreadMarks: Distributed Shared Memory on Standard Workstations and Operating Systems Present By: Blair Fort Oct. 28, 2004.
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
1 Chapter 9 Distributed Shared Memory. 2 Making the main memory of a cluster of computers look as though it is a single memory with a single address space.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 March 20, 2008 Session 9.
Lazy Release Consistency for Software Distributed Shared Memory Pete Keleher Alan L. Cox Willy Z. By Nooruddin Shaik.
1 Lecture 19: Scalable Protocols & Synch Topics: coherence protocols for distributed shared-memory multiprocessors and synchronization (Sections )
Logical Clocks. Topics r Logical clocks r Totally-Ordered Multicasting.
Project Summary Fair and High Throughput Cache Partitioning Scheme for CMPs Shibdas Bandyopadhyay Dept of CISE University of Florida.
CS267 Lecture 61 Shared Memory Hardware and Memory Consistency Modified from J. Demmel and K. Yelick
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
Distributed Memory and Cache Consistency (some slides courtesy of Alvin Lebeck)
Database Transaction Abstraction I
CS5102 High Performance Computer Systems Memory Consistency
Distributed Shared Memory
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 18: Coherence and Synchronization
Ivy Eva Wu.
CMSC 611: Advanced Computer Architecture
Lecture 2: Snooping-Based Coherence
Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini
The University of Adelaide, School of Computer Science
Lecture 24: Multiprocessors
Distributed Resource Management: Distributed Shared Memory
Lecture 17 Multiprocessors and Thread-Level Parallelism
Lecture: Coherence and Synchronization
Lecture 19: Coherence and Synchronization
Lecture 18: Coherence and Synchronization
The University of Adelaide, School of Computer Science
Lecture 17 Multiprocessors and Thread-Level Parallelism
Presentation transcript:

1 Release Consistency Slides by Konstantin Shagin, 2002

2 The need for Relaxed Consistency Schemes In any implementation of Sequential Consistency there should be some global control mechanism. Either of writes or reads require memory synchronization operations. –In most implementation writes require some kind of memory synchronization: w(x) w(y) w(x) A B

3 The Idea of Relaxed Consistency Schemes The Relaxed Consistency Schemes are designed to allow less memory synchronization operations. –Writes can be delayed, aggregated, eliminated. –This results in less communication and therefore higher performance. w(x) w(y) w(x) A B barrier

4 Software Distributed Shared Memory distributed shared memory network Mem Node 1 Node 2 Node n page based, permissions, … single system image, shared virtual address space, …

5 False Sharing False sharing is a situation in which two or more processes access different variables within a page and at least one of the accesses is a write. –If only one process is allowed to write to a page at a time, false sharing leads to unnecessary communication, called the “ping-pong” effect.

6 Understanding False Sharing w(x)w(x) w(x) A B r(y)r(y)r(y) x y p w(x)w(x) w(x) A B r(y)r(y)r(y) x page p1 y page p2 p p p p p

7 False Sharing in Relaxed Consistency Schemes False sharing has much smaller overhead in relaxed consistency models. The overhead induced by false sharing can be further reduced by the the usage of multiple-writer protocols. Multiple-writer protocols allow multiple processes to simultaneously modify their local copy of a shared page. –The modifications are merged at certain points of execution.

8 Release Consistency [Gharachorloo et al. 1990, DASH]* Introduces a special type of variables, called synchronization variables or locks. Locks cannot be read or written to. They can be acquired and released. For a lock L those operations are denoted by acquire(L) and release(L) respectively We will say that a process that acquired a lock L but has not released it, holds the lock L. No more than one process can hold a lock L. One process holds the lock while others wait. K. Gharachorloo, D. Lenoski, J. Laudon, P. Gibbons, A. Gupta, and J.L. Hennessy. Memory consistency and event ordering in scalable shared-memory multiprocessors. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages IEEE, May (*)

9 Using Release and Acquire to define execution-flow synchronization primitives Let a set of processes release tokens by reaching the operation release in their program order. Let another set (possibly with overlap) acquire those tokens by performing acquire operation, where acquire can proceed only when all tokens have already arrived from all releasing processes. 2-way synchronization = lock-unlock, 1 release, 1 acquire n-way synchronization = barrier, n releases, n acquires PARC’s synch = k-way synchronization

10 Model of Atomicity A read by P i is considered performed with respect to process P k at a point in time when the issuing of a write to the same address by P k can not affect the value returned by the read. A write by P i is considered performed with respect to process P k at a point in time when an issued read to the same address by P k returns the value defined by this write (or a later value). An access is performed when it is performed with respect to all processes. An acquire(L) by P i is performed when P i receives exclusive ownership of L (before any other requester). A release(L) by P i is performed when P i gives away its exclusive ownership of L.

11 Formal Definition of Release Consistency Conditions for Release Consistency: (A)Before a read or write access is allowed to perform with respect to any other process, all previous acquire accesses must be performed, and (B)Before a release access is allowed to perform with respect to any other process, all previous read or write accesses must be performed, and (C) acquire and release accesses are sequentially consistent.

12 A B rel(L 1 ) w(x)1 t t r(x)0 r(x)? acq(L 1 ) r(x)1 Understanding RC From this point all processes must see the value 1 in X It is undefined what value is read here. It can be any value written by some process. Here it can be 0 or 1. 1 must be read according to rule (B), but the programmer can not be sure of it Programmer is sure that this will return 1 according to rules (C) and (A)

13 Acquire and Release release serves as a memory-synch operation, or a flush of the local modifications to the attention of all other processes. According to the definition, the acquire and release operations are not only used for synchronization of execution, but also for synchronization of memory, i.e. for propagation of writes from/to other processes. –This allows to overlap the two expensive kinds of synchronization. –This turns out also simpler on the programmer from semantic point of view.

14 Acquire and Release (cont.) A release followed by an acquire of the same lock guarantees to the programmer that all writes previous to the release will be seen by all reads following the acquire. The idea is to let the programmer decide which blocks of operations need be synchronized, and put them between matching pair of acquire-release operations. In the absence of release/acquire pairs, there is no assurance that modifications will ever propagate between processes.

15 Consistency of synchronization operations Note the relations of the release/acquire operations to themselves also define an independent memory consistency scheme. –The rule (C) defined it to be Sequential Consistency. There are other flavors of RC in which the consistency of synchronization operations defined to be some consistency x (e.g., Coherence). Such a memory model is denoted by RCx. RCx is weaker than RCy if x is weaker than y. For simplicity, we deal only with RCsc.

16 Happened-Before relation induced by acquire/release Redefine the happened-before relation using acquire and release instead of receive and send respectively. We say that event e happened before event e’ (and denote it by e  e’ or e < e’) if one of the following properties holds: Processor Order: e precedes e’ in the same process Release-Acquire: e is a release and e’ is the following acquire of the same lock Transitivity: exists e’’ s.t. e < e’’ and e’’< e’

17 Happened-Before relation induced by acquire/release (cont.) A B rel(L 2 ) w(x) t t acq(L 1 ) r(x) acq(L 2 ) rel(L 1 )acq(L 2 ) w(y) r(y) rel(L 2 ) w(y) rel(L 1 )r(x) C r(y) w(x)

18 Competing Accesses Two memory accesses are not synchronized if they are independent events according to the previously defined happened-before relationship. Two memory accesses are conflicting if they are accesses to the same memory location, and at least one of them is a write. Conflicting accesses are said to be competing if there exists an execution in which they are not synchronized. Competing accesses form a race condition as they may be executed concurrently.

19 Data Races in RC Release Consistency does not guarantee anything about propagation of updates without synchronization. Example: Initially : grades = oldDatabase; updated = false; grades = newDatabase; updated = true; while (updated == false); X:=grades.gradeOf(lecturersSon); Thread T.A. Thread Lecturer If the modification of variable updated is passed to Lecturer, while the modification of grades is not, then Lecturer looks at the old database! This is possible in Release Consistency, but not in Sequential Consistency.

20 Expressiveness of Release Consistency [Gharachorloo et.al 1990] Let a properly-labeled (PL) program be such that has no competing accesses. Theorem: RCsc = SC for PL programs. Should make sure there are no data-races.

21 Implementing RC The first implementation was proposed by the inventors of RC and is called DASH. DASH combats memory latency by pipelining writes to shared memory. The processor is stalled only when executing a release, at which time it must wait for all its previous writes to perform. P1P1 rel(L) P2P2 w(x) w(y)w(z) x y z

22 Implementing RC (cont.) It is important to reduce the number of messages exchanges, because every message has additional fixed overhead, independent of its size. Another implementation of RC, called Munin reduces the number of messages by buffering writes until a release. P1P1 rel(L) P2P2 w(x)w(y)w(z) x,y,z

23 Eager Release Consistency [Carter et al. 1991, Munin]* Implementation of Release Consistency (not a new memory model). Postpone sending modifications to the next release. Upon a release send all accumulated modifications to all caching processes. No memory-synchronization operations on an acquire. Upon a miss (no local caching of the variable) get latest modification from latest modifier (need some more control to store its identity, no big deal). John B. Carter, John K. Bennett, and Willy Zwaenepoel. Implementation and Performance of MUNIN. In Proceedings of the 13th ACM Symposium on Operating Systems Principles, pages , October (*)

24 Understanding ERC Release operation does not complete (is not performed) until the acknowledgements from all the processes are received. A B w(y)1 t t w(x)1 acq(L 1 ) C w(z)1 acq(L 2 ) r(x)0 x,y z z r(x)0 x,y apply changes rel(L 1 ) apply changes r(x)1 rel(L 2 ) apply changes r(z)0 apply changes r(y)1 r(z)1

25 Supporting Multiple Writers in ERC Modifications are detected by twinning. –When writing to unmodified page, its twin is created. –When releasing, the final copy of a page is compared to its twin. –The resulting difference is called a diff. Twinning and diffing not only allow multiple writers, but also reduce communication. –Sending a diff is cheaper than sending an entire page.

26 Twinning and Diffing write Ptwin writable working copy release:  diff

27 Update-based vs. Invalidate-based In update-based protocols the modifications are sent whereas in invalidate-based protocol only notifications of modifications are sent. P1P1 rel(L) P2P2 w(x)1w(y)2 x:=1 y:=2 P1P1 rel(L) P2P2 w(x)1w(y)2 “I changed x and y” Invalidate-based Update-based

28 Update-Based vs. Invalidate-Based (cont.) Invalidations are smaller than the updates. –The bigger the coherency unit the bigger is the difference. In invalidation-based schemes there can be significant overhead due to access misses. P1P1 rel(L) P2P2 w(x)1w(y)2 inv(x) inv(y) acq(L) r(x) get(x) x=1 r(y) get(y) y=2

29 Reducing the Number of Messages In DASH and Munin systems all processes (or all processes that cache the page) see the updates of a process. Consider the following example of execution in Munin: P1P1 w(x) P2P2 P3P3 P4P4 rel(L) acq(L) w(x) rel(L) acq(L) w(x) rel(L) r(x) acq(L) There are many unneeded messages. In DASH even more. This problem exists in invalidation-based schemes as well.

30 Reducing the Number of Messages (cont.) Logically, however it suffices to update each processor’s copy only when it acquires L. P1P1 w(x) P2P2 P3P3 P4P4 rel(L) acq(L) w(x) rel(L) acq(L) w(x) rel(L) r(x) acq(L) Therefore, a new algorithm, called Lazy Release Consistency (LRC) for implementing RC was proposed. LRC is aimed at reducing both the number of messages and the amount of data exchanged.

31 Lazy Release Consistency [Keleher et al., Treadmarks 1992]* The idea is to postpone sending of modifications until a remote processor actually needs them. Invalidate-based protocol The BIG advantage: no need to get modifications that are irrelevant, because they are already masked by newer ones. NOTE: implements a slightly more relaxed memory model than RC! P. Keleher, A. L. Cox, S. Dwarkadas, and W. Zwaenopol. Treadmarks: Distributed shared memory on standard workstations and operating systems. In Proceedings of the 1994 Winter Usenix Conference, pages , Jan (*)

32 Formal Definition of Lazy Release Consistency Conditions for Lazy Release Consistency: (A)Before a read or write access is allowed to perform with respect to any other process, all previous acquire accesses must be performed with respect to that other process, and (B)Before a release access is allowed to perform with respect to any other process, all previous read or write accesses must be performed with respect to that other process, and (C) acquire and release accesses are sequentially consistent.

33 A B rel(L 1 ) w(x)1 t t r(x)0 r(x)? acq(L 1 ) r(x)?r(x)1 Understanding the LRC Memory Model C r(x)0 r(x)? acq(L 2 ) r(x)? It is guaranteed that the acquirer of the same lock sees the modification that precede the release in program order.

34 Understanding the LRC Memory Model: Transitivity The process C sees the modification of x by A. A B rel(L 1 ) w(x)1 t t acq(L 1 ) C acq(L 2 ) w(y)1 rel(L 2 ) r(x)1 r(y)1 acq(L 2 ) rel(L 1 )

35 Implementation of LRC Satisfying the happened-before relationship between all operations is enough to satisfy LRC. –Maintenance and usage of such a detailed ordering would be expensive. Instead, the ordering is applied to process intervals. –Intervals are segments of time in the execution of a single process. –New interval begins each time a process executes a synchronization operation.

36 Intervals P2P2 acq(L 1 ) rel(L 2 ) acq(L 2 ) rel(L 1 ) P1P1 t t P3P3 acq(L 3 ) acq(L 2 ) rel(L 3 )

37 Happened-before of Intervals A happened before partial order is defined between intervals. An interval i 1 precedes an interval i 2 according to happened-before of intervals, if all accesses in i 1 precede accesses in i 2 according to the happened-before of accesses.

38 Vector Timestamps An interval is said to be performed at a process if all interval’s accesses have been performed at that process. Each process p has vector timestamp V p that tracks which intervals have been performed at that process. –A vector timestamp consists of a set of interval indices, one per process in the system.

39 Management of Vector Timestamps Vector timestamps are managed like vector clocks. –send and receive events are replaced by release and acquire (of the same lock) respectively. –A lock grant message (that is sent from releaser to acquirer to give acquire the exclusive ownership) contains the current timestamp of the releaser 1.Just before executing a release or acquire in p: V p [q]:= V p [q] A lock grant message m is time-stamped with t(m)=V p. 3.Upon acquire for every q: V p [q]:= max{ V p [q], t(m)[q] }

40 Vector Timestamps (cont.) A process updates its vector timestamp at the end of an interval. Therefore during an interval the process’ timestamp does not change. We denote the vector timestamp of process p at interval i by V p i. The entry for process q  p is denoted by V p i [q]. –It specifies the most recent interval of process q that has been performed at process p. –Entry V p i [p] is always equal to i. An interval x of process q is said to be covered by V p i if V p i [q]  x

41 Write Notices Write notice is an indication that a given page has been modified. Each process keeps a table of intervals covered by it. –An entry in this table represents an interval. It contains a write notice for every page that was modified during the segment of time corresponding to the interval. Write notices are sent in the lock grant message along with the vector timestamp of the releaser.

42 Write Notices (cont.) It is not necessary to send to acquirer the write notices belonging to intervals covered by its vector timestamp. In order to let releaser know what intervals are covered by the acquirer, the acquirer sends the release its timestamp inside a lock request message. When the releaser sends a lock grant message to the acquirer, it sends only the write notices belonging to interval covered by itself, but not covered by the acquirer. When the acquirer receives the lock grant message, it invalidates all the pages for which a write notice is included in the message.

43 Write Notices (cont.) A B acq(L) x,y rel(L) r(y) w(y) w(x) write notices generate write notices acq(L) write notices for intervals not covered by VC B invalidate according to write notices request diffs diffs lock request

44 Access Misses When accessing an invalidated page, all the modifications made to it in the intervals that happened before the current interval must be obtained. –Note that this is true even if the access is a write. A process can identify those intervals and the processes that performed the modification by the write notices it has for the page. –A write notice is saved along with the id of the process from which it was received and its vector timestamp. How do we merge modifications performed by concurrent writers to a page?

45 Tracking Modifications with Multiple Writers It is possible that several processes make modifications to different variables at the same page. If the intervals in which the modifications are performed are independent (according to happened-before), we cannot just bring a page from one of the processes. What should we do? Employ the twinning and diffing technique again! XY P2P2 P1P1

46 Twinning and Diffing (reminder) write Ptwin writable working copy release:  diff

47 Tracking Modifications with Multiple Writers (cont.) Note that twinning and diffing not only allows multiple independent writers but also significantly reduces the amount of data sent. P1P1 P2P2 rel(L 1 ) w(x) t t acq(L 1 ) P3P3 acq(L 2 ) rel(L 2 ) r(x) w(y) inv(P) x y page P

48 Access Misses (cont.) Consider the following scenario, in which P 3 has a miss on a page containing variables x, y and z: P1P1 w(x) P2P2 P3P3 rel acq inv(x) rel acq inv(x,y) w(y) r(z) mod(x,y) When accessing z, P 3 sees that according to the locally stored write notices there has been two previous modifications. They are ordered by happened before relationship therefore P 3 can request both modifications from P 2.

49 Access Misses (cont.) More generally, if processor q modified page P at its interval x, then q is guaranteed to have any diffs of P created intervals that “happened-before” the interval x. Therefore even if diffs from multiple writers need to be retrieved, it is usually only necessary to communicate with very few processors. How long should a process keep the diffs ? How long should a process keep the write notices ? Clearly, not forever! A garbage collection needs to be done…

50 Garbage Collection A diff needs to be retained until it is clear it will never be requested. –This happens when a diff has already been sent to every processor. When a process sees it is running out of memory it initiates garbage collection, which is invoked at the next barrier. Garbage collection piggybacks on the barrier to “stop the world”. Each process receives all write notices in the system and uses them to validate all of its cached pages. As a result, all write notices and diffs are discarded.

51 Lazy Diffing 1.Don’t diff on every release; do it only if there’s communication with another node. 2.Delay generation of a diff until somebody asks for the lock. –When passing a lock token, send write notices for modified pages, but leave pages write-enabled. –If somebody asks for diff, diff and mark clean. Diff may include updates from later intervals (e.g., under the scope of other locks). 3.Must also generate diff if a write notice arrives. Must invalidate the page but keep modifications.

52 LRC with Lazy Diffing P1P1 w(x) P2P2 acq(L) make twin rel(L) acq(L) diff apply diff r(x)

53 Benefits of Lazy Diffing The gain is considerable. –The eventual diff may include modifications that would have been split over several diffs. –Lock acquisitions are faster – no need to wait for diffs. –Reducing the number of diffs reduces overall amount of transmitted data.

54 Drawbacks of Traditional LRC 1.At access miss a node may have to obtain the diffs from several nodes. –This happens when there is a substantial write-write false sharing. 2.The same diff may be applied many times. –Once at each node that fetches the diff 3.The need to save all diffs seen by a node significantly increases memory consumption. –A node that creates a diff needs to store it locally. –A node stores diffs it fetched from other nodes. 4.Garbage collection is an expensive global operation.

55 Home-based Lazy Release Consistency * HLRC is a simple home-based multiple-writer protocol that implements LRC Each page has a designated node, called the home node of the page, which contains its master copy. Diffs are computed at lock transfer time, sent to the home nodes of the corresponding pages, and then discarded. On access miss (read or write) an entire page is fetched from home. HLRC solves the mentioned drawbacks of LRC. L. Iftode. Home-based Shared Virtual Memory. PhD thesis, Dept. of Computer Science, Princeton University, June (*)

56 Understanding HLRC P1P1 w(x) P2P2 P3P3 acq(L) make twin rel(L) acq(L) diff apply diff r(x) p Assume x is a variable on a page p whose home is P 3 What happens if P 2 tries to fetch p before the diff arrives to P 3 ?

57 Guaranteeing Update Completion Before a Fetch There are several techniques to ensure that the home’s copy of a page contains the required updates. –Write flushing –Page versions (scalar timestamps) –Vector timestamps All the techniques require that the network delivers the messages in the order that they are sent.

58 Write Flushing The simplest approach. Delay the completion of the release events until all the updates are propagated to the corresponding home are completed –The completion is ensured by having home acknowledge the receipt of diffs.

59 Write Flushing (cont.) There are two drawbacks: 1.Latency increases due to the need to wait for the completion of update operation. 2.Page prefetching: a page fetched from home may contain more recent updates than the ones required by LRC. P1P1 w(x) P2P2 P3P3 acq(L) make twin rel(L) acq(L) diff apply diff r(x) p

60 Page prefetching P1P1 P2P2 P3P3 rel(L 1 ) acq(L 2 ) diff apply diff w(y) w(x) rel(L 2 ) P4P4 P5P5  (y) acq(L 1 ) diff  (x) apply diff r(x) acq(L 2 ) p inv(p) r(y) p No need to bring p again

61 Page Versions A version number is attached to each page. Page version number is incremented at home whenever the home receives the update performed by a non-home writer within an interval. The home sends the page version either in reply to a diff message, or in reply to a fetch request along with the page itself. The page version numbers are included in the write notices. A local page is not invalidated if the local page version is greater than or equal to the required page version included in the write notice.

62 Page Versions (cont.) Using page versions avoid unnecessary invalidations, but still require waiting for updates to complete at home. P1P1 P2P2 P3P3 rel(L 1 ) acq(L 2 ) diff apply diff w(y) w(x) rel(L 2 ) P4P4 P5P5  (y) acq(L 1 ) diff  (x) apply diff r(x) acq(L 2 ) (p, 2 ) inv(p,2) inv(p,1) r(y) 1 2 inv(p,1) p is not invalidated, because the version of the local copy is 2

63 Vector Timestamps Vector timestamps represent page versions, but avoid the need to wait for completion of updates. The lock can be transferred immediately, because the vector timestamp representing the new page version can be calculated without cooperation of home. The prefetching is detected in same way it is detected with scalar page versions.

64 Vector Timestamps (cont.) A vector timestamp is attached to a valid page and indicates the current version of the page. –Such timestamp is called flush timestamp. –At home, flush timestamps are updated each time updates corresponding to a remote interval are performed. –At non-home node, flush timestamp are updated either at the end of intervals during which the page was written, or when the page is fetched from home.

65 Vector Timestamps (cont.) A vector timestamp is attached to an invalid page and indicates the page version that the node has to fetch from home. –Such timestamp is called lock timestamp. –Lock timestamp is updated at acquire time as a result of applying the write notices received from the last releaser. The lock timestamp is presented to the home as a part of a fetch request. –The home delays answering a fetch request if the required version is not available (because the corresponding updates are not completed).

66 Vector Timestamps (cont.) P1P1 P2P2 P3P3 rel(L 1 ) acq(L 2 ) diff apply diff w(y) w(x) rel(L 2 ) P4P4 P5P5  (y) acq(L 1 ) diff (x)(x) apply diff acq(L 2 )r(x) (p, [] ) inv(p,[]) r(y) inv(p,[]) p is not invalidate d read is delayed until the home node P 3 applies the diff from P 1

67 Invalidation of a modified page There are situations that require that a modified page is invalidated. Example: A B acq(L)rel(L) acq(L) w(P) inv(P) A and B write to two different locations of the page P (no data race) After invalidating P at acquire, B cannot discard the local copy of P, because it contains B’s recent modifications.

68 Invalidation of a modified page (2) What happens when B fetches P from its home? (Let C be the home of P) A B acq(L)rel(L) acq(L) w(P) inv(P) r(P) P C diff(P) How do we merge B’s local copy with the fetched one? B’s local copy is combined with the new one by two-way diffing.

69 Two-way diffing Let P old be the modified invalidated copy and let T old be its twin. Let P fetched be the fetched copy. The new local copy P new is calculated by applying modification in P old to P new : P new := P fetched + P old  T old In addition, the twin of P is replaced by P fetched : T new := P fetched Therefore, the next time the diff of P is calculated, both old and new modifications are detected.