Presentation is loading. Please wait.

Presentation is loading. Please wait.

Consistency and Replication Chapter 7. n Most of the lecture notes are based on slides by u Prof. Jalal Y. Kawash at Univ. of Calgary u Prof. Kenneth.

Similar presentations


Presentation on theme: "Consistency and Replication Chapter 7. n Most of the lecture notes are based on slides by u Prof. Jalal Y. Kawash at Univ. of Calgary u Prof. Kenneth."— Presentation transcript:

1 Consistency and Replication Chapter 7

2 n Most of the lecture notes are based on slides by u Prof. Jalal Y. Kawash at Univ. of Calgary u Prof. Kenneth Chiu at SUNY Binghamton u Prof. Sanjeev Setia at George Mason University n I have modified them and added new slides Giving credit where credit is due: CSCE455/855 Distributed Operating Systems

3 Consistency and Replication Chapter 7 Part I Consistency Models

4 Reasons for Replication Reliability: –Mask failures –Mask corrupted data Performance: –Scalability (size and geographical) Examples: –Web caching –Horizontal server distribution

5 Cost of Replication Replicas must be kept consistent Dilemma: 1.Replicate data for better performance 2.Modification on one copy triggers modifications on all other replicas 3.Propagating each modification to each replica can degrade performance ?

6 Consistency Issues – Access/Update Ratio time … Updates to the Web page User accesses to the page

7 Consistency Model When and how the modifications are made = consistency model: –Weak versus strong consistency model

8 Consistency Models (cont.) The general organization of a logical data store, physically distributed and replicated across multiple processes.

9 Consistency Models (cont) A process performs a read operation on a data item, expects the operation to return a value that shows the result of the last write operation on that data No global clock  difficult to define the last write operation Consistency models provide other definitions Different consistency models have different restrictions on the values that a read operation can return read1 read2

10 Summary of Consistency Models a)Consistency models not using synchronization operations. b)Models with synchronization operations. ConsistencyDescription StrictAbsolute time ordering of all shared accesses matters. Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time. CausalAll processes see causally-related shared accesses in the same order. FIFO All processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order. (a) ConsistencyDescription WeakShared data can be counted on to be consistent only after a synchronization is done ReleaseShared data are made consistent when a critical region is exited EntryShared data pertaining to a critical region are made consistent when a critical region is entered. (b)

11 Framework for Consistency Partial and Total Orders Let S be a set, and R  S  S R is anti-reflexive if  x  S, (x,x)  R R is transitive if  x, y, z  S, if (x,y)  R and (y,z)  R then (x,z)  R A PO is an anti-reflexive, transitive relation A PO is denoted by (S,R) xRy means (x,y)  R A TO is a PO (S,R) such that  x, y  S x  y, either xRy or yRx

12 Framework for Consistency Operations and Data Items Operations are either writes or reads A write is denoted w p (x)v A read is denoted r p (x)v A read-write data item is the set of all sequences such that 1.Each o i is either a read or a write 2.Each read returns the same value written by the most recent preceding write in the sequence

13 Framework for Consistency Operations and Processes Each operation can be decomposed into two components: –Invocation and response w p (x)v: invocation = w p (x)v; response = empty r p (x)v: invocation = r p (x)?; response = v A process is a sequence of operation invocations A process computation is a sequence of operations obtained by augmenting each invocation in the process by its response

14 Framework for Consistency Multiprocess Systems A (multiprocess) system (P,D) is a set of processes, P, and a set of data items, D, such that all operation invocations of processes in P are applied to items in D A (multiprocess) system (P,D) computation is a collection of process computations one for each process in P

15 Framework for Consistency Example Program p: x = y Process p: r(y)v? w(x)v? Program q: y = x Process q: r(x)v? w(y)v? Process p Comp: r(y)5 w(x)5 Process q Comp: r(x)0 w(y)0 System (P,D): P = {p,q} D = {x,y} System (P,D) Computation: p: r(y)5 w(x)5 q: r(x)0 w(y)0

16 Framework for Consistency Program Order Define program order, denoted (O, < po ), by o 1 < po o 2 iff o 2 follows o 1 in p’s computation Process p: r(y)v? w(x)v? Process q: r(x)v? w(y)v? Process p Comp: r(y)5 w(x)5 Process q Comp: r(x)0 w(y)0 r p (y)5 < po w p (x)5 r q (x)0 < po w q (y)0 All of program order for the example Program p: x = y Program q: y = x

17 Framework for Consistency Consistency Models A consistency model is a set of constraints on system computations A system computation of (P,D) satisfies a consistency model CM if the computation meets all the constraints in CM

18 Relation of Consistency Models Sequential All processes see all shared accesses in the same order Strict Absolute time ordering of all shared accesses matters. For two consistency models CM 1 and CM 2 CM 1 is stronger than CM 2 if the constraints of CM 1 imply those of CM 2 –CM 2 is weaker than CM 1 –Sequential consistency is weaker than strict consistency

19 Framework for Consistency – Validity Given a set of operations O O|w indicates all the write operations in O O|r indicates all the read operations in O O|p is the subset of O containing p’s operations, for some process p O|x is the subset of O containing operations on x, for some data item x Let (O,<) be a total order of O (O,<) is valid if for each data item x, the subsequence (O|x,<) is valid for x

20 Framework for Consistency Valid Total Orders Computation: p: w(x)5 r(y)5 q: r(x)0 w(y)5 r(x)5 Valid Total Order: r q (x)0 w q (y)5 w p (x)5 r q (x)5 r p (y)5 x and y are initially 0 Valid for x: r q (x)0 w q (y)5 w p (x)5 r q (x)5 r p (y)5Valid for y: r q (x)0 w q (y)5 w p (x)5 r q (x)5 r p (y)5 Invalid Total Order: w p (x)5 r q (x)0 w q (y)5 r q (x)5 r p (y)5

21 Sequential Consistency (SC) Two constraints: –the result of any execution is the same as if the operations of all the processes were executed in some sequential order, and –the operations of each individual process appear in this sequence in the order specified by its program Let O be the set of all the operations of a computation C of a system (P,D). Then, C satisfies SC if there is a valid total order (O,<) such that (O, < po )  (O,<)

22 SC – Intuition process … All Data Items (the set D) process Switch FIFO Channels

23 Sequential Consistency – Example C satisfies SC if there is a valid total order (O,<) such that (O, < po )  (O,<) C1 p: w(x)1 r(x)2 q: r(x)1 w(x)2 C1 satisfies SC (O, (O, < po ) = { (w p (x)1, r p (x)2), (r q (x)1, w q (x)2) }

24 Sequential Consistency – Examples C2 p: w(x)1 r(x)2 q: w(x)2 r(x)1 C2 does not satisfy SC (O, < po ) = { (w p (x)1, r p (x)2), (w q (x)2, r q (x)1) } (is not valid) (violates PO) C3 p: w(x)1 w(y)2 q: r(y)2 r(x)0 Exercise: Does C3 satisfy SC? (x and y are initially 0)

25 Coherence [Goodman] SC per data item Let O be the set of all the operations of a computation C of a system (P,D). Then, C satisfies Coherence if for each x  D there is a valid total order (O|x,< x ) such that (O|x,< po )  (O|x,< x )

26 Coherence – Intuition process … One Data Item process One Data Item One Data Item … FIFO Channels

27 Coherence – Examples C1 p: w(x)1 r(x)2 q: r(x)1 w(x)2 C1 satisfies Coherence (O|x, C2 p: w(x)1 r(x)2 q: w(x)2 r(x)1 C2 does not satisfy Coherence C3 p: w(x)1 w(y)2 q: r(y)2 r(x)0 C3 satisfies Coherence but not SC C4 p: w(x)3 w(x)2 r(y)3 q: w(y)3 w(y)1 r(x)3 Does C4 satisfy Coherence? SC?

28 SC versus Coherence If Computation C satisfies SC, then it satisfies Coherence + PO If a Computation C satisfies Coherence, then it does not necessarily satisfy SC –Proof: Computation C3 is an example All Computations satisfying consistency model CM = C(CM) C(Coherence) C(SC) C3

29 FIFO [Lipton & Sandberg] Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes When would we like to use FIFO consistency model

30 Review: SC – Intuition process … All Data Items (the set D) process Switch FIFO Channels

31 Review: Coherence – Intuition process … One Data Item process One Data Item One Data Item … FIFO Channels

32 FIFO – Intuition process All Data Items (D) process All Data Items (D) process All Data Items (D) process All Data Items (D) FIFO Channels

33 FIFO [Lipton & Sandberg] Writes done by a single process are seen by all other processes in the order in which they were issued, but writes from different processes may be seen in a different order by different processes Let O be the set of all the operations of a computation C of a system (P,D). Then, C satisfies FIFO if for each p  P there is a valid total order (O|p  O|w,< p ) such that (O|p  O|w,< po )  (O|p  O|w,< p )

34 FIFO – Examples C1 p: w(x)1 r(x)2 q: r(x)1 w(x)2 C1 satisfies FIFO (also SC and Coherence) (O|p  O|w, (O|q  O|w, C2 p: w(x)1 r(x)2 q: w(x)2 r(x)1 C2 satisfies FIFO but not Coherence C3 p: w(x)1 w(y)2 q: r(y)2 r(x)0 C3 satisfies Coherence but not SC nor FIFO

35 FIFO – Examples (cont) C5 p: w(x)3 w(x)1 w(y)2 q: r(y)2 r(x)3 Does C4 satisfy FIFO? Coherence? SC? Does C5 satisfy FIFO? Coherence? SC? C4 p: w(x)3 w(x)2 r(y)3 q: w(y)3 w(y)1 r(x)3

36 SC versus FIFO If Computation C satisfies SC, does it satisfy FIFO? If Computation C satisfies FIFO, does it satisfy SC?

37 SC versus FIFO If Computation C satisfies SC, then it satisfies FIFO If a Computation C satisfies FIFO, then it does not necessarily satisfy SC –Proof: Computation C4 is an example C(FIFO) C(SC) C4 p: w(x)3 w(x)2 r(y)3 q: w(y)3 w(y)1 r(x)3

38 Coherence versus FIFO If Computation C satisfies Coherence, does it satisfy FIFO?

39 Coherence versus FIFO If Computation C satisfies Coherence, then it does not necessarily satisfy FIFO –Proof: Computation C5 is an example C5 p: w(x)3 w(x)1 w(y)2 q: r(y)2 r(x)3

40 Coherence versus FIFO If a Computation C satisfies FIFO, does it satisfy Coherence?

41 Coherence versus FIFO If a Computation C satisfies FIFO, then it does not necessarily satisfy Coherence –Proof: Computation C2 is an example C2 p: w(x)1 r(x)2 q: w(x)2 r(x)1

42 Coherence versus FIFO There are computations that satisfy both Coherence and FIFO, but not SC –Proof: Computation C4 C(Coherence) C(FIFO) C(SC) C: satisfies FIFO and Coherence, but not SC C4 p: w(x)3 w(x)2 r(y)3 q: w(y)3 w(y)1 r(x)3

43 Weak Consistency Consider Critical Section –If a process is in a critical section, its intermediate results of operations are not necessarily propagated to others. Idea –Enforce consistency on a Group of Operations –Limit the time when consistency holds –Let programmer explicitly specify this

44 Synchronization Operations In addition to reads and writes, introduce synch p () operation, which –synchronizes all local copies of the data store Propagate local updates Bring in other ’ s updates

45 Weak Consistency (cont.) Three conditions 1.No operation on a synchronization variable is allowed to be performed until all previous writes have completed everywhere. 2.No read or write operation on data items are allowed to be performed until all previous operations to synchronization variables have been performed. 3.Accesses to synchronization variables associated with a data store, are sequentially consistent.

46 Weak Consistency (cont.) Weak Consistency Not Weak Consistency P1 P2 P3 P1 P2 P3 W(x, a) W(x, b) W(y, c) S2 S1 S3 b  R(x) c  R(y) b  R(x) W(x, a) W(x, b) W(y, c) S2 S1 S3 a  R(x) c  R(y) b  R(x) c  R(y) Or a  R(x) Nil  R(y) b  R(x) No operation on a synchronization variable is allowed to be performed until all previous writes have completed everywhere. No read or write operation on data items are allowed to be performed until all previous operations to synchronization variables have been performed.

47 WC – Example C6 p: w(x)3 s() q: r(x)0 s() w(y)1 s’() r(x)3 m: w(x)5 r(y)1 s() r(x)3 All of p, q, and m must agree on a total order of synch operations consistent with program order; for example: (O|p  O|w  O|s, < p ) = (O|q  O|w  O|s, < q ) = (O|m  O|w  O|s, < m ) = Accesses to synchronization variables associated with a data store, are sequentially consistent.

48 Summary of Consistency Models a)Consistency models not using synchronization operations. b)Models with synchronization operations. ConsistencyDescription StrictAbsolute time ordering of all shared accesses matters. Linearizability All processes must see all shared accesses in the same order. Accesses are furthermore ordered according to a (nonunique) global timestamp Sequential All processes see all shared accesses in the same order. Accesses are not ordered in time CausalAll processes see causally-related shared accesses in the same order. FIFO All processes see writes from each other in the order they were used. Writes from different processes may not always be seen in that order (a) ConsistencyDescription WeakShared data can be counted on to be consistent only after a synchronization is done ReleaseShared data are made consistent when a critical region is exited EntryShared data pertaining to a critical region are made consistent when a critical region is entered. (b)

49 Weaker Models Sometimes strong models are needed, if the result of race conditions are very bad. –Banks Sometimes the result of races are just inefficiency, or inconvenience, etc. How strong is Orbitz’s model? –If it shows that a flight ticket with a certain price is available, is it really? One kind of weaker model is eventual consistency –It eventually becomes consistent

50 Lazy Consistency Models When updates are scarce When updates are not conflicting –Examples: DNS and WWW Eventual Consistency (EC): Lazy propagation of updates to all replicas –If no updates take place for a long time, all replicas will become consistent –Cheap to implement –If a client always accesses the same replica, the same or newer data will be read as time passes. EC works.

51 Eventual Consistency How well does EC work for mobile clients? Client-centric is for this. Consistent for a single client.

52 Session Guarantees When client moves around and connects to different replicas, strange things can happen –Update you just made was missing –Database goes back in time Design choice: –Insist on stricter consistency –Use a client-centric consistency model to enforce some “session” guarantees

53 Read Your Writes Every read in a session should see all previous writes in that session Example error: deleted email messages reappear

54 Monotonic Reads Disallow reads to a DB less current than previous read Example error: –Get list of email messages (previous read) –When attempting to read one, get “message doesn’t exist” error (read the past)

55 Monotonic writes Writes must follow any previous writes that occurred within their session Example error: –Update to library made –Update to application using library made –Don’t want application depending on new library to show up where new library is unavailable

56 Writes Follow Reads After a read R at a server X, a write W occurs in the session. If W is in Y’s database then any writes relevant to R are also there. Ensures: a write operation following a previous read takes place on the same or a more recent value of x that was read Example error: newsgroup displays responses to articles before original article has propagated there

57 Supporting Session Guarantees Responsibility of “session manager”, not servers Two sets: –Read set: set of writes that are relevant to session reads –Write set: set of writes performed in session Update dependencies captured in read sets and write sets

58 Consistency and Replication Chapter 7 Part II Replica Management & Consistency Protocols

59 System Model Assume –replica manager applies operations to its replica –set of replica managers may be fixed or could change dynamically –requests are reads or writes (updates)

60 A basic architectural model for the management of replicated data

61 System Model Five phases in performing a request –Front end issues the request Either sends to a single (primary) replica or multicasts to all replica managers –Coordination RMs coordinate in preparation for the execution of the request, i.e., agree if request is to be performed and the ordering of the request relative to others –FIFO ordering, Causal ordering, Total (sequential) ordering

62 System Model Five phases in performing a request –Execution May be tentative –Agreement Reach consensus on effect of the request, e.g., agree to commit or abort in a transactional system –Response

63 The Passive (Primary-Backup) Model

64 Passive (Primary-Backup) Replication Request: –(FE issued a read request to a randomly-picked RM) –FE issues an update to the primary RM Coordination: primary RM takes updates in FIFO Execution: primary RM executes the update and stores the response Agreement: primary sends the update to all backups. The backups send acknowledgements Response: primary responds to the FE, which hands the response back to the client

65 Passive (Primary-Backup) Replication Implements SC, since it sequences all updates (all reads can then be inserted to form the total order) (Btw, this approach also implements read your writes, if updates are blocking)

66 Passive (Primary-Backup) Replication If primary fails, –It should be replaced with a unique backup –RMs that survive must agree upon which operations had been performed when the replacement primary takes over (will discuss this in next chapter on fault tolerance)

67 Active Replication Using Multicast Active replication –FE multicasts request to each replica

68 Implementing Ordered Multicast Incoming messages are held back in a queue until delivery guarantees can be met

69 Implementing Ordered Multicast Incoming messages are held back in a queue until delivery guarantees can be met Coordination between machines needed to determine delivery order –FIFO-ordering: easy, use a separate sequence number for each process –Total-ordering: use Lamport’s timestamps –Causal-ordering: use vector timestamps

70 Network Partitions Primary-based Replicated-write Neither works if network is partitioned since in both approaches coordination involves all RMs

71 Network Partitions Idea? –Use Majority Read –Retrieve number of replicas in read quorum –Select the one with the latest version –Perform a read on it Write –Retrieve number of replicas in write quorum –Find the latest version and increment it –Perform a write on the entire write quorum Well-known Solution: Quorum-based Protocols

72 Quorum-based Protocols N: Total #Replicas N R : #Replicas in Read Quorum N W : #Replicas in Write Quorum Constraints: 1.N R + N W > N 2.N W > N/2

73 Quorum-Based Protocols Three examples of the voting algorithm for N = 12 replicas (a) A correct choice of read and write set (b) A choice that may lead to write-write conflicts (c) A correct choice, known as ROWA (read one, write all)

74 Possible Policies ROWAL R=1, W=N –Fast reads, slow writes Majority: R=W=N/2+1 –Both moderately slow, but extremely high availability

75 Distribution Protocols How are updates propagated to replicas (independent of the consistency model)? –State versus operations Propagate only notification of update, i.e., invalidation Transfer data from one copy to another Propagate the update operation to other copies –Push versus pull protocols

76 Pull versus Push Protocols A comparison between push-based and pull-based protocols in the case of multiple backup replica, single primary replica. Leases: a hybrid form of update propagation that dynamically switches between pushing and pulling Primary maintains state for a backup for a TTL, i.e., while lease has not expired IssuePush-basedPull-based State of primaryList of backups and cachesNone Messages sentUpdate (and possibly fetch update later)Poll and update Response time at backup Immediate (or fetch-update time)Fetch-update time IssuePush-basedPull-based State of primary Messages sent Response time at backup

77 77 Leases Combined push and pull A primary replica promises –push updates for a certain time –a lease expires => the backup polls the primary or requests a new lease –length of a lease? –different types of leases age based: {time to last modification} renewal-frequency based: long-lasting leases to active users state-space overhead: increasing utilization of a primary => lower expiration times of new leases


Download ppt "Consistency and Replication Chapter 7. n Most of the lecture notes are based on slides by u Prof. Jalal Y. Kawash at Univ. of Calgary u Prof. Kenneth."

Similar presentations


Ads by Google