Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li.

Similar presentations


Presentation on theme: "Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li."— Presentation transcript:

1 Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li

2 Consistency model: –A constraint on the system state observable by application operations Examples: –X86 memory: –Database: What is consistency? read x (should be 5)write x=5 time x:=x+1; y:=y-1assert(x+y==const) time

3 Consistency No right or wrong consistency models –Tradeoff between ease of programmability and efficiency Consistency is hard in (distributed) systems: –Data replication (caching) –Concurrency –Failures

4 Consistency challenges: example Each node has a local copy of state Read from local state Send writes to the other node, but do not wait

5 Consistency challenges: example W(x)1 W(y)1 x=1 If y==0 critical section y=1 If x==0 critical section

6 Does this work? R(y)0 W(x)1W(y)1 R(x)0 x=1 If y==0 critical section y=1 If x==0 critical section

7 What went wrong? W(x)1W(y)1 CPU0 sees: W(x)1 R(y)0 W(y)1 CPU1 sees: W(y)1 R(x)0 W(x)1 Diff CPUs see different event orders! R(y)0 R(x)0

8 Strict consistency Each operation is stamped with a global wall-clock time Rules: 1. Each read gets the latest write value 2. All operations at one CPU have time- stamps in execution order

9 No two CPUs in the critical section Proof: suppose mutual exclusion is violated CPU0: W(x)1 R(y)0 CPU1: W(y)1 R(x)0 Rule 1: read gets latest write CPU0: W(x)1 R(x)0 CPU1: W(y)1 R(x)0 Strict consistency gives “intuitive” results W must have timestamp later than R Contradicts rule 1: R must see W(x)1

10 Sequential consistency Strict consistency is not practical –No global wall-clock available Sequential consistency is the closest Rules: There is a total order of ops s.t. –All CPUs see results according to total order (i.e. reads see most recent writes) –Each CPUs’ ops appear in order

11 Lamport clock gives a total order Each CPU keeps a logical clock Each CPU updates its logical clock between successive events A sender includes its clock value in the message. A receiver advances its clock be greater than the message’s clock value. Lamport clocks define a total order. –Ties are broken based on CPU ids.

12 Fix the example W(x)1 W(y)1 CPU0 should see order W(x)1 W(y)1 CPU1 should see order W(x)1 W(y)1 ack R(y)0 R(x)1

13 Lamport clock: an example W(x)1 W(y)1 1,0 S: W(x)1 2,1 R: W(x)1 4,0 R: ack 2,0 R: W(y)1 1,1 S: W(y)1 4,1 R: ack 3,1 S: ack 3,0 S: ack 1,0 S W(x)1 1,1 S W(y)1 2,0 R W(y)1 2,1 R W(x)1 3,0 S ack 3,1 S ack 4,0 R ack 4,1 S ack Defines one possible total order: W(x)1 < W(y)1

14 Lamport clock: an example W(x)1 W(y)1 1,0 S: W(x)1 2,1 R: W(x)1 4,0 R: ack 2,0 R: W(y)1 1,1 S: W(y)1 4,1 R: ack 3,1 S: ack 3,0 S: ack 1,0 S W(x)1????? 1,0 S W(x)1 1,1 S W(y)1 2,0 R W(y)1 3,0 S ack 1,0 S W(x)1 1,1 S W(y)1 ??????1,1 S: W(x)1 1,0 S W(x)1 1,1 S W(y)1 2,1 R: W(x)1 3,1 S: ack 1,0 S W(x)1

15 Beyond Lamport clock Typical system obtains a total order differently –Use a single node to order all reads/writes E.g. the lock_server in Lab1 –Partition state over multiple nodes, each node orders reads/writes for its partition Invariant: exactly one is in charge of ordering  The ordering node must be online

16 Weakly consistent systems Sequential consistency –All read/writes are applied in total order –Reads must see most recent writes Eventual consistency (Bayou) –Writes are eventually applied in total order –Reads might not see most recent writes in total order

17 Why (not) eventual consistency? Support disconnected operations –Better to read a stale value than nothing –Better to save writes somewhere than nothing Potentially anomalous application behavior –Stale reads and conflicting writes…

18 Bayou Version Vector Write log 0:0 1:0 2:0 0:0 1:0 2:0 0:0 1:0 2:0 N0 N1 N2

19 Bayou propagation Version Vector Write log 0:3 1:0 2:0 N0 N1 N2 1:0 W(x) 2:0 W(y) 3:0 W(z) 0:0 1:1 2:0 0:0 1:0 2:0 1:1 W(x) 1:0 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:0 2:0

20 Bayou propagation Version Vector Write log 0:3 1:0 2:0 N0 N1 N2 1:0 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:4 2:0 0:0 1:0 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 1:1 W(x) 0:3 1:4 2:0

21 Bayou propagation Version Vector Write log N0 N1 N2 0:3 1:4 2:0 0:0 1:0 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) Which portion of The log is stable?

22 Bayou propagation Version Vector Write log N0 N1 N2 0:3 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:4 2:5 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z)

23 Bayou propagation Version Vector Write log N0 N1 N2 0:3 1:6 2:5 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:5 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:4 2:5

24 Bayou uses a primary to commit a total order Why is it important to make log stable? –Stable writes can be committed –Stable portion of the log can be truncated Problem: If any node is offline, the stable portion of all logs stops growing Bayou’s solution: –A designated primary defines a total commit order –Primary assigns CSNs (commit-seq-no) –Any write with a known CSN is stable –All stable writes are ordered before tentative writes

25 Bayou propagation Version Vector Write log 0:3 1:0 2:0 N0 N1 N2 1:1:0 W(x) 2:2:0 W(y) 3:3:0 W(z) 0:0 1:1 2:0 0:0 1:0 2:0 ∞:1:1 W(x) 0:0 1:1 2:0

26 Bayou propagation Version Vector Write log 0:4 1:1 2:0 N0 N1 N2 1:1:0 W(x) 2:2:0 W(y) 3:3:0 W(z) 0:0 1:1 2:0 0:0 1:0 2:0 ∞:1:1 W(x) 4:1:1 W(x) 1:1:0 W(x) 2:2:0 W(y) 3:3:0 W(z) 4:1:1 W(x) 0:4 1:1 2:0

27 Bayou’s limitations Primary cannot fail Server creation & retirement makes nodeID grow arbitrarily long Anomalous behaviors for apps? –Calendar app


Download ppt "Ordering of events in Distributed Systems & Eventual Consistency Jinyang Li."

Similar presentations


Ads by Google