Presentation is loading. Please wait.

Presentation is loading. Please wait.

Eventual Consistency Jinyang. Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes.

Similar presentations


Presentation on theme: "Eventual Consistency Jinyang. Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes."— Presentation transcript:

1 Eventual Consistency Jinyang

2 Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes are applied in a single order Handles concurrent writes Realizing sequential consistency: –Reads/writes from a single node execute one at a time –All reads/writes to address X must be ordered by one memory/storage module responsible for X

3 Realizing sequential consistency W(A)1 W(A)2 Cache or replica Cache Or replica W(B)3 Invalidate, R(B)

4 Disadvantages of sequential consistency Requires highly available connections –Lots of chatter between clients/servers Not suitable for certain scenarios: –Disconnected clients (e.g. your laptop) –Apps might prefer potential inconsistency to loss of availability

5 Why (not) eventual consistency? Support disconnected operations –Better to read a stale value than nothing –Better to save writes somewhere than nothing Potentially anomalous application behavior –Stale reads and conflicting writes…

6 Operating w/o total connectivity replica Client writes to its local replica W(A)1 W(A)2 Sync w/ server resolves non-conflicting changes, reports conflicting ones to user No sync between clients

7 Pair-wise synchronization replica W(A)1 W(A)2 W(B)3 Pair-wise sync resolves non-conflicting changes, reports conflicting ones to users

8 Examples usages? File synchronizers –One user, many gadgets

9 File synchronizer Goal 1.All replica contents eventually become identical 2.No lost updates –Do not replace new version with old ones

10 Prevent lost updates Detect if updates were sequential –If so, replace old version with new one –If not, detect conflict “Optimistic” vs. “Pessimistic” –Eventual Consistency: Let updates happen, worry about whether they can be serialized later –Sequential Consistency: Updates cannot take effect unless they are serialized first

11 How to prevent lost updates? Strawman: use mtime to decide which version should replace the other Problem w/ wallclock: cannot detect disagreement on ordering H1 H2 W(f)a mtime: 15648 W(f)c 23657 f W(f)b 16679 f 12354 f 15648

12 Strawman fix Carry the entire modification history If history X is a prefix of Y, Y is newer H1 W(f)a W(f)b W(f)c H1:15648 H1:16679 H1:15648 H2:23657

13 Compress version history H1 W(f)a W(f)b W(f)c H1:1 H1:2 H1:1 H1:2 H2:1 H1:1 H1:2 H1:2 implies H1:1, so we only need one number per host H1:1H1:2 H1:1H1:2 H2:1 H2

14 Compare vector timestamp H1:1 H2:3 H3:2 H1:1 H2:5 H3:7 H1:1 H2:3 H3:2 H1:2 H2:1 H3:7 < <

15 Using vector timestamp H1 W(f)a W(f)b W(f)c H1:1H1:2 H1:1 H2:1 H1:2 H2:1 H2

16 Using vector timestamp H1 W(f)a W(f)b W(f)c H1:1H1:2 H1:1 H2:1 H1:1 H2:1 H2

17 How to deal w/ conflicts? Easy: mailboxes w/ two different set of messages Medium: changes to different lines of a C source file Hard: changes to same line of a C source file After conflict resolution, what should the vector timestamp be?

18 What about file deletion? Can we forget about the vector timestamp for deleted files? Simple solution: treat deletion as a write –Conflicts involving a deleted file is easy Downside: –Need to remember vector timestamp for deleted files indefinitely

19 Tra [Cox, Josephson] What are Tra’s novel properties? –Easy to compress storage of vector timestamps –No need to check every file’s version vector during sync –Allows partial sync of subtrees –No need to keep timestamp for deleted files forever

20 Tra’s key technique Two vector timestamps: 1.One represents modification time –Tracks what a host has 2.One represents synchronization time –Tracks what a host knows Sync time implies no modification happens since mod time H1:1 H2:5 H3:7 H1:10 H2:20 H3:25

21 f1 f2 H1:0 H2:0 H1:0 H2:0 Using sync time H1 W(f1)a W(f2)b H1:1 H2:0 H2 H1:2 H2:0 f1 f2 H1:1 H1:2 H2:0 H1:2 H2:0 f2

22 Compress mtime and synctime dir synctime = element-wise min of child sync times dir mtime = element-wise max of child mod times Sync(d1  d1’) –Skip d1 if mtime of d1 is less than synctime of d1’ Can we achieve this with single mtime? –Skip d1 if mtime of d1 is less than mtime of d1’

23 Synctime enables partial synchronization Directory d1 contains f1 and f2, suppose host sync a subtree (d1/f1) –With synctime+mtime: synctime of d1 does not change. Mtime of d1 increases –With mtime only: Mtime of d1 increases Host later syncs subtree d1/f2 –With synctime+mtime: will pull in modifications in e2 because synctime of d1 is smaller –With mtime only: skips d1 because mtime is high enough

24 f2 H1:0 H2:0 Using sync time H1 W(f1)a W(f2)b H1:1 H2 H1:2f1f2 H1:2 H2:0 d Sync f1 only f1 H1:0 H2:0 H1:2 H1:0 H2:0 d f1 H1:1 H1:2 H2:0 H1:2 H1:0 H2:0 d Sync f2 only f1 H1:1 H1:2 H2:0 d f2 H1:2

25 f2 H1:0 How to deal w/ deletion H1 W(f1)a D(f2) H1:1 H2 f1f2 H1:2 H2:0 d f1 H1:0 H2:0 d H1:2 H2:0 Deletion notice for a deleted file contains its sync time f1 H1:1 H1:2 H2:0 d

26 f2 How to deal w/ deletion H1 W(f1)a D(f2) H1:1 H2 f1f2 H1:2 H2:0 d f1 H1:0 H2:1 d H1:2 H2:0 Deletion notice for a deleted file contains its sync time H2:1 f1 H1:1 H1:2 H2:1 d f2

27 Another definition of eventual consistency Eventual consistency (Tra) –All replica contents are eventually identical –Do not care about individual writes, just overwrite old replica w/ new one Eventual consistency (Bayou) –Writes are eventually applied in total order –Reads might not see most recent writes in total order

28 Bayou Version Vector Write log 0:0 1:0 2:0 0:0 1:0 2:0 0:0 1:0 2:0 N0 N1 N2

29 Bayou propagation Version Vector Write log 0:3 1:0 2:0 N0 N1 N2 1:0 W(x) 2:0 W(y) 3:0 W(z) 0:0 1:1 2:0 0:0 1:0 2:0 1:1 W(x) 1:0 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:0 2:0

30 Bayou propagation Version Vector Write log 0:3 1:0 2:0 N0 N1 N2 1:0 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:4 2:0 0:0 1:0 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 1:1 W(x) 0:3 1:4 2:0

31 Bayou propagation Version Vector Write log N0 N1 N2 0:3 1:4 2:0 0:0 1:0 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) Which portion of The log is stable?

32 Bayou propagation Version Vector Write log N0 N1 N2 0:3 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:4 2:5 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z)

33 Bayou propagation Version Vector Write log N0 N1 N2 0:3 1:6 2:5 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:0 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:4 1:4 2:5 1:0 W(x) 1:1 W(x) 2:0 W(y) 3:0 W(z) 0:3 1:4 2:5

34 Bayou uses a primary to commit a total order Why is it important to make log stable? –Stable writes can be committed –Stable portion of the log can be truncated Problem: If any node is offline, the stable portion of all logs stops growing Bayou’s solution: –A designated primary defines a total commit order –Primary assigns CSNs (commit-seq-no) –Any write with a known CSN is stable –All stable writes are ordered before tentative writes

35 Bayou propagation Version Vector Write log 0:3 1:0 2:0 N0 N1 N2 1:1:0 W(x) 2:2:0 W(y) 3:3:0 W(z) 0:0 1:1 2:0 0:0 1:0 2:0 ∞:1:1 W(x) 0:0 1:1 2:0

36 Bayou propagation Version Vector Write log 0:4 1:1 2:0 N0 N1 N2 1:1:0 W(x) 2:2:0 W(y) 3:3:0 W(z) 0:0 1:1 2:0 0:0 1:0 2:0 ∞:1:1 W(x) 4:1:1 W(x) 1:1:0 W(x) 2:2:0 W(y) 3:3:0 W(z) 4:1:1 W(x) 0:4 1:1 2:0

37 Bayou’s limitations Primary cannot fail Server creation & retirement makes nodeID grow arbitrarily long Anomalous behaviors for apps? –Calendar app


Download ppt "Eventual Consistency Jinyang. Sequential consistency Sequential consistency properties: –Latest read must see latest write Handles caching –All writes."

Similar presentations


Ads by Google