Presentation is loading. Please wait.

Presentation is loading. Please wait.

Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group.

Similar presentations


Presentation on theme: "Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group."— Presentation transcript:

1 Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group

2 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 1 Motivations for this work Peer-to-peer, decentralised write sharing Lessons and commonalities Understand limitations Different solutions: spectrum or discrete points? Simple formal model

3 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 2 Optimistic replication Replicas of shared objects on sites Without synchronisation: peer-to-peer read and update! Consistency: a posteriori, offline Merge independent updates Applications: high latency networks disconnected operation cooperative work Improves availability & performance

4 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 3 Example: cooperative engineering with CVS CVS: developing shared code Local, disconnected replica: no interference Conflicts: Write same file = syntactic Overlap in file = violates edit semantics Doesn’t compile, test = violates application semantics Both sides of a conflict are excluded Manual repair

5 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 4 Example: Bayou General-purpose database Any replica can update, log actions action = { dependency check, operation, merge-procedure } Optimistic replication: epidemic exchange logs { roll-back, replay }*; commit dep-check: semantic check for conflict merge-proc: semantic repair

6 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 5 Basic vocabulary While isolated: tentative updates When connected, reconcile: Propagate & collect updates (Conceptually) Restart from initial state Replay updates (if possible) Overriding goal: consistency

7 1. Consistency Study component issues of consistency

8 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 7 What is consistency? Consistent with user intents apply operations according to user scenario Consistent with data invariants dependent actions pre- and post-conditions conflict resolution Replicas consistent with each other converge towards same values

9 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 8 Consistency: problem taxonomy 1. Objects & updates Internal vs. external consistency Value / value log / operation log Single master / multi-master 2. Detecting dependence vs. concurrency 3. Concurrency control 4. Laziness of concurrency control Pessimistic / advanced concurrency / optimistic 5. Convergence

10 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 9 Operation-based reconciliation Updates: concurrent, unsynchronised Local log of actions = operation descriptions object identifier, method, arguments Multi-log collects local + remote logs Reconciliation schedule: merge multi-log & run sequentially Scheduling issues: Include vs. exclude Execution order

11 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 10 Operation-based model 0 0 1 2 0 0 4 3

12 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 11 Dependence vs. concurrency Two actions are either have a dependency or commutative / concurrent Dependent actions: do not conflict must be scheduled in dependence order Concurrent actions potentially conflict Dependence / concurrency detection is a fundental mechanism

13 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 12 Concurrency control Concurrent & no conflict  commute: execute both, arbitrary order Conflict detection options Conflict resolution options

14 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 13 Convergence Liveness: sites receive same/all actions Safety: given same actions, sites compute the same value Stability: actions eventually not undone

15 2. Dependency & Concurrency Mechanisms to detect if actions are dependent or concurrent

16 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 15 Scalar clocks and timestamps Wall clock, Lamport clock Total order Total order, consistent with causal dependence Schedule in timestamp order Can’t detect concurrency

17 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 16 Happens-before e 1 precedes e 2 in process e 1 sends, e 2 receives  e 1  e 2  (e 1  e 2 )   (e 2  e 1 )  e 1 || e 2 e 1 || e 2 : e 1 does not cause e 2 e 1  e 2 : e 1 might cause e 2 Partial order, consistent with causal dependence Schedule consistent with 

18 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 17 Syntactic vs. semantic mechanisms Scalar timestamps no concurrency detection very conservative approx. of causality Vector timestamps detect concurrency conservative approx. of causality Alternative: explicit semantic constraints

19 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 18 Locks as semantic constraints Read(x) depends on previous Write(x) in same process, or previously-received Write(x), whichever is later Write(x) depends on previous Read(*) in same process More semantic information than Happens- Before Step in the right direction, but still too coarse

20 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 19 IceCube: Primitive constraints Declarative (“static”): MustHave: a  b if a  s and a  b then b  s (not necessarily contiguous nor in order) Order: a  b if a, b  s and a  b then a before b in s (not necessarily both nor contiguous) Within log, across logs Imperative (dynamic): preCondition (State)

21 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 20 Log constraints parcel predecessor- successor alternatives Express user intents: Predecessor/successor: a  b  b  a b uses effect of a; “a causes b” Parcel: a  b  b  a  transaction Alternatives: a  b  b  a

22 3. Concurrency control & scheduling Policies for dealing with concurrent actions

23 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 22 Optimistic concurrency control & scheduling Two actions are either: dependent  schedule in dependence order concurrent and non-conflicting or commutative  schedule in any order concurrent and conflicting  schedule in non-conflicting order  or exclude one, the other, or both

24 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 23 Concurrency control Concurrent & no conflict  commute: execute both, arbitrary order Conflict detection options: 2 concurrent actions conflict only if operate on same object only if both write only if violate semantic invariant Conflict resolution options: exclude both exclude 1st, include 2nd (or vice-versa) execute both in favorable order (rewrite and execute both)

25 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 24 What is a conflict? 1 site executes code + pre/post-conditions Pre/post-conditions often unknown Dependency between successive actions Schedule execution must satisfy pre/post- conditions Violation  conflict pre(x 0 )post(x 0, f(x 0 )) x 1 := f(x 0 ) pre(x 0 )post(x’ 1, g(x 0 )) x’ 1 := g(x 0 ) pre(x 1 )post(x 1, g(x 1 )) x 2 := g(x 1 )

26 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 25 Thomas’ Write Rule Pre- / post-conditions unknown Scalar clocks no concurrency detect implicit concurrency control schedule in clock order a later action excludes earlier ones Lost updates Delete ambiguity: “tombstone” state

27 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 26 Value-based Version Vector concurrency control Pre- / post-conditions unknown Independent objects actions to different objects commute VV = per-object vector timestamp any concurrent writes to object conflict Resolution: Manual Values: “Resolver” per data type

28 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 27 Bayou scheduling Disjoint databases; 1 primary / database Transaction: single database Action = { dependency check, operation, merge-procedure } Optimistic replication: epidemic exchange logs { roll-back, replay }*; commit Conflict  dependency check fails  merge-procedure

29 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 28 Bayou dependency checks Write-write conflicts: on replay check that data unchanged Read-write conflicts: check input data can detect concurrent updates semantic: only relevant changes Application-specific checks bank account balance > £100 fine grain

30 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 29 IceCube: Object constraints Shared data type advertises static semantics mutually exclusive a  b  b  a best order (e.g. bank: credits before debits) a  b Only between concurrent actions Also: dynamic constraints commute best order mutually exclusive

31 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 30 IceCube scheduling Insight: conflict: choice of which action to exclude maximise value

32 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 31 IceCube execution model 01 02 0 0 0 0 0 8 11 4 5 6 log constraints object constraints 09 010 07 dynamic constraints

33 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 32 Search vs. syntactic order

34 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 33 Performance of IceCube heuristics

35 4. Convergence Can a peer-to-peer system converge? Hard in the general case Formalise to understand limitations, trade-offs

36 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 35 Convergence Liveness: sites receive same/all operations epidemic multicast quickly Safety: sites compute the same value equivalent schedules Stability: actions eventually not undone stable schedules Users, external world dependency Garbage collection

37 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 36 Schedule soundness & equivalence s sound: Closed for MustHave a  s  a  b  b  s Consistent with Order (a,b  s  a  b)  a before b in s Equivalence: s  t s, t sound a  s  a  t ordering is irrelevant!

38 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 37 Stability Peer-to-peer, indefinite tentative update + advisory reconciliation OK But stability needed: Users, external world depend on it Garbage collect multilog Stable: eventually decisions not changed committed: definitely included in all schedules aborted: definitely excluded

39 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 38 Correctness of stability Actions known to be stable at site i: stable i = committed i  aborted i Live:  action a, site i:  a  stable i Safe:  site i, schedule s i : s i sound  committed i  s i  site i,k: committed i  aborted k =  Safety invariant: strong, global!

40 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 39 Maintaining disjointness  site i,k: committed i  aborted k =  Different possibilities Unilateral abort TWR, Holliday 2000 Unilateral commit Deterministic abort / commit rule TWR Primary (only one) site decides Bayou, CVS Consensus before deciding Deno, Holliday 2000-2002

41 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 40 Maintaining soundness  site i, schedule s i : s i sound  committed i  s i When aborting a, also abort actions that MustHave a When committing a, also abort uncommitted actions that are ‘Order’ed before a Maintain both soundness and disjointness. Peer-to-peer commitment is hard!

42 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 41 Stability with TWR Independent objects Independent writes (no MustHave nor Order) All sites take same decision: Given two writes to same object, abort the earlier Whether concurrent or not Write stable when seen by all sites Disjointness: committed i =  Soundness: no MustHave (no transactions)

43 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 42 Stability in Bayou Databases: Disjoint Independent: no multi-DB transaction 1 primary / database Log constraints: transactions, time order Disjointness: Only 1 site decides about a: the primary for the database that a updates Soundness: whole transaction commits or aborts

44 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 43 Holliday’s pre-commit protocol Log constraints: multi-object transactions happens-before order Read transactions commit locally Read-Write transactions: consensus to commit convert locks to intentions pre-commit, vote commit if quorum ‘yes’ abort if anti-quorum ‘no’ or conflict with committed

45 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 44 Trade-offs Deterministic rule fast, inflexible Partition + primary single point of failure no MustHave across partition boundaries Consensus slow scalability impossibility of consensus in asynchronous systems with failure

46 5. Conclusions

47 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 46 Need for OR not going away “Network technology improving: keep everything consistent pessimistically.” True, but: Constant latency; unavailable bandwidth Mobile access  unbounded latency Increasing numbers of replicas “Conflicts are rare.” True, but: Do occur Very high cost

48 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 47 OR pros & cons Peer-to-peer read/write sharing OR accepts more updates: Performance despite latency Availability despite failures Increased complexity Semantic information Not transparent Bottleneck moved to commit Hard to make peer-to-peer Unless (unacceptable?) restrictions Unavoidable

49 Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 48 The end


Download ppt "Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group."

Similar presentations


Ads by Google