Presentation is loading. Please wait.

Presentation is loading. Please wait.

Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon,

Similar presentations


Presentation on theme: "Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon,"— Presentation transcript:

1 Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon, Portugal, August 2005

2 Euro-Par’05 2 Dealing with problems Goal: derive bounds on process replication for problems  Typically of the form n > kt  n : number of processes  t : maximum number of process failures State problem properties  E.g. consensus  Agreement: all correct processes decide upon the same value  Termination: every correct process eventually decides Show lower bound on process replication  Constraint on amount of replication  E.g.  S consensus, crash failures: n > 2t Provide an algorithm, perhaps showing bound is tight

3 Euro-Par’05 3 Showing lower bound Partition argument Indistinguishable executions leading to violation E.g.  S consensus, crash failures  Proof idea: n = 2t = 6, t = 3 A B Execution 1 Execution 2 Execution 3 Messages delayed arbitrarily long Execution 1 Execution 2 Violation of agreement: n >2t Ex. 3 = Ex. 1 Ex. 3 = Ex. 2

4 Euro-Par’05 4 Deriving an algorithm From the previous argument…  Not every pair of (n - t) processes intersects  Implies that n  2(n - t) and n  2t  S consensus, crash failures  Chandra and Toueg’s rotating coordinator protocol  In every asynchronous round, a process waits for messages from (n-t) processes  Every two groups of (n-t) processes intersect  No two groups of processes make progress independently

5 Euro-Par’05 5 Introduction For several problems same partition/intersection argument  Conclusion from the argument: n > kt Often consider a threshold t on the number of process failures  Threshold model  Not adequate when failures not independent, identically distributed Previously…  Model of dependent failures  Cores: generalize subsets of ( t+1 ) processes  Survivor sets: generalize subsets of ( n-t ) processes  Other forms of expressing requirements on replication  Replication predicates  Requirement on replication  Not necessarily of the form n > kt  For consensus  Partition and Intersection properties  Partition property: minimal requirement  Intersection property: meets this requirement

6 Euro-Par’05 6 In this work… Review model of dependent failures Generalize findings for consensus  Derive replication predicates  Similar framework using our model Integral factors  E.g. n > 2t  Two equivalent properties  Replace n > kt, k > 1 Fractional factors  E.g. n >  3t/2   Hard to understand when using inequalities as predicates A fractional example  Weak leader election in the receive-omission failure model: n >  3t/2   Framework in action  Two equivalent properties  Replace n >  kt/(k-1) , k > 1

7 Euro-Par’05 7 A set  of processes Execution: a sequence of steps of processes Core: In every execution, at least one process is correct Survivor set: Minimal set of processes correct in some execution Observations on cores and survivor sets  In each execution, at least one survivor set contains only correct processes  If a subset A does not contain a core, then  \ A contains a survivor set System model Core Survivor sets

8 Euro-Par’05 8 Generalizing the Partition/Intersection argument: The k properties Back to the partition argument Partition of  into 2 blocks All processes faulty in some execution: does not contain a core Contains a survivor set A B All processes faulty in some execution: does not contain a core None of the blocks contains a core implies that two survivor sets do not intersect

9 Euro-Par’05 9 Generalizing the Partition/Intersection argument: The k properties Back to the partition argument k -Partition: in every partition of the processes into k blocks, at least one block contains a core k -Intersection: every k survivor sets intersect Partition of  into 3 blocks A B Contains a survivor set C All processes faulty in some execution: does not contain a core Contains a survivor set None of the blocks contains a core implies that three survivor sets do not intersect

10 Euro-Par’05 10 An example Implement a replicated service  State machine replication  Leverage existing implementations Five different versions: v 1 -v 5  v 2 reuses code from v 1  v 3 reuses code from v 1 (different from v 2 ) and v 2  v 4 and v 5 have independent developments Five processes: one for each version One software fault exercised at a time Code overlap: v1v1 v2v2 v3v3 v4v4 v5v5 v1v1 v2v2 v3v3 v4v4 v5v5

11 Euro-Par’05 11 {, } Cores: Survivor sets: {,, } {, }{,, } An example (cont.) Satisfies 3-Partition as either:  are not by themselves in a block OR  form a block Conclusion: in every partition into 3 blocks, one contains a core Code overlap: v1v1 v2v2 v3v3 v4v4 v5v5

12 Euro-Par’05 12 A fractional example Weak Leader Election Motivation:  Primary-Backup approach for replication  At most one primary at any time  Receive-omission failure model  Faulty process: either crashes or fails to receive messages Properties  At most one process is the leader at any time  Eventually some process is elected  In failure-free executions, a single process is ever elected  There is exactly one process elected infinitely often

13 Euro-Par’05 13 Lower bound Lower bound on process replication: n >  3t/2  Proof idea  Assume n =  3t/2  = 3, t =2 Cras h Execution 1 Execution 2 Execution 3 Execution 1 Execution 2 Cras h Elected A B C Ex. 3 = Ex. 2 Faulty Ex. 3 = Ex. 1

14 Euro-Par’05 14 Generalizing n >  kt/(k-1)  : The (k, k-1) properties A similar derivation (k,k-1) -Partition: in every partition of the processes into k blocks, some union of k-1 blocks contains a core (k,k-1) -Intersection: for every k survivor sets, there is at least one pair that intersects A B C Contains a survivor set All processes faulty in some execution: does not contain a core Contains a survivor set Partition of  into 3 blocks No union of two blocks contains a core implies that out of some three survivor sets no two intersect

15 Euro-Par’05 15 An example: (3,2)-Intersection A 2-cluster system  Cluster failures: all processes fail  One cluster failure + 1 process failure Survivor sets: Satisfies (3,2)-Intersection: Out of any three survivor sets, two are majorities from a single cluster

16 Euro-Par’05 16 Solving Weak Leader Election Consensus-like synchronous algorithm  Initial value: own process id  In every round  A process sends messages to all other processes  If a process does not receive messages from a survivor set, it stops  Decision value: array of values, one for each process Satisfies (3,2)-Intersection Faulty, not crashed Correct S1S1 S2S2 S3S3 Messages S3S3 S3S3

17 Euro-Par’05 17 Conclusions Model of dependent failures  Cores and survivor sets  Generalize common sets used in bound proofs and algorithms  Simple abstractions  More expressive: move away from IID Partition and Intersection properties  New form of expressing bounds  k Properties: generalize n > kt  ( k,k-1 ) Properties: generalize n >  kt/(k-1)  Take away…  Use cores and survivor sets to model failures  Use Partition and Intersection properties as replication predicates

18 Euro-Par’05 18 END

19 Euro-Par’05 19 Introduction For several problems same partition/intersection argument Often consider a threshold t on the number of process failures  Threshold model  Not adequate when failures not independent, identically distributed Review a model of dependent failures  Cores: generalize subsets of ( t+1 ) processes  Survivor sets: generalize subsets of ( n-t ) processes Generalize the argument  Two equivalent properties  Replace n > kt for integer k, k > 1 A fractional example  Weak leader election in the receive-omission failure model: n >  3t/2   Two equivalent properties  Generalize n >  kt/(k-1) , k > 1 Replication predicate  Statement about the amount of replication  E.g. n > kt, intersection of survivor sets, etc.

20 Euro-Par’05 20 Properties of RO Consensus Initial value: p i.a Decision value: p i.D Termination  Correct processes eventually decide Agreement  If p i.D[j] , for some p i, then p i.D[j] = p c.D[j] for every correct p c RO Uniformity  At most two decision values: D 1 and D 2  D 1  D 2  D 2  D 1 Validity  For every process p i that does not crash, p i.D[j]  { , p j.a}  If initial values v 1, v 2 such that v 1  v 2, and v 1  p i.D, then for every p j that decides, v 1  p j.D

21 Euro-Par’05 21 Solving Weak Leader Election Synchronous algorithm  Computation in rounds RO Consensus  Initial value: v  V  Decision value D : array of values, one for each process  Decision values allowed to differ  For every p i, p j that decide: p i.D  p j.D or p j.D  p i.D  A  B, A and B arrays   i : if A[i] , then A[i] = B[i] Two phases  Phase 1: RO Consensus  Initial value: for each process, its own process id  Decision value: array of values, one for each process  Phase 2: RO Consensus again  Initial value: decision array from previous phase  Decision value: array of arrays of values, one for each process  Leader: first process id of the array containing fewer values

22 Euro-Par’05 22 Solving Weak Leader Election Consensus-like synchronous algorithm  Initial value: own process id  In every round  a process sends messages to all other processes  If a process does not receive messages from a survivor set, it stops  Decision value: array of values, one for each process Satisfies (3,2)-Intersection Faulty, not crashed Correct Survivor sets S1S1 S2S2 S3S3 Case 1 Case 2 Case 1 Decision value D 2 Decision value D 1 Ms g s Decision value D 2

23 Euro-Par’05 23 Correctness intuition RO Consensus  In every round  A process sends messages to every other process  If a process does not receive messages from a survivor set, it stops  Satisfies (3,2)-Intersection Faulty, not crashed Correct Survivor sets S1S1 S2S2 S3S3 Case 1 Case 2 Case 1 Decision value D 2 Decision value D 1 D2 D1D2 D1 Ms g s D1 D2D1 D2 Decision value D 2

24 Euro-Par’05 24 Conclusions Reviewed a model of dependent failures  Cores: generalize subsets of ( t+1 ) processes  Survivor sets: generalize subsets of ( n-t ) processes k properties  Generalize n > kt  k -Partition: Commonly used in lower bound proofs  k -Intersection: Commonly used in the design of algorithms  Equivalent properties Example of fractional k: n >  3t/2   Weak Leader Election  ( k, k-1 )-Partition/Intersection Future work  Study more general properties further  Apply to different problems

25 Euro-Par’05 25 Outline Related work K-properties Fractional K Conclusions

26 Euro-Par’05 26 Equivalence between the properties

27 Euro-Par’05 27 Equivalence between the properties

28 Euro-Par’05 28 Deriving an algorithm


Download ppt "Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon,"

Similar presentations


Ads by Google