Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon,

Slides:



Advertisements
Similar presentations
CS 542: Topics in Distributed Systems Diganta Goswami.
Advertisements

DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
A General Characterization of Indulgence R. Guerraoui EPFL joint work with N. Lynch (MIT)
6.852: Distributed Algorithms Spring, 2008 Class 7.
Failure detector The story goes back to the FLP’85 impossibility result about consensus in presence of crash failures. If crash can be detected, then consensus.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
1 Principles of Reliable Distributed Systems Lectures 11: Authenticated Byzantine Consensus Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
Structure of Consensus 1 The Structure of Consensus Consensus touches upon the basic “topology” of distributed computations. We will use this topological.
Failure Detectors & Consensus. Agenda Unreliable Failure Detectors (CHANDRA TOUEG) Reducibility ◊S≥◊W, ◊W≥◊S Solving Consensus using ◊S (MOSTEFAOUI RAYNAL)
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Sergio Rajsbaum 2006 Lecture 3 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.
Coterie availability in sites Flavio Junqueira and Keith Marzullo University of California, San Diego DISC, Krakow, Poland, September 2005.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Efficient Algorithms to Implement Failure Detectors and Solve Consensus in Distributed Systems Mikel Larrea Departamento de Arquitectura y Tecnología de.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 8: Failure Detectors.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Bringing Paxos Consensus in Multi-agent Systems Andrei Mocanu Costin Bădică University of Craiova.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 11: Asynchronous Consensus 1.
Consensus and Its Impossibility in Asynchronous Systems.
1 Lectures on Parallel and Distributed Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Parallel and Distributed.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.
The virtue of dependent failures in multi-site systems Flavio Junqueira and Keith Marzullo University of California, San Diego Workshop on Hot Topics in.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
Approximation of δ-Timeliness Carole Delporte-Gallet, LIAFA UMR 7089, Paris VII Stéphane Devismes, VERIMAG UMR 5104, Grenoble I Hugues Fauconnier, LIAFA.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Exercises for Chapter 15: COORDINATION AND AGREEMENT From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 3, © Addison-Wesley.
SysRép / 2.5A. SchiperEté The consensus problem.
Agreement in Distributed Systems n definition of agreement problems n impossibility of consensus with a single crash n solvable problems u consensus with.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch CSCE 668 Set 3: Leader Election in Rings 1.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Failure Detectors n motivation n failure detector properties n failure detector classes u detector reduction u equivalence between classes n consensus.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
Distributed Systems Lecture 9 Leader election 1. Previous lecture Middleware RPC and RMI – Marshalling 2.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
Exercises for Chapter 11: COORDINATION AND AGREEMENT
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Systems, Consensus and Replicated State Machines
Presented By: Md Amjad Hossain
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Failure Detectors motivation failure detector properties
Distributed systems Consensus
Presentation transcript:

Replication predicates for dependent-failure algorithms Flavio Junqueira and Keith Marzullo University of California, San Diego Euro-Par Conference, Lisbon, Portugal, August 2005

Euro-Par’05 2 Dealing with problems Goal: derive bounds on process replication for problems  Typically of the form n > kt  n : number of processes  t : maximum number of process failures State problem properties  E.g. consensus  Agreement: all correct processes decide upon the same value  Termination: every correct process eventually decides Show lower bound on process replication  Constraint on amount of replication  E.g.  S consensus, crash failures: n > 2t Provide an algorithm, perhaps showing bound is tight

Euro-Par’05 3 Showing lower bound Partition argument Indistinguishable executions leading to violation E.g.  S consensus, crash failures  Proof idea: n = 2t = 6, t = 3 A B Execution 1 Execution 2 Execution 3 Messages delayed arbitrarily long Execution 1 Execution 2 Violation of agreement: n >2t Ex. 3 = Ex. 1 Ex. 3 = Ex. 2

Euro-Par’05 4 Deriving an algorithm From the previous argument…  Not every pair of (n - t) processes intersects  Implies that n  2(n - t) and n  2t  S consensus, crash failures  Chandra and Toueg’s rotating coordinator protocol  In every asynchronous round, a process waits for messages from (n-t) processes  Every two groups of (n-t) processes intersect  No two groups of processes make progress independently

Euro-Par’05 5 Introduction For several problems same partition/intersection argument  Conclusion from the argument: n > kt Often consider a threshold t on the number of process failures  Threshold model  Not adequate when failures not independent, identically distributed Previously…  Model of dependent failures  Cores: generalize subsets of ( t+1 ) processes  Survivor sets: generalize subsets of ( n-t ) processes  Other forms of expressing requirements on replication  Replication predicates  Requirement on replication  Not necessarily of the form n > kt  For consensus  Partition and Intersection properties  Partition property: minimal requirement  Intersection property: meets this requirement

Euro-Par’05 6 In this work… Review model of dependent failures Generalize findings for consensus  Derive replication predicates  Similar framework using our model Integral factors  E.g. n > 2t  Two equivalent properties  Replace n > kt, k > 1 Fractional factors  E.g. n >  3t/2   Hard to understand when using inequalities as predicates A fractional example  Weak leader election in the receive-omission failure model: n >  3t/2   Framework in action  Two equivalent properties  Replace n >  kt/(k-1) , k > 1

Euro-Par’05 7 A set  of processes Execution: a sequence of steps of processes Core: In every execution, at least one process is correct Survivor set: Minimal set of processes correct in some execution Observations on cores and survivor sets  In each execution, at least one survivor set contains only correct processes  If a subset A does not contain a core, then  \ A contains a survivor set System model Core Survivor sets

Euro-Par’05 8 Generalizing the Partition/Intersection argument: The k properties Back to the partition argument Partition of  into 2 blocks All processes faulty in some execution: does not contain a core Contains a survivor set A B All processes faulty in some execution: does not contain a core None of the blocks contains a core implies that two survivor sets do not intersect

Euro-Par’05 9 Generalizing the Partition/Intersection argument: The k properties Back to the partition argument k -Partition: in every partition of the processes into k blocks, at least one block contains a core k -Intersection: every k survivor sets intersect Partition of  into 3 blocks A B Contains a survivor set C All processes faulty in some execution: does not contain a core Contains a survivor set None of the blocks contains a core implies that three survivor sets do not intersect

Euro-Par’05 10 An example Implement a replicated service  State machine replication  Leverage existing implementations Five different versions: v 1 -v 5  v 2 reuses code from v 1  v 3 reuses code from v 1 (different from v 2 ) and v 2  v 4 and v 5 have independent developments Five processes: one for each version One software fault exercised at a time Code overlap: v1v1 v2v2 v3v3 v4v4 v5v5 v1v1 v2v2 v3v3 v4v4 v5v5

Euro-Par’05 11 {, } Cores: Survivor sets: {,, } {, }{,, } An example (cont.) Satisfies 3-Partition as either:  are not by themselves in a block OR  form a block Conclusion: in every partition into 3 blocks, one contains a core Code overlap: v1v1 v2v2 v3v3 v4v4 v5v5

Euro-Par’05 12 A fractional example Weak Leader Election Motivation:  Primary-Backup approach for replication  At most one primary at any time  Receive-omission failure model  Faulty process: either crashes or fails to receive messages Properties  At most one process is the leader at any time  Eventually some process is elected  In failure-free executions, a single process is ever elected  There is exactly one process elected infinitely often

Euro-Par’05 13 Lower bound Lower bound on process replication: n >  3t/2  Proof idea  Assume n =  3t/2  = 3, t =2 Cras h Execution 1 Execution 2 Execution 3 Execution 1 Execution 2 Cras h Elected A B C Ex. 3 = Ex. 2 Faulty Ex. 3 = Ex. 1

Euro-Par’05 14 Generalizing n >  kt/(k-1)  : The (k, k-1) properties A similar derivation (k,k-1) -Partition: in every partition of the processes into k blocks, some union of k-1 blocks contains a core (k,k-1) -Intersection: for every k survivor sets, there is at least one pair that intersects A B C Contains a survivor set All processes faulty in some execution: does not contain a core Contains a survivor set Partition of  into 3 blocks No union of two blocks contains a core implies that out of some three survivor sets no two intersect

Euro-Par’05 15 An example: (3,2)-Intersection A 2-cluster system  Cluster failures: all processes fail  One cluster failure + 1 process failure Survivor sets: Satisfies (3,2)-Intersection: Out of any three survivor sets, two are majorities from a single cluster

Euro-Par’05 16 Solving Weak Leader Election Consensus-like synchronous algorithm  Initial value: own process id  In every round  A process sends messages to all other processes  If a process does not receive messages from a survivor set, it stops  Decision value: array of values, one for each process Satisfies (3,2)-Intersection Faulty, not crashed Correct S1S1 S2S2 S3S3 Messages S3S3 S3S3

Euro-Par’05 17 Conclusions Model of dependent failures  Cores and survivor sets  Generalize common sets used in bound proofs and algorithms  Simple abstractions  More expressive: move away from IID Partition and Intersection properties  New form of expressing bounds  k Properties: generalize n > kt  ( k,k-1 ) Properties: generalize n >  kt/(k-1)  Take away…  Use cores and survivor sets to model failures  Use Partition and Intersection properties as replication predicates

Euro-Par’05 18 END

Euro-Par’05 19 Introduction For several problems same partition/intersection argument Often consider a threshold t on the number of process failures  Threshold model  Not adequate when failures not independent, identically distributed Review a model of dependent failures  Cores: generalize subsets of ( t+1 ) processes  Survivor sets: generalize subsets of ( n-t ) processes Generalize the argument  Two equivalent properties  Replace n > kt for integer k, k > 1 A fractional example  Weak leader election in the receive-omission failure model: n >  3t/2   Two equivalent properties  Generalize n >  kt/(k-1) , k > 1 Replication predicate  Statement about the amount of replication  E.g. n > kt, intersection of survivor sets, etc.

Euro-Par’05 20 Properties of RO Consensus Initial value: p i.a Decision value: p i.D Termination  Correct processes eventually decide Agreement  If p i.D[j] , for some p i, then p i.D[j] = p c.D[j] for every correct p c RO Uniformity  At most two decision values: D 1 and D 2  D 1  D 2  D 2  D 1 Validity  For every process p i that does not crash, p i.D[j]  { , p j.a}  If initial values v 1, v 2 such that v 1  v 2, and v 1  p i.D, then for every p j that decides, v 1  p j.D

Euro-Par’05 21 Solving Weak Leader Election Synchronous algorithm  Computation in rounds RO Consensus  Initial value: v  V  Decision value D : array of values, one for each process  Decision values allowed to differ  For every p i, p j that decide: p i.D  p j.D or p j.D  p i.D  A  B, A and B arrays   i : if A[i] , then A[i] = B[i] Two phases  Phase 1: RO Consensus  Initial value: for each process, its own process id  Decision value: array of values, one for each process  Phase 2: RO Consensus again  Initial value: decision array from previous phase  Decision value: array of arrays of values, one for each process  Leader: first process id of the array containing fewer values

Euro-Par’05 22 Solving Weak Leader Election Consensus-like synchronous algorithm  Initial value: own process id  In every round  a process sends messages to all other processes  If a process does not receive messages from a survivor set, it stops  Decision value: array of values, one for each process Satisfies (3,2)-Intersection Faulty, not crashed Correct Survivor sets S1S1 S2S2 S3S3 Case 1 Case 2 Case 1 Decision value D 2 Decision value D 1 Ms g s Decision value D 2

Euro-Par’05 23 Correctness intuition RO Consensus  In every round  A process sends messages to every other process  If a process does not receive messages from a survivor set, it stops  Satisfies (3,2)-Intersection Faulty, not crashed Correct Survivor sets S1S1 S2S2 S3S3 Case 1 Case 2 Case 1 Decision value D 2 Decision value D 1 D2 D1D2 D1 Ms g s D1 D2D1 D2 Decision value D 2

Euro-Par’05 24 Conclusions Reviewed a model of dependent failures  Cores: generalize subsets of ( t+1 ) processes  Survivor sets: generalize subsets of ( n-t ) processes k properties  Generalize n > kt  k -Partition: Commonly used in lower bound proofs  k -Intersection: Commonly used in the design of algorithms  Equivalent properties Example of fractional k: n >  3t/2   Weak Leader Election  ( k, k-1 )-Partition/Intersection Future work  Study more general properties further  Apply to different problems

Euro-Par’05 25 Outline Related work K-properties Fractional K Conclusions

Euro-Par’05 26 Equivalence between the properties

Euro-Par’05 27 Equivalence between the properties

Euro-Par’05 28 Deriving an algorithm