Consensus with Partial Synchrony Kevin Schaffer Chapter 25 from “Distributed Algorithms” by Nancy A. Lynch.

Slides:



Advertisements
Similar presentations
Multi-Party Contract Signing Sam Hasinoff April 9, 2001.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
Chapter 15 Basic Asynchronous Network Algorithms
6.852: Distributed Algorithms Spring, 2008 Class 7.
6.852: Distributed Algorithms Spring, 2008 Class 16.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Failure detector The story goes back to the FLP’85 impossibility result about consensus in presence of crash failures. If crash can be detected, then consensus.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
Consensus Hao Li.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
1 © P. Kouznetsov On the weakest failure detector for non-blocking atomic commit Rachid Guerraoui Petr Kouznetsov Distributed Programming Laboratory Swiss.
Byzantine Generals Problem: Solution using signed messages.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
Sergio Rajsbaum 2006 Lecture 3 Introduction to Principles of Distributed Computing Sergio Rajsbaum Math Institute UNAM, Mexico.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
CPSC 668Set 3: Leader Election in Rings1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 10: Consensus with Byzantine Failures1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 6: Impossibility.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 11: Asynchronous Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 12: Impossibility.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 19: Paxos All slides © IG.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 11: Asynchronous Consensus 1.
Consensus and Its Impossibility in Asynchronous Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
DISTRIBUTED ALGORITHMS AND SYSTEMS Spring 2014 Prof. Jennifer Welch Set 11: Asynchronous Consensus 1.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
SysRép / 2.5A. SchiperEté The consensus problem.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
“Distributed Algorithms” by Nancy A. Lynch SHARED MEMORY vs NETWORKS Presented By: Sumit Sukhramani Kent State University.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Mutual Exclusion with Partial Synchrony
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Alternating Bit Protocol
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Consensus
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Presentation transcript:

Consensus with Partial Synchrony Kevin Schaffer Chapter 25 from “Distributed Algorithms” by Nancy A. Lynch

Consensus  System inputs init(v) i stop i  System outputs decide(v) i

Constraints  Well-formedness: Interactions between U i and A are well-formed for i  Agreement: All decision values are identical  Validity: If all init actions specify the same value v, then v is the only possible decision value  f-failure termination, 0  f  n: If init occurs on all ports and stop occurs on at most f ports, then a decide occurs on all non-failing ports.

PSynchFD Failure Detector  Each process continually sends messages to all other processes  If P i performs m steps without receiving a message from P j, then it outputs inform-stopped(j)  m must be strictly greater than (d + 2 )/ 1 + 1

Correctness of PSynchFD  All failures are eventually detected by all other non-failed processes  After performing (d + 2 )/ steps, at least d + 2 time has passed; it takes at most d + 2 for a message sent from P j to reach P i ; hence, P j must have stopped  Once a stop j occurs, it takes at most Ld + d + O(L 2 ) for either an inform-stopped(j) i or a stop i to occur

Transforming Synchronous to Partially Synchronous  In A each process P i is the composition of two MMT automata: Q i and R i Q i is a PSynchFD process R i is simulates the synchronous algorithm  Each R i keeps track of the current round r and moves to the next round once it has received all round r messages from all non-failed processes  B is A with the input and output corrected

Process P in B

Upper Bound

Synchronous Algorithm  Synchronous algorithm adapted to partially synchronous model  Uses PSynchFD to detect process failures  Complexity: T(0) + f(Ld + d) + d + O(fL 2 )

PSynchAgreement  Uses PSynchFD to detect process failures  Proceeds in rounds  A process can decide 0 in even-numbered rounds and 1 in odd-numbered rounds  Processes start round 0 after receiving input  Each process has a variable decided to keep track of which processes have decided

PSA Algorithm  Round 0 If input is 1 then  Send goto(1) to all processes  Go to round 1 If input is 0 then  Send goto(2) to all processes  Output decide(0)  Send decided to all processes

PSA Algorithm (2)  Round r > 0 If process received goto(r + 1) from anyone  Send goto(r + 1) to all processes  Go to round r + 1 If process received goto(r) from everyone not in stopped  decided  Send goto(r + 2) to all processes  Output decide(r mod 2)  Send decided to all processes

Lemma 25.7  If any process sends goto(r + 2), then some process tries to decide at round r  If any process reaches round r + 2, then some process tries to decide at round r

Lemma 25.8  If a process i decides at round r then: R i sends no goto(r + 1) messages R i sends a goto(r + 2) message to every process No process tries to decide at round r + 1

Validity  If all processes start with 0, then no process can over leave round 0. No process can decide 1 in round 0.  If all processes start with 1, then no process tries to decide in round 0. Lemma 25.7 implies that no process reaches round 2.

Agreement  Suppose that Ri decides at round r and no process decides at any earlier round  By Lemma 25.8, no process tries to decide at round r + 1. By Lemma 25.7, no process can reach round r + 3  All processes must decide in r or r + 2, which result in the same output value

Liveness  Suppose R i gets stuck at round r > 0  All non-stopped, non-decided processes must eventually reach r  Since r > 0, each process must send goto(r) to R i and R i must eventually receive it  Therefore R i ’s condition for deciding it satisfied; a contradiction

Wait-Free Termination  A quiet round r is one in which some process never receives goto(r + 1) If no process tries to decide at round r, then round r + 1 is quiet If some process decides at round r, then r + 2 is quiet  There must eventually be a quiet round, hence wait-free termination is guaranteed  Complexity: Ld + (2f + 2)d + O(f 2 + L 2 )

Lower Bound  Suppose that n  f + 1  There is no n-process agreement algorithm for the partially synchronous model that guarantees f-failure termination, in which all nonfaulty processes always decide strictly before time Ld + (f – 1)d

Other Results  Synchronous processes, asynchronous channels: agreement is not solvable with even 1 failure  Asynchronous processes, synchronous channels: same  Eventual time bounds: solvable, but only if n > 2f