Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.

Slides:



Advertisements
Similar presentations
Impossibility of Distributed Consensus with One Faulty Process
Advertisements

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Byzantine Generals. Outline r Byzantine generals problem.
Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Announcements. Midterm Open book, open note, closed neighbor No other external sources No portable electronic devices other than medically necessary medical.
Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
The Byzantine Generals Problem Boon Thau Loo CS294-4.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©
Byzantine Generals Problem: Solution using signed messages.
Byzantine Generals Problem Anthony Soo Kaim Ryan Chu Stephen Wu.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture IX: Coordination And Agreement.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Consensus and Its Impossibility in Asynchronous Systems.
Ch11 Distributed Agreement. Outline Distributed Agreement Adversaries Byzantine Agreement Impossibility of Consensus Randomized Distributed Agreement.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
1 Resilience by Distributed Consensus : Byzantine Generals Problem Adapted from various sources by: T. K. Prasad, Professor Kno.e.sis : Ohio Center of.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Reaching Agreement in the Presence of Faults M. Pease, R. Shotak and L. Lamport Sanjana Patel Dec 3, 2003.
SysRép / 2.5A. SchiperEté The consensus problem.
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
1 SECOND PART Algorithms for UNRELIABLE Distributed Systems: The consensus problem.
Distributed Agreement. Agreement Problems High-level goal: Processes in a distributed system reach agreement on a value Numerous problems can be cast.
CSE 486/586 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
Synchronizing Processes
Coordination and Agreement
The consensus problem in distributed systems
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
COMP28112 – Lecture 14 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 13-Oct-18 COMP28112.
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 19-Nov-18 COMP28112.
Alternating Bit Protocol
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Consensus
EEC 688/788 Secure and Dependable Computing
Consensus in Synchronous Systems: Byzantine Generals Problem
EEC 688/788 Secure and Dependable Computing
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 22-Feb-19 COMP28112.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Byzantine Generals Problem
Presentation transcript:

Distributed Algorithms: Agreement Protocols

Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes have proposed what that value (decision) should be l Examples: mutual exclusion, election, transactions l Processes may be correct, crashed, or they may exhibit arbitrary (Byzantine) failures l Messages are exchanged on an one-to-one basis, and they are not signed

Consensus and related problems l System model N processes {p 1, p 2,..., p N } Communication is reliable but processes may fail At most f processes out of N may be faulty. m Crash failure m Byzantine failure (arbitrary) The system is logically fully connected A receiver process knows the identity of the sender process Limiting faults solely to the processes simplifies the solution to the agreement problems m Recently agreement problems have been studied under the failure of communication channels only & under the failure of both process & communication channels

Authenticated & Non-authenticated messages l To reach an agreement, processes have to exchange their values and relay the received values to other procs Authenticated or signed message system – A (faulty) process cannot forge a message or change the contents of a received message (before it relays the message to other). m A process can verify the authenticity of a received message Non-authenticated or unsigned or oral message – A (faulty) process can forge a message and claimed to have received it from another process or change the contents of a received message before it relays the message to other. m A process has no way of verifying the authenticity of a received message

Two Agreement Problems l Consensus problem: N processes agree on a value (e.g. synchronized action – go / abort) Consensus may have to be reached in the presence of failure m Process failure – crash/fail-stop, arbitrary failure m Communication failure All process i starts in an “undecided” state Every process i proposes a value v i, from a set D while in the undecided state. Process i exchanges messages until it makes decision d i and moves to decided state. A consensus is reached if all correct processes agree on the same value d i

Consensus Requirements l Termination: Eventually each correct process sets its decision value This may not be possible in the presence of process crashes in asynchronous system l Agreement: The decision value is same for all correct processes Arbitrary (Byzantine) failures may cause inconsistency and prevent agreement l Integrity: if all correct processes propose the same value, any correct process decides that value l Consensus may involve a proposal stage and an agreement stage

Byzantine Generals Problem l Proposed and solved by Lamport Consider a battle ground. There are a number of generals at different positions and want to reach an agreement in their attack plan, i.e, “attack” or “retreat”. Generals are separated geographically and communicate through messengers. Some of the generals are “loyal” and some are “traitors”. l Upper bound on number of traitors Pease et al. showed that it is impossible to reach a consensus if f exceeds  (N-1)/3 

Byzantine Generals Problem l “Byzantine generals” problem: a “commander” process i orders value v. The “lieutenant” processes must agree on what the commander ordered. Processes may be faulty m provide wrong or contradictory messages Integrity requirement: m A distinguished process decides a value for others to agree upon Solution only exists if N > 3f, where f : #faulty processes Differs from consensus in that a distinguished process supplies a value that the others are to agree upon, instead of each of them proposing a value

Byzantine Generals Problem l Requirements Termination: Eventually each process sets its decision variable Agreement: The decision value of all correct processes is the same Integrity: If the commander is correct, then all correct processes agree on the value the commander proposed Note: integrity implies agreement when the commander is correct; but the commander need not be correct

IC: A Variant of Consensus l Interactive Consistency Problem Every process proposes a single value. The goal of the algorithm is for the correct processes to agree on a vector of values, one for each process – the “decision vector” Ex – for each of a set of processes to obtain the same information about their respective states

IC: A Variant of Consensus l Requirements Termination: Eventually each process sets its decision variable Agreement: The decision vector of all correct processes is the same Integrity: If p i is correct, then all correct processes agree on v i as the i th component of its vector

Relationship between C, BG & IC l Although it is common to consider the BG problem with arbitrary process failures, in fact each of the three problems – C, BG, & IC – is meaningful in the context of either arbitrary or crash failures l Each can be framed assuming either a synchronous or an asynchronous system l It is sometimes possible to derive a solution to one problem using a solution to another

Relationship between C, BG & IC l Suppose that there exist solutions to C, BG & IC C i (v 1, v 2, … v N ) returns the decision value of p i in a run of the solution to the consensus problem where v 1, v 2, … are the values that the processes proposed BG i (j, v) returns the decision value of p i in a run of the solution to the BG problem, where p j, the commander proposed the value v IC i (v 1, v 2, … v N )[ j ] returns the jth value in the decision vector of p i in a run of the solution to the IC problem, where v 1, v 2, … are the values that the processes proposed l It is possible to construct solutions out of the solutions to other problems

Relationship between C, BG & IC IC can be solved by using BG’s solution by running it N times, once with each process pi (i = 1, 2, … N) acting as the commander: m IC i (v 1, v 2, … v N )[ j ] = BG i (j, v) (i = 1, 2, … N) C can be solved by using IC’s solution by running IC to produce a vector of values at each process, then applying an appropriate function on the vector’s values to derive a single value: m C i (v 1, v 2, … v N ) = majority(IC i (v 1, v 2, … v N )[1], … IC i (v 1, v 2, … v N )[N] ) BG can be solved from C as follows: m The commander p j sends its proposed value v to itself and each of the remaining processes m All processes run C with values v 1, v 2, … v N that they receive (p j may be faulty) m They derive BG i (j, v) = C i (v 1, v 2, … v N ) (i = 1, 2, … N)

Consensus l Solving consensus is equivalent to solving reliable and totally ordered multicast Given a solution to one, we can solve the other l Implementing consensus with RTO-multicast Collect all processes into a group g Each process p i performs RTO-multicast(g, v i ) Each process p i chooses d i = m i, where m i is the first value that p i RTO-delivers m Termination property follows from the reliability of the multicast m The agreement and integrity properties follow from the reliability and total ordering of multicast delivery Chandra & Toueg [1996] demonstrates how RTO-multicast can be derived from consensus

Consensus in a synchronous system l We discuss an algorithm that uses only a basic multicast protocol to solve consensus in a synchronous system l The algorithm assumes that up to f of the N processes exhibit crash failures

Communication Model Complete graph (i.e. logically fully connected) Synchronous, network

Multicast Send a message to all processors in one round a a a a

At the end of round: everybody receives a a a a a

Multicast Two or more processes can multicast at the same round a a a a b b b b

a,b a b

Crash Failures Faulty processor a a a a

Faulty processor Some of the messages are lost, they are never received a a

Faulty processor a a

Consensus Start Everybody has an initial value

Finish Everybody must decide the same value

Start If everybody starts with the same value they must decide that value Finish Validity condition:

A simple algorithm using B-multicast 1.B-multicast value to all processors 2.Decide on the minimum Each processor: (only one round is needed)

Start

B-multicast values 0,1,2,3,4

Decide on minimum 0,1,2,3,4

Finish

This algorithm satisfies the validity condition Start Finish If everybody starts with the same initial value, everybody decides on that value (minimum)

Consensus with Crash Failures 1.B-multicast value to all processors 2.Decide on the minimum Each processor: The simple algorithm doesn’t work

Start fail The failed processor doesn’t multicast its value to all processors 0 0

Multicasted values 0,1,2,3,4 1,2,3,4 fail 0,1,2,3,4 1,2,3,4

Decide on minimum 0,1,2,3,4 1,2,3,4 fail 0,1,2,3,4 1,2,3,4

Finish fail No Consensus!!!

If an algorithm solves consensus for f failed process we say it is: an f-resilient consensus algorithm

The input and output of a 3-resilient consensus algorithm Start Finish 1 1 Example:

New validity condition: if all non-faulty processes start with the same value then all non-faulty processes decide that value Start Finish 1 1

An f-resilient algorithm Round 1: B-multicast my value Round 2 to round f+1: Multicast any new received values End of round f+1: Decide on the minimum value received

Start Example: f=1 failures, f+1 = 2 rounds needed

Round fail Example: f=1 failures, f+1 = 2 rounds needed B-multicast all values to everybody 0,1,2,3,4 1,2,3,4 0,1,2,3,4 1,2,3,4 (new values)

Example: f=1 failures, f+1 = 2 rounds needed Round 2 B-multicast all new values to everybody 0,1,2,3,

Example: f=1 failures, f+1 = 2 rounds needed Finish Decide on minimum value ,1,2,3,4

Start Example: f=2 failures, f+1 = 3 rounds needed Another example execution with 3 failures

Round 1 0 Failure 1 Multicast all values to everybody 1,2,3,4 0,1,2,3,4 1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed

Round 2 Failure 1 Multicast new values to everybody 0,1,2,3,4 1,2,3,4 0,1,2,3,4 1,2,3,4 Failure 2 Example: f=2 failures, f+1 = 3 rounds needed

Round 3 Failure 1 Multicast new values to everybody 0,1,2,3,4 O, 1,2,3,4 Failure 2 Example: f=2 failures, f+1 = 3 rounds needed

Finish Failure 1 Decide on the minimum value 0,1,2,3,4 O, 1,2,3,4 Failure 2 Example: f=2 failures, f+1 = 3 rounds needed

Start Example: f=2 failures, f+1 = 3 rounds needed Another example execution with 3 failures

Round 1 0 Failure 1 Multicast all values to everybody 1,2,3,4 0,1,2,3,4 1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed

Round 2 Failure 1 Multicast new values to everybody 0,1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed At the end of this round all processes know about all the other values Remark:

Round 3 Failure 1 Multicast new values to everybody 0,1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed (no new values are learned in this round) Failure 2

Finish Failure 1 Decide on minimum value 0,1,2,3,4 Example: f=2 failures, f+1 = 3 rounds needed Failure 2

If there are f failures and f+1 rounds then there is a round with no failed process Example: 5 failures, 6 rounds 1 2 No failure 3456 Round

In the algorithm, at the end of the round with no failure: Every (non faulty) process knows about all the values of all other participating processes This knowledge doesn’t change until the end of the algorithm

Therefore, at the end of the round with no failure: everybody would decide the same value However, we don’t know the exact position of this round, so we have to let the algorithm execute for f+1 rounds

Validity of algorithm: when all processes start with the same input value then the consensus is that value This holds, since the value decided from each process is some input value

A Lower Bound Any f-resilient consensus algorithm requires at least f+1 rounds Theorem:

Proof sketch: Assume for contradiction that f or less rounds are enough Worst case scenario: There is a process that fails in each round

Round a 1 before process fails, it sends its value a to only one process Worst case scenario

Round a 1 before process fails, it sends value a to only one process Worst case scenario 2

Round 1 Worst case scenario 2 ……… f 3 Process may decide a, and all other processes may decide another value (b) a b decide

Round 1 Worst case scenario 2 ……… f 3 a b decide Therefore f rounds are not enough At least f+1 rounds are needed

Consensus in synchronous systems Up to f faulty processes Duration of round: max. delay of B-multicast Dolev & Strong, 1983: Any algorithm to reach consensus despite up to f failures requires (f +1) rounds.

Byzantine agreement: synchronous p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u p 1 (Commander) p 2 p 3 1:x1:w 2:1:w 3:1:x 3 says 1 says ‘u’ Faulty process Lamport et al, 1982: No solution for N = 3, f = 1 Nothing can be done to improve a correct process’ knowledge beyond the first stage: - It cannot tell which process is faulty. Pease et al, 1982: No solution for N<= 3*f (assuming private comm. channels)

Byzantine agreement for N > 3*f Example with N=4, f=1: - 1 st round: Commander sends a value to each lieutenant - 2 nd round: Each of the lieutenants sends the value it has received to each of its peers. - A lieutenant receives a total of (N – 2) + 1 values, of which (N – 2) are correct. - By majority(), the correct lieutenants compute the same value. p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u p 4 1:v 4:1:v 2:1:v3:1:w 4:1:v p 1 (Commander) p 2 p 3 1:w1:u 2:1:u 3:1:w p 4 1:v 4:1:v 2:1:u3:1:w 4:1:v In general, O(N (f+1) ) msg’s O(N 2 ) for signed msg’s

Four Byzantine Generals: N = 4, f = 1 in a Synchronous DS p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u Faulty processes p 4 1:v 4:1:v 2:1:v3:1:w 4:1:v p 1 (Commander) p 2 p 3 1:w1:u 2:1:u 3:1:w p 4 1:v 4:1:v 2:1:u3:1:w 4:1:v p2 decides on majority(v,u,v) = v p4 decides on majority(v,v,w) = v p2, p3, p4 decide on majority(u,v, w) = 

Asynchronous system l Solutions to consensus and BG problem ( and to IC) exist in synchronous systems l No algorithm can guarantee to reach consensus in an asynchronous system, even with one process crash failure l In an asynchronous system, processes can respond to messages at arbitrary times – so a crashed process is indistinguishable from a slow one l There is always some continuation of the processes’ execution that avoids consensus being reached

Impossibility of (deterministic) consensus in asynchronous systems M.J. Fischer, N. Lynch, and M. Paterson: “Impossibility of distributed consensus with one faulty process”, J. ACM, 32(2), pp , A crashed process cannot be distinguished from a slow one. - Not even with a 100% reliable comm. network ! There is always a chance that some continuation of the processes’ execution avoid consensus being reached.

Contd l Note the word “guarantee” in the statement of the impossibility result l The result does not mean that processes can never reach consensus in an asynchronous system if one is faulty – it allows that consensus can be reached with some probability greater than zero l For example, despite the fact that our systems are often effectively asynchronous, transaction systems have been reaching consensus regularly for many years