CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section 12.5.1-12.5.3 Klara Nahrstedt.

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
CMPT 431 Lecture IX: Coordination And Agreement. 2 CMPT 431 © A. Fedorova A Replicated Service client servers network client master slave W W WR R W write.
CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.
Byzantine Generals. Outline r Byzantine generals problem.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks.
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
Consensus Hao Li.
The Byzantine Generals Problem Boon Thau Loo CS294-4.
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Byzantine Generals Problem: Solution using signed messages.
1 Principles of Reliable Distributed Systems Lecture 6: Synchronous Uniform Consensus Spring 2005 Dr. Idit Keidar.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture IX: Coordination And Agreement.
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 15 Wenbing Zhao Department of Electrical and Computer Engineering.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Lecture 3-1 Computer Science 425 Distributed Systems Lecture 3 Logical Clock and Global States/ Snapshots Reading: Chapter 11.4&11.5 Klara Nahrstedt.
Computer Science 425 Distributed Systems (Fall 2009) Lecture 5 Multicast Communication Reading: Section 12.4 Klara Nahrstedt.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Lecture 8-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 16, 2010 Lecture 8 The Consensus.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Consensus and Its Impossibility in Asynchronous Systems.
Ch11 Distributed Agreement. Outline Distributed Agreement Adversaries Byzantine Agreement Impossibility of Consensus Randomized Distributed Agreement.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Computer Science 425 Distributed Systems (Fall 2009) Lecture 10 The Consensus Problem Part of Section 12.5 and Paper: “Impossibility of Distributed Consensus.
Consensus with Partial Synchrony Kevin Schaffer Chapter 25 from “Distributed Algorithms” by Nancy A. Lynch.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 8 Instructor: Haifeng YU.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 8 Leader Election Section 12.3 Klara Nahrstedt.
Lecture 11 Failure Detectors (Sections 12.1 and part of 2.3.2) Klara Nahrstedt CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009)
Lecture 11-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2010 Indranil Gupta (Indy) September 28, 2010 Lecture 11 Leader Election.
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
CSE 486/586 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
SysRép / 2.5A. SchiperEté The consensus problem.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
Lecture 7- 1 CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 7 Distributed Mutual Exclusion Section 12.2 Klara Nahrstedt.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 9 Instructor: Haifeng YU.
Lecture 4-1 Computer Science 425 Distributed Systems (Fall2009) Lecture 4 Chandy-Lamport Snapshot Algorithm and Multicast Communication Reading: Section.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
Distributed Agreement. Agreement Problems High-level goal: Processes in a distributed system reach agreement on a value Numerous problems can be cast.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
The Consensus Problem in Fault Tolerant Computing
Synchronizing Processes
Coordination and Agreement
When Is Agreement Possible
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
COMP28112 – Lecture 14 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 13-Oct-18 COMP28112.
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 19-Nov-18 COMP28112.
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Consensus
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 22-Feb-19 COMP28112.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Presentation transcript:

CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt

Acknowledgement The slides during this semester are based on ideas and material from the following sources: –Slides prepared by Professors M. Harandi, J. Hou, I. Gupta, N. Vaidya, Y-Ch. Hu, S. Mitra. –Slides from Professor S. Gosh’s course at University o Iowa.

Administrative MP1 posted September 8, Tuesday –Deadline, September 25 (Friday), 4-6pm Demonstrations

Plan for Today Failure Models Three Problems –Consensus –Byzantine Generals –Interactive Consistency Synchronous Setting

Failure Models Crash failure: ceases to execute –Permanent –Cause:, e.g., power loss –Variant: dead for finite period of time then resumes Omission failure: process or communication channel fails to perform actions that it is supposed to do. –Communication Omission failure: sender sends a sequence of messages but receiver does not receive some subset of messages »Cause: e.g., interference in medium –Process Omission failures: crash failure Timing failures –Messages do not arrive in time, computation takes longer then expected times –Cause: e.g., congestion, over-loading, garbage-collection

Failure Models Transient failure: process jumps to arbitrary state and resumes normal execution –Cause: e.g., gamma rays Byzantine failure: arbitrary messages and transitions –Cause: e.g., software bugs, malicious attacks

Definition of Consensus (C) Problem N processes {0,1,2,…, N-1} try to agree p i begins in undecided state and proposes value v i є D p i ‘s communicate by exchanging values p i sets its decision value di and enters decided state Requirements –Termination: eventually all correct processes decide »i.e., each correct process sets its decision variable –Agreement : decision value of all correct processes is the same, »i.e., if p i and p j are correct and decided, then d i = d j –Integrity: if all correct processes proposed v, then any correct decided process has d i = v

Consensus for three processes

Byzantine Generals (BG) Problem N > 2 generals {0,1, 2, … N-1} One of the generals is the commander who issues attack or retreat commands to all the other generals All generals try to agree about whether to attack or retreat Some generals (including the commander) may be traitors (byzantine) Requirements: –Termination: all correct generals decide –Agreement: if p i and p j are correct and decided then d i = d j –Integrity: if commander is correct, then all correct processes decide value issued by commander If commander is correct, then integrity implies agreement

Interactive Consistency (IC) Problem N processes {0,1,2,…, N-1} try to agree on vector of values p i begins in undecided state and proposes a value v i є D p i sets its decision value di and enters decided state Requirements: –Termination: all correct processes decide –Agreement: the decision vector for all correct processes is the same –Integrity: if p i is correct, then for any correct process p j d j [i] = v i

C to BG to IC How to solve IC from an algorithm for solving BG? –Run BG N times once with each process as commander How to solve C from an algorithm for IC? –If majority of processes are correct, then solve IC and then apply majority function How to solve BG using an algorithm for C? –Commander sends proposed value to itself and the other processes which then run C How to solve RTO (Reliable Total Ordered) – multicast from C and vice-versa, under crash failures only?

Solving C with RTO-multicast All processes form a group p i performs RTO-multicast (v i, g) p i sets d i = m i, where m i is the first msg delivered by RTO-multicast –Termination guaranteed by reliable multicast –Agreement and validity by definitely of TO Solving consensus using basic multicast in the case where up to f processes may crash

Consensus in Synchronous Systems

Consensus in a synchronous system Dolev & Strong (1983)

Examples Example execution: with No failures (f = 0) Example execution: with f = 2

Correctness of Dolev & Strong Algorithm Termination: finite number of rounds, finite duration of each round Agreement and integrity –We will prove by contradiction that V i [f+1] = V j[ f+1] Suppose V i [f+1] ≠ V j [f+1] with f crashes  There is v є V i [f+1], but v is not in V j [f+1], hence there is p k that delivered v to p i in round f+1 but crashed before delivering v to p j  There is v є V k [f], but v not in V j [f], hence, there is p l that delivered v to p k in round f but crashed before delivering v to p j  … all the way back to V j [1]  Proceeding in this way, we infer at least one crash in each of the preceding rounds (i.e., which implies f+1 crashes)  But we have assumed at most f crashes can occur and there are f+1 rounds  contradiction.

Byzantine Generals in Synchronous Systems

BG in Synchronous System Assumptions –Up to f of N processes may be Byzantine –Synchronous implies »Correct processes can detect absence of messages with timeout, but cannot conclude that sender has crashed Is BG solvable? –For N = 3f ? –For N = 3, f = 1?

Impossibility (no solution) with N = 3, f = 1 Lamport et al (1982) considered three processes with one Byzantine process No solution to achieve agreement Example –1:v means “1 says v”, 2:1:v means “2 says 1 says v” –2 different scenarios appear identical to p2 p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u p 1 (Commander) p 2 p 3 1:x 1:w 2:1:w 3:1:x Faulty processes are shown coloured

Impassibility with N ≤ 3f (outline) Pease et al generalized basic impossibility result Simulation-based argument –Impossibility shown by contradiction –Assume there exists algorithm for N≤3f ( e.g. N = 12, f = 4) Use algorithm to solve BG for N= 3 and f =1 thus reaching contradiction! –Assume three processes {0,7,1}, each simulate behavior of 4 generals –Assume process 0 is faulty, then {0,11,5,6} generals will generate byzantine failures. All other processes are correct –Correctness of simulated algorithm tells us that algorithm terminates and 1 and 7 satisfy integrity –2 correct processes {1,7} solve consensus in spite of failure of 0 Contradiction (since N= 3, f=1 case is unsolvable)

BG Algorithm for N = 3f rounds 1. commander sends value to lieutenants 2. lieutenants send value to peers p 1 (Commander) p 2 p 3 1:v 2:1:v 3:1:u Faulty processes are shown coloured p 4 1:v 4:1:v 2:1:v3:1:w 4:1:v p 1 (Commander) p 2 p 3 1:w1:u 2:1:u 3:1:w p 4 1:v 4:1:v 2:1:u3:1:w 4:1:v P2 decides Majority(v,v,u) = v P4 decides Majority(v,v,w) = v) P2 decides Majority(u,v,w) = ┴ P3 decides Majority(u,v,w) = ┴ P4 decides Majority(u,v,w) = ┴

Summary BG algorithm for N ≥ 3f + 1 by Pease et al This algorithm can account for omissions –Timeout (synchronous) and assume that the sent value was ┴ We cannot solve BG (synchronous) if more than a third of the generals are byzantine We can measure efficiency of agreement algorithms based on the –Number of (synchronous) rounds of communication needed –Number of messages More impossibility results –Read paper from FLP (Fischer, Lynch, Patterson), 1983