1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
Byzantine Generals. Outline r Byzantine generals problem.
Agreement: Byzantine Generals UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau Paper: “The.
BASIC BUILDING BLOCKS -Harit Desai. Byzantine Generals Problem If a computer fails, –it behaves in a well defined manner A component always shows a zero.
Teaser - Introduction to Distributed Computing
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
The Byzantine Generals Problem Boon Thau Loo CS294-4.
The Byzantine Generals Problem Leslie Lamport, Robert Shostak, Marshall Pease Distributed Algorithms A1 Presented by: Anna Bendersky.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Byzantine Generals Problem: Solution using signed messages.
Byzantine Generals Problem Anthony Soo Kaim Ryan Chu Stephen Wu.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Distributed Operating Systems CS551 Colorado State University at Lockheed-Martin Lecture 9 -- Spring 2001.
Copyright 2006 Koren & Krishna ECE655/ByzGen.1 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Fault Tolerant Computing ECE 655.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.
The Byzantine Generals Problem Leslie Lamport Robert Shostak Marshall Pease.
1 Principles of Reliable Distributed Systems Recitation 7 Byz. Consensus without Authentication ◊S-based Consensus Spring 2008 Alex Shraer.
Consensus and Related Problems Béat Hirsbrunner References G. Coulouris, J. Dollimore and T. Kindberg "Distributed Systems: Concepts and Design", Ed. 4,
1 Fault Tolerance in Collaborative Sensor Networks for Target Detection IEEE TRANSACTIONS ON COMPUTERS, VOL. 53, NO. 3, MARCH 2004.
1 Clock Synchronization Ronilda Lacson, MD, SM. 2 Introduction Accurate reliable time is necessary for financial and legal transactions, transportation.
Paxos Made Simple Jinghe Zhang. Introduction Lock is the easiest way to manage concurrency Mutex and semaphore. Read and write locks. In distributed system:
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
9/14/20151 Lecture 18: Distributed Agreement CSC 469H1F / CSC 2208H1F Fall 2007 Angela Demke Brown.
“Revisiting Fault Diagnosis Agreement in a New Territory” S. C. Wang and K. Q. Yan Operating Systems Review, April 2004, p. 41– 61. An extension of the.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Consensus and Its Impossibility in Asynchronous Systems.
Ch11 Distributed Agreement. Outline Distributed Agreement Adversaries Byzantine Agreement Impossibility of Consensus Randomized Distributed Agreement.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Practical Byzantine Fault Tolerance
1 Chapter 12 Consensus ( Fault Tolerance). 2 Reliable Systems Distributed processing creates faster systems by exploiting parallelism but also improve.
Byzantine Fault Tolerance in Stateful Web Service Yilei ZHANG 30/10/2009.
1 Resilience by Distributed Consensus : Byzantine Generals Problem Adapted from various sources by: T. K. Prasad, Professor Kno.e.sis : Ohio Center of.
Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.
The Byzantine General Problem Leslie Lamport, Robert Shostak, Marshall Pease.SRI International presented by Muyuan Wang.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
Hwajung Lee. Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit or Abort.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Reaching Agreement in the Presence of Faults M. Pease, R. Shotak and L. Lamport Sanjana Patel Dec 3, 2003.
Fault Tolerance Chapter 7. Topics Basic Concepts Failure Models Redundancy Agreement and Consensus Client Server Communication Group Communication and.
Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.
1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
Distributed Agreement. Agreement Problems High-level goal: Processes in a distributed system reach agreement on a value Numerous problems can be cast.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
reaching agreement in the presence of faults
Synchronizing Processes
Coordination and Agreement
The OM(m) algorithm Recall what the oral message model is.
8.2. Process resilience Shreyas Karandikar.
COMP28112 – Lecture 14 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 13-Oct-18 COMP28112.
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 19-Nov-18 COMP28112.
Alternating Bit Protocol
Distributed Consensus
Agreement Protocols CS60002: Distributed Systems
Distributed Consensus
Byzantine Generals Problem
Byzantine Faults definition and problem statement impossibility
Consensus in Synchronous Systems: Byzantine Generals Problem
The Byzantine Generals Problem
COMP28112 – Lecture 13 Byzantine fault tolerance: dealing with arbitrary failures The Byzantine Generals’ problem (Byzantine Agreement) 22-Feb-19 COMP28112.
Byzantine Generals Problem
Presentation transcript:

1 AGREEMENT PROTOCOLS

2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement is very much required. In Distributed Data bases, there may be a situation where data managers have to decide “Whether to commit or Abort the Transaction” When there is no failure, reaching an agreement is easy. However, in case of failures, processes must exchange their values with other processes and relay the values received from others several times to isolate the effect of faulty processor. Agreement Protocols helps to reach an agreement in presence of failures.

3 System Model Agreement Problems have been studied under following System Model: 1.‘n’ processors and at most ‘m’ of the processors can be faulty 2.Processors can directly communicate with other processors by message passing. 3.Receiver knows the identity of the sender 4.Communication medium is reliabble. 5.Only Processors are prone to failures1

4 Synchronous VS Asynchronous Computation Synchronous Computation 1.Processes run in lock step manner[ Process receives a message sent to it earlier, performs computation and sends a message to other process. 2.Step of Synchronous computation is called round Asynchronous Computation 1.Computation does not proceed in lock step. 2.Process can send receive messages and perform computation at any time. Synchronous Models of Computations are Assumed here.

5 Model of processor Failures Processor Can Fail in three modes: 1.Crash Fault : (i) Processor stops and never resumes operation. 2.Omission Fault : (i) Processor Omits to send message to some processors 3.Malicious Fault : (i) Also known as Byzantine Faults (ii) Processor may send fictitious values/message to other processes to confuse them (iii) Tough to detect/correct

6 Authenticated VS Non- Authenticated Messages Authenticated Messages: 1.Also known as signed Message 2.Processor can not forge/change a received message 3.Processor can verify the authenticity of the message 4.It is easier to reach on an agreement in this case Non-Authenticated Messages: 1.Also known as Oral Message 2.Processor can forge/change a received message and claims to have received it from others 3.Processor can not verify the authenticity of the message in this case.

7 Performance Aspects of Agreement Protocols Following Metrics are used : 1.Time: No of rounds needed to reach an agreement 2.Message Traffic : Number of messages exchanged to reach an agreement. 3.Storage Overhead : Amount of information that needs to stored at processors during execution of the protocol.

8 Classification of Agreement Problems There are three well known agreement problems in distributed systems: 1.Byzantine Agreement Problem : A single Value is to be agreed upon. Agreed Value is initialized by an arbitrary processor and all non faulty processors have to agree on that value. 2.Consensus Problem: Every processor has its own initial value and all non faulty processors must agree on a single common value. 3.Interactive Consistency Problem: Every processor has its own initial value and all non faulty processors must agree on a set of common values. Cont..

9 Classification of Agreement Problemscont… In All the previous mentioned problems, all non faulty processors must reach an agreement In Byzantine and Consensus problems, agreement is on a single value In Interactive Consistency problem, agreement is on a set of common values. In Byzantine agreement problem,only one processor initializes the value where as in other two cases, every processor has its own initial value.

10 Byzantine Agreement Problem Source Processor [ Any arbitrarily chosen processor] broadcasts its values to others. Solution must meet following objectives: 1.Agreement : All non-faulty processors agree on the same value. 2.Validity : If source is nonfaulty, then the common agreed value must be the value supplied by the source processor. “If source is faulty then all non- faulty processors can agree on any common value”. “Value agreed upon by faulty processors is irrelevant”

11 Consensus Problem Every Processor broadcasts its initial value to all other processors. Initial Values may be different for different processors. Protocol must meet following objectives: 1.Agreement : All non-faulty processors agree on the same single value. 2.Validity : If initial value of every non-faulty processor is v, then the common agreed value by all non-faulty processors must be v. “If initial value of non-faulty processors are different then all non- faulty processors can agree on any common value”. “Value agreed upon by faulty processors is irrelevant”

12 Interactive Consistency Problem Every Processor broadcasts its initial value to all other processors. Initial Values may be different for different processors. Protocol must meet following objectives: 1.Agreement : All non-faulty processors agree on the same vector (v1,v2,…vn). 2.Validity : If ith processor is non-faulty and its initial value is vi, then the ith value agreed by all non-faulty processors must be vi. “If jth processor is faulty then all non- faulty processors can agree on any common value vj”. “Value agreed upon by faulty processors is irrelevant”

13 Solution for Byzantine Agreement Problem First Defined and solved by lamport. Source Broadcasts its initial value to all other processors. Processors send their values to other processors and also relay received values to others. During Execution faulty processors may confuse by sending conflicting values. However if faulty processors dominate in number, they can prevent non-faulty processors from reaching an agreement. No of faulty processors should not exceed certain limit. Pease showed that in a fully connected network, it is impossible to reach an agreement if number faulty processors ‘m’ exceeds (n-1)/3. n = number of processors

14 Lamport-Shostak-Pease Algorithm This algorithm also known as Oral Message Algorithm OM(m) where m is the number of faulty processors ‘n’ = Number of processors and n >= 3m+1 Algorithm is Recursively defined as follows: Algorithm OM(0) 1.Source processor sends its values to every processor 2.Each processor uses the value it receives from source. [If no value is received default value 0 is used] Algorithm OM(m), m>0 1.The source processor sends its value to every processor.

15 Cont… 2.For each i, let Vi be the value processor i receives from source.[ Default value 0 if no value received] 3.Processor I acts as the new source and initiates Algorithm OM(m-1) where it sends the value vi to each of the n-2 other processors. 4.For each i and j (not i), let vj be the value processor i received from processor j in STEP 3. Processor I uses the value majority(v1,v2….vn- 1). “The function majority(v1,v2….vn-1) computes the majority value if exists otherwise it uses default value 0.”

Lamport-Shostak-Paes Algorithm Contd.. 16

Lamport-Shostak-Paes Algorithm Contd.. 17

Lamport-Shostak-Paes Algorithm Contd.. 18

19 Byzantine Agreement Can not be reached among three processors if one processor is faulty P0 P2P1 P0 is NON-Faulty P0 P2P1 P0 is Faulty P0 is source processor

Byzantine Consensus: n > 3f Oral Messages algorithm, OM(f) Consists of f+1 “phases” Algorithm OM(0) is the “base case” (no faults) 1)Commander sends value to every lieutenant 2)Each lieutenant uses value received from commander, or default “retreat” if no value was received Recursive algorithm handles up to f faults 20

OM(f): Recursive Algorithm 1)Commander sends value to every lieutenant 2)For each lieutenant i, let v i be the value i received from commander, or “retreat” if no value was received. Lieutenant i acts as commander in Alg. OM(f-1) to send v i to each of the n-2 other lieutenants 3)For each i, and each j not equal to i, let v j be the value Lieutenant i received from Lieutenant j in step (2) (using Alg. OM(f-1)), or else “retreat” if no such value was received. Lieutenant i uses the value majority (v 1, …, v n-1 ). 21

Example: f = 1, n = 4 Loyal General, 1 traitor lieutenant L3 Commander 1 v L2 v v Step 1: Commander sends same value, v, to all Step 2: Each of L2, L3, L4 executes OM(0) as commander, but L2 sends arbitrary values Step 3: Decide L3 has {v,v,x}, L4 has {v,v,y}, Both choose v. x L4 v v v v y 22

y x x z z y Example: f = 1, n = 4 Traitor General, all lieutenants loyal L3 Commander 1 L2 x z Step 1: Commander sends different value, x,y,z, to each Step 2: Each of L2, L3, L4 executes OM(0) as commander, sending value they received Step 3: Decide L2 has {x,y,z} L3 has {x,y,z}, L4 has {x,y,x}, L4 y All loyal lieutenants get same result. 23

24 Applications of Agreement Algorithms 1.Fault-Tolerant Clock Synchronization Distributed Systems require physical clocks to synchronized Physical clocks have drift problem Agreement Protocols may help to reach a common clock value. 2.Atomic Commit in DDBS DDBS sites must agree whether to commit or abort the transaction Agreement protocols may help to reach a consensus.