Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.

Slides:

Advertisements

Similar presentations

Impossibility of Distributed Consensus with One Faulty Process

Advertisements

CS425/CSE424/ECE428 – Distributed Systems – Fall 2011 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S. Mitra, K. Nahrstedt, N. Vaidya.

DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.

A General Characterization of Indulgence R. Guerraoui EPFL joint work with N. Lynch (MIT)

CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.

CS542 Topics in Distributed Systems Diganta Goswami.

Failure Detection The ping-ack failure detector in a synchronous system satisfies – A: completeness – B: accuracy – C: neither – D: both.

Distributed Computing 8. Impossibility of consensus Shmuel Zaks ©

Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.

Failure detector The story goes back to the FLP’85 impossibility result about consensus in presence of crash failures. If crash can be detected, then consensus.

Byzantine Generals Problem: Solution using signed messages.

Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:

LEADER ELECTION CS Election Algorithms Many distributed algorithms need one process to act as coordinator – Doesn’t matter which process does the.

1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.

Distributed Algorithms: Asynch R/W SM Computability Eli Gafni, UCLA Summer Course, CRI, Haifa U, Israel.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )

1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:

Outline Why distributed computing? Atomic Broadcast The atom system Relevance for e-textiles What’s next? Q&A.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.

1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 3: Fault-Tolerant.

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.

Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.

Ordered Communication. Define guarantees about the order of deliveries inside group of processes Type of ordering: Deliveries respect the FIFO ordering.

Aran Bergman & Eddie Bortnikov & Alex Shraer, Principles of Reliable Distributed Systems, Spring Principles of Reliable Distributed Systems Recitation.

Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.

Distributed Algorithms: Agreement Protocols. Problems of Agreement l A set of processes need to agree on a value (decision), after one or more processes.

Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.

On the Cost of Fault-Tolerant Consensus When There are no Faults Idit Keidar & Sergio Rajsbaum Appears in SIGACT News; MIT Tech. Report.

Systems of Distributed systems Module 2 - Distributed algorithms Teaching unit 2 – Properties of distributed algorithms Ernesto Damiani University of Bozen.

Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.

 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.

State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.

1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:

Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.

Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1.

Chapter 14 Asynchronous Network Model by Mikhail Nesterenko “Distributed Algorithms” by Nancy A. Lynch.

Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (Paper by X. Défago, A. Schiper, and P. Urbán) ACM computing Surveys, Vol. 36,No 4,

Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.

Consensus and Its Impossibility in Asynchronous Systems.

Group Communication Group oriented activities are steadily increasing. There are many types of groups:  Open and Closed groups  Peer-to-peer and hierarchical.

Farnaz Moradi Based on slides by Andreas Larsson 2013.

Approximation of δ-Timeliness Carole Delporte-Gallet, LIAFA UMR 7089, Paris VII Stéphane Devismes, VERIMAG UMR 5104, Grenoble I Hugues Fauconnier, LIAFA.

CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.

Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.

Sliding window protocol The sender continues the send action without receiving the acknowledgements of at most w messages (w > 0), w is called the window.

Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.

SysRép / 2.5A. SchiperEté The consensus problem.

Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.

DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.

Failure Detectors n motivation n failure detector properties n failure detector classes u detector reduction u equivalence between classes n consensus.

Alternating Bit Protocol S R ABP is a link layer protocol. Works on FIFO channels only. Guarantees reliable message delivery with a 1-bit sequence number.

Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb

1 Fault-Tolerant Consensus. 2 Communication Model Complete graph Synchronous, network.

Group Communication A group is a collection of users sharing some common interest.Group-based activities are steadily increasing. There are many types.

1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.

Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.

Distributed systems II Fault-Tolerant Broadcast

CSE 486/586 Distributed Systems Reliable Multicast --- 1

Coordination and Agreement

Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013

Alternating Bit Protocol

Distributed Consensus

Agreement Protocols CS60002: Distributed Systems

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Distributed Systems Terminating Reliable Broadcast

CSE 486/586 Distributed Systems Reliable Multicast --- 1

Presentation transcript:

Fault-Tolerant Broadcast Terminology: broadcast(m) a process broadcasts a message to the others deliver(m) a process delivers a message to itself 1

Atomic Broadcast Models: Synchronous vs. asynchronous Types of process failures Types of communication failures Network topology Deterministic vs. randomized 2

Reliable Broadcast Three conditions Agreement: all correct processes receive same set of messages Validity: set of messages received by correct processes includes all messages sent by correct processes Integrity: each correct process P delivers a message from correct process Q at most once, and only if Q actually sent it 3

Reliable Broadcast What about faulty processes? Definition: A property is uniform if faulty processes satisfy it as well. Uniform agreement: If a process (correct or faulty) delivers m, then all corect processes eventually deliver m. Uniform integrity: For every message m, every process (correct or not) delivers m at most once, and only if some process broadcast m 4

Reliable Broadcast If a process fails in the middle of a broadcast, either every correct process gets the message or mone does. 5

Reliable Broadcast How can we implement Reliable Broadcast? Model Asynchronous Benign process and link failures only No network partitions 6

Reliable Broadcast Assume we have send(m) and receive(m) primitives Transmit and send messages across a link If P sends m to Q, and link correct, then Q eventually receives m For all m, Q receives m at most once from P, and only if P actually sent m 7

Reliable Broadcast R-broadcast(m) uniquely tag m with sender and sequence number send(m) to all neighbours (including self) end R-broadcast R-deliver(m) upon receive(m) do if i have not already delivered m then if I am not the sender of m then send m to all neighbours endif deliver(m) endif end R-deliver 8

Reliable Broadcast In an asynchronous system Where every two correct processes are connected via a path that never fails, Then the previous algorithm implements reliable broadcast with uniform integrity: For every message m, every process (correct or not) delivers m at most once, and only if some process broadcast m. 9

Reliable Broadcast In an asynchronous system where every two correct processes are connected via a path that never fails, and only receive omissions occur, then the algorithm satisfies uniform agreement: If a process (correct or faulty) delivers m, then all correct processes eventually deliver m. 10

FIFO Broadcast Same as reliable, plus All messages broadcast by same sender delivered in order sent 11

FIFO Broadcast msgBag=0 Next[Q]=1 for all processes Q F-broadcast(m) R-broadcast(m) F-deliver(m) upon R-deliver(m) do Q := sender(m) msgBag := msgBagU{m} while ( m´ in msgBag : sender(m´)=Q and seq ≠ (m´) = next[Q]) do F-deliver(m´) next[Q] := next[Q]+1 msgBag:=msqBag-{m´} endwhile 12

FIFO Broadcast Theorem 1: Given a reliable broadcast algorithm this algorithm is uniform FIFO. Theorem 2: if the reliable broadcast algorithm satisfies uniform agreement, so does this algorithm. 13

Limitations of FIFO Broadcast Scenario: User A broadcasts a message to a mailing list/Board B delivers that article B broadcasts reply C delivers B’s response without A´s original message and misinterprets the message 14

Causal Broadcast Same as FIFO, plus Messages must be delivered in causal precedence: A broadcasts article m B delivers article m B broadcasts m´ C does not deliver m´ until it has delivered m 15

Causal Broadcast prevDel is sequence of messages that P C-delivered since its last C-broadcast C-broadcast(m) F-broadcast(prevDel●m) prevDel:=Ø C-deliever(m) upon F-deliever(m1,m2,...,ml) do for i in 1..l do if P has not previously C-delivered mi then C-deliver(mi) prevDel:=prevDel●mi 16

Causal Broadcast Theorem 1: If the FIFO broadcast algorithm is Uniform FIFO, this is a uniform causal broadcast algorithm. Theorem 2: if the FIFO broadcast satisfies Uniform Agreement, so does this one. 17

Limitation of Causal Broadcast Causal broadcast does not impose any order on unrelated messages. Two replicas can deliver operations/request in different order. 18

Atomic Broadcast Requires that all correct processes deliver all messages in the same order. Implies that all correct processes see the same view of the world. 19

Atomic Broadcast Theorem: Atomic broadcast is impossible in asynchronous systems. Equivalent to consensus problem. 20

Review of Consensus

FLP Theorem: Consensus is impossible in any asynchronous system if one process can halt. [Fisher, Lynch, Peterson 1985] 22

Atomic Broadcast Theorem 1: Any atomic broadcast algorithm solves consensus. Everybody does an Atomic Broadcast Decides first value delivered Theorem 2: Atomic broadcast is impossible in any asynchronous system if one process can halt. 23

Atomic Broadcast Consensus is solvable in: Synchronous systems (we had discusses such an algorthm that works in f+1 rounds) Certain semi-synchronous systems Consensus is also solvable in Asynchronous systems with randomization Asynchronous systems with failure-detectors 24