Prof R. Guerraoui Distributed Programming Laboratory

Slides:



Advertisements
Similar presentations
Distributed systems Total Order Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
Advertisements

Global States.
DISTRIBUTED SYSTEMS II FAULT-TOLERANT BROADCAST Prof Philippas Tsigas Distributed Computing and Systems Research Group.
A General Characterization of Indulgence R. Guerraoui EPFL joint work with N. Lynch (MIT)
1 Distributed systems Causal Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Reliable Multicast Steve Ko Computer Sciences and Engineering University at Buffalo.
Teaser - Introduction to Distributed Computing
An evaluation of ring-based algorithms for the Eventually Perfect failure detector class Joachim Wieland Mikel Larrea Alberto Lafuente The University of.
Failure detector The story goes back to the FLP’85 impossibility result about consensus in presence of crash failures. If crash can be detected, then consensus.
1 © R. Guerraoui Implementing the Consensus Object with Timing Assumptions R. Guerraoui Distributed Programming Laboratory.
Byzantine Generals Problem: Solution using signed messages.
Failure Detectors. Can we do anything in asynchronous systems? Reliable broadcast –Process j sends a message m to all processes in the system –Requirement:
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
E-Transactions: End-to-End Reliability for Three-Tier Architectures Svend Frølund and Rachid Guerraoui.
Failure Detectors & Consensus. Agenda Unreliable Failure Detectors (CHANDRA TOUEG) Reducibility ◊S≥◊W, ◊W≥◊S Solving Consensus using ◊S (MOSTEFAOUI RAYNAL)
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Group Communication Phuong Hoai Ha & Yi Zhang Introduction to Lab. assignments March 24 th, 2004.
Distributed Systems Non-Blocking Atomic Commit
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
1 Principles of Reliable Distributed Systems Lecture 5: Failure Models, Fault-Tolerant Broadcasts and State-Machine Replication Spring 2005 Dr. Idit Keidar.
1 Principles of Reliable Distributed Systems Recitation 8 ◊S-based Consensus Spring 2009 Alex Shraer.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed Systems 2006 Group Membership * *With material adapted from Ken Birman.
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman.
Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.
Distributed Systems Terminating Reliable Broadcast Prof R. Guerraoui Distributed Programming Laboratory.
 Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 7: Failure Detectors.
Composition Model and its code. bound:=bound+1.
Lab 1 Bulletin Board System Farnaz Moradi Based on slides by Andreas Larsson 2012.
Failure detection and consensus Ludovic Henrio CNRS - projet OASIS Distributed Algorithms.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Total Order Broadcast and Multicast Algorithms: Taxonomy and Survey (Paper by X. Défago, A. Schiper, and P. Urbán) ACM computing Surveys, Vol. 36,No 4,
Lab 2 Group Communication Farnaz Moradi Based on slides by Andreas Larsson 2012.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
Farnaz Moradi Based on slides by Andreas Larsson 2013.
1 © R. Guerraoui Regular register algorithms R. Guerraoui Distributed Programming Laboratory lpdwww.epfl.ch.
November NC state university Group Communication Specifications Gregory V Chockler, Idit Keidar, Roman Vitenberg Presented by – Jyothish S Varma.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
SysRép / 2.5A. SchiperEté The consensus problem.
Replication and Group Communication. Management of Replicated Data FE Requests and replies C Replica C Service Clients Front ends managers RM FE RM Instructor’s.
1 © R. Guerraoui Distributed algorithms Prof R. Guerraoui Assistant Marko Vukolic Exam: Written, Feb 5th Reference: Book - Springer.
Failure Detectors n motivation n failure detector properties n failure detector classes u detector reduction u equivalence between classes n consensus.
PROCESS RESILIENCE By Ravalika Pola. outline: Process Resilience  Design Issues  Failure Masking and Replication  Agreement in Faulty Systems  Failure.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Lecture 9: Multicast Sep 22, 2015 All slides © IG.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
Distributed systems Causal Broadcast
Distributed systems Total Order Broadcast
Distributed Systems, Consensus and Replicated State Machines
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Distributed systems Reliable Broadcast
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Distributed systems Causal Broadcast
Failure Detectors motivation failure detector properties
Distributed systems Consensus
Distributed Systems Terminating Reliable Broadcast
Distributed systems Causal Broadcast
Last Class: Fault Tolerance
Presentation transcript:

Prof R. Guerraoui Distributed Programming Laboratory Distributed Systems Group Membership and View Synchronous Communication Prof R. Guerraoui Distributed Programming Laboratory

Group Membership A Who is there? B C

Group Membership In some distributed applications, processes need to know which processes are participating in the computation and which are not Failure detectors provide such information; however, that information is not coordinated (see next slide) even if the failure detector is perfect

Perfect Failure Detector suspect(p2,p3) suspect(p2) p1 crash p2 crash p3 suspect(p2,p3) suspect(p3) suspect() p4

Group Membership V1 = (p1,p4) p1 crash p2 crash p2 p4 V1 = (p1,p4)

Group Membership To illustrate the concept, we focus here on a group membership abstraction to coordinate the information about crashes In general, a group membership abstraction can also typically be used to coordinate the processes joinning and leaving explicitly the set of processes (i.e., without crashes)

Group Membership Like with a failure detector, the processes are informed about failures; we say that the processes install views Like with a perfect failure detector, the processes have accurate knowledge about failures Unlike with a perfect failure detector, the information about failures are coordinated: the processes install the same sequence of views

Group Membership Memb1. Local Monotonicity: If a process installs view (j,M) after installing (k,N), then j > k and M  N Memb2. Agreement: No two processes install views (j,M) and (j,M’) such that M ≠ M’ Memb3. Completeness: If a process p crashes, then there is an integer j such that every correct process eventually installs view (j,M) such that p  M Memb4. Accuracy: If some process installs a view (i,M) and p  M, then p has crashed

Group Membership Events Indication: <membView, V> Properties: Memb1, Memb2, Memb3, Memb4

Algorithm (gmp) Implements: groupMembership (gmp). Uses: PerfectFailureDetector (P). UniformConsensus(Ucons). upon event < Init > do view := (0,S); correct := S; wait := true;

Algorithm (gmp – cont’d) upon event < crash, pi > do correct := correct \ {pi}; upon event (correct < view.memb) and (wait = false) do wait := true; trigger<ucPropose,(view.id+1,correct) >;

Algorithm (gmp – cont’d) upon event < ucDecided, (id, memb)> do view := (id, memb); wait := false; trigger < membView, view>;

Algorithm (gmp) p1 p2 p3 p4 UCons((p1,p2,p4);(p1,p2,p4)) crash

Group Membership and Broadcast membView(p1,p3) p1 m crash p2 m p3 membView(p1,p3)

View Synchrony View synchronous broadcast is an abstraction that results from the combination of group membership and reliable broadcast View synchronous broadcast ensures that the delivery of messages is coordinated with the installation of views

View Synchrony Besides the properties of group membership (Memb1-Memb4) and reliable broadcast (RB1-RB4), the following property is ensured: VS: A message is vsDelivered in the view where it is vsBroadcast

View Synchrony Events Request: <vsBroadcast, m> Indication: <vsDeliver, src, m> <vsView, V>

View Synchrony If the application keeps vsBroadcasting messages, the view synchrony abstraction might never be able to vsInstall a new view; the abstraction would be impossible to implement We introduce a specific event for the abstraction to block the application from vsBroadcasting messages; this only happens when a process crashes

View Synchrony Events Request: <vsBroadcast, m>; <vsBlock, ok> Indication: <vsDeliver, src, m>; <vsView, V>; <vsBlock>

Algorithm (vsc) Implements: ViewSynchrony (vs). Uses: GroupMembership (gmp). TerminatingReliableBroadcast(trb). BestEffortBroadcast(beb).

Algorithm (vsc – cont’d) upon event < Init > do view := (0,S); nextView := ; pending := delivered := trbDone := ; flushing := blocked := false;

Algorithm (vsc – cont’d) upon event <vsBroadcast,m) and (blocked = false) do delivered := delivered   m  trigger <vsDeliver, self, m>; trigger <bebBroadcast, Data,view.id,m>;

Algorithm (vsc – cont’d) upon event<bebDeliver,src,Data,vid,m) do If(view.id = vid) and (m  delivered) and (blocked = false) then delivered := delivered   m  trigger <vsDeliver, src, m >;

Algorithm (vsc – cont’d) upon event < membView, V > do addtoTail (pending, V); Upon (pending ≠ ) and (flushing = false) do nextView := removeFromhead (pending); flushing := true; trigger <vsBlock>;

Algorithm (vsc – cont’d) Upon <vsBlockOk> do blocked := true; trbDone:= ; trigger <trbBroadcast, self, (view.id,delivered)>;

Algorithm (vsc – cont’d) Upon <trbDeliver, p, (vid, del)> do trbDone := trbDone  p; forall m  del and m  delivered do delivered := delivered   m ; trigger <vsDeliver, src, m >;

Algorithm (vsc – cont’d) Upon (trbDone = view.memb) and (blocked = true) do view := nextView; flushing := blocked := false; delivered := ; trigger <vsView, view>;

Consensus-Based View Synchrony Instead of launching parallel instances of TRBs, plus a group membership, we use one consensus instance and parallel broadcasts for every view change Roughly, the processes exchange the messages they have delivered when they detect a failure, and use consensus to agree on the membership and the message set

Algorithm 2 (vsc) Implements: ViewSynchrony (vs). Uses: UniformConsensus (uc). BestEffortBroadcast(beb). PerfectFailureDetector(P).

Algorithm 2 (vsc – cont’d) upon event < Init > do view := (0,S); correct := S; flushing := blocked := false; delivered := dset := ;

Algorithm 2 (vsc – cont’d) upon event <vsBroadcast,m) and (blocked = false) do delivered := delivered   m  trigger <vsDeliver, self,m>; trigger <bebBroadcast,Data,view.id,m >;

Algorithm 2 (vsc – cont’d) upon event<bebDeliver,src,Data,vid,m) do if (view.id = vid) and m  delivered and blocked = false then delivered := delivered   m  trigger <vsDeliver, src, m >;

Algorithm 2 (vsc – cont’d) upon event < crash, p > do correct := correct \  p ; if flushing = false then flushing := true; trigger <vsBlock>;

Algorithm 2 (vsc – cont’d) Upon <vsBlockOk> do blocked := true; trigger <bebBroadcast, DSET,view.id,delivered >; Upon <bebDeliver, src, DSET,vid,del > do dset:= dset  (src,del); if forall p  correct, (p,mset)  dset then trigger <ucPropose, view.id+1, correct, dset >;

Algorithm 2 (vsc – cont’d) Upon <ucDecided, id, memb, vsdset > do forall (p,mset)  vs-dset: p  memb do forall (src,m)  mset: m  delivered do delivered := delivered  m trigger <vsDeliver, src, m>; view := (id, memb); flushing := blocked := false; dset := delivered := ; trigger <vsView, view>;

Uniform View Synchrony We now combine the properties of group membership (Memb1-Memb4) – which is already uniform uniform reliable broadcast (RB1-RB4) – which we require to be uniform VS: A message is vsDelivered in the view where it is vsBroadcast – which is already uniform

Uniform View Synchrony Using uniform reliable broadcast instead of best effort broadcast in the previous algorithms does not ensure the uniformity of the message delivery

Uniformity? p1 m m p2 m m p3 vsView(p1,p3) vsDeliver(m) crash

Algorithm 3 (uvsc) correct := S; flushing := blocked := false; upon event < Init > do view := (0,S); correct := S; flushing := blocked := false; udelivered := delivered := dset := ; for all m: ack(m) := ;

Algorithm 3 (uvsc – cont’d) upon event <vsBroadcast,m) and (blocked = false) do delivered := delivered  m; trigger <bebBroadcast,Data,view.id,m >;

Algorithm 3 (uvsc – cont’d) upon event<bebDeliver,src,Data,vid,m) do if (view.id = vid) then ack(m) := ack(m)  src; if m  delivered then delivered := delivered   m  trigger <bebBroadcast, Data,view.id,m >;

Algorithm 3 (uvsc – cont’d) upon event (view ≤ ack(m)) and (m  udelivered) do udelivered := udelivered   m  trigger <vsDeliver, src(m), m >;

Algorithm 3 (uvsc – cont’d) upon event < crash, p > do correct := correct \  p ; if flushing = false then flushing := true; trigger <vsBlock>;

Algorithm 3 (uvsc – cont’d) Upon <vsBlockOk> do blocked := true; trigger <bebBroadcast, DSET,view.id,delivered >; Upon <bebDeliver, src, DSET,vid,del > do dset:= dset  (src,del); if forall p  correct, (p,mset)  dset then trigger <ucPropose, view.id+1, correct, dset >;

Algorithm 3 (uvsc – cont’d) Upon <ucDecided, id, memb, vsdset > do forall (p,mset)  vs-dset: p  memb do forall (src,m)  mset: m  udelivered do udelivered := udelivered  m trigger <vsDeliver, src, m>; view := (id, memb); flushing := blocked := false; dset := delivered := udelivered := ; trigger <vsView, view>;