Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.

Similar presentations


Presentation on theme: "Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05."— Presentation transcript:

1 Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05

2 Agenda Introduction Introduction Challenges in Fault-Tolerant Distributed Systems Challenges in Fault-Tolerant Distributed Systems Consistent Event Ordering in a Distributed System Consistent Event Ordering in a Distributed System Key Aspects of Proposed Approach Key Aspects of Proposed Approach Logical vs. Physical Failures Logical vs. Physical Failures Proposed Broadcast Primitives Proposed Broadcast Primitives GBCAST GBCAST ABCAST ABCAST CBCAST CBCAST Advantages of the Proposed Approach Advantages of the Proposed Approach Sample Application: Updating Replicated Data Sample Application: Updating Replicated Data Final Thoughts Final Thoughts

3 Introduction White paper written in 1987, funded by DoD White paper written in 1987, funded by DoD Purpose: Purpose: Present a set of communication primitives that facilitate distributed processing in the presence of failures Present a set of communication primitives that facilitate distributed processing in the presence of failures System Assumptions: System Assumptions: One computation, executed by multiple processes in a distributed system (DS) Each process has a local state Processes communicate via broadcasts with other processes Processes may “halt” at any time The paper does not address byzantine failures

4 Challenges in Fault-Tolerant DS Challenges in fault-tolerant distributed systems Challenges in fault-tolerant distributed systems How does the system handle exit/re-entry of processes How does the system handle exit/re-entry of processes A process may leave the computation (due to failure) A process may leave the computation (due to failure) A process may re-enter the computation (recovery) A process may re-enter the computation (recovery) How do processes communicate with each other when How do processes communicate with each other when Messages may be lost by communication subsystem Messages may be lost by communication subsystem Messages may be re-ordered while in transit Messages may be re-ordered while in transit Some receiver processes may halt Some receiver processes may halt Sender process may halt Sender process may halt How to handle failures when system is asynchronous How to handle failures when system is asynchronous Goal: continue the computation in the presence of failures Goal: continue the computation in the presence of failures

5 Consistent Event Ordering in DS Key aspects of distributed processing Key aspects of distributed processing Processes must have a consistent view of the ordering of events (i.e., messages) during the computation Processes must have a consistent view of the ordering of events (i.e., messages) during the computation A system must provide ordering, but also allow concurrency A system must provide ordering, but also allow concurrency Failures can affect consistent view of event ordering Failures can affect consistent view of event ordering Example: Example: Process ‘A’ sends a broadcast message, then fails Process ‘A’ sends a broadcast message, then fails Process ‘B’ receives the message, then notices failure Process ‘B’ receives the message, then notices failure Process ‘C’ notices failure, then receives message Process ‘C’ notices failure, then receives message Process ‘D’ never receives message, but notices failure Process ‘D’ never receives message, but notices failure Process ‘F’ receives message, but never notices failure Process ‘F’ receives message, but never notices failure Process ‘G’ never receives message nor notices failure Process ‘G’ never receives message nor notices failure Key: All processes must agree on the events that occurred and on the order of those events Key: All processes must agree on the events that occurred and on the order of those events

6 ... Consistent Event Ordering in DS Approaches to keep consistent event ordering in the presence of failures Approaches to keep consistent event ordering in the presence of failures 1) Run agreement protocol after a failure is detected 1) Run agreement protocol after a failure is detected Problems: slow and requires synchronous communication Problems: slow and requires synchronous communication 2) Use this rule: A process should discard messages received from a process that is known to have failed 2) Use this rule: A process should discard messages received from a process that is known to have failed Problem: Processes learn of failures at different times, so system may still be inconsistent Problem: Processes learn of failures at different times, so system may still be inconsistent Proposed Idea: Proposed Idea: “Construct a broadcast protocol that orders messages relative to failure and recovery events” “Construct a broadcast protocol that orders messages relative to failure and recovery events”

7 Key Aspects of Proposed Approach Failure and recovery are treated as system events, just like local processing and messages Failure and recovery are treated as system events, just like local processing and messages Thus, failure and recovery have an ordering with respect to messages & local processing Thus, failure and recovery have an ordering with respect to messages & local processing The paper proposes communication primitives that maintain consistent ordering among processes The paper proposes communication primitives that maintain consistent ordering among processes All processes experience the same sequence of events, including failures All processes experience the same sequence of events, including failures Advantages: Advantages: When a process notices a failure, it can assume that the rest of the system has noticed the order of the failure consistently When a process notices a failure, it can assume that the rest of the system has noticed the order of the failure consistently Therefore, the process can immediately react to the failure (no agreement protocol required) Therefore, the process can immediately react to the failure (no agreement protocol required)

8 Logical vs. Physical Failures Failures (i.e., lost messages, process halts) are physical events, occurring in real-time Failures (i.e., lost messages, process halts) are physical events, occurring in real-time Processes cannot control when a failure occurs Processes cannot control when a failure occurs Recall that processes use logical clocks to track order of events in a distributed computation Recall that processes use logical clocks to track order of events in a distributed computation In order to treat failures as ordered events, physical failures must be mapped to logical failures In order to treat failures as ordered events, physical failures must be mapped to logical failures How? How? Introduce “Process-Group View”: Logical snapshot of processes involved in the distributed computation Introduce “Process-Group View”: Logical snapshot of processes involved in the distributed computation Changes in the properties of the group (i.e., failures, recovery) are ordered with respect to other events Changes in the properties of the group (i.e., failures, recovery) are ordered with respect to other events These changes are communicated among processes by using the proposed broadcast primitives These changes are communicated among processes by using the proposed broadcast primitives

9 Proposed Broadcast Primitives 3 Broadcast Communication Primitives 3 Broadcast Communication Primitives Group-Broadcast (GBCAST) Group-Broadcast (GBCAST) Atomic-Broadcast (ABCAST) Atomic-Broadcast (ABCAST) Causal-Broadcast (CBCAST) Causal-Broadcast (CBCAST) All 3 are atomic: All processes receive the message or non-receive the message All 3 are atomic: All processes receive the message or non-receive the message Emphasis on lightweight primitives: quick processing is desired to improve performance Emphasis on lightweight primitives: quick processing is desired to improve performance

10 GBCAST GBCAST  Group Broadcast GBCAST  Group Broadcast Used to keep consistent “process group view” Used to keep consistent “process group view” Call: GBCAST(action, G) Call: GBCAST(action, G) action  type of event that has occurred action  type of event that has occurred G  process group view G  process group view GBCAST satisfies the following ordering constraints GBCAST satisfies the following ordering constraints Delivered in the same order with respect to all other broadcasts at each destination Delivered in the same order with respect to all other broadcasts at each destination Delivered after any messages sent by the failed process Delivered after any messages sent by the failed process

11 … GBCAST GBCAST is used to inform group member processes that the process group view has changed GBCAST is used to inform group member processes that the process group view has changed Each process keeps a local copy of the “process group view” Each process keeps a local copy of the “process group view” Reception of a GBCAST updates the local copy Reception of a GBCAST updates the local copy A process can assume that its local copy is consistent with the rest of the group A process can assume that its local copy is consistent with the rest of the group Upon failure or recovery, a GBCAST is sent by the Upon failure or recovery, a GBCAST is sent by the Supervisory process executing in same machine where process failure or recovery occurred (if machine alive) Supervisory process executing in same machine where process failure or recovery occurred (if machine alive) Failure detection software executing on other machine Failure detection software executing on other machine The usage of GBCAST avoids execution of an agreement protocol The usage of GBCAST avoids execution of an agreement protocol

12 ABCAST ABCAST  Atomic Broadcast ABCAST  Atomic Broadcast Provides sequential consistency on replicated data Provides sequential consistency on replicated data Applications use ABCAST to enforce order in the way data is updated in the distributed system (i.e., shared data structure) Applications use ABCAST to enforce order in the way data is updated in the distributed system (i.e., shared data structure) Call: ABCAST(msg, label, dests) Call: ABCAST(msg, label, dests) msg  message to be broadcasted msg  message to be broadcasted label  identifies ABCASTs that are related to each other label  identifies ABCASTs that are related to each other dests  set of processes to which broadcast is sent dests  set of processes to which broadcast is sent ABCASTs with the same label that have destinations in common are delivered in the same order (some order) to all such destinations ABCASTs with the same label that have destinations in common are delivered in the same order (some order) to all such destinations

13 CBCAST CBCAST  Causal Broadcast CBCAST  Causal Broadcast Provides causal consistency on replicated data Provides causal consistency on replicated data Applications use CBCAST to enforce causal order in the way data is updated in the distributed system Applications use CBCAST to enforce causal order in the way data is updated in the distributed system Call: CBCAST(msg, clabel, dests) Call: CBCAST(msg, clabel, dests) msg  message to be broadcasted msg  message to be broadcasted clabel  identifies related CBCASTs and type of ordering clabel  identifies related CBCASTs and type of ordering dests  set of processes to which broadcast is sent dests  set of processes to which broadcast is sent CBCASTs with the same ‘clabel’ that have destinations in common are delivered in a predetermined order to all such destinations CBCASTs with the same ‘clabel’ that have destinations in common are delivered in a predetermined order to all such destinations

14 … CBCAST Broadcast ‘A’ causally precedes broadcast ‘B’ if Broadcast ‘A’ causally precedes broadcast ‘B’ if A and B are sent by the same process, and A is sent before B A and B are sent by the same process, and A is sent before B A and B are sent by different processes, and A was received by the process that sent B before B was sent A and B are sent by different processes, and A was received by the process that sent B before B was sent Causal ordering is determined by the value of ‘clabels’ Causal ordering is determined by the value of ‘clabels’ If broadcast A causally precedes broadcast B, then clabel(A) < clabel(B) If broadcast A causally precedes broadcast B, then clabel(A) < clabel(B) Usage of ‘clabels’ gives applications the power to decide events that are causally related Usage of ‘clabels’ gives applications the power to decide events that are causally related Not all CBCASTs are causally related; ordering them would limit system concurrency Not all CBCASTs are causally related; ordering them would limit system concurrency

15 Advantages of the Proposed Approach Simplify applications Simplify applications Eliminate the need for ‘ordering protocols’ at the application level needed to prevent inconsistencies due to potential failures Eliminate the need for ‘ordering protocols’ at the application level needed to prevent inconsistencies due to potential failures These protocols are needed if communication were done via simple atomic broadcasts These protocols are needed if communication were done via simple atomic broadcasts Improve system performance Improve system performance Application ordering protocols restrict concurrency by imposing synchronization rules Application ordering protocols restrict concurrency by imposing synchronization rules Note: Assumption is that GBCAST, ABCAST, and CBCAST are implemented at a level below the application (i.e., Kernel) Note: Assumption is that GBCAST, ABCAST, and CBCAST are implemented at a level below the application (i.e., Kernel)

16 Sample Application: Updating Replicated Data All copies of the replicated data must be updated in the same order All copies of the replicated data must be updated in the same order Without the proposed broadcast primitives, process would need to do explicit synchronization Without the proposed broadcast primitives, process would need to do explicit synchronization Send a basic atomic broadcast to the remote copies Send a basic atomic broadcast to the remote copies Wait for the remote copies to reply with confirmation of update Wait for the remote copies to reply with confirmation of update Update local copy, and perform next update Update local copy, and perform next update Note: similar to 2-Phase-Commit Note: similar to 2-Phase-Commit Using CBCAST, process can assume that all copies have been updated once CBCAST returns and local copy is updated Using CBCAST, process can assume that all copies have been updated once CBCAST returns and local copy is updated CBCAST guarantees that all copies receive update in required order with respect to previous CBCASTs that update the same data CBCAST guarantees that all copies receive update in required order with respect to previous CBCASTs that update the same data CBCASTs are ordered with respect to failures (notified via GBCASTs) CBCASTs are ordered with respect to failures (notified via GBCASTs) Note: Usage of CBCASTs improves performance Note: Usage of CBCASTs improves performance

17 Final Thoughts The proposed broadcast primitives provide The proposed broadcast primitives provide Implicit ordering of messages Implicit ordering of messages Applications need not do explicit synchronization to prevent ordering problems when failures are possible Applications need not do explicit synchronization to prevent ordering problems when failures are possible Message ordering with respect to faults/recoveries Message ordering with respect to faults/recoveries Faults and Recoveries are treated as logical events, subject to ordering with respect to messages Faults and Recoveries are treated as logical events, subject to ordering with respect to messages This provides consistency among the processes in the distributed system (all processes experience same set of events) This provides consistency among the processes in the distributed system (all processes experience same set of events) Improved performance Improved performance Elimination of explicit application ordering protocols allows higher concurrency in computation Elimination of explicit application ordering protocols allows higher concurrency in computation


Download ppt "Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05."

Similar presentations


Ads by Google