Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection.

Slides:



Advertisements
Similar presentations
Fault Tolerance. Basic System Concept Basic Definitions Failure: deviation of a system from behaviour described in its specification. Error: part of.
Advertisements

Global States.
1 CS 194: Distributed Systems Process resilience, Reliable Group Communication Scott Shenker and Ion Stoica Computer Science Division Department of Electrical.
BASIC BUILDING BLOCKS -Harit Desai. Byzantine Generals Problem If a computer fails, –it behaves in a well defined manner A component always shows a zero.
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
Gossip and its application Presented by Anna Kaplun.
6.852: Distributed Algorithms Spring, 2008 Class 7.
Prepared by Ilya Kolchinsky.  n generals, communicating through messengers  some of the generals (up to m) might be traitors  all loyal generals should.
Efficient Solutions to the Replicated Log and Dictionary Problems
Termination Detection. Goal Study the development of a protocol for termination detection with the help of invariants.
Termination Detection Part 1. Goal Study the development of a protocol for termination detection with the help of invariants.
Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
Ordering and Consistent Cuts Presented By Biswanath Panda.
1 Principles of Reliable Distributed Systems Lecture 3: Synchronous Uniform Consensus Spring 2006 Dr. Idit Keidar.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Set 16: Distributed Shared Memory1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 16 Wenbing Zhao Department of Electrical and Computer Engineering.
1 Fault-Tolerant Consensus. 2 Failures in Distributed Systems Link failure: A link fails and remains inactive; the network may get partitioned Crash:
Replication Management using the State-Machine Approach Fred B. Schneider Summary and Discussion : Hee Jung Kim and Ying Zhang October 27, 2005.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Recitation 5: Reliable.
The Byzantine Generals Strike Again Danny Dolev. Introduction We’ll build on the LSP presentation. Prove a necessary and sufficient condition on the network.
Internetworking Fundamentals (Lecture #2) Andres Rengifo Copyright 2008.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Error Checking continued. Network Layers in Action Each layer in the OSI Model will add header information that pertains to that specific protocol. On.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Reliable Communication in the Presence of Failures Based on the paper by: Kenneth Birman and Thomas A. Joseph Cesar Talledo COEN 317 Fall 05.
1 Nasser Alsaedi. The ultimate goal for any computer system design are reliable execution of task and on time delivery of service. To increase system.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
1 CSE 8343 Presentation # 2 Fault Tolerance in Distributed Systems By Sajida Begum Samina F Choudhry.
CS 425/ECE 428/CSE424 Distributed Systems (Fall 2009) Lecture 9 Consensus I Section Klara Nahrstedt.
1 Data Link Layer Lecture 23 Imran Ahmed University of Management & Technology.
Distributed systems Consensus Prof R. Guerraoui Distributed Programming Laboratory.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion & Leader Election Steve Ko Computer Sciences and Engineering University.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Building Dependable Distributed Systems, Copyright Wenbing Zhao
Impossibility of Distributed Consensus with One Faulty Process By, Michael J.Fischer Nancy A. Lynch Michael S.Paterson.
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
Page 1 Mutual Exclusion & Election Algorithms Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Set 16: Distributed Shared Memory 1.
Reliable Communication in the Presence of Failures Kenneth P. Birman and Thomas A. Joseph Presented by Gloria Chang.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
Fault Tolerance (2). Topics r Reliable Group Communication.
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
1 AGREEMENT PROTOCOLS. 2 Introduction Processes/Sites in distributed systems often compete as well as cooperate to achieve a common goal. Mutual Trust/agreement.
The Consensus Problem in Fault Tolerant Computing
Coordination and Agreement
When Is Agreement Possible
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Operating System Reliability
Operating System Reliability
Agreement Protocols CS60002: Distributed Systems
Outline Announcements Fault Tolerance.
Operating System Reliability
Operating System Reliability
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Operating System Reliability
Operating System Reliability
Presentation transcript:

Agenda Fail Stop Processors –Problem Definition –Implementation with reliable stable storage –Implementation without reliable stable storage Failure Detection and Fault Diagnosis –System-Level Fault Diagnosis –Example Reliable Message Delivery –Problem Definition –Implementation –Example

Definitions Fail Stop Processors (FSP) assumes that any processor, upon failure, does not perform any incorrect actions and simply ceases to function. The visible effects of a failure of a FSP are: –It stops executing. –The internal state and the contents of the volatile storage connected to the processor are lost; the state of the stable storage is unaffected by it. –Any processor can the detect the failure of a FSP. K-fail-stop processor is a computing system that behaves like a FSP unless k + 1 or more components of the system fail.

Implementation A FSP reads data from stable storage and writes data to it. Processors also interact with each other through the stable storage, as the internal state of a processor or its volatile memory is not visible to other processors. The functioning of a FSP will depend on the reliability of the available stable storage. A stable storage is typically a storage medium that is controlled by some active device (i.e. the controller), which is controlled by the program running on the processor.

Fail Stop Processors Stable Storage psps p1p2p3pnpn Code Segment (Copy #1) Code Segment (Copy #2) Code Segment (Copy #3) Code Segment (Copy #n)

Implementation with reliable stable storage: Assumptions The storage process (s-process) works correctly and does not fail. The system consists of k + 1 ordinary processes (p-processes) and one s-process. P-processes can fail in an arbitrary manner, no assumption is done regarding their behavior when fail. All processes are connected via reliable communication network. The origin of messages can be authenticated by its receiver. All clocks are non faulty and all processes are synchronized and run at the same rate.

Implementation with reliable stable storage The non-failed processors make the same sequence of requests to the s-processor. A failure is detected by the s-process if any of the requests from the k + 1 processors has failed. Synchronization clocks are needed so that all the copies of a particular request from the non-faulty p-processors will have the same time stamp and guaranteed to arrive within some time interval. Failure is detected when by the s-process when when processes try to access the stable storage.

Implementation with reliable stable storage R  bag of received requests with proper time stamp If |R| = k + 1  all requests are identical  all requests are from different processes  ¬ failed then If request is write, write the stable storage else if request is read, send value to all processes else /* k-fail stop processor has failed */ failed  true

Implementation without reliable stable storage: Assumptions k + 1 p-processes. 2k + 1 s-processes. A copy of all the variables are kept in the stable storage is stored by each s-process. At each variable update, all non-faulty s-processes update their variables. All p-processes are connected to the s-processes through reliable communication network. The failure is detected when the p-process write to the stable storage.

Implementation without reliable stable storage: Assumptions Stable Storage psps p1p2p3P k+1 Code Segment (Copy #1) Code Segment (Copy #2) Code Segment (Copy #3) Code Segment (Copy #n) pspspspsP 2k+1

Implementation without reliable stable storage: Failure detection As long as less than k + 1 p-processes fail, disagreement among the values of the different p-processes will occur. And, as long as less than k + 1 s-processes fail, there will still be a majority of s-processes that will not have failed. Hence, as long as only up to k processes fail in all, we can be sure that at least one p-process is working correctly, and the majority of s-processes are working correctly. Every request of an update to a variable in the stable storage is sent by a p-process to every s-process. Each s-process should receive the request from all non-failed p-processes.

Continue… If Pj is non-faulty, then every non-faulty s-process receives the request of Pj. If s-processes Sk and Sl are non-faulty, then both agree on every request received from Pj.

Failure detection algorithm 1- For writing the stable storage, a p-process Pj initiates a Byzantine agreement with all s-processes. 2- For reading the stable storage, a p-process Pj: (a) Broadcasts the request to s-processes. (b) Uses the majority value which obtained from at least k + 1 s-processes. 3- An s-process Si, on receiving a request from all the p- processes: M  bag of requests received if the request is read then: send requested value to all p- processes whose request is in M

Failure detection algorithm, continue… if request is write then: If |M| = k + 1  all requests are identical  all requests are from different processes  ¬ failed then write the value else { failed  true send message “halt” to all p-processes }

Failure Detection and Fault Diagnosis: System-Level Fault Diagnosis Basic goal: Identifying all the faulty units in a system. Example, PMC model: A system S is decomposed into n units, not necessarily identical, denoted by the set U = {u 1, u 2, …, u n }. Each unit is well-defined and cannot decomposed further for the purpose of diagnosis (i.e. the whole unite is either works correctly or considered faulty). The status does not change during diagnosis. A fault-free unit always reports the status it tests correctly.

PMC Model Each unit in U is assigned a particular subset of U to test (no unit tests itself). The set of test results called “syndrome”. The complete set of tests is called connection assignment and represented as graph G = (U, E), where nodes are units and links are testing links. U1U1 U2U2 U3U3 U4U4 U5U5 a 12 = x a 23 = 0 a 34 = 0 a 45 = 0 a 56 = 1Our syndrome is (x, 0, 0, 0, 1) Where 1: failed, 0: not failed

Continue… Definition: A system S is t-fault diagnosable (or t- diagnosable) if, given a syndrome, all faulty units in S can be identified, provided that the number of faulty units does not exceed t. Our previous example is one-step one-fault diagnosable, since the faulty node can always be determined using the following method: If in a syndrome a string of 0s is followed by a 1, then the 1 correctly represents the faulty unit.

Continue… The graph is one-step 2-diagnosable because if both u 1 and u 2 were faulty, and u 2 returned 0 (since it could return any value), the syndrome of this system would be indistinguishable from the syndrome of the previous system. Generally, if no two units test each other, then the following two conditions are sufficient for a system S with n units to be t-diagnosable: I) n  2t + 1, and II) each unit is tested by at least t others.

Fault Diagnosis in Distributed Systems Adaptive Distributed System-level Diagnosis (Adaptive DSD) works as follow: 1. t  i 2. repeat 3. t  (t + 1) mod n 4. request t to forward TESTED_Up t to i 5. until (i tests t as “fault-free”) 6. TESTED_Up t [i]  t 7. for j  1 to (n – 1) do 8. if (i  t) 9. TESTED_Up i [j]  TESTED_Up t [j]

Example x Faulty x Fault free 1, 0 0 1, 1,

Reliable Message Delivery: Problem Definition In distributed system, it is frequently assumed that a message sent by one node to another arrives uncorrupted at the receiver, and the that the message order is preserved between two nodes. The following properties should be hold for any network: 1.A message sent from i is received correctly by j. 2.Messages sent from i are delivered to j in the order in which i sent them.

Reliable Message Delivery: Implementation Error detection: using error detection/correction code (i.e. CRC). Message ordering and guaranteed delivery: using sliding window protocols. Failures: sliding window protocols guarantee message delivery and message ordering if no failures occur. But what happen when failure occurs? We assume that the failure does not partition the network.

Continue… Each node sends to each neighbor a list of its estimated delays to each destination (not just the neighbor). Assume node i gets information from its neighbor j, which says that the estimated delay from j to a node k is d. Since i knows the delay to j (assume it is x j ). This means that i can send a message to k via j and the estimated delay will be d + x j. If N i is the set of nodes that are the neighbors of node i, then for destination k, the message is routed through a neighbor j such that: (d + x j )  (d + x l ),  l  N i. KjKj KjKj KjKj ljlj

Example A packet from node E to node A will be routed through node C and the expected delay is 9. Also, a packet from D to E will routed through C with an expected delay of 11. A B C D E F A packet from node F to A will be routed through E (and C) with an expected delay of 17.

Continue… When node C fails, links CD, CE, and CA will fail. Nodes A,D, and E will set their cost to C as infinity. For destination A, the entry will be changed to B. In the next round, F will detect that the route A via E is no longer optimal because the total cost is 25. A B C D E F The new route will be through D, with an estimated cost of 21.