CS294, YelickSelf Stabilizing, p1 CS 294-8 Self-Stabilizing Systems

Slides:



Advertisements
Similar presentations
Mutual Exclusion – SW & HW By Oded Regev. Outline: Short review on the Bakery algorithm Short review on the Bakery algorithm Black & White Algorithm Black.
Advertisements

Effect of Fairness in Model Checking of Self-stabilizing programs Jingshu Chen, Fuad Abujarad and Sandeep Kulkarni.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Consensus Steve Ko Computer Sciences and Engineering University at Buffalo.
Chapter 6 - Convergence in the Presence of Faults1-1 Chapter 6 Self-Stabilization Self-Stabilization Shlomi Dolev MIT Press, 2000 Shlomi Dolev, All Rights.
Automatic Verification Book: Chapter 6. What is verification? Traditionally, verification means proof of correctness automatic: model checking deductive:
Chapter 7 - Local Stabilization1 Chapter 7: roadmap 7.1 Super stabilization 7.2 Self-Stabilizing Fault-Containing Algorithms 7.3 Error-Detection Codes.
Teaser - Introduction to Distributed Computing
Snap-stabilizing Committee Coordination Borzoo Bonakdarpour Stephane Devismes Franck Petit IEEE International Parallel and Distributed Processing Symposium.
Multiprocessor Synchronization Algorithms ( ) Lecturer: Danny Hendler The Mutual Exclusion problem.
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Announcements. Midterm Open book, open note, closed neighbor No other external sources No portable electronic devices other than medically necessary medical.
Distributed Algorithms – 2g1513 Lecture 10 – by Ali Ghodsi Fault-Tolerance in Asynchronous Networks.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
Introduction to Self-Stabilization Stéphane Devismes.
Byzantine Generals Problem: Solution using signed messages.
Chapter 8 - Self-Stabilizing Computing1 Chapter 8 – Self-Stabilizing Computing Self-Stabilization Shlomi Dolev MIT Press, 2000 Draft of January 2004 Shlomi.
Ordering and Consistent Cuts Presented By Biswanath Panda.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Fall 2006 Prof. Jennifer Welch.
CPSC 668Set 9: Fault Tolerant Consensus1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
Outline Max Flow Algorithm Model of Computation Proposed Algorithm Self Stabilization Contribution 1 A self-stabilizing algorithm for the maximum flow.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 2 – Distributed Systems.
The Byzantine Generals Strike Again Danny Dolev. Introduction We’ll build on the LSP presentation. Prove a necessary and sufficient condition on the network.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Selected topics in distributed computing Shmuel Zaks
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Fault Tolerance Computer Programs that Can Fix Themselves! Prof’s Paul Sivilotti and Tim Long Dept. of Computer & Info. Science The Ohio State University.
By J. Burns and J. Pachl Based on a presentation by Irina Shapira and Julia Mosin Uniform Self-Stabilization 1 P0P0 P1P1 P2P2 P3P3 P4P4 P5P5.
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)
Self-Stabilizing K-out-of-L Exclusion on Tree Networks Stéphane Devismes, VERIMAG Joint work with: – Ajoy K. Datta (Univ. Of Nevada) – Florian Horn (LIAFA)
CS294, Yelick Consensus revisited, p1 CS Consensus Revisited
Stabilization Presented by Xiaozhou David Zhu. Contents What-is Motivation 3 Definitions An Example Refinements Reference.
Weak vs. Self vs. Probabilistic Stabilization Stéphane Devismes (CNRS, LRI, France) Sébastien Tixeuil (LIP6-CNRS & INRIA, France) Masafumi Yamashita (Kyushu.
Chap 15. Agreement. Problem Processes need to agree on a single bit No link failures A process can fail by crashing (no malicious behavior) Messages take.
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
CS 542: Topics in Distributed Systems Self-Stabilization.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 9: Fault Tolerant Consensus 1.
Program Correctness. The designer of a distributed system has the responsibility of certifying the correctness of the system before users start using.
“Towards Self Stabilizing Wait Free Shared Memory Objects” By:  Hopeman  Tsigas  Paptriantafilou Presented By: Sumit Sukhramani Kent State University.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
Distributed Error- Confinement Shay Kutten (Technion) with Boaz Patt-Shamir (Tel Aviv U.) Yossi Azar (Tel Aviv U.)
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Snap-Stabilizing Depth-First Search on Arbitrary Networks Alain Cournier, Stéphane Devismes, Franck Petit, and Vincent Villain OPODIS 2004, December
Self-Stabilizing Systems
CSE-591: Term Project Self-stabilizing Network Algorithms by Tridib Mukherjee ASU ID :
Computer Science 425/ECE 428/CSE 424 Distributed Systems (Fall 2009) Lecture 20 Self-Stabilization Reading: Chapter from Prof. Gosh’s book Klara Nahrstedt.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
第1部: 自己安定の緩和 すてふぁん どぅゔぃむ ポスドク パリ第11大学 LRI CNRS あどばいざ: せばすちゃ てぃくそい
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Outline Distributed Mutual Exclusion Distributed Deadlock Detection
Agreement Protocols CS60002: Distributed Systems
CS60002: Distributed Systems
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Introduction to Self-Stabilization
Corona Robust Low Atomicity Peer-To-Peer Systems
Presentation transcript:

CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems

CS294, YelickSelf Stabilizing, p2 Administrivia No seminar or class Thursday Sign up for project meetings tomorrow, Thursday, or Tuesday (?) Poster session Wednesday, 12/13, in the Woz, 2-5pm Final papers due Friday, 12/15

CS294, YelickSelf Stabilizing, p3 (Self) Stabilization History: Idea introduced by Dijkstra in 1973/74. Popularized by Lamport in Idea of stabilization: ability of a system to converge in finite number of steps from arbitrary states to desired state

CS294, YelickSelf Stabilizing, p4 Overview ConceptsTools AsynchronousCutSnapshot Asynchronous Fault-Tolerant Byzantine Faults Authorization, Xcast Asynchronous Fault-Tolerant Self- Stabilizing Local Checking, Counter Flushing

CS294, YelickSelf Stabilizing, p5 Stabilization Motivation: fault tolerance –Especially transient faults –Also useful for others (crashes, Byzantine) Where: Stabilization ideas appear in physics, control theory, mathematical analysis, and systems science

CS294, YelickSelf Stabilizing, p6 Stabilization Definition: Let P be a state predicate of a system S. S is stabilizing to P iff it satisfies the following: –Closure: P is closer in S: any computation that starts in a state in P leads to states that are in P –Convergence: every computation of S has a finite prefix such that the following is in P

CS294, YelickSelf Stabilizing, p7 Stabilization (Refined)

CS294, YelickSelf Stabilizing, p8 Practical Issues Stabilizing protocols allow for –Corrupted state –Initialization errors –Not corruption of code Applications –Routing, Scheduling, Resource Allocation

CS294, YelickSelf Stabilizing, p9 Dijkstra’s Model The concurrency model is unrealistic, but useful for illustration –Processors are organized in a sparse, connected graph –At each “step” a processor looks at its own step and neighbors, and changes its own state –A “demon” selects the processor to executed (fairly)

CS294, YelickSelf Stabilizing, p10 Impossibility of Stabilization If the processors are truly identical (symmetric), then stabilization is impossible –Consider an N processor system, with N a non- prime, say N = 2m –Consider an initial state that is cyclically symmetric, e.g., s, t, s, t, s, t pi’s state is s for i even, and t for i odd –Then the scheduling “demon” can schedule all even processors (which will all move to s’) and then all odd (move to t’), so no progress will be made

CS294, YelickSelf Stabilizing, p11 Implications of Impossibility How important is this result? Burns and Pachl show that with a prime number of processors, self- stabilization with symmetric processors is possible More importantly, how realistic is the symmetry assumption?

CS294, YelickSelf Stabilizing, p12 Mutual Exclusion on a Line The following simple example is a solution to the mutual exclusion problem: –n processors are connected in a line –Each talks to 2 neighbors (1 on the ends) State: –Each process has 2 variables up: token is above if true, below if false x: a bit used for token passing

CS294, YelickSelf Stabilizing, p13 Token Passing on a Line Top (Process n-1) x = 0 up = false x = 1 up = true. x = 1 up = true Bottom (Process 0) Logical token: Token is at one of 2 procs where up differs If x’s differ, upper proc, if same, lower proc

CS294, YelickSelf Stabilizing, p14 Token Passing Program Bottom-move-up[0] –If x[0] = x[1] and up[1] = false then x[0] := ~x[0] Top-move-down[n-1] –If x[n-2] != x[n-1] then x[n-1] := x[n-2] Middle-move-up[i] –if x[i] != x[i-1] then {x[i] := x[i-1]; up[i] := true} Middle-move-down[i] –if (x[i] = x[i+1] and up[i] = true and up[i+1]=true then up[i] = false This is Dijkstra’s second, “4-state” algorithm

CS294, YelickSelf Stabilizing, p15 Token Passing Up Top (Process n-1) x = 0 up = false x = 1 up = true. x = 1 up = true Bottom (Process 0) if x[i] != x[i-1] then {x[i] := x[i-1]; up[i] := true} x = 1 up = true

CS294, YelickSelf Stabilizing, p16 Token Passing Down Top (Process n-1) x = 0 up = false x = 1 up = true. x = 1 up = true Bottom (Process 0) if x[n-2] != x[n-1] then x[n-1] := x[n-2] x = 1

CS294, YelickSelf Stabilizing, p17 Proof Idea for Correct States If the initialization of states is correct –One can divide the processor line in two parts based on “up” Two processors, i and i-1 in between All processors above i have same x value as x[i]; all below i-1 same as x[i-1] –An action in the program is enabled only when the token is held –Only 1 action is enabled (and only 1 process holds the token at any given time) The above can be checked by examining the predicates on the rules

CS294, YelickSelf Stabilizing, p18 Locally Checkable Properties In any good state, the following hold: –If up[i-1] = up[i], then x[i-1]=x[i] –If up[i] = true then up[i-1]=true These are enough to show that only 1 processor is enabled These are locally checkable –a local set (pair) of processors can detect an incorrect state

CS294, YelickSelf Stabilizing, p19 General Stabilization Technique 1 Varghese proposed local checking and correction as a general technique Turn local checks into local correction –Consider processors as tree (line is special case) –Consider I-1 to be I’s parent –For each node I (I != 0), add Correction action: check the local predicate between I and its parent, correct I’s state if necessary –Correction affects only child, not parent

CS294, YelickSelf Stabilizing, p20 Practical Issues Dijkstra’s algorithm works without the explicit correction step For more complex protocols, correction is used Although Dijkstra’s algorithm is self- stabilizing, it goes through states where mutual exclusion is not guaranteed

CS294, YelickSelf Stabilizing, p21 Token Passing on Ring Processor 0 and n-1 are neighbors Initially, count = 0, except for processor 0 where count = 1 Zero-move –If count[0] = count[n-1] then count[0] = count[0]+1 mod (n+1) Other-move –If count[i] != count[i-1] then count[i] := count[i-1] Note: this is Dijkstra’s first, k-state algorithm

CS294, YelickSelf Stabilizing, p22 Token Ring Execution x-1x x x x Good States: For I = 1…n=1, either count[I-1]=count[I] or count[I-1] = count[I]+1 Either count[0] = count[n-1] or count[0] = count[n-1]+1 token p0

CS294, YelickSelf Stabilizing, p23 Proof Idea The following can be shown –In any execution, P0 will eventually increment its counter (because all other processor decrease # of counter values) –In any execution P0 will eventually reach a “fresh” counter value –Any state in which P0 has a fresh counter value m is eventually followed by a state in which all processes have m

CS294, YelickSelf Stabilizing, p24 General Stabilization Technique 2 Varghese proposes counter flushing as a general technique for stabilization –Starting with some sender (P0) sending to others, which messages in rounds –Make stabilizing by numbering messages with counters (max ctr > N) –Sender must eventually get “fresh” value

CS294, YelickSelf Stabilizing, p25 Compilers and Stabilization Two useful properties for compilers (according to Schneider): –Self-stabilizing source code should produce self-stabilizing object –Compiler should produce a self- stabilizing version of our program even if the source code is not

CS294, YelickSelf Stabilizing, p26 Compilers Con’t Fundamental difference between symmetric and asymmetric rings Self-stabilization is “unstable” across architectures There is a class of programs for which a compiler can be written to “force” stabilization

CS294, YelickSelf Stabilizing, p27 Summary Self-stabilizing algorithms –Overlooked for 10 years –Revived in distributed algorithms community –Algorithms for: MST, Communication, … Relevance to practice –Tolerating transient faults is important –Do these ideas appear in real systems? See