Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Global States.
Chapter 8 Fault Tolerance
Chapter 7 - Local Stabilization1 Chapter 7: roadmap 7.1 Super stabilization 7.2 Self-Stabilizing Fault-Containing Algorithms 7.3 Error-Detection Codes.
Lecture 8: Asynchronous Network Algorithms
Chapter 15 Basic Asynchronous Network Algorithms
Self Stabilizing Algorithms for Topology Management Presentation: Deniz Çokuslu.
Self-stabilizing Distributed Systems Sukumar Ghosh Professor, Department of Computer Science University of Iowa.
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS Fall 2011 Prof. Jennifer Welch CSCE 668 Self Stabilization 1.
Lecture 4: Elections, Reset Anish Arora CSE 763 Notes include material from Dr. Jeff Brumfield.
Fardin Abdi, Brett Robins, Marco Caccamo University of Illinois at Urbana-Champaign Urbana-Champaign, USA {abditag2, robbins3, 1UIUC.
Mobile and Wireless Computing Institute for Computer Science, University of Freiburg Western Australian Interactive Virtual Environments Centre (IVEC)
1 Complexity of Network Synchronization Raeda Naamnieh.
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
LSRP: Local Stabilization in Shortest Path Routing Hongwei Zhang and Anish Arora Presented by Aviv Zohar.
Tracking Murat Demirbas SUNY Buffalo. A Pursuer-Evader Game for Sensor Networks Murat Demirbas Anish Arora Mohamed Gouda.
© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.
Dynamic Hypercube Topology Stefan Schmid URAW 2005 Upper Rhine Algorithms Workshop University of Tübingen, Germany.
CPSC 668Self Stabilization1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
LSRP: Local Stabilization in Shortest Path Routing Anish Arora Hongwei Zhang.
CS294, YelickSelf Stabilizing, p1 CS Self-Stabilizing Systems
Outline Max Flow Algorithm Model of Computation Proposed Algorithm Self Stabilization Contribution 1 A self-stabilizing algorithm for the maximum flow.
Performance Comparison of Existing Leader Election Algorithms for Dynamic Networks Mobile Ad Hoc (Dynamic) Networks: Collection of potentially mobile computing.
Self-Stabilization An Introduction Aly Farahat Ph.D. Student Automatic Software Design Lab Computer Science Department Michigan Technological University.
CS 603 Communication and Distributed Systems April 15, 2002.
Page 1 Copyright © Alexander Allister Shvartsman CSE 6510 (461) Fall 2010 Selected Notes on Fault-Tolerance (12) Alexander A. Shvartsman Computer.
GS 3 GS 3 : Scalable Self-configuration and Self-healing in Wireless Networks Hongwei Zhang & Anish Arora.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Selected topics in distributed computing Shmuel Zaks
Introduction Distributed Algorithms for Multi-Agent Networks Instructor: K. Sinan YILDIRIM.
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
On Probabilistic Snap-Stabilization Karine Altisen Stéphane Devismes University of Grenoble.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
1 Self-stabilizing Algorithms and Frequency Assignment Problems.
Review for Exam 2. Topics included Deadlock detection Resource and communication deadlock Graph algorithms: Routing, spanning tree, MST, leader election.
Defining Programs, Specifications, fault-tolerance, etc.
Grey Box testing Tor Stålhane. What is Grey Box testing Grey Box testing is testing done with limited knowledge of the internal of the system. Grey Box.
Growth Codes: Maximizing Sensor Network Data Persistence abhinav Kamra, Vishal Misra, Jon Feldman, Dan Rubenstein Columbia University, Google Inc. (SIGSOMM’06)
Fault-containment in Weakly Stabilizing Systems Anurag Dasgupta Sukumar Ghosh Xin Xiao University of Iowa.
Energy-Efficient Shortest Path Self-Stabilizing Multicast Protocol for Mobile Ad Hoc Networks Ganesh Sridharan
Dissecting Self-* Properties Andrew Berns & Sukumar Ghosh University of Iowa.
Fault-Tolerant Systems Design Part 1.
Autonomic distributed systems. 2 Think about this Human population x10 9 computer population.
Stabilization Presented by Xiaozhou David Zhu. Contents What-is Motivation 3 Definitions An Example Refinements Reference.
Fault Management in Mobile Ad-Hoc Networks by Tridib Mukherjee.
University of Iowa1 Self-stabilization. The University of Iowa2 Man vs. machine: fact 1 An average household in the developed countries has 50+ processors.
Several sets of slides by Prof. Jennifer Welch will be used in this course. The slides are mostly identical to her slides, with some minor changes. Set.
Self-stabilization. What is Self-stabilization? Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward.
CS 542: Topics in Distributed Systems Self-Stabilization.
Self-stabilizing energy-efficient multicast for MANETs.
1 Fault tolerance in distributed systems n Motivation n robust and stabilizing algorithms n failure models n robust algorithms u decision problems u impossibility.
Self-stabilization. Technique for spontaneous healing after transient failure or perturbation. Non-masking tolerance (Forward error recovery). Guarantees.
Hwajung Lee.  Technique for spontaneous healing.  Forward error recovery.  Guarantees eventual safety following failures. Feasibility demonstrated.
Superstabilizing Protocols for Dynamic Distributed Systems Authors: Shlomi Dolev, Ted Herman Presented by: Vikas Motwani CSE 291: Wireless Sensor Networks.
ITEC452 Distributed Computing Lecture 15 Self-stabilization Hwajung Lee.
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Design of Tree Algorithm Objectives –Learning about satisfying safety and liveness of a distributed program –Apply the method of utilizing invariants and.
Self-Stabilizing Systems
CSE-591: Term Project Self-stabilizing Network Algorithms by Tridib Mukherjee ASU ID :
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Network Topology Computer network topology is the way various components of a network (like nodes, links, peripherals, etc) are arranged. Network topologies.
Distributed Systems Lecture 6 Global states and snapshots 1.
Introduction to Wireless Sensor Networks
New Variants of Self-Stabilization
Student: Fang Hui Supervisor: Teo Yong Meng
CS60002: Distributed Systems
Corona Robust Low Atomicity Peer-To-Peer Systems
Presentation transcript:

Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi

Outline Introduction Concept of Self-Stabilization Dijkstra’s Model of Self-stabilization Super-Stabilization Application of Stabilization on Sensor Network Discussion

Distributed Systems A distributed system consists of loosely connected machines which do not share a global memory. Each machine has the partial view of the global state System consists of components : processes and inter-connections (message passing/shared memory) Topology is formed by components as nodes and inter-connections as directed edges Each component has a local state Global system state is union of local states Behavior of the system is represented by the global state and the transition between the states Design of robust distributed system requires capability of recovery from unforeseen perturbances Previous attempts cater to permanent failures by introducing redundant components Transient malfunction (message corruption/sensor malfunction/memory error) can put the system to illegal state

Self-stabilization A self-stabilizing system guarantees that Regardless of the current state, the system is guaranteed to recover to a legal configuration in a finite number of steps Remain in the legal configuration thereafter until a malfunction occurs Examples of Self-stabilization Mathematics : Newton-Raphson method for finding square root Control Theory: Systems with feedback Distributed Systems: Pioneered by Dijkstra Questions? Is it always possible to have self-stabilization? If not, under what conditions How do we design algorithms that are self-stabilizing Cost of self-stabilization and convergence rate?

Properties of self-stabilization Closure Some evaluation function (a predicate) P, once true given the state of the system S, is always true thereafter Convergence Regardless of the starting stated, P will be satisfied within a finite number of state transitions OR: Given a starting state in which a predicate Q is satisfied, P will be satisfied within a finite number of state transitions These properties must remain true even against an adversary The adversary may have knowledge of the system’s design and cause damage or remove state

Using Self-Stabilization for Fault Tolerance Why use self-stabilization for fault tolerance In some sense, because it was not designed for fault tolerance Fault tolerance is a side-effect of self-stabilizing behavior of a system No case checking Simply guaranteeing that a system is self-stabilizing with respect to some predicate assures that for that predicate, the system will never “fail” Self-stabilization implies (finite time) recovery after failure Existing approaches rely upon avoiding failure at all costs, and then handling only certain failure modes No probabilistic guarantees necessary Since the system must stabilize regardless of the initial state, one can assert that a self-stabilizing system is perfectly fault-tolerant

Dijkstra’s Self-Stabilization Techniques His question: can a self-stabilizing system exist at all? Yes. Can the processes (finite state machines) be the same? No. Are some processes “privileged”? Yes. What do they do? No(thing).

Dijkstra’s Self-Stabilization Techniques (2) How does his K-state self-stabilizing system work? Each Can the processes (finite state machines) be the same? No. Are some processes “privileged”? Yes. What do they do? No(thing).

Pursuer-Evader Problem in Sensor Networks Problem : How do you design the motion of pursuer so that it can finally see the unpredictable evader? Examples Motion Tracking Sensing occurrences of unlikely events Past attempts of solving not applicable in sensor networks Limited Computational Resources (no centralized algorithm) Communication burden Fault-prone environment On-site maintenance infeasible

The Problem Input System consisting of large number of sensor node Each node j has a neighbour set nbr.j that it can communicate with Each node has clock synchronized with other nodes Two disntinguished processes : Pursuer & Evador Both processes are mobile changing location from node to node Evador moves at a slower rate compared to pursuer The strategy of evader movement is unknown to the network Goal: Design a program for the motes and the pursuer so that the pursuer can “catch” the evader, i.e. guarantees pursuer and evader would come to the same node A program consists of a set of variables, mote actions, pursuer actions and evader actions

Assumption Evader Action Evader moves picks a random neighbor and moves to the node Fault Model Transient fault model corrupt the state at each node Transient fault also restart motes We assume connectivity of the graph despite faults

Evader-centric Program Each node stores a variable ts: latest timestamp that a node knows for the detection of evader If a node detects an evader, it sets its ts to the current clock time A node otherwise compares its own ts value with all the neighbors’ and stores the maximum (p representing the parent node) One the tracking tree is formed, pursuer simply follows the parent link from each node to reach to the evader node

Evader-centric Program: Observations The tracking tree is a spanning tree rooted at the mote where evader resides The distance between pursuer and evader does not increase once the tree includes the pursuer The pursuer catches the evader in at most M+2M*(a/1-a) steps Stabilization: If we assume (ts < clock) and (parent is a neighbor), it can be proved that the above algorithm will stabilize Performance: Energy = deg * N Time = D + 2D *(a/1-a)

Pursuer-Centric Program Each node updates the ts when it detects evader Motes communicate with neighbors only at the request of pursuer When pursuer visits a node, it sets its ts to 0 and moves to the neighboring node with maximum ts

Pursuer-Centric Program: Observations If the pursuer reaches a mote j where ts>0, the pursuer catches the evader in at most N*a/(1-a) The pursuer catches the evader within O(N 2 *log(N)) Stabilization: Initialization of ts = 0 Performance Metrics Communication : degree communications per step Time : O(N 2 *log(N))

A Hybrid Pursuer-Evader Program Modify Evader-centric program to limit the tracking depth to save energy Modify pursuer centric program to exploit the tracking tree structure program Remove cycles from the graph Check to correct fake tree roots Pursuer: if parent defined, follow the parent otherwise select a neighbor with maximum ts value Performance Metrics Communication: n + degree of communication at each step Time: O((N-n) 2 log(N-n))

Discussion Here we came to know about a self-stabilizing program to track presence of evader in a sensor network The algorithm is tunable and energy-efficient Limitations Fault-model is limited – no communication related failure Can the concepts be generalized to address generic sensor related failures? Number of evaders and no of pursuers are limited to one Need for clock synchronization can possibly be relaxed