Presentation on theme: "Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi."— Presentation transcript:
Self-Stabilization in Distributed Systems Barath Raghavan Vikas Motwani Debashis Panigrahi
Outline Introduction Concept of Self-Stabilization Dijkstra’s Model of Self-stabilization Super-Stabilization Application of Stabilization on Sensor Network Discussion
Distributed Systems A distributed system consists of loosely connected machines which do not share a global memory. Each machine has the partial view of the global state System consists of components : processes and inter-connections (message passing/shared memory) Topology is formed by components as nodes and inter-connections as directed edges Each component has a local state Global system state is union of local states Behavior of the system is represented by the global state and the transition between the states Design of robust distributed system requires capability of recovery from unforeseen perturbances Previous attempts cater to permanent failures by introducing redundant components Transient malfunction (message corruption/sensor malfunction/memory error) can put the system to illegal state
Self-stabilization A self-stabilizing system guarantees that Regardless of the current state, the system is guaranteed to recover to a legal configuration in a finite number of steps Remain in the legal configuration thereafter until a malfunction occurs Examples of Self-stabilization Mathematics : Newton-Raphson method for finding square root Control Theory: Systems with feedback Distributed Systems: Pioneered by Dijkstra Questions? Is it always possible to have self-stabilization? If not, under what conditions How do we design algorithms that are self-stabilizing Cost of self-stabilization and convergence rate?
Properties of self-stabilization Closure Some evaluation function (a predicate) P, once true given the state of the system S, is always true thereafter Convergence Regardless of the starting stated, P will be satisfied within a finite number of state transitions OR: Given a starting state in which a predicate Q is satisfied, P will be satisfied within a finite number of state transitions These properties must remain true even against an adversary The adversary may have knowledge of the system’s design and cause damage or remove state
Using Self-Stabilization for Fault Tolerance Why use self-stabilization for fault tolerance In some sense, because it was not designed for fault tolerance Fault tolerance is a side-effect of self-stabilizing behavior of a system No case checking Simply guaranteeing that a system is self-stabilizing with respect to some predicate assures that for that predicate, the system will never “fail” Self-stabilization implies (finite time) recovery after failure Existing approaches rely upon avoiding failure at all costs, and then handling only certain failure modes No probabilistic guarantees necessary Since the system must stabilize regardless of the initial state, one can assert that a self-stabilizing system is perfectly fault-tolerant
Dijkstra’s Self-Stabilization Techniques His question: can a self-stabilizing system exist at all? Yes. Can the processes (finite state machines) be the same? No. Are some processes “privileged”? Yes. What do they do? No(thing).
Dijkstra’s Self-Stabilization Techniques (2) How does his K-state self-stabilizing system work? Each Can the processes (finite state machines) be the same? No. Are some processes “privileged”? Yes. What do they do? No(thing).
Pursuer-Evader Problem in Sensor Networks Problem : How do you design the motion of pursuer so that it can finally see the unpredictable evader? Examples Motion Tracking Sensing occurrences of unlikely events Past attempts of solving not applicable in sensor networks Limited Computational Resources (no centralized algorithm) Communication burden Fault-prone environment On-site maintenance infeasible
The Problem Input System consisting of large number of sensor node Each node j has a neighbour set nbr.j that it can communicate with Each node has clock synchronized with other nodes Two disntinguished processes : Pursuer & Evador Both processes are mobile changing location from node to node Evador moves at a slower rate compared to pursuer The strategy of evader movement is unknown to the network Goal: Design a program for the motes and the pursuer so that the pursuer can “catch” the evader, i.e. guarantees pursuer and evader would come to the same node A program consists of a set of variables, mote actions, pursuer actions and evader actions
Assumption Evader Action Evader moves picks a random neighbor and moves to the node Fault Model Transient fault model corrupt the state at each node Transient fault also restart motes We assume connectivity of the graph despite faults
Evader-centric Program Each node stores a variable ts: latest timestamp that a node knows for the detection of evader If a node detects an evader, it sets its ts to the current clock time A node otherwise compares its own ts value with all the neighbors’ and stores the maximum (p representing the parent node) One the tracking tree is formed, pursuer simply follows the parent link from each node to reach to the evader node
Evader-centric Program: Observations The tracking tree is a spanning tree rooted at the mote where evader resides The distance between pursuer and evader does not increase once the tree includes the pursuer The pursuer catches the evader in at most M+2M*(a/1-a) steps Stabilization: If we assume (ts < clock) and (parent is a neighbor), it can be proved that the above algorithm will stabilize Performance: Energy = deg * N Time = D + 2D *(a/1-a)
Pursuer-Centric Program Each node updates the ts when it detects evader Motes communicate with neighbors only at the request of pursuer When pursuer visits a node, it sets its ts to 0 and moves to the neighboring node with maximum ts
Pursuer-Centric Program: Observations If the pursuer reaches a mote j where ts>0, the pursuer catches the evader in at most N*a/(1-a) The pursuer catches the evader within O(N 2 *log(N)) Stabilization: Initialization of ts = 0 Performance Metrics Communication : degree communications per step Time : O(N 2 *log(N))
A Hybrid Pursuer-Evader Program Modify Evader-centric program to limit the tracking depth to save energy Modify pursuer centric program to exploit the tracking tree structure program Remove cycles from the graph Check to correct fake tree roots Pursuer: if parent defined, follow the parent otherwise select a neighbor with maximum ts value Performance Metrics Communication: n + degree of communication at each step Time: O((N-n) 2 log(N-n))
Discussion Here we came to know about a self-stabilizing program to track presence of evader in a sensor network The algorithm is tunable and energy-efficient Limitations Fault-model is limited – no communication related failure Can the concepts be generalized to address generic sensor related failures? Number of evaders and no of pursuers are limited to one Need for clock synchronization can possibly be relaxed