Termination Detection of Diffusing Computations Chapter 19 Distributed Algorithms by Nancy Lynch Presented by Jamie Payton Oct. 3, 2003.

Slides:



Advertisements
Similar presentations
Chapter 5: Tree Constructions
Advertisements

Distributed Snapshots: Determining Global States of Distributed Systems - K. Mani Chandy and Leslie Lamport.
Global States.
Distributed Snapshots: Determining Global States of Distributed Systems Joshua Eberhardt Research Paper: Kanianthra Mani Chandy and Leslie Lamport.
Distributed Computing 5. Snapshot Shmuel Zaks ©
Lecture 8: Asynchronous Network Algorithms
SES Algorithm SES: Schiper-Eggli-Sandoz Algorithm. No need for broadcast messages. Each process maintains a vector V_P of size N - 1, N the number of processes.
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
1 Global State $500$200 A B C1: Empty C2: Empty Global State 1 $450$200 A B C1: Tx $50 C2: Empty Global State 2 $450$250 A B C1: Empty C2: Empty Global.
Uncoordinated Checkpointing The Global State Recording Algorithm.
Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.
Chapter 15 Basic Asynchronous Network Algorithms
6.852: Distributed Algorithms Spring, 2008 Class 12.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
Distributed Computing 5. Snapshot Shmuel Zaks ©
OSU CIS Lazy Snapshots Nigamanth Sridhar and Paul A.G. Sivilotti Computer and Information Science The Ohio State University
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Logical Clocks and Global State.
Termination Detection. Goal Study the development of a protocol for termination detection with the help of invariants.
Termination Detection Part 1. Goal Study the development of a protocol for termination detection with the help of invariants.
Global State Collection. Global state collection Some applications - computing network topology - termination detection - deadlock detection Chandy-Lamport.
Distributed Snapshot (continued)
Ordering and Consistent Cuts Presented By Biswanath Panda.
CPSC 668Set 2: Basic Graph Algorithms1 CPSC 668 Distributed Algorithms and Systems Spring 2008 Prof. Jennifer Welch.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
Distributed Systems Fall 2009 Logical time, global states, and debugging.
© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.
Termination Detection Presented by: Yonni Edelist.
Chapter 11 Detecting Termination and Deadlocks. Motivation – Diffusing computation Started by a special process, the environment environment sends messages.
Ordering and Consistent Cuts Presented by Chi H. Ho.
Cloud Computing Concepts
Chapter 10 Global Properties. Unstable Predicate Detection A predicate is stable if, once it becomes true it remains true Snapshot algorithm is not useful.
Computer Science Lecture 10, page 1 CS677: Distributed OS Last Class: Clock Synchronization Physical clocks Clock synchronization algorithms –Cristian’s.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
Distributed Computing 5. Snapshot Shmuel Zaks ©
Complexity of Bellman-Ford Theorem. The message complexity of Bellman-Ford algorithm is exponential. Proof outline. Consider a topology with an even number.
Broadcast & Convergecast Downcast & Upcast
UBI529 Distributed Algorithms
Chapter 14 Asynchronous Network Model by Mikhail Nesterenko “Distributed Algorithms” by Nancy A. Lynch.
Distributed Snapshot. Think about these -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes?
Termination Detection
A correction The definition of knot in page 147 is not correct. The correct definition is: A knot in a directed graph is a subgraph with the property that.
Distributed Systems Fall 2010 Logical time, global states, and debugging.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
DISTRIBUTED ALGORITHMS Spring 2014 Prof. Jennifer Welch Set 2: Basic Graph Algorithms 1.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 5 Instructor: Haifeng YU.
Global State Collection
Chapter 21 Asynchronous Network Computing with Process Failures By Sindhu Karthikeyan.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee. Some applications - computing network topology - termination detection - deadlock detection Chandy Lamport algorithm does a partial job.
1 Chapter 11 Global Properties (Distributed Termination)
CIS 825 Review session. P1: Assume that processes are arranged in a ring topology. Consider the following modification of the Lamport’s mutual exclusion.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Parallel and Distributed Simulation Deadlock Detection & Recovery.
Distributed Systems Lecture 6 Global states and snapshots 1.
Termination detection
The Echo Algorithm The echo algorithm can be used to collect and disperse information in a distributed system It was originally designed for learning network.
Lecture 3: State, Detection
CSE 486/586 Distributed Systems Global States
Distributed Snapshots & Termination detection
Lecture 9: Asynchronous Network Algorithms
ITEC452 Distributed Computing Lecture 9 Global State Collection
Global state collection
Global State Collection
Distributed Snapshot Distributed Systems.
CSE 486/586 Distributed Systems Global States
CIS825 Lecture 5 1.
Distributed Snapshot.
Presentation transcript:

Termination Detection of Diffusing Computations Chapter 19 Distributed Algorithms by Nancy Lynch Presented by Jamie Payton Oct. 3, 2003

Motivation We would like to simplify programming of asynchronous networks  Use a monitoring algorithm Debugging Backup Termination detection Deadlock detection Global quantity computation

Approaches Stable property detection  Any property of the global state that, once established, remains true  E.g., deadlock or termination Consistent global snapshots  Captures the global state of the network such that it appears to all processes to have been taken at the same instant on every process

Asynchronous Networks Collection of processes Collection of channels Communication subsystem  Point-to-point  Broadcast  Multicast

Network Model n-Node directed graph  G=(V,E) Represent n processes as nodes in graph Represent channels as directed edges in graph

Send/Receive Model Processes  Input actions Receive(m) i,j  Output actions Send(m) i,j Channels  Input actions Send(m) i,j  Output actions Receive(m) i,j Each process associated with a node and each channel associated with an edge in the graph G is modeled as an I/O automaton

Termination Detection for Diffusing Computations All processes are initially quiescent A designated “live” process diffuses computation to other processes Processes awaken, perform work, spread computation Processes become quiescent again When all processes become quiescent, then termination should be signaled

Problem Statement (1) Initially  No locally controlled actions can be performed  All channels are empty A single process is awakened with an input event from the environment  It can perform an output action to generate an event, send, for other processes A process performs an input action, receive, which wakes up the process Awakened processes perform tasks and spread computation with output actions A process becomes quiescent again when  No locally controlled action can be performed;  No messages exist in its channel

Problem Statement (2) Given the initial state of the network as described, If a process A i at node n i receives an input event, and after some time a global quiescent state is reached, then a special done i output should be performed at node i

Detecting Termination Detect termination of algorithm A with a monitoring algorithm, B(A)  Send/receive algorithm  Based on the graph G  Issues a special done output when termination is detected

Dijkstra-Scholten Algorithm(1) Basic idea:  Augment the diffusing algorithm with construction and maintenance of a spanning tree  Tree is rooted at the source node of the diffusing computation  Tree includes all active nodes  Tree grows/shrinks as nodes become active/idle

Dijkstra-Scholten Algorithm(2) All nodes are initially idle A single node, the source of the diffusing computation, becomes active An idle node receives a message from the diffusing computation algorithm, A  Becomes active  Records the sender as its parent  Does not ack the message until ready to become idle again An active node must wait for all outgoing messages to be acknowledged before becoming idle again

Example A1A1 A4A4 A3A3 A2A2

Dijkstra-Scholten Automaton (1) Signature of DS(A i )  Signature of A i  Input Receive(“ack”) j,I where j  nbrs  Output Send(“ack”) i,j where j  nbrs  Internal Cleanup i Done i

Dijkstra-Scholten Automaton (2) State information for DS(A i )  All state info for diffusing computation A i  s i  {idle, root, non-root}, initially idle  p i  nbrs  {null}, initially null   j  nbrs: send-buffer(j), a FIFO queue of ack msgs, initially empty deficit(j)  N, initially 0

Dijkstra-Scholten Automaton (3) Transitions  See handout

Lemma In any state after an input at node i,  s i  {root, idle} and p i = null   j≠i, s j  {idle, non-root}, and if s j =non-root, then p j = null   j, if s j =idle then the projected state of A j is quiescent, p j =null, and deficit(k) j =0 for every k   j,k, deficit(k) j = numMsgs j,k + numBufAcks j,k + numAcks j,k + {1 if p k = j, 0 otherwise}  If s i = root, then the parent pointers form a directed tree rooted at i which spans the set of nodes with s ≠ idle  If s i = idle, then s j = idle for all j and all channels are empty

Proof Outline (1) The Dijkstra-Scholten algorithm detects termination for a diffusing algorithm A  Lemma implies that if root announces termination, then A is quiescent  Must show liveness property If A becomes quiescent, then the root eventually announces termination

Proof Outline (2) Proof by contradiction:  Assume that A becomes quiescent and no done event occurs  After this point, no A messages are sent or received again The tree cannot grow The tree must eventually stop shrinking Eventually, there are no further ack messages in the global state

Proof Outline (3)  j,k, deficit(k) j = {1 if p k = j, 0 otherwise } This means that the cleanup action is enabled  The tree can shrink further -- Contradiction

Complexity Analysis Messages  2m, where m is the number of messages sent by the algorithm A Time  O(m(l+d)) l is an upper bound on time associated with a process performing a task d is an upper bound on time associated with a channel to perform a task

Consistent Global Snapshots

CSG Problem Statement

Chandy-Lamport Algorithm

Example

Proof Outline

CSG Applications

Conclusions Use a monitoring algorithm to detect termination of a computation Use a spanning tree to track nodes participating in the computation Use acks and message counters to maintain tree