Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation Author: Friedermann Mattern Presented By: Shruthi Koundinya.

Slides:



Advertisements
Similar presentations
Virtual Time Virtual Time and Global States of Distributed Systems Friedmann Mattern, 1989 The Model: An asynchronous distributed system = a set of processes.
Advertisements

Distributed Snapshots: Determining Global States of Distributed Systems - K. Mani Chandy and Leslie Lamport.
Global States.
Virtual Time “Virtual Time and Global States of Distributed Systems” Friedmann Mattern, 1989 The Model: An asynchronous distributed system = a set of processes.
Lecture 8: Asynchronous Network Algorithms
Parallel and Distributed Simulation Global Virtual Time - Part 2.
Time Warp: Global Control Distributed Snapshots and Fossil Collection.
Uncoordinated Checkpointing The Global State Recording Algorithm.
Uncoordinated Checkpointing The Global State Recording Algorithm Cristian Solano.
Parallel and Distributed Simulation Time Warp: Basic Algorithm.
Parallel and Distributed Simulation Lookahead Deadlock Detection & Recovery.
Lookahead. Outline Null message algorithm: The Time Creep Problem Lookahead –What is it and why is it important? –Writing simulations to maximize lookahead.
Time and Global States Part 3 ECEN5053 Software Engineering of Distributed Systems University of Colorado, Boulder.
Distributed Computing 5. Snapshot Shmuel Zaks ©
1 Causality. 2 The “happens before” relation happens before (causes)
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Logical Clocks and Global State.
Distributed Snapshot (continued)
CS 582 / CMPE 481 Distributed Systems
Ordering and Consistent Cuts Presented By Biswanath Panda.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
CMPT 431 Dr. Alexandra Fedorova Lecture VIII: Time And Global Clocks.
Distributed Systems Fall 2009 Logical time, global states, and debugging.
CPSC 668Set 12: Causality1 CPSC 668 Distributed Algorithms and Systems Fall 2009 Prof. Jennifer Welch.
Ordering and Consistent Cuts Presented by Chi H. Ho.
State Machines CS 614 Thursday, Feb 21, 2002 Bill McCloskey.
Time, Clocks, and the Ordering of Events in a Distributed System Leslie Lamport (1978) Presented by: Yoav Kantor.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Logical Clocks and Global State.
CIS 720 Distributed algorithms. “Paint on the forehead” problem Each of you can see other’s forehead but not your own. I announce “some of you have paint.
Distributed Computing 5. Snapshot Shmuel Zaks ©
Logical Clocks n event ordering, happened-before relation (review) n logical clocks conditions n scalar clocks condition implementation limitation n vector.
Distributed Snapshot. Think about these -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes?
“Virtual Time and Global States of Distributed Systems”
Distributed Systems Fall 2010 Logical time, global states, and debugging.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Snapshot. One-dollar bank Let a $1 coin circulate in a network of a million banks. How can someone count the total $ in circulation? If not.
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 5 Instructor: Haifeng YU.
Fault tolerance and related issues in distributed computing Shmuel Zaks GSSI - Feb
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
CSE 486/586 CSE 486/586 Distributed Systems Global States Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Chapter 11 Global Properties (Distributed Termination)
Hwajung Lee. -- How many messages are in transit on the internet? --What is the global state of a distributed system of N processes? How do we compute.
Logical Clocks event ordering, happened-before relation (review) logical clocks conditions scalar clocks  condition  implementation  limitation vector.
Distributed Systems Lecture 6 Global states and snapshots 1.
Ordering of Events in Distributed Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
PDES Introduction The Time Warp Mechanism
CSCE 668 DISTRIBUTED ALGORITHMS AND SYSTEMS
Consistent cut A cut is a set of events.
CSE 486/586 Distributed Systems Global States
PDES: Time Warp Mechanism Computing Global Virtual Time
Distributed Snapshots & Termination detection
Lecture 9: Asynchronous Network Algorithms
Distributed Snapshot.
COT 5611 Operating Systems Design Principles Spring 2012
Timewarp Elias Muche.
Distributed Snapshot.
Outline Theoretical Foundations - continued Lab 1
Parallel and Distributed Simulation
Distributed Snapshot Distributed Systems.
Chapter 5 (through section 5.4)
Outline Theoretical Foundations - continued
ITEC452 Distributed Computing Lecture 8 Distributed Snapshot
Distributed Snapshot.
CSE 486/586 Distributed Systems Global States
Jenhui Chen Office number:
Distributed algorithms
CIS825 Lecture 5 1.
Consistent cut If this is not true, then the cut is inconsistent
COT 5611 Operating Systems Design Principles Spring 2014
Distributed Snapshot.
Presentation transcript:

Efficient Algorithms for Distributed Snapshots and Global Virtual Time Approximation Author: Friedermann Mattern Presented By: Shruthi Koundinya

Overview Introduction Models and Definitions Consistent Cut Algorithm Distributed Monotonic Computation Scheme Distributed Simulation Global Virtual Time Summary

Introduction Efficient snapshot algorithms to get global state of the system under non FIFO conditions and without acknowledgments Assumptions: Processes and channels form a strongly connected finite graph Messages are assumed to be delivered correctly with arbitrary delay, but not necessarily in the same order. 3 types or events – Send, Receive and internal Applications Detecting stable properties like termination detection, detection of deadlocks. Used to compute monotonic functions of the global state such as lower bounds of simulation time (Global Virtual Time)

Models and Definitions Causally Consistent Global state State of the channels – Messages still in transit Local states of processes Monotonic Functions of the Global State F is monotonic function from a partial ordered set: f(s(c))<f(s(c’)) where s( c) is a particular global state. Infimum Greatest lower bound, the infimum of a subset of some set is the greatest element that is smaller than all other elements of the subset

Consistent Cut Algorithm Algorithm There are 2 colors for each process Red and White. Initially all processes are white A process becomes red when the local snapshot is taken. Messages carry the process color along with the basic message Every process takes a local snapshot when feasible before a red message is received. A single process initiates the snapshot algorithm, sends virtual broadcast messages to the rest of the processes. * Messages are non-FIFO One can receive red messages before receiving a red control message to initiate a local timestamp

Consistent Cut Algorithm Initiator - 1

Consistent Cut Algorithm Initiator - 1 Control Messages & a Red message sent

Consistent Cut Algorithm Initiator - 1 Red Message Received at 2

Consistent Cut Algorithm Initiator sends white Mesg to 1 Ctrl Mesg from 1 reaches 3

Consistent Cut Algorithm Initiator - 1 White Mesg received at 1 Red Ctrl Mesg received

Catching Messages in Transit Counting Method Each process has a counter storing (white) sent messages during local snapshot these are sent to the initiator Initiator has total white message count and waits until they are all received.

Catching Messages In Transit Vector Counter Principle Every process P i counts the number of white messages it has sent to P j on jth component of a Vector A control message, with control vector circulates the ring. Round 1-Send control message across the ring and increment the control vector C:= C+V[i] Reset V[i] to zero C[i] = number of white message Round 2- Process C[i]> 0 for all i revisited and control message waits at each process until count goes to 0.

Distributed Monotonic Computing Scheme Internal events – xi only increases Send event xi >= current value of xi Receive event xi >= to value of xi before the receive event xi >= to X-stamp of the received message. A global infimum function tells one how far the computation has progressed.

How to handle inconsitent message- Closure C* DMC allows messages FUTURE  PAST FUTURE  PAST doesn’t affect global infimum of the system Global consistent cut reached through closure for cut C C* = C U {e’ Є E| e Є C: e’  e} F(C) is the lower bound : f(c )<=f(c*)<=f(c’)

Distributed Simulation A distributed discrete simulation system has a set of sequential event driven simulators implemented by autonomous processes which interact by messages Chronology is essential in Simulations Optimistic Scheme: Simulator may directly execute the earliest event. If the event message with an earlier timestamp arrives, the simulator rolls back. How long to keep the memory around for rollback?

GVT Approximation GVT is also called distributed infimum approximation GVT is calculated along the cut C’ Messages sent between C and C’ are used to determine the GVT Consistent cut algorithm and Vector Counter Implementation are used to device the algorithm.

GVT Approximation Rules: Each process has a local clock & a vector counter Send Message Receive Message Control Message reaches P i Control Message reaches P init

Summary 1. Consistent Cut 2. Get Transit message 3. Monotonic Functions4. GVT

Q/A