# Slides for Chapter 11: Time and Global State

## Presentation on theme: "Slides for Chapter 11: Time and Global State"— Presentation transcript:

Slides for Chapter 11: Time and Global State
From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Pearson Education 2005

Figure 11.1 Skew between computer clocks in a distributed system
Clock skew: instantaneous difference between readings of different clocks Clock drift: clocks count time at different rates Quartz crystals are used for clocks – they oscillate at ‘rates’ Drift can vary from 10-8 to 10-6 per second Atomic oscillators drift is per second Coordinated Universal Time (UTC) Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Synchronizing Physical Clocks
Authoritative source (S) UTC time S(t) System P Process i, clock Ci Synchronization bound D>0 External Synchronization: S(t) – Ci(t) < D; for i= 1,2, … , N, for all time t in process i. Ci are accurate within the bound D Internal Synchronization:  Ci(t) – Cj(t) < D; Ci agree within the bound D If a system is externally synchronized with a bound D then the system is internally synchronized with a bound 2D Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Synchronization in a synchronous system
p1 p2 m Process p1 sends a time ’t’ in message m to p2 Let Ttrans – time to transmit m from p1 to p2 Ttrans – is subject to variations Upper bound- max and lower bound- min Uncertainty in the message transmission time u = (max-min); t+max or t+min? For t + (max+min)/2 Skew is at most u/2 In general,when synchronizing N clocks the skew can be u(1-1/N) Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Single server – prone to faults
Figure 11.2 Clock synchronization using a time server – Cristian’s Algorithm m r t p Time server,S Berkley algorithm The master sends the amount by which a slave’s clock requires adjustment p requests t from S S receives mr, appends t to mt and sends mt to p When p receives mt, what is the time at S? The earliest time at which S placed t in mt is time at which p sent mr plus min The latest time at which S placed t in mt is time at which p received mt minus min S’ clock in the range (t+min, t+Tround-min) Range is Tround-2min and the accuracy is (Tround/2-min) Single server – prone to faults Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.3 An example synchronization subnet in an NTP implementation
2 3 Note: Arrows denote synchronization control, numbers denote strata. *Reliable service despite lengthy disconnected periods *Clients can resynchronize sufficiently frequently to offset drift *Protection against interference Synchronize each other from time to time Use UDP Multicast, procedure call (Cristian’s algo), Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.4 Messages exchanged between a pair of NTP peers
-2 - 3 Server B Server A Time m m' oi is the offset for each pair of messages between two servers is an estimate of the actual offset and a delay di. Let o be the true offset of the clock at B relative to A actual transmission time of m (m’) is t(t’) Ti-2 = Ti-3+t+o; Ti = Ti-1+t’-o ; di = t+t’= Ti-2-Ti-3+Ti-Ti-1 o = oi+(t’-t)/2; t,t’ 0  oi-di/2 ≤o ≤ oi+di/2 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

LC1: for any process pi: e ie’ then e e’;
Figure 11.5 Events occurring at three processes : Logical time and clocks LC1: for any process pi: e ie’ then e e’; i indicates happened before event on processor i LC2: For any message m, send(m) receive(m) LC3: If e e’ and e’ e’’ then e e’’ Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.6 Lamport timestamps for the events shown in Figure 11.5
Totally ordered logical clocks: Event e occurs at pi with local time stamp Ti Event e’ occurs at pj with local time stamp Tj Logical time stamps for the events (Ti,i), (Tj,j) Then (Ti,i) < (Tj,j) iff either Ti < Tj or Ti = Tj and i<j. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.7 Vector timestamps for the events shown in Figure 11.5
VC1: Initially Vi[j] = 0 for i,j = 1,2, …N VC2: just before pi timestamps an event, it sets Vi[i]:= Vi[i]+1 VC3:pi includes the value of t = vi in every message it sends VC4: when pi receives a time stamp t in a message, it sets Vi[j] := max(Vi[j],t[j]), for j = 1,2,… N; takes the max of two vector stamps For vector clock Vi, Vi[i] is the number of events that pi has time stamped Vi[j] ( ji) is the number of events that have occurred at pj that pi has potentially been affected by. Note: pj may have time-stamped more events by this point, but no messages have arrived at pi to this effect Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Compare vector time stamps: V = V’ iff V[j] = V’[j] for j= 1,2, …N
V<V’ iff V≤ V’  VV’ Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.8 Detecting global properties
2 1 message garbage object object reference a. Garbage collection Garbage: no references A message is transit has a reference to orange object – is not garbage b. Deadlock p p 1 wait-for 2 A subset of processes are affected Deadlock: Cycle with ‘wait for’ relation between processes p1 and p2 wait for each other – there will be no progress wait-for A process is attempting to perform an activity c. Termination Detecting termination of a process is difficult. For example message from p2 to p1 may be in transit when both are perceived to be ‘passive’ p p 1 2 activate Possible that all processes have terminated passive passive A passive process is not engaged in any activity Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Global states and Consistent Cuts
System consists of N processes p1, p2, … pN History of pi: history(pi) = hi = <ei0, ei1, … > History till kth event hik = <ei0, ei1, … eik > Each event – internal state transformation action or communication action Sik is the state of process i immediately after kth event Si0 is the initial state of pi Global history of system is the union of individual process histories Any set of states of the individual processes form a global state S = (s1+s2 …sN) Which process states occurred at the same time? Global state: initial prefixes of the individual process histories Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Global States and consistent Cuts
A cut of the system’s execution is a subset of its global history that is a union of prefixes of process histories: Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

S = (s1+s2 …sN) - state corresponding to a cut C.
Figure 11.9 Cuts m 1 2 p Physical time e Consistent cut Inconsistent cut 3 S = (s1+s2 …sN) - state corresponding to a cut C. There are two cuts above, C1 and C2 … The events e10 and e20 happen before the cut C1, * < e10, e20 > is the frontier of C1 ** is an inconsistent cut * < e12, e22 > is the frontier of C2 ** is a consistent cut Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Cut e p m Physical time Inconsistent cut Consistent cut
1 2 p Physical time e Consistent cut Inconsistent cut 3 C1: on p2, the receipt of message m1 is included, but on p1 the sending of message m1 is not included; C1 is showing an effect without a cause - inconsistent C2:Includes sending and receiving events of m1; C2 includes the sending of m2. Receipt is an effect, and it takes time for the message to arrive - consistent. A Cut is consistent if for each event e in the frontier, it also contains all events that happened before e, for all events e  C, f  e  f  C Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

A consistent global state corresponds to a consistent cut
Execution of a distributed system can be characterized as S0 S1 S2 S3 . . . Each transition – One event at any one process or Concurrent events (NO happened-before) Run is a total ordering of all events in a global history Consistent run or linearization is an ordering of events in a global history such that all events are consistent with the happened before relation on H. If there is a linearization that passes through state S and then state S’, then S is said to be reachable from S Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Global State predicates
Global state predicates are useful in evaluating a condition. Examples: An object is garbage A set of processes are deadlocked Processes have terminated Global state predicate is function that maps from the set of global states of processes in the system to {True, False} When a system reaches a state in which the Predicate is True, it remains True in all states reachable from that state. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

S0 is the original state of the system.
Safety and livenes S0 is the original state of the system.  is the property that the global system is deadlocked  is an undesirable property Safety:  evaluates to False for all states Si reachable from S0  is the property of reaching a termination Liveness  evaluates to True for some state SL reachable from S0 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.10 Chandy and Lamport’s ‘snapshot’ algorithm
Determines global states of a distributed system Records consistent global state For a set of processes pi Process and Channel states are recorded Snapshot The algorithm Assumes no failures (channels and processes) A message sent from process is eventually received by another Unidirectional channels provide FIFO delivery A path exists between any two processes Global snapshot can initiated by any process The snapshot is non-interfering Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.10 Chandy and Lamport’s ‘snapshot’ algorithm
pi pj pi incoming channels outgoing channels Each process records its state and for each incoming channel, a set of messages sent to it Marker message: * is a prompt for the receiver or save its state *is an indicator of which messages to include in channel state. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.10 Chandy and Lamport’s ‘snapshot’ algorithm
Marker receiving rule for process pi On pi’s receipt of a marker message over channel c: if (pi has not yet recorded its state) it records its process state now; records the state of c as the empty set; turns on recording of messages arriving over other incoming channels; else pi records the state of c as the set of messages it has received over c since it saved its state. end if Marker sending rule for process pi After pi has recorded its state, for each outgoing channel c: pi sends one marker message over c (before it sends any other message over c). Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.11 Two processes and their initial states
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.12 The execution of the processes in Figure 11.11
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

It is a strongly connected graph of processes and channels
Termination A process receiving the marker message records its state, within a finite time, and sends marker messages on each of its outgoing channels, within a finite time A path of communication channels exists between any two processes pi and pj pj will records its state a finite state after pi It is a strongly connected graph of processes and channels All processes record their state and the states of incoming messages, a finite time after the initiation Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.13 Reachability between states in the snapshot algorithm
init final snap actual execution e ,e 1 ,... recording begins ends pre-snap: e ' ,...e R-1 post-snap: e R R+1 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Distributed Debugging
In a distributed system it is not possible to observe the states of all processes sumilataneously Chandy and Lamport’s algorithm collects states in a distributed fashion The challenge: trace information over time to establish whether a required safety condition has been met or violated Assumptions: the algorithm is centralized - all processes send their states to a monitor process - the monitor lies outside the system - the monitor’s main job is to observe the execution of processes Objective: whether a global system predicate  was definitely True or whether it was possibly True (might have occurred). Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Definitely: applies to the actual execution
Debugging Possibly: Suppose one consistent global state, S is extracted and (S) is found to be true. Possibly : There is a consistent global state S through which a linearization of H passes such that (S) is True. Definitely: applies to the actual execution - all linearizations are considered Definitely : for all linearizations L of H, there is a consistent global state S through which L passes such that (S) is True. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Subsequently, state messages are sent to the monitor
Collecting the state Initially, each process pi (i=1,2, …N) sends its initial state to the monitor. Subsequently, state messages are sent to the monitor Monitor maintains a queue, Qi for each process pi The monitoring process delays normal execution, but it does not interfere Several optimizations can be incorporated to reduce overheads E.g, send state only when changes happen, send only relevant portion the state Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Consistent global states are assembled by the monitor to evaluate 
Observations Consistent global states are assembled by the monitor to evaluate  Processes include their vector clock values with state messages Monitor can distinguish between consistent and inconsistent states Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.14 Vector timestamps and variable values for the execution of Figure 11.9
2 p Physical time Cut C (1,0) (2,0) (4,3) (2,1) (2,2) (2,3) (3,0) x = 1 = 100 = 105 = 95 = 90 At time t=0, x1=x2=0; the requirement is x1 -x2  ≤ 50 Consider the inconsistent Cut C1, the monitor process would find that the constraint x1 -x2  ≤ 50 is broken as the collected states at the monitor show x1= 1 and x2=100. For the consistent cut C2, the collected states at the monitor show x1= 105 and x2=90. * Vector time stamps of the state messages must be examined . Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Consistent Global States
S = (s1+s2 …sN) – global state as seen by the monitor V(si ) – vector timestamp of state si received from pi S is a consistent global state iff V(si)[i]  V(sj)[i] for i,j = 1,2, … N Number of pi ‘s events known to pj when it last sent sj is ≤ number of events that had occurred at pi when it last sent si The monitor process can establish whether a global state is consistent by using vector timestamps. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Edges: possible transitions between states
Figure The lattice of global states for the execution of Figure 11.14 Sij is in level (i+j) Nodes: global states Edges: possible transitions between states Level 0 1 2 3 4 5 6 7 S 00 10 20 21 30 31 32 22 23 33 43 Sij = global state after i events at process 1 and j events at process 2 m 1 2 p Physical time Cut C (1,0) (2,0) (4,3) (2,1) (2,2) (2,3) (3,0) x = 1 = 100 = 105 = 95 = 90 A linearization traverses the lattice from any global state to any global state reachable from it in the next level Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

p 1 Physical p 2 p1 p2 p3 (0,1,1) (0,1,0) (1,2,0) (1,3,0) (1,0,0) (0,1,2) (1,3,3) (2,1,2) (3,1,2) (3,4,3) Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

S22 is reachable from S20, but not from S30
Level 0 1 2 3 4 5 6 7 S 00 10 20 21 30 31 32 22 23 33 43 S22 is reachable from S20, but not from S30 possibly : monitor starts at the initial state and steps through all consistent states reachable from that point;  is evaluated at each stage; it stops if  evaluates to True. definitely : monitor process finds a set of states through which all linearizations must pass; evaluates  at each of these states; if  evaluates to True at each of these states then definitely . Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.16 Algorithms to evaluate possibly f and definitely f
Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Figure 11.17 Evaluating definitely f
The monitor begins to process states, starting at the initial state, (s10, s20,… sN0) Maintains the set States – those states at current state that may be reached on a linearization from the initial state, by traversing only states for which  evaluates to False. If F = ( f( S ) = False); T = ( True) ? Level 0 1 2 3 4 5 F T definitely  cannot be asserted as long as such linearization exists. If a level with no such linearization is reached, then definitely. One level 4: the state to right of F is not considered since it may be reached via a state for which  evaluates to True.  evaluates to True at level 5 then definitely. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

k events/process, N processes Theoretically O(kN) comparisons;
Cost k events/process, N processes Theoretically O(kN) comparisons; All events are not significant There are techniques to reduce number of states Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

Exercise problems 11.1 to 11.8 are also important, though trivial.
Exercises Exercise problems 11.9 to 11.16 Exercise problems 11.1 to 11.8 are also important, though trivial. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005

p2 holds resource r2 and p4 hosts resource r4
(1,0,0) (2,0,0) p 1 a b m 1 (2,1,0) (2,2,0) Physical p 2 time c d m 2 (0,0,1) (2,2,2) p 3 e f p4 p2 holds resource r2 and p4 hosts resource r4 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn © Pearson Education 2005