Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems Fall 2010 Logical time, global states, and debugging.

Similar presentations


Presentation on theme: "Distributed Systems Fall 2010 Logical time, global states, and debugging."— Presentation transcript:

1

2 Distributed Systems Fall 2010 Logical time, global states, and debugging

3 Fall 20105DV0203 Outline Logical time –Lamport timestamps –Vector clocks Global states –Snapshot algorithm Distributed debugging Summary

4 Fall 20105DV0204 Logical time No global time! What do we do? What can we trust? –Causal relationships still hold: an effect cannot be observed before the cause Processes maintain an ordering of events: e → i e' Send events must come before Receive events

5 Fall 20105DV0205 Happened-before relation “ → ” HB1: If there exists a process p i : e → i e', then e → e' HB2: For any message m: send(m) → receive(m) HB3: If e, e', and e'' are events s.t. e → e' and e'' → e''', then e → e'''

6 Fall 20105DV0206 Example (HB-relation) HB1: a → b, c → d, e → f HB2: b → c, d → f HB3: a → b → c → d → f No ordering for e.g. b and e! They are concurrent, denoted b || e

7 HB-relation pitfall Messages are not always sent as result of a causal relationship (e.g. scheduled messages) Fall 201075DV020

8 Fall 20105DV0208 Lamport logical clocks Monotonically increasing Each process has a counter which is increased when an event occurs Counter is sent with messages –Recipient sets own clock to max(own, received) and then increases own counter

9 Fall 20105DV0209 Example (Lamport timestamps) Evident that e → e' ⇒ L(e) < L(e') But, the opposite does not hold! –E.g. L(b) > L(e), but b || e

10 Fall 20105DV02010 Vector clocks Keep track of known events at all processes (a vector of timestamps) Send vector clock with message –Receiver merges clocks by setting own values to the maximum of own values and received ones

11 Fall 20105DV02011 Example (Vector clocks) Vector clocks can be ordered –V=V' if all values are the same –V≤V' if all values in V are ≤ those in V' –V<V' if V≤V' and V and V' are non- equal

12 Fall 20105DV02012 Vector clocks e → e' ⇒ V(e) < V(e') and V(e) < V(e') ⇒ e → e' Concurrent events (b || e): –Neither V(b) < V(e) nor V(e) < V(b) Nice properties, but expensive in terms of memory and bandwidth –O(N) in both cases

13 Fall 20105DV02013 Global states We often need to know the state of the entire distributed system –Distributed garbage collection –Distributed deadlocks –Distributed termination detection Just process states are not enough! –Messages currently in the channel

14 Fall 20105DV02014 Global states Simple with global time! –Just issue “report state at time X” –…we do not have this luxury Each process maintains own history –We could create global history by just taking union of all local histories We only want to consider such global states S that may have occurred at some point in time

15 Fall 20105DV02015 Cuts Denote the local prefix of the process history (first k events) by h i k = A cut is a union of prefixes of process histories: C = h 1 c1 ∪ h 2 c2 ∪ … ∪ h N cN The state of p i in S is that which it had after immediately the last of the process' event that is in the cut

16 Fall 20105DV02016 Consistent cuts and global states According to the definition, we can make any cut we want, including ones that make no sense! We want to only consider consistent cuts –For all events e ∈ C, f → e ⇒ f ∈ C Consistent global states correspond to consistent global cuts –We only ever move between consistent global states during execution

17 Fall 20105DV02017 Linearizations and runs Both are total orderings of all events in the global history A run is only consistent with the ordering of each process' own local history A linearization is consistent with the (global) happened-before relation

18 Linearizations and runs Runs do not have to pass through consistent global states, but all linearizations do –S' is reachable from S if ∃ a linearization from S to S' Fall 2010185DV020

19 Fall 20105DV02019 Snapshot algorithm Chandy and Lamport Distributed algorithm Constructs a snapshot of the global state (both process and channel state) –Ensures that the global state is consistent –Makes no guarantee that the system was actually in the recorded state!

20 Fall 20105DV02020 Snapshot algorithm – assumptions Neither channel nor processes fail Every sent message is eventually delivered intact Strongly connected graph of processes Channels are unidirectional and provide FIFO message delivery

21 Snapshot algorithm – notes Any process may initiate a global snapshot at any time Processes may continue execution while the snapshot is in progress (important) Fall 2010215DV020

22 Fall 20105DV02022 Snapshot algorithm When a marker is received on channel c –If own process state is not recorded Record own process state Send marker messages on all outbound channels before sending anything else Start recording the set of messages coming in on all other incoming channels (the set for c is the empty set)

23 Snapshot algorithm –If own process state is already recorded Record the state of c as the messages that have been received on it since recording own state Fall 2010235DV020

24 Snapshot example Fall 20105DV02024 P1 P3P2 Note that is neither part of the state of P3 nor of P2 at this point!

25 Snapshot example contd. P2 received marker on P1 -> P2 after, so it is part of recorded state Same for P3 and However, P3 sent out before the marker, and P2’s state snapshot does not include it Algorithm concludes that was in transit between P3 and P2 Fall 20105DV02025

26 Fall 20105DV02026 Snapshot algorithm Study the more complicated example in the book, too! The example takes a few minutes to walk through and understand –If you want guidance, we can do it after class

27 Fall 20105DV02027 Distributed debugging Different from regular debugging! Let system run, and establish post- hoc (after the fact) whether the system worked as intended Rather than Snapshot, we shall study a centralized algorithm –Monitoring server not considered part of the group of processes in the system

28 Fall 20105DV02028 Distributed debugging Stable and non-stable predicates Safety with respect to α –α evaluates to False for all states S reachable from the initial state Liveness with respect to β –For any linearization L starting in the initial state, β evaluates to True for some state reachable from the initial state

29 Fall 20105DV02029 Possibly Φ Possibly Φ: there is a consistent global state S through which a linearization of H passes such that Φ(S) is True –“At some point, Φ is true”

30 Definitely Φ Definitely Φ: for all linearizations L of H, there is a consistent global state S through which L passes such that Φ(S) is True –“No matter which way the system is executed, Φ is always true” Fall 2010305DV020

31 Fall 20105DV02031 Possibly and Definitely Φ Processes send their state to the monitor as soon as it changes (including own vector clock) State is consistent if: –p x 's events known at p y when it sent s y is no more than the events that had occurred at p x when it sent s x “Events that happened-before are already known and part of the global state”

32 Fall 20105DV02032 Possibly and Definitely Φ The vector clocks allow us to construct a lattice (crystalline structure) of global states for an execution –Determining Possibly and Definitely Φ is simply a manner of traversing this structure Figures 11.14 and 11.15 from the book, © Pearson Education 2005

33 Fall 20105DV02033 Summary Logical clocks are based on events in processes and the inter-event relationships –Happened-before relation Logical clocks do not capture everything –Out-of-band communication (phone calls, etc.)

34 Fall 20105DV02034 Summary Lamport's logical clocks are simple, but give limited reasoning power Vector clocks are more powerful, but also more costly Global state encompasses both process and channel state –Snapshot algorithm Distributed debugging is quite different from regular debugging –Possibly Φ and Definitely Φ

35 Next time Hands-on exercise with Java RMI Prepare by doing as much of it as possible before attending Excellent time to ask questions! Note that the deadline for submitting the results are later that same day Fall 20105DV02035

36 Fall 20105DV02036 Next lecture Coordination and Agreement –Distributed mutual exclusion –Election algorithms –Byzantine Generals problem –Paxos algorithm Multicast and message ordering –Important for the assignment!


Download ppt "Distributed Systems Fall 2010 Logical time, global states, and debugging."

Similar presentations


Ads by Google