Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Systems Lecture 6 Global states and snapshots 1.

Similar presentations


Presentation on theme: "Distributed Systems Lecture 6 Global states and snapshots 1."— Presentation transcript:

1 Distributed Systems Lecture 6 Global states and snapshots 1

2 Previous lecture Time and synchronization – Motivation – Algorithms 2

3 Motivation Determine whether or not a particular property of a distributed system is true as it executes Use logical time to construct global view of the system state The ability to obtain a global photograph of the system is important Example: –Multiple servers (for a service/application) handling multiple concurrent events and interacting with each other Note: http://www.cs.usfca.edu/~srollins/courses/cs682-s08/web/notes/timeandstates.htmlhttp://www.cs.usfca.edu/~srollins/courses/cs682-s08/web/notes/timeandstates.html 3

4 Examples Distributed garbage collection – Are there references to an object anywhere in the system? References may exist at the local process, at another process, or in the communication channel. 4

5 Examples Distributed deadlock detection – Is there a cycle in the graph of the waits for relationship between processes? 5

6 Examples Distributed termination detection – Has a distributed algorithm terminated? 6

7 Examples Distributed debugging Example: – Given two processes p 1 and p 2 with variables x 1 and x 2 respectively, can we determine whether the condition |x 1 -x 2 | > δ is ever true 7

8 Global Predicate Evaluation A global state predicate is a function that maps from the set of global state of processes in the system ρ to {True, False} – Safety a predicate always evaluates to false. A given undesirable property (e.g., deadlock) never occurs. – Liveness a predicate eventually evaluates to true. A given desirable property (e.g., termination) eventually occurs. 8

9 Why? – Distributed garbage collection Example: multiple processes sharing and referencing objects – Distributed deadlock detection, termination Example: database transactions – Global states most useful for detecting stable predicates : Once true always stays true Example: once a deadlock, always stays a deadlock What? – Global state = states of all processes + states of all communication channels – Capture The instantaneous state of each process The instantaneous state of each communication channel, i.e., messages in transit on the channels How? Algorithms for finding the global state

10 Synchronize clocks of all processes Ask all processes to record their states at known time t Problems? – Time synchronization possible only approximately Many sensitive applications – Example: Distributed banking applications – Does not record the state of messages in the channels However – synchronization not required – causality is enough! Initial thought

11 Cuts Physical time cannot be perfectly synchronized in a distributed system  not possible to gather the global state of the system at a particular time Cuts provide the ability to assemble a meaningful global state from local states recorded at different times 11

12 Definitions ρ is a system of N processes p i (i = 1, 2,..., N) history(p i ) = h i = h i k = - a finite prefix of the process's history s i k is the state of the process p i immediately before the k th event occurs All processes record sending and receiving of messages. If a process p i records the sending of message m to process p j and p j has not recorded receipt of the message, then m is part of the state of the channel between p i and p j A global history of ρ is the union of the individual process histories: H = h 0 ∪ h 1 ∪ h 2 ∪... ∪ h N-1 A global state can be formed by taking the set of states of the individual processes: S = (s 1, s 2,..., s N ) A cut of the system's execution is a subset of its global history that is a union of prefixes of process histories The frontier of the cut is the last state in each process A cut is consistent if, for all events e and e': – ( e ∈ C and e ' → e ) ⇒ e ' ∈ C A consistent global state is one that corresponds to a consistent cut 12

13 Consistent vs. inconsistent cuts 13

14 More examples 14

15 Obtaining consistent cuts Working example – Distributed debugging Scenario – We have several processes, each with a variable x i – The safety condition required in this example is |x i -x j | ≤ δ (i, j = 1, 2,..., N). Algorithm – Determine post hoc whether the safety condition was ever violated p 1, p 2,..., p N, send their states to a passive monitoring process, p 0. p 0 is not part of the system Based on the states collected, p 0 can evaluate the safety condition 15

16 Collecting the state Processes send messages m ij – Their initial state to a monitoring process – Updates whenever relevant state changes, in this case the variable x i Only send the value of x i and a vector timestamp The monitoring process maintains an ordered queue V for each process – By timestamp – Contains state messages S is in a consistent state iff – Send(m ij ) in s i  m ij in channel state XOR rec(m ij ) in s j – Send(m ij ) not in s i  m ij not in channel state AND rec(m ij ) not in s j 16

17 Collecting the state 17

18 Snapshot algorithm Chandy-Lamport algorithm Assumptions – There are no failures and all messages arrive intact and only once – The communication channels are unidirectional and FIFO ordered – There is a communication path between any two processes in the system – Any process may initiate the snapshot algorithm – The snapshot algorithm does not interfere with the normal execution of the processes – Each process in the system records its local state and the state of its incoming channels 18

19 Algorithm 1.The observer process (the process taking a snapshot): – Saves its own local state – Sends a snapshot request message bearing a snapshot token to all other processes 2.A process receiving the snapshot token for the first time on any message: – Sends the observer process its own saved state – Attaches the snapshot token to all subsequent messages (to help propagate the snapshot token) 3.Should a process that has already received the snapshot token receive a message that does not bear the snapshot token, this process will forward that message to the observer process. – This message was obviously sent before the snapshot cut off (as it does not bear a snapshot token must have come from before the snapshot token was sent out) Needs to be included in the snapshot 19

20 Next lecture Multicast communication 20


Download ppt "Distributed Systems Lecture 6 Global states and snapshots 1."

Similar presentations


Ads by Google