Presentation is loading. Please wait.

Presentation is loading. Please wait.

Checkpointing 2.0 Compiler-Assisted Checkpointing Uncoordinated Checkpointing.

Similar presentations


Presentation on theme: "Checkpointing 2.0 Compiler-Assisted Checkpointing Uncoordinated Checkpointing."— Presentation transcript:

1 Checkpointing 2.0 Compiler-Assisted Checkpointing Uncoordinated Checkpointing

2 Compiler-Assisted Checkpointing Compiler-Assisted Checkpointing (1994) Beck, Plank, Kingsley (UTK) Compiler-Assisted Memory Exclusion for Fast Checkpointing (1995) Beck, Plank, Kingsley (UTK)

3 Motivation We saw that memory exclusion can dramatically reduce the size (and overhead) of a checkpoint file Can be time consuming, and wrong decisions can cause a program to be incorrect on recovery Use a compiler to determine what memory can be excluded, automating and ensuring the correctness of the process

4 Compiler Directives Programmer add the following directives to the code CHECKPOINT_HERE Direct translation to checkpoint_here() EXCLUDE_HERE Include exclude_byte() and include_byte() calls at that location

5 Directive Placement Poor placement of directives might lead to inefficient checkpointing However, the program will still checkpoint and recover properly

6 Directive Placement Why not place EXCLUDE_HERE directly before all CHECKPOINT_HERE directives? for(…) { EXC… CHE… } EXC… for(…) { CHE… }

7 Overview of Technique Perform some data flow analysis of the program to determine which variables are clean or dead at each EXCLUDE_HERE statement Insert the appropriate exclude_byte() calls at each location

8 Build a Control Flow Graph A control flow graph G= is a directed graph, where each node represents a program statement, and each edge represents a possible flow of control from one statement to another

9 Example Program INTEGER I, X, Y, Z S1:Z = 3 S2:X = 5 S3:FOR 100, I = 1,1000 S4:Y = X + Z S5:X = X * Y S6:EXCLUDE_HERE S7:CHECKPOINT_HERE S8:100CONTINUE S9:END

10 Example CFG S4 S9S8 S7S6 S1S2 S3 S5

11 Find Sub-graphs Given the CFG, G, of our program, find all sub-graphs, G’, where G’ is rooted by an EXCLUDE_HERE and contains all paths reachable from that EXCLUDE_HERE that do not pass through another EXCLUDE_HERE

12 Example G’ S4 S9S8 S7S6 S1S2 S3 S5

13 Strategy For each G’, calculate two sets DE(G’) – all variables that are dead at every CHECKPOINT_HERE in G’ RO(G’) – all variables that are read-only throughout G’ At each EXCLUDE_HERE insert calls to exclude_bytes(v, CKPT_DEAD) for all v in DE(G’) exclude_bytes(v, READ_ONLY) for all v in RO(G’) include_bytes(v) for all v that are not in DE(G’) nor in RO(G’)

14 Determine Memory Accesses For each statement, S, determine the membership of three sets MAY_REF(S) – every location that may be referenced by some execution of S MAY_DEF(S) – every location that may be defined by some execution of S MUST_DEF(S) – every location that will be defined by every execution of S

15 An Aside Because our example has no arrays, no pointers, etc. MUST_DEF(S) and MAY_DEF(S) will be the same set

16 Example S4 S9S8 S7S6 S1S2 S3 S5 {},{Z}{},{X} {},{I} {X,Z},{Y} {X,Y},{X} {},{} {I},{I} {REF},{DEF}

17 Liveness/Deadness v is ‘live’ at S if there is a path from S to some S’ s.t. v  MAY_REF(S’) and for all S’’ on the path, v  MUST_DEF(S’’) v must be live at S if it is read at some (later) S’ without being re-defined somewhere between the two If v is not alive at S, we say it is dead at S

18 DEAD(S) The set DEAD(S) is the set of variables that are dead immediately before the execution of S v  DEAD(S) if v is dead everywhere below S or it is redefined at S, except if ref’d at S We calculate DEAD(S) with an iterative algorithm: 1. For all S, set DEAD(S) = V 2. For every statement S, DEAD(S) = (  S’ DEAD(S’))  MUST_DEF(S) – MAY_REF(S) 3. Repeat step 2 until all DEAD(S) converge

19 Data Flow Eqn. For DEAD DEAD(S) Fs(X) = { V if S is END { X  MUST_DEF(S) – MAY_REF(S) otherwise, where X is  S’ DEAD(S’)

20 The Set DE(S) The set DE(S) is the set of all variables that are dead at every CHECKPOINT_HERE below S, in the same subgraph Calculate iteratively, as before Fs(X) = { X  DEAD(S) if S is CHECKPOINT_HERE *{ V if S is EXCLUDE_HERE or END { X otherwise where X is  S’ DE(S’)

21 The Set RO(S) v is read-only at S if v  MAY_DEF(s) The set RO(S) is the set of variables that are read-only along all paths from S in the same sub-graph Fs(X) = { V if S is EXCLUDE_HERE or END { X – MAY_DEF(S) otherwise

22 Solution to Example DE(G’) and RO(G’) are defined to be DE(S) and RO(S) where S is the statement directly following the EXCLUDE_HERE For our example DE(7) = {Y}, RO(7) = {Z} S6 would become exclude_bytes(Y, CKPT_DEAD) exclude_bytes(Z, CKPT_READONLY) include_bytes(everything else)

23 Uncoordinated Distributed Checkpoints Q: How can we extend our uniprocessor checkpointing to a distributed system? A: Each process in the distributed system takes an independent checkpoint

24 Global State The global state is a collection of the states of each of the individual processes (and of the communication channels) A consistent global state is one which that may occur during a failure-free, correct running of the computation

25 Consistent States are states that may have occurred q p q p

26 Inconsistent States are states that could not have occurred Here processor p has received a message that has not been sent q p

27 Inconsistent States Inconsistent states can only occur where there have been failures, and the processes have been restarted from their checkpoints A rollback-recovery system must insure that the system is restarted in a consistent state but not necessarily a state that has ever occurred

28 Consistent Global Checkpoint A consistent global checkpoint is set of checkpoints, one from each process, that correspond to a consistent global state If processes take their checkpoints independently, they must search for a consistent global state upon restart

29 The Domino Effect In the event of a failure, ideally we would like to only roll back the failed process; however, doing so might leave the system in an inconsistent state, necessitating that others be rolled back as well

30 Example If r fails, and restarts at C, message 8 must be invalidated, forcing q to rollback to B. Msg. 7 is now invalidated, forcing p to rollback to A, etc., all the way to the beginning p q r * 1 2 3 4 5 6 7 8 C B A

31 Calculating the Recovery Line The recovery line for an uncoordinated system is the set of the “latest” checkpoints for each process in the system that is consistent In order to calculate the RL after a failure, the processes record the dependencies among their checkpoints during failure-free operation

32 Protocol Let c i,x be the x th checkpoint of process P i Let I i,x denote the interval between checkpoints c i-1,x and c i,x If P i sends a message, m, to P j during interval I i,x, P i will piggy-back (i,x) on m If P j receives m during I j,y, it will record the dependence of I j,y on I i,x, and later save it in checkpoint c j,y If P i fails, on recovery, all the other processes will send their dependency information to P i, who will use that info to calculate the recovery line

33 Checkpoint Dependency Graph P i takes the dependency information and constructs a dependency graph The nodes of the graph are all of the c a,b, and the current state of all un-failed processes A directed edge is drawn from c i,x-1 to c j,y if i  j and a message was sent from I i,x to I j,y i = j and y = x An edge from c i,x-1 to c j,y implies that c j,y contains a message received not marked as sent in c i,x-1

34 Example p q r *

35 Algorithm Include last ckpt of each failed P in RecoverySet Include current state of un-failed P in RecoverySet Mark all ckpts. reachable from any node in RS While(at least one node in RS is marked) Replace each marked RS element with the latest unmarked ckpt of the same process Mark all ckpts. reachable from any node in RS

36 Finding the Recovery Set XX X X

37 X X X X X X

38 The Recovery Line p q r *


Download ppt "Checkpointing 2.0 Compiler-Assisted Checkpointing Uncoordinated Checkpointing."

Similar presentations


Ads by Google