Presentation is loading. Please wait.

Presentation is loading. Please wait.

Selected topics in distributed computing Shmuel Zaks

Similar presentations


Presentation on theme: "Selected topics in distributed computing Shmuel Zaks"— Presentation transcript:

1 Selected topics in distributed computing Shmuel Zaks zaks@cs.technion.ac.il

2 l'Aquila, Distributed Computing, 12-14/1/092 Part 1: Part 1: Self stabilization

3 l'Aquila, Distributed Computing, 12-14/1/093 A new approach to fault tolerance A unified approach to transient failures by formally incorporating them into the design model

4 l'Aquila, Distributed Computing, 12-14/1/094 shared memory, synchronous 6 6 6 6 6 x x x x x 7 7 7 7 7 8 8 8 8 8 x x x x x clock synchronization Part 3.1: Clock synchronization (of Gouda and Herman)

5 l'Aquila, Distributed Computing, 12-14/1/095 Program at each processor p : choose a neighbor x of p ; if t(p) = t(x) then t(p):= t(p)+1; Stabilizing if all start with the same time, however …

6 l'Aquila, Distributed Computing, 12-14/1/096 67 6 4 7 Is it stabilizing? no! (may even deadlock) if t(p) = t(x) then t(p):= t(p)+1;

7 l'Aquila, Distributed Computing, 12-14/1/097 Program at each processor p : choose a neighbor x of p ; if t(p) = t(x) then t(p):= t(p)+1; Is it stabilizing? if t(p) < t(x) then t(p):= t(x);

8 l'Aquila, Distributed Computing, 12-14/1/098 67 6 4 7 x 7 7 7 7 7 x x x x 8 8 x x 8 x 8 x x 8 Is it stabilizing?yes! if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):= t(x);

9 l'Aquila, Distributed Computing, 12-14/1/099 67 6 4 7 x 8 8 7 7 7 x x x x 9 9 x x 8 x 8 x x 8 Is it stabilizing?no! if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):= t(x);

10 l'Aquila, Distributed Computing, 12-14/1/0910 Program at each processor p : choose a neighbor x of p if t(p) = t(x) then t(p):= t(p)+1; if t(p) < t(x) then t(p):= t(x); fairly; Is it stabilizing? yes!

11 l'Aquila, Distributed Computing, 12-14/1/0911 proof

12 l'Aquila, Distributed Computing, 12-14/1/0912 67 6 4 7 n=5, range(S)=7-4=3, top(S)=2 S:

13 l'Aquila, Distributed Computing, 12-14/1/0913 67 6 4 7 n=5, range(S 1 )=7-7=0, top(S 1 )=5 S1:S1: 7 7 7 7 7 x x x x x

14 l'Aquila, Distributed Computing, 12-14/1/0914 67 6 4 7 n=5, range(S 2 )=8-7=1, top(S 2 )=2 S2:S2: 7 7 7 8 8 x x x x x

15 l'Aquila, Distributed Computing, 12-14/1/0915 67 6 4 7 n=5, range(S 3 )=9-8=1, top(S 3 )=2 S3:S3: 7 7 7 8 8 x x x x x 9 8 8 8 9 x x x x x

16 l'Aquila, Distributed Computing, 12-14/1/0916 67 6 4 7 hw: prove. Use for: correctness (especially termination) + complexity S :

17 l'Aquila, Distributed Computing, 12-14/1/0917 distributed control ring of (finite-state machine) processes for each machine define privileges (when a transition is enabled) nodes are influenced only by neighbors legitimate states in each legitimate state there is at least one privileged machine in each legitimate state each possible move will bring the system again to a legitimate state Part 3.2: Mutual exclusion (Dijkstra’s 1 st algorithm)

18 l'Aquila, Distributed Computing, 12-14/1/0918 A system is self-stabilizing if, regardless of the initial state and regardless of the privilege selected each time for the next move, at least one privilege will always be present and the system is guaranteed to find itself in a legitimate state after a finite number of moves.

19 l'Aquila, Distributed Computing, 12-14/1/0919 non-stable states stable states Finite number of transitions

20 l'Aquila, Distributed Computing, 12-14/1/0920 Token ring. N machines, denoted 0 through N-1. A machine is priviliged (“has token”) or not priviliged. daemon/scheduler determines which of the priviliged machines will make the next move Legitimate States: all states with exactly one privileged machine. In Dijkstra’s model:

21 l'Aquila, Distributed Computing, 12-14/1/0921 Notes to consider: scheduler (daemon) : points at each time to one of the privileged machines, or point at each time to one of the machines. centralized scheduler, distributed scheduler fairness: Do we need fairness, and of what kind? Unconditional Fairness: Each machine gets pointed to infinitely often. Type of fairness: - each machine is pointed at infinitely often, - round robin, or - other.

22 l'Aquila, Distributed Computing, 12-14/1/0922 N processors 0,1,…,N-1 Solution with k-state Machines machine P 0 if x[0] = x[N-1] then x[0] := x[0]+1 mod k machine P i ( i= 1,2,…,N-1) if x[i] ≠ x[i-1] then x[i] := x[i-1] Dijkstra’s 1 st algorithm

23 l'Aquila, Distributed Computing, 12-14/1/0923 2 0 1 0 p0p0 p3p3 p1p1 p2p2 N=4, K=3 State : 0,1,2

24 l'Aquila, Distributed Computing, 12-14/1/0924 2 0 1 0 p0p0 p3p3 p1p1 p2p2 machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k

25 l'Aquila, Distributed Computing, 12-14/1/0925 2 0 1 0 p0p0 p3p3 p1p1 p2p2 machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1]

26 l'Aquila, Distributed Computing, 12-14/1/0926 2 0 1 0 p0p0 p3p3 p1p1 p2p2 machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k So far:

27 l'Aquila, Distributed Computing, 12-14/1/0927 2 0 1 0 p0p0 p3p3 p1p1 p2p2 machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k 21 2 2 0 0 0 0 K=3 Etc… Example 1

28 l'Aquila, Distributed Computing, 12-14/1/0928 2 0 1 0 p0p0 p3p3 p1p1 p2p2 machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k 21 2 2 0 1 1 0 K=2 0110 0 0 1 Etc… Example 2

29 l'Aquila, Distributed Computing, 12-14/1/0929 2 0 1 0 p0p0 p3p3 p1p1 p2p2 machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k 21 2 2 0 1 1 0 K=2 011 1 0 0 0 1 1 0 01 0 0 1 Etc… Example 3

30 l'Aquila, Distributed Computing, 12-14/1/0930 Informally: “If machine is privileged, then make it non-privileged”. Note: machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k

31 l'Aquila, Distributed Computing, 12-14/1/0931 Legitimate States: exactly one privileged machine 2 0 1 0 p0p0 p3p3 p1p1 p2p2 21 2 2 1 1 1 0 K=2 1 1 machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k 0

32 l'Aquila, Distributed Computing, 12-14/1/0932 Legitimate States 0 0 0 0 0 …… 1 0 0 0 0 1 1 0 0 0 0 0 K–1 K–1 K–1 …… 1 1 1 1 1 0 K-1 K–1 K–1 K–1 2 1 1 1 1 … 2 2 2 2 2 … K–1 K–1 K–1 K–1 K–1

33 l'Aquila, Distributed Computing, 12-14/1/0933 Goal: Show that from an arbitrary state, the system stabilizes to a legitimate state in a finite number of steps. Intuition: 0 1 2 3 4 0 2 4 1 0 1 2 4 1 0 1 2 4 1 1 2 2 4 1 1 2 2 2 1 1  0 1 2 3 4 0 3 2 1 0 1 3 2 1 0 1 1 2 1 0 1 1 2 2 0 1 1 2 2 2  N=5, K = 5 N=5, K = 4

34 l'Aquila, Distributed Computing, 12-14/1/0934 0 1 2 3 4 0 1 2 1 0 0 0 2 1 0 1 0 2 1 0 1 0 2 1 1 1 0 2 2 1 1 0 0 2 1 0 1 2 3 4 1 1 0 2 1 2 1 0 2 1 2 1 0 2 2 2 1 0 0 2 2 1 1 0 2 2 2 1 0 2 0 2 1 0 2 0 2 1 0 0 0 2 1 1 0 0 2 2 1 0 0 0 2 1 0 N=5, K = 3 hw: can you get to this configuration? hw: which configurations have the same property? (Check also for all other examples.)

35 l'Aquila, Distributed Computing, 12-14/1/0935 Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2.

36 l'Aquila, Distributed Computing, 12-14/1/0936 Lemma 2: (liveness) At every configuration, at least one machine is privileged. Lemma 3: Starting from any configuration, P 0 will eventually make a move. Corollary: Starting at any configuration, P 0 will make an infinite number of steps. (this still does not guarantee stabilization.) machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k Lemma 1: If all states are equal then the system is stabilized. (Hw)

37 l'Aquila, Distributed Computing, 12-14/1/0937 Definition: P i is called special in a configuration if Lemma 4: If P 0 is special in a configuration, then the system will eventually stabilize. Lemma 5: If in a configuration one of the states is missing, then the system will eventually stabilize. machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k

38 l'Aquila, Distributed Computing, 12-14/1/0938 Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2. Proof: Case 1: K< N-1 Case 2: K = N-1 Case 3: K = N Case 4: K > N

39 l'Aquila, Distributed Computing, 12-14/1/0939 Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2. Proof: (e.g., N=4, k=2) Case 1. k < N-1

40 l'Aquila, Distributed Computing, 12-14/1/0940 Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2. Proof: Case 4. k > N … Lemma 5

41 l'Aquila, Distributed Computing, 12-14/1/0941 Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2. Proof: Case 3. k = N … Lemma 5 Either one state is missing, or all states are distinct. … Lemma 4

42 l'Aquila, Distributed Computing, 12-14/1/0942 Theorem: Assuming a centralized scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-2. Proof: Case 2. k = N - 1 … Lemma 5 Either one state is missing, or one state occurs twice.

43 l'Aquila, Distributed Computing, 12-14/1/0943 … Lemma 4Either P 0 is special, or x[0] = x[i] i < N-1 i = N-1 P 0 is not enabled, and after any other move there is a mising state, and … Lemma 5. If P i, i 0 makes a move, there is missing state, and done by Lemma 5. If P 0 makes a move, we get back to the previous case. So, one state occurs twice.

44 l'Aquila, Distributed Computing, 12-14/1/0944 Theorem: Assuming a distrubted scheduler: starting with any initial configuration, Dijkstra’s 1st algorithm stabilizes iff K > N-1. hw: prove

45 l'Aquila, Distributed Computing, 12-14/1/0945 S = a configuration (a system state) r(S) = number of maximal runs of equal states Example: S = 2222011000, f(S) = 4 r(S) + 1 if first(S) = last(S) f(S) = r(S) otherwise. Note: S is legitimate iff f(S) = 2. Use of a potential function

46 l'Aquila, Distributed Computing, 12-14/1/0946 Show: If K > N and f(S) > 2, then f(S) never increases and is eventually decreased. Recall: A common technique to prove termination and correctness Hw: 1. finish the proof. 2. use f to analyze complexity

47 l'Aquila, Distributed Computing, 12-14/1/0947 Applications Distributed data structures Digital and analog circuits Genetic algorithms Network protocols Sensor networks Some general comments on self stabilization

48 l'Aquila, Distributed Computing, 12-14/1/0948 is a special kind of fault tolerance guarantees an automatic recovery from a transient failure as a design goal is a too strong property and thus either too difficult to achieve or achieved at the expense of other goals Self stabilization

49 l'Aquila, Distributed Computing, 12-14/1/0949 Uniform protocol : same program at each processor Theorem: There is no uniform protocol that solves the mutual exclusion problem (under either a centralized or a distributed scheduler) Back to Dijkstra’s algorithm machine P i : if x[i] ≠ x[i-1] then x[i] := x[i-1] machine P 0 : if x[0] = x[N-1] then x[0] := x[0]+1 mod k

50 l'Aquila, Distributed Computing, 12-14/1/0950 0 0 0 0 p0p0 p3p3 p1p1 p2p2 0 0 p6p6 p5p5 3 3 3 7 7 7

51 l'Aquila, Distributed Computing, 12-14/1/0951 N processors 0,1,…,N-1 Solution with 3-state Machines P 0: if x[0] +1=x[1] then x[0]:=x[0]-1 P N-1 : if (x[N-2] =x[0]) (x[N-1]≠ x[0]+1) then x[N-1]:=x[0]+1; P i ( i= 1,2,…,N-2) : if (x[i]+1=x[i-1]) (x[i]+1=x[i+1]) then x[i] := x[i]+1; Part 3.3: Mutual exclusion (Dijkstra’s 3 rd algorithm)

52 l'Aquila, Distributed Computing, 12-14/1/0952 2 1 2 1 p0p0 p4p4 p1p1 p2p2 State : 0,1,2 p3p3 0

53 l'Aquila, Distributed Computing, 12-14/1/0953 P 0: if x[0] +1=x[1] then x[0]:=x[0]-1 2 1 2 1 p0p0 p4p4 p1p1 p2p2 p3p3 0

54 l'Aquila, Distributed Computing, 12-14/1/0954 P N-1 : if (x[N-2] =x[0]) (x[N-1]≠ x[0]+1) then x[N-1]:=x[0]+1; 2 1 2 1 p0p0 p4p4 p1p1 p2p2 p3p3 0

55 l'Aquila, Distributed Computing, 12-14/1/0955 P i ( i= 1,2,…,N-2) : if (x[i]+1=x[i-1]) (x[i]+1=x[i+1]) then x[i] := x[i]+1; 2 1 2 1 p0p0 p4p4 p1p1 p2p2 p3p3 0

56 l'Aquila, Distributed Computing, 12-14/1/0956 P i ( i= 1,2,…,N-2) : if (x[i]+1=x[i-1]) (x[i]+1=x[i+1]) then x[i] := x[i]+1; 2 1 2 1 p0p0 p4p4 p1p1 p2p2 p3p3 0

57 l'Aquila, Distributed Computing, 12-14/1/0957 P i ( i= 1,2,…,N-2) : if (x[i]+1=x[i-1]) (x[i]+1=x[i+1]) then x[i] := x[i]+1; 2 1 2 1 p0p0 p4p4 p1p1 p2p2 p3p3 0

58 l'Aquila, Distributed Computing, 12-14/1/0958 2 1 2 1 p0p0 p4p4 p1p1 p2p2 p3p3 0

59 l'Aquila, Distributed Computing, 12-14/1/0959 1 1 p0p0 p4p4 p1p1 p2p2 p3p3 1 If all are in the same state? if (x[N-2] =x[0]) & (x[N-1]≠ x[0]+1) then x[N-1]:=x[0]+1; 1 1 2 if (x[i]+1=x[i-1]) v (x[i]+1=x[i+1]) then x[i] := x[i]+1; 2 2 if x[0] +1=x[1] then x[0]:=x[0]-1 1 0 2

60 l'Aquila, Distributed Computing, 12-14/1/0960 0 1 2

61 l'Aquila, Distributed Computing, 12-14/1/0961 For two neighbors

62 l'Aquila, Distributed Computing, 12-14/1/0962 s: 2 1 2 0 1 1 2 2 f(s) = number of + 2 x number of f(s) = 4 + 2 x 1 = 6

63 l'Aquila, Distributed Computing, 12-14/1/0963

64 l'Aquila, Distributed Computing, 12-14/1/0964 Lemma 1: If there is a unique arrow in a configuration, then it moves back and forth infinitely. Corollary: It suffices to show that from any configuration we eventually reach a configuration with a unique arrow. Lemma 2: (liveness) Starting from any initial configuration, at least one process is priveleged.

65 l'Aquila, Distributed Computing, 12-14/1/0965 Lemma 3: Starting from any configuration, eventually. Proof: if during an execution -. else - f=0, but then there is no arrow, and in the next step there is a unique arrow, and then apply Lemma 1.

66 l'Aquila, Distributed Computing, 12-14/1/0966 Lemma 4: Between every two successive moves of P N-1 there is at least one move of P 0. Proof: recall: the program for P N-1 : if (x[N-2] =x[0]) (x[N-1]≠ x[0]+1) then x[N-1]:=x[0]+1; After P N-1 makes a step x[N-1]=x[0]+1, And this condition makes P N-1 non-privileged. This condition can become true only after P 0 makes a step [ P 0: if x[0] +1=x[1] then x[0]:=x[0]-1 ].

67 l'Aquila, Distributed Computing, 12-14/1/0967 Lemma 5: A sequence of moves in which P 0 does not move is finite. (therefore – impossible by Lemma 2) Proof: by Lemma 4 – in such a sequence P N-1 makes at most one step. so, from some point only P i makes a steps,. steps 1,2,3,4,5. 1,2 – no change in arrows 3,4,5 – decrease no. of arrows..

68 l'Aquila, Distributed Computing, 12-14/1/0968 Lemma 5: A sequence of moves in which P 0 does not move is finite. (therefore – impossible by Lemma 2) Proof (cont.): from some point only steps 1 and 2. Follows from topology.

69 l'Aquila, Distributed Computing, 12-14/1/0969 Theorem: Starting from any initial configuration, the system eventually stabilizes. Proof: by Lemma 5 P 0 makes infinite no. of moves. P 0 … P 0 … P 0 … P 0 … P 0 … …

70 l'Aquila, Distributed Computing, 12-14/1/0970 Proof (cont.) P 0 … P 0 … P 0 … P 0 … P 0 … … phases “there exists a rightmost arrow, and it points to the right” Falsified in each phase. P 0 … P 0 … P 0 … P 0 … P 0 … … F T F T F T F T F T

71 l'Aquila, Distributed Computing, 12-14/1/0971 This is done only in steps 3, 4, 6 6 - ok so, assume only 3 and 4. in each phase: they have Δf=-3. 0 and 7 Δf<=2. others Δf=0. so, each phase f is decreased by at least 1.

72 l'Aquila, Distributed Computing, 12-14/1/0972 Complexity? (following Chernoy, Shalom and Z.) hw:

73 l'Aquila, Distributed Computing, 12-14/1/0973 E. W. Dijkstra, Self-stabilizing systems in spite of distributed control, CACM, 17,11, 1974, pp. 643-644. E. W. Dijkstra, A belated proof of self-stabilization, Distributed Computing, 1, 1, 1986, pp. 5-6. M. G. Gouda and T. Herman, Stabilizing Unison, Department of Computer Science, 1990. V. Chernoy, M. Shalom and S. Zaks, A Self- stabilizing algorithm with tight bounds for mutual exclusion on a ring, DISC 2008. References


Download ppt "Selected topics in distributed computing Shmuel Zaks"

Similar presentations


Ads by Google