Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil.

Similar presentations


Presentation on theme: "Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil."— Presentation transcript:

1 Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil

2 Join Work ICDCN, 04/01/2013, Mumbia Ajoy K. Datta & Lawrence L. Larmore Sébastien Tixeuil

3 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia

4 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia A fault = a process state corruption

5 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia

6 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia

7 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia

8 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia

9 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia

10 Self-Stabilization [Dijkstra,74] ICDCN, 04/01/2013, Mumbia Recover after any number of transient faults

11 Price of the Versatility 1.Several impossibility results –E.g., Leader Election and Token Circulation in anonymous networks 2.The stabilization time usually depends on global parameters (diameter, size of the network …) ICDCN, 04/01/2013, Mumbia

12 Price of the Versatility 1.Several impossibility results –E.g., Leader Election and Token Circulation in Anonymous Networks 2.The stabilization time usually depends on global parameters (diameter, size of the network …) ICDCN, 04/01/2013, Mumbia

13 When a few number of faults hit the system Self-Stabilization: Ω(D) rounds ICDCN, 04/01/2013, Mumbia

14 When a few number of faults hit the system Self-Stabilization: Ω(D) rounds Stronger forms: –Fault Containment [Ghosh et al, Dist Comp 2007] –k-adaptive Self-Stabilization [Burman et al, OPODIS’05] Weakened forms: –k-stabilization [Beauquier et al, PODC’98] ICDCN, 04/01/2013, Mumbia

15 When a few number of faults hit the system Self-Stabilization: Ω(D) rounds Stronger forms: –Fault Containment [Ghosh et al, Dist Comp 2007] –k-adaptive Self-Stabilization [Burman et al, OPODIS’05] Weakened forms: –k-stabilization [Beauquier et al, PODC’98] ICDCN, 04/01/2013, Mumbia

16 Fault-Containment Pros –Self-stabilizing –If f ≤ k faults, stabilization time in O(f) rounds –Containment radius –Fault gap is small Cons (currently) –k=1, or –Surrounded by a majority of correct processes, or –Synchronous setting, or – Probabilistic recovery ICDCN, 04/01/2013, Mumbia

17 Fault gap The minimum time between consecutive faulty transitions to have O(f) recovery time ICDCN, 04/01/2013, Mumbia Legitimate Illegitimate ≥ Fault gap O(f)O(f)

18 Fault gap The minimum time between consecutive faulty transitions to have O(f) recovery time ICDCN, 04/01/2013, Mumbia Legitimate Illegitimate < fault gap >Ω(D)

19 Time-Adaptive Self-stabilization Self-Stabilization If the hamming distance to a legitimate configuration is f ≤ k, i.e., f ≤ k faults occurs simultaneous (Static faults), –“output” stabilization in O(f) rounds ICDCN, 04/01/2013, Mumbia

20 Output vs. State Stabilization ICDCN, 04/01/2013, Mumbia Legitimate Correct Output O(f)O(f) >Ω(D) Illegitimate f ≤ k faults

21 Output vs. State Stabilization ICDCN, 04/01/2013, Mumbia Legitimate Correct Output O(f)O(f) >Ω(D) Illegitimate f ≤ k faults The fault gap depends on global parameters

22 k-Stabilization (first definition) ICDCN, 04/01/2013, Mumbia If the hamming distance to a legitimate configuration is f ≤ k, i.e., f ≤ k faults occurs simultaneous, the system eventually recovers Otherwise no guarantee

23 k-Stabilization (first definition) Pros –Can solve more problems than self-stabilization –Usually, only-k-dependent stabilization time –Usually, only-k-dependent fault gap Cons –Not self-stabilizing –Static faults: f ≤ k faults should occur in a single transition ICDCN, 04/01/2013, Mumbia

24 Our definition of k-stabilization Faulty transition = one process state corruption Dynamic faults: –if f ≤ k faulty transitions occur in an arbitrary manner The system eventually recovers ICDCN, 04/01/2013, Mumbia

25 Our definition of k-stabilization ICDCN, 04/01/2013, Mumbia Legitimate Illegitimate 1 fault f ≤ k faults

26 Our contribution Leader recovery protocol –On an anonymous (yet oriented) ring –Asynchronous atomic read/write –k-stabilizing if n ≥ 18k + 1 –Stabilization time O(k 2 ) rounds –Log(k) bits per process –This problem is unsolvable in self-stabilizing setting ICDCN, 04/01/2013, Mumbia

27 Our contribution ICDCN, 04/01/2013, Mumbia The system stars in a legitimate configuration where one process is elected

28 Our contribution ICDCN, 04/01/2013, Mumbia Some faulty transitions occurs in an arbitrary manner

29 Our contribution ICDCN, 04/01/2013, Mumbia Some faulty transitions occurs in an arbitrary manner Fault propagation

30 Our contribution ICDCN, 04/01/2013, Mumbia Some faulty transitions occurs in an arbitrary manner Fault propagation

31 Our contribution ICDCN, 04/01/2013, Mumbia If n ≥ 18k + 1, the system recovers the same leader in O(k 2 ) rounds

32 Our contribution ICDCN, 04/01/2013, Mumbia If n ≥ 18k + 1, the system recovers the same leader in O(k 2 ) rounds

33 Our contribution ICDCN, 04/01/2013, Mumbia If n ≥ 18k + 1, the system recovers the same leader in O(k 2 ) rounds

34 Our contribution ICDCN, 04/01/2013, Mumbia If n ≥ 18k + 1, the system recovers the same leader in O(k 2 ) rounds

35 Our contribution ICDCN, 04/01/2013, Mumbia If n ≥ 18k + 1, the system recovers the same leader in O(k 2 ) rounds

36 Fault gap ICDCN, 04/01/2013, Mumbia Legitimate Illegitimate f ≤ k faulty transition f ≤ k faulty transitions 0 0 O(k 2 ) rounds

37 Main ideas of the algorithm ICDCN, 04/01/2013, Mumbia

38 Vote = Relative Address ∈ {- 3k..3k} ∪ { ⊥ } ICDCN, 04/01/2013, Mumbia 0 ⊥ ⊥ 3 2 1 -1 -2-2 -3-3 ⊥ 3k3k Interval of relevance: 6+1 votes

39 After k faults ICDCN, 04/01/2013, Mumbia 0 ⊥ ⊥ 3 2 1 -1 -2-2 -3-3 ⊥

40 After k faults ICDCN, 04/01/2013, Mumbia 0 ⊥ ⊥ 3 0 1 -1 -2-2 -3-3 ⊥

41 After k faults ICDCN, 04/01/2013, Mumbia 1 ⊥ ⊥ 3 0 1 0 -2-2 -3-3 ⊥ At most 3k processes change their votes

42 After k faults ICDCN, 04/01/2013, Mumbia 1 ⊥ ⊥ 3 0 1 0 -2-2 -3-3 ⊥ At most 3k processes change their votes Always a majority of votes for the previous leader

43 Rumors ICDCN, 04/01/2013, Mumbia 1 1 Vote Rumor In a legitimate state, Vote = Rumor, for all process Main idea: Vote: hard to change Rumor: easy to change

44 Rumors ICDCN, 04/01/2013, Mumbia 1 2 Vote Rumor If Rumor ≠ Vote If Rumor ≠ ⊥ Candidate ← Rumor Else Candidate ← Vote Initiate Query(Candidate)

45 Rumors ICDCN, 04/01/2013, Mumbia 1 2 Vote Rumor Query(Candidate) traverses the interval of relevance of the candidate (6k+1 processes), and Count the votes for the candidate

46 Query Return If at least 3k+1 votes for the Candidate –If Rumor ≠ ⊥ ≠ Candidate Initiate a Denial of rumor in its interval of relevance –Vote←Candidate –Rumor←Candidate Else –If Rumor = Candidate, then Rumor← ⊥ –Initiate a Denial of Candidate in its interval of relevance –If Vote = Candidate, then Vote← ⊥ ICDCN, 04/01/2013, Mumbia

47 Query Tracks ICDCN, 04/01/2013, Mumbia

48 Other tracks Denial (to kill a rumor) To manage lost queries –Probe wave –Report (see the paper) ICDCN, 04/01/2013, Mumbia

49 Deadlock Prevention Each two neighboring processes share a resource –Think of chopstick between 2 philosophers ICDCN, 04/01/2013, Mumbia

50 Deadlock Prevention Each two neighboring processes share a resource –Think of chopstick between 2 philosophers Only a process that holds both its left and right resources can initiate a query ICDCN, 04/01/2013, Mumbia

51 Deadlock Prevention Each two neighboring processes share a resource –Think of chopstick between 2 philosophers Only a process that holds both its left and right resources can initiate a query So, at any time at most n/2 pending initiated query ICDCN, 04/01/2013, Mumbia

52 Deadlock Prevention Each two neighboring processes share a resource –Think of chopstick between 2 philosophers Only a process that holds both its left and right resources can initiate a query So, at any time at most n/2 pending initiated query Now, we can have up to 9k rogue queries, i.e., non- initiated queries ICDCN, 04/01/2013, Mumbia

53 Deadlock Prevention Each two neighboring processes share a resource –Think of chopstick between 2 philosophers Only a process that holds both its left and right resources can initiate a query So, at any time at most n/2 pending initiated query Now, we can have up to 9k rogue queries, i.e., non- initiated queries So, n > n/2+9k, that is n ≥ 18k + 1 ICDCN, 04/01/2013, Mumbia

54 Conclusion Less restrictive definition of k-stabilization Using this definition, we solve a problem having no self-stabilizing solution: –Leader recovery protocol On an anonymous (yet oriented) ring Only-k-dependent complexity: –Stabilization time O(k 2 ) rounds –Log(k) bits per process ICDCN, 04/01/2013, Mumbia

55 Thank You! ICDCN, 04/01/2013, Mumbia


Download ppt "Fast Leader (Full) Recovery despite Dynamic Faults Ajoy K. Datta Stéphane Devismes Lawrence L. Larmore Sébastien Tixeuil."

Similar presentations


Ads by Google