Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.

Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach

2 Consensus problem  In the Consensus problem, all correct processes propose a value and must reach an irrevocable decision on some value that is related to the proposed values [Fisher 1983]  The Consensus problem is specified as follows: Termination: Every correct process eventually decides some value. Validity: If a process decides v, then v was proposed by some process. Agreement: No two correct processes decide differently.

3 FLP(Fischer,Lynch,Paterson) - Main results  Proves the impossibility of fault-tolerant consensus  Every asynchronous fault-tolerant consensus algorithm has an infinite run in which no process decides  It is possible to design asynchronous consensus algorithms that don’t always terminate

4 The Failure Detectors abstraction (Chandra/Toueg 96)  Showed that FLP applies to many problems, not just consensus In particular, they show that FLP applies to group membership, reliable multicast So these practical problems are impossible in asynchronous systems, in formal sense  Chandra/Toueg also look at the weakest condition under which consensus can be solved

5 Chandra/Toueg Idea  Separate problem into The consensus algorithm itself A “failure detector:” a form of oracle that announces suspected failure, but it can change its mind. Each process has local failure detector oracle typically outputs list of processes suspected to have crashed at any given time

6 Example of a failure detector  The detector they call  S ( “diamond-S”) “eventually strong”  Defined by two properties: Strong Completeness: Eventually (after some unknown but finite time t), every process that crashes is permanently suspected by every correct process Completeness : detection of every crash Eventual weak accuracy: There is a time after which some correct process is not suspected Accuracy : does it make mistakes?

7 The Model  Asynchronous distributed system in which there is no bound on message delay, clock drift or the time necessary to execute a step  Every pair of processes is connected by a reliable communication channel  The system consists of a set of n processes 1,…n  t < n/2 of them can crash

8 Implementing Consensus – Mostefaoui-Raynal  Mostefaoui and Raynal 99 This is the most elegant protocol I am aware of Relies on  S, at least half are not faulty 1 ) r := r+1; c := r mod n; 2) if c=i then send (v_i,r) to all endif; 3) wait until either got a msg from c or c is suspected; 4) if got a msg then e_i := v_c else e_i :=  endif; 5) send (e_i,r) to all; -------------------------------------------------------------- 6) wait until got a msg from at least half of the nodes; 7) build the vector V_i such that V_i[j]:=e_j or  ; 8) if V_i includes a value v a majority of times then decide v and return 9) elseif V_i includes both v and  then v_i := v 10) endif; % otherwise, keep old v_i % 11) goto line number 1 12)Upon reception of decide(v),send decide(v) to all and return

9 Proof  Validity  Termination  Uniform Agreement

Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.

Similar presentations

Presentation on theme: "Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach.

Similar presentations

Presentation on theme: "Distributed Systems Tutorial 4 – Solving Consensus using Chandra-Toueg’s unreliable failure detector: A general Quorum-Based Approach."— Presentation transcript:

Similar presentations

About project

Feedback