CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited

CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited http://www.cs.berkeley.edu/~yelick/294

CS294, Yelick Consensus revisited, p2 Agenda Consensus overview Classic impossibility proof by FLP: –Impossibility of consensus in shared memory with n-1 failures –Impossibility of consensus in shared memory with 1 failure –Impossibility of consensus with message passing What does this mean in practice? Administrivia

CS294, Yelick Consensus revisited, p3 Models Failures: –Link failures –Processor crash failures –Byzantine processor failures Timing –Synchronous: lock step algorithms –Asynchronous: unbounded delay –Partially synchronous: bounds on message delay or processor speed differences

CS294, Yelick Consensus revisited, p4 The Consensus Problem In general, the consensus problem is to get all non- faulty processors to agree on something: –To commit a transaction –Which processors are “up” –Which version of a file to use Abstract problem: Every processor has an input –Termination: Eventually every non-faulty processor must decide on a value. –Agreement: All non-faulty decisions must be the same. –Validity: If all inputs are the same, then the non-faulty decision must be that input.

CS294, Yelick Consensus revisited, p5 Impossibility of Asynchronous Consensus Proof outline: 1.Show impossible in shared memory with n-1 faults. (Wait-free consensus) 2.1 implies there is no 2-proc algorithm resilient to 1 fault 3.Show impossible in shared memory with 1 fault by reduction 4.Show impossible in message passing systems by reduction. Original result by Fischer/Lynch/Paterson. This proof presentation due to Welch.

CS294, Yelick Consensus revisited, p6 Step 1: Impossibility of Wait-Free Consensus An algorithm for n processor is wait-free if it can tolerate n-1 crashed processors Theorem 1: There is no wait-free consensus algorithm in an asynchronous shared memory system. Proof plan: By contradiction. Classify configurations C according to how many different decisions are reachable: –Bivalent: both 0 and 1 are reachable –Univalent: only one output is reachable (0-valent or 1-valent) Three lemmas lead to the result C 0 0 1 0 1 1

CS294, Yelick Consensus revisited, p7 Impossibility of Wait-Free Consensus (con’t) Lemma 1: There is an initial configuration that is bivalent. Proof: Assume all initial configurations are univalent. Build a chain of configurations: But if  and  ’ differ only in 1 input, processor i. Consider executions in which i fails immediately – since  produces 0, so does  ’, a contradiction. … … 00 11  ’’ 00000 xx0xxxx1xx11111 0-valent 1-valent

CS294, Yelick Consensus revisited, p8 Impossibility of Wait-Free Consensus (con’t) Lemma 2: If C1 and C2 are univalent and C1 and C2 are equivalent at p i then C1 and C2 have the same valency. Proof: Suppose C1 is v-valent. –Since the algorithm is wait-free (i.e., all other processors could stop), there is a schedule  in which only p i takes steps that causes pi to decide v –Since p i cannot tell the difference between C1 and C2, if  is applied to C2, p i also decides v there. –Thus C2 is also v-valent

CS294, Yelick Consensus revisited, p9 Impossibility of Wait-Free Consensus (con’t) Lemma 3: If C is bivalent, then at least one processor is not critical, i.e., it can take a step and keep the system bivalent. Proof: By cases: –Suppose in contradiction that all processors are critical. Then there exist processors p i p j : 0/1 10 C pipi pjpj

CS294, Yelick Consensus revisited, p10 Impossibility of Wait-Free Consensus (con’t) Case 1: p i and p j access different registers or read the same register But these operations commute => a contradiction. 0/1 10 C pipi pjpj ??

CS294, Yelick Consensus revisited, p11 Impossibility of Wait-Free Consensus (con’t) Case 2: p i writes to and p j reads from the same register Let C+i be the configuration after executing p i and C+j+i be the configuration after executing p j then p i. C+i is equivalent to C+j+i from p i ’s perspective, contradicting Lemma 1. 0/1 10 C p i writes to R p j reads from R p i writes to R ??

CS294, Yelick Consensus revisited, p12 Impossibility of Wait-Free Consensus (con’t) Case 3: p i and p j write to the same register As in case 2, we can “run” the completion of the left- hand execution after p j ’s write. Since p i overwrites R, the executions result in 0. 0/1 10 C p i writes to R p j writes to R p i writes to R ??

CS294, Yelick Consensus revisited, p13 Impossibility of Wait-Free Consensus (con’t) Theorem 1: There is no wait-free consensus algorithm in an asynchronous shared memory system. Proof: Construct an execution in which all configurations are bivalent. 1.Start with bivalent initial configuration from lemma 1. 2.Use lemma 2 to get net bivalent configuration 3.Repeat step 2 infinitely

CS294, Yelick Consensus revisited, p15 Impossibility of Single Failure Consensus Even if the ratio of faulty processors is very low, consensus cannot be solved in asynchronous shared memory Proof outline: 1.Assume there exists an algorithm A for n processors and 1 failure 2.Use A as a subroutine to design algorithm for A’ for 2 processors and 1 failure 3.Previous result shows A’ cannot exist 4.Thus A does not exist

CS294, Yelick Consensus revisited, p16 Impossibility of Single Failure Consensus (con’t) Proof assumptions: for processors q 0,…q n-1 1.Each q i has a single register R i which it writes and others read 2.Code of each q i alternates reads and writes, beginning with a read 3.Each write step of each q i write q i ’s entire current state into R i All of these are without loss of generality.

CS294, Yelick Consensus revisited, p17 Impossibility of Single Failure Consensus (con’t) Idea of algorithm A’ for p 0 and p 1 : 1.Each pi goes through the q j ’s in round-robin order, trying to simulate their steps. Steps are grouped into pairs: a read and the following write. 2.When p i begins the simulation of q j, it uses its own input as the input for q j. If p i ever simulates a decision step by q j, it decides the same thing. 3.How do p 0 and p 1 keep their simulations consistent? The need to “agree” on the value of each q j ’s local state after each pair of steps by q j.

CS294, Yelick Consensus revisited, p18 Impossibility of Single Failure Consensus (con’t) For q j ’s k th pair, p 0 and p 1 each have flag variable: 1.Assume q j ’s k-1 st pair has been computed. 2.p i calculates its suggestion for q j ’s state after the k th pair (see later slides) 3.p i checks if p i-1 has made a suggestion for this state of q j 4.If not then p i sets its flag to 1 5.If so, then pi sets its flag to 0

CS294, Yelick Consensus revisited, p19 Impossibility of Single Failure Consensus (con’t) Note order of operations: So two 0 flags is possible, but not two 1’s. 1.Write suggest0 2.Read suggest1 3.Write flag0 1 if suggest1 empty 0 otherwise 1.Write suggest1 2.Read suggest1 3.Write flag1

CS294, Yelick Consensus revisited, p20 Impossibility of Single Failure Consensus (con’t) Interpretation of flags: 1.If pi’s flag is 1, then pi is the winner. 2.If both are 0, then consider p0 the winner. 3.If one is 0 and the other is not yet set, the winner is not yet determined. 4.If neither is set, the winner is not yet determined. 5.Not possible for both to be 1. In cases 1 and 2, the k th pair is said to be computed; otherwise not.

CS294, Yelick Consensus revisited, p21 Impossibility of Single Failure Consensus (con’t) How does p i calculate suggestion for q j ’s state after q j ’s k th pair? p i gets q j ’s state after its k-1 st pair: –if k-1 = 0, then user q j ’s initial state with p i ’s input –Otherwise get the suggestion of the winner for q j ’s k-1 st pair. Consult q j ’s state (just obtained) to determine which q r ’s register is to be read in its k th pair Get current value of q r ’s register by finding large m such that q r ’s m th pair has been computed and get the winning suggestion Apply q j ’s transition function to get the value of q j ’s state after its k th pair

CS294, Yelick Consensus revisited, p22 Impossibility of Single Failure Consensus (con’t) Each execution of A’ (by p’s) simulates an execution of A (by q’s). If p i observers a q j making a decision, then it makes the same decision. If the simulated execution is “admissible” (by failure assumption on q’s) then it satisfies: –Termination: eventually all q’s decide –Agreement: all q’s agree –Validity: If all q’s have input v, then the decision is v. So A’ would be a correct execution

CS294, Yelick Consensus revisited, p23 Impossibility of Single Failure Consensus (con’t) Why is the simulated execution admissible? We need to show that at least n-1 processors take an infinite number of steps in it. How can a simulation of qj be blocked? If p0 or p1 crashes during its simulation of q j ’s k th pair, e.g.: –p0 writes a suggestion then crashes –p1 sees p0s suggestion and writes 0 to its flag –p0’s flag remains unset forever –So q j ’s k th pair is never computed But the crash of 1 p i can only block the simulation of 1 q j. In the example, p1 would continue simulating all other q’s.

CS294, Yelick Consensus revisited, p25 Impossibility of Consensus in Message Passing Assume there exists an n-processor, consensus algorithm A for message passing with 1 fault Use A as a subroutine to design A’ for shared memory Previous results show A’ cannot exist So A cannot exist Idea of A’: Simulate message channels with read/write register. Then run A on top of these channels to get A’.

CS294, Yelick Consensus revisited, p27 Implications and Limitations of the Result FLP says consensus is impossible in an asynchronous environment. –All of the proofs are about liveness, not safety Castro/Liskov rely on this –Explains “window of vulnerability” in practice: Interval of time in which a fault can cause entire system to wait indefinitely –Do you care about liveness or response time (soft real- time guarantees) From a theoretical perspective, one can also “get around” this result by: –Using randomization (algorithm due to Ben Or tolerates <= 1/3 faulty processors) –Using RMW register, rather than just R/W

CS294, Yelick Consensus revisited, p28 Overview of Results on Consensus Let f be the maximum number of faulty processors. The following are tight bounds for synchronous message passing: Partially synchronous case is not as well studied. CrashByzantine Number of roundsf+1 Number of processors >= f+1>= 3f +1 Message sizePolynomial

CS294, Yelick Consensus revisited, p29 Administrivia If you’re doing a project and haven’t met with me in last 3 weeks, let me know asap. Final project deadlines: –Poster session Dec 13 in pm (with 262) –Final papers due Dec 15 Papers online for next week by Thursday

CS294, Yelick Consensus revisited, p30 Impossibility of Wait-Free Consensus (con’t) Definition: A decider execution  is: –Failure free –Bivalent –Univalent for every extension (adding 1 step) A decider execution goes from bivalent to univalent in a single step

CS294, Yelick Consensus revisited, p31 Impossibility of Wait-Free Consensus (con’t) Lemma 2: There is a decider execution for any wait-free consensus algorithm. Proof: –Suppose (for the purpose of contraction) any failure-free bivalent execution has a bivalent failure-free extension. –Given this and Lemma 1, we can construct an infinite bivalent failure-free execution . –Since  is infinite, some processor i takes an infinite number of steps. –Modify  by inserting a stop event after the last event for each process j that takes only a finite number of steps. Call this  ’. –By the wait-free assumption, processor i must decide in  ’. –But  and  ’ look identical to processor i => a contradiction. j i j k i j j k …

CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited

Similar presentations

Presentation on theme: "CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited

Similar presentations

Presentation on theme: "CS294, Yelick Consensus revisited, p1 CS 294-8 Consensus Revisited"— Presentation transcript:

Similar presentations

About project

Feedback