Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation 11: State Machine Replication with Paxos & Sequential Consistency Spring 2009 Alex Shraer

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 2 Replicated State Machines Data is replicated at n servers. Operations are initiated by clients. Operations need to be performed at all correct servers in the same order. Goal: ensure that all the copies are the same after the i th operation.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 3 Client-Server Interaction Leader-based: each process (client/server) has an estimate of who is the current leader. A client sends a request to its current leader. The leader sends the response to the client.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 4 Sequence of Paxos Instances A sequence of separate instances of Paxos. –The value chosen by instance i is the i th operation. Clients send operations to the current leader. The leader decides where in the sequence each operation should appear. –If the leader decides that a certain operation should appear as the 135 th operation, it tries to have that operation as the value of the 135 th instance of Paxos.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 5 Safety and Liveness Reasons for leader proposal’s failures: –A leader fails –A different node believes it is a leader Safety is always preserved (worst case) Performance can be optimized during non- faulty periods

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 6 Replication with Fast Paxos Non-optimized version: –New leader learns the entire history Observations: –No value is chosen until phase 2 of Paxos. –At the end of phase 1, either the value to be proposed is determined, or else the proposer is free to choose any value.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 7 Normal Operation Normal operation: the previous leader has just failed and a new one has been selected. The new leader knows most of the operations that have already been chosen –Since it participated in the protocol before it became a leader –Suppose it knows operations 1-134, 138 and 139.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 8 Normal Operation (cont’d) The leader executes phase 1 of instances 135-137 and of all instances > 139. –Suppose the outcome of this phase determines the value to be proposed in instances 135 and 140, and is unconstrained in the other instances. –The new leader now executes phase 2 for instance 135 and 140 (Why does it have to?)

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 9 Normal Operation (cont’d) Every server knows commands 1-135  the leader can execute them. Cannot execute commands 138-140 before 136 and 137 Two options: –Use the next two client requests as commands 136 and 137. –Fill the gap using “no-op” operations Which one is better?

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 10 Normal Operation (cont’d) Operations 1-140 have now been chosen, and all servers can execute them. The leader also completed phase 1 for instances > 140. Can start working in express mode –Can propose any value in phase 2 of these instances immediately

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 11 How can gaps occur? The leader can propose operation 142 before it knows its proposed 141 operation is chosen. Bad scenario –All messages it sent proposing operation 141 are lost and operation 142 is chosen before any server learns about operation 141. –The leader fails before 141 is chosen.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 12 Phase 1 for infinity? A new leader executes phase 1 for infinitely many instances of Paxos. (135-137 and all instances > 139). –Uses the same BallotNum for all of the instances. A response to a prepare message needs to include a value only for the instances for which it already accepted a value (in phase 2). In the example: 135 and 140. –the servers can respond with a "reasonably short" message

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 13 Abnormal Operation We assumed that there is a single leader. –Only phase 2 can be executed for each instance. What happens if that is not the case? –Safety is preserved (why?). –A single leader is needed for liveness.

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 14 Sequential-Linearization A Sequential-linearization  of a concurrent execution  is 1.A sequential execution Each invocation is immediately followed by its response Satisfies the object’s sequential specification 2.Looks like  Responses to all invocations are the same as in  Responses to pending invocations in  may be added 3.Preserves local real-time order If the completion for operation o 1 at process p i occurs in  before the invocation for operation o 2 at node p i, then o 1 appears before o 2 in  Can be written as:  |i =  |i for all i

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 15 Sequential Consistency A concurrent execution that has a sequential- linearization is sequentially consistent What’s different from linearizability?

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 16 Sequential Consistency A concurrent execution that has a sequential- linearization is sequentially consistent What is the difference from linearizability? Both linearizability and sequentially consistency are ``strong’’ consistency conditions: all processes must agree on the order in which all operations occur

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 17 Some notations x.write i (v) – invocation by process p i of a write operation with value v to register x x.ack i – completion of write operation to register x by process p i x.read i – invocation by process p i of a read operation from register x x.ret i (v) – completion of read operation from register x by process p i, with v being the returned value

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 18 Sequentially consistent local-writes algorithm the algorithm emulates sequentially-consistent shared register using message-passing abcast and adeliver: reliable atomic broadcast x i is the local copy of the shared register x at p i upon x.read i if num=0 then invoke x.ret i (x i ) upon x.write i (v) num  num+1 abcast(  "write", x, v  ) invoke x.ack i upon adeliver i (j,  "write", x, v  ) x i  v if (i = j) then num  num – 1 if num = 0 and a read on x is pending then invoke x.ret i (x i ) Question 1: –Show a linearizable execution of this algorithm (explain why it is linearizable) –Show a sequentially consistent execution of this algorithm which is not linearizable (explain why it is sequentially consistent and not linearizable) The algorithm is taken from Attiya Book (second edition), page 197

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 19 Question 2: For each of the following executions, determine whether it is linearizable, sequentially-consistent, or neither, and explain (assume that the initial value in all register is  ): 1.x.write 1 (0), x.write 2 (1), x.ack 1, x.read 1, x.ack 2, x.ret 1 (0), x.read 2, x.ret 2 (1). 2.x.write 1 (1), x.ack 1, x.read 2, x.ret 2 ( ┴ ), x.read 2, x.ret 2 (1) 3.x.write 1 (0), x.write 2 (1), x.ack 1, x.ack 2, x.read 1, x.read 2, x.ret 1 (0), x.ret 2 (1). 4.x.write 1 (1), x.ack 1, x.write 3 (2), x.ack 3, x.read 4, x.read 2, x.ret 2 (1), x.ret 4 (2), x.read 4, x.ret 4 (1), x.read 2, x.ret 2 (2). 5.x.write 1 (1), y.write 2 (1), y.ack 2, x.ack 1, y.read 1, x.read 2, x.ret 1 ( ┴ ), y.ret 2 ( ┴ ) Hint: it always helps to draw the execution as in the lectures, and your explanation should use the requirements made by the definition

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation.

Similar presentations

Presentation on theme: "Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation.

Similar presentations

Presentation on theme: "Eddie Bortnikov & Aran Bergman, Principles of Reliable Distributed Systems, Technion EE, Spring 2006 1 Principles of Reliable Distributed Systems Recitation."— Presentation transcript:

Similar presentations

About project

Feedback