Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman.

Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman

Distributed Systems 20062 Plan We skip Section 18.2 Tracking group membership: We’ll base it on 2PC and 3PC Fault-tolerant multicast: We’ll use membership Ordered multicast: We’ll base it on fault-tolerant multicast Tools for solving practical replication and availability problems: we’ll base them on ordered multicast Robust Web Services: We’ll build them with these tools 2PC and 3PC: Our first “tools” (lowest layer)

Distributed Systems 20063 Using group communication as a building block Communication to the group Using communication inside the group

Distributed Systems 20064 How to communicate to a group? We have primarily looked at communication within groups Often process groups provide a service that processes outside the groups need to use –How do we multicast to the group from outside the group? A number of solutions –Join the group... –Use a well-known member of the group –Cache a group view

Distributed Systems 20065 Using a well-known group member The non-member process issues an RPC to a member process –Assigns an ID; reissue if failure –The member processes multicasts on behalf of the non-member What about ordering? –Can requests submitted by different non-members be considered concurrent? –Complex in the event of failure c

Distributed Systems 20066 Caching a group view Non-member gets a view of the group, possibly outdated –Client communicate directly with processes in cached view View obtained, e.g., through –RPC to group member –By joining (larger) group to get view Three cases of multicast arrival –1) entirely in correct view – correct –2) entirely late – ignore/report, client iterates multicast –3) partially correct, partially late – flush at view installation will deliver message correctly Often cheaper than RPC-based solution –Still complex with ordering c ignore /report flush

Distributed Systems 20067 Using communication within groups What is a ”natural” abstraction for working with group communication? Reconsider the transactional model – ”virtual serial execution” –Transactions appear to run in isolation (cf ACID) –Database free to optimize by interleaving execution of transactions ”Virtual Synchrony” –All group members see the same events in the same order –Group communication system free to optimize Insight –Enables us to run the same algorithm on all processes that will stay in same state...

Distributed Systems 20068 A synchronous execution With true synchrony executions run in genuine lock-step – “state machine execution” possible p q r s t u

Distributed Systems 20069 Virtual Synchrony at a glance With virtual synchrony executions only look “lock step” to the application p q r s t u

Distributed Systems 200610 Virtual Synchrony at a glance We use the weakest (hence fastest) form of ordered communication possible p q r s t u

Distributed Systems 200611 Elements of Virtual Synchrony Support for process groups –Processes may join and leave dynamically –Excluded if (thought to) fail Various reliable, ordered multicast protocols –Strive to replace synchronous, totally-ordered, dynamically uniform protocols View-synchronous delivery –Two processes that are member of the same view receive same set of multicasts during that view Identical process groups views and rankings –Identical sequences of group membership lists Gap-freedom guarantees –If, after a failure, m1 has been delivered to a destination, then any message m0 that should be delivered prior to m1 is also delivered State transfer for joining processes –A new process may obtain group state from existing member –(Will develop this shortly)

Distributed Systems 200612 Why ”Virtual” Synchrony? The user sees what looks like a synchronous execution –Simplifies the developer’s task But the actual execution is rather concurrent and asynchronous –Maximizes performance –Reduces risk that lock-step execution will trigger correlated failures

Distributed Systems 200613 Correlated failures Why do we claim that virtual synchrony makes these less likely? Recall that many programs are buggy –Often these are Heisenbugs (order-sensitive) With lock-step execution each group member sees group events in identical order –So all die in unison With virtual synchrony orders differ –So an order-sensitive bug might only kill one group member!

Distributed Systems 200614 Why Virtual Synchrony again? So what can we build using Virtual Synchrony? Distributed Algorithms –Election –Consensus –Consistent snapshot Saw this last time –Replicated data and synchronization –State transfer –Load-balancing –Fault tolerance Primary-backup Coordinator-cohort

Distributed Systems 200615 Election We need to choose (in a distributed way) some process to take a special role in an algorithm? Processes that might participate join an appropriate group Now the group view gives a simple leader election rule –Everyone sees the same members, in the same order, ranked by when they joined –Leader can be, e.g., the “oldest” process We have already used this in a number of places –E.g., abcast

Distributed Systems 200616 Consensus A set of processes need to agree on some decision/value? A group can easily solve consensus –Leader multicasts: “what’s your input”? –All reply: “Mine is 0. Mine is 1” –Initiator picks the most common value and multicasts that: the “decision value” –If the leader fails, the new leader just restarts the algorithm Famous, theoretical result (”FLP”): –In the asynchronous model with process failure, distributed consensus is impossible Does this apply here?

Distributed Systems 200617 Replicated Data and Synchronization We want to to replicate data and synchronize on updates among a set of processes? Want to support READ, UPDATE, LOCK operations on replicated data First look at synchronous, non-uniform case

Distributed Systems 200618 Synchronous, Replicated Data Use groups... –Joining processes Transfer data from existing processes (more later) –Maintain replica of data at each process READ –Read local copy UPDATE –Asynchronously multicast update using abcast – i.e., no waiting LOCK –Nonexclusive read locks can be implemented locally –Acquire exclusive lock by abcasting to group members for them to grant the lock Recipient just grants requests for lock in order received – provided no non-exclusive read is pending locally Sender waits until all recepients have acquired lock Fault-tolerant, consistent –Replicas start in identical state –Subsequent update and lock operations behave identically at all copies – same events, same order

Distributed Systems 200619 Can we do better? How many abcasts can be replaced with asynchronous cbcasts? All, if we modify the protocol a bit... All UPDATEs guarded by appropriate locks –Want mutual exclusion on data that concurrent UPDATE could change LOCKs established by passing write rights

Distributed Systems 200620 cbcast and locking For possibly conflicting updates, there will be only one writer at a time –The initial writer is the first process to join the group To make a LOCK request, a process sends a cbcast and waits –Receiving processes queue requests When writer is done, it will send a cbcast that grants the lock –It takes oldest process on its LOCK queue and grant to that –Other processes dequeues request and grants (when local read locks done) Does this work? –Do group members perform same sequence of updates?

Distributed Systems 200621 cbcast and locking Sequence of conflicting updates identical at all replicas –There is only one writer at the time for a given lock –Updates and lock grants follow the same causal path –(The causal order is a total order along a causal path) Non-conflicting updates may be applied in any order

Distributed Systems 200622 Performance Mostly asynchronous –UPDATEs may be done when sending the UPDATE cbcast Causally prior updates will already have been delivered to sender –Waiting for LOCKs necessary unless we already have the lock... ”85.000 updates per second” –Cf. quorum read and update Forces into lockstep for reads and updates ”100 reads and updates (combined) per second” –Not dynamically uniform, though E.g., safe cbcast or cbcast followed by flush

Distributed Systems 200623 Summary We introduced ”Virtual Synchrony” as a powerful distributed programming abstraction –Processes appear to run in lock-step – middleware may optimize –Based on tools developed so far: membership, group communication Gave examples of what can be built on top of Virtual Synchrony –More next time...

Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman.

Similar presentations

Presentation on theme: "Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman.

Similar presentations

Presentation on theme: "Distributed Systems 2006 Virtual Synchrony* *With material adapted from Ken Birman."— Presentation transcript:

Similar presentations

About project

Feedback