Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Distributed Systems Fall 2009 Replication

Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly available services Summary

Fall 20095DV0204 Group communication Static vs. Dynamic groups Primary partition vs. partitionable groups Group management –Interface for membership changes –Failure detection –Notification upon membership changes –Provide group address expansion

Fall 20095DV0205 Group views Views contain a set of members at a given point in time –Failed identified processes are not in the view Events occur in views View-synchronous group communication –Allows based on view delivery, we can know which messages must have been delivered to other members

Fall 20095DV0206 View-synchronous group communication Correct processes deliver the same set of messages in any given view Messages are delivered at most once Correct processes always deliver messages they send –If delivering to q fails, the next view excludes q

Fall 20095DV0207 Why replication? Many algorithms require a working server node Performance (load balancing) Increased availability 1 – p(all replicas crashed) = 1 – p n Fault-tolerance –Correct servers in majority

Fall 20095DV0208 Replication Replication transparency –Client unaware of replication Problem with >1 client –Concurrent access, rather than exclusive –Operations are interleaved How do we ensure correctness?

Fall 20095DV0209 Correctness of interleavings Always –Interleaved sequence of operations must meet the specification of a single correct copy of the object(s) Sequential consistency property –Order of operations is consistent with the program order in which each individual process executed them Linearizability property –Order of operations is consistent with the real times at which the operations occurred during execution

Fall 20095DV02010 Example (interleaved operations) C1: A, B, C C2: d, e, f Order during execution: A, B, d, C, e, f An interleaving with sequential consistency: A, B, d, e, f, C Interleaving with linearizability: A, B, d, C, e, f

Fall 20095DV02011 Generalized replication 1. Request: client makes request 2. Coordination: replica managers decide upon order of request 3. Execution: request is executed 4. Agreement: replica managers agree on result of execution 5. Response: response is sent back to the client

Fall 20095DV02012 Passive replication One Primary replica manager, many backups If primary fails, backups can take its place (election!) Implements linearizability if: –A failing primary is replaced by a unique backup –Backups agree on which operations had been performed when primary crashed View-synchronous group communication!

Fall 20095DV02013 Passive replication 1. Request: front end issues request with unique ID 2. Coordination: primary checks if request has been carried out, if so, returns cached response 3. Execution: perform operation, cache results 4. Agreement: primary sends updated state to backups 5. Response: primary sends result to front end, which forwards to the client

Fall 20095DV02014 Active replication More distributed All replica managers carry out all operations Requests to RM are totally ordered Front ends issue one request at a time (FIFO) Implements sequential consistency

Fall 20095DV02015 Active replication 1. Request: front end adds unique identifier to request, mcasts it to RMs 2. Coordination: totally ordered request delivery to RMs 3. Execution: each RM executes request 4. Agreement: not needed 5. Response: all RMs respond to front end, front end interprets response and forwards interpretation to client

Fall 20095DV02016 Comparison (Active/Passive) Handling of crash failures? –Both: yes (but differently) Handling of arbitrary failures? –Active: yes, Passive: no Complexity? Optimizations? –Send “reads” to backups in passive Lose linearizability property! –Send “reads” to single backup in active Lose fault tolerance

Fall 20095DV02017 Highly available services Goal is to allow clients to use service for as long as possible –Even if network connections are lost –Even if results may be inconsistent Study this deeper on your own –Basic questions about this have been on exams before

Gossip Queries vs. updates Guarantees by Gossip – Each client gets a consistent service over time Replicas will provide data that is fresher than what the client has seen so far – Relaxed consistency between replicas Generally less than sequential consistency Eventually, all updates are applied (in order), but clients may observe stale data

Gossip Forced (total and causal) or immediate ordering Choice is up to application developer

Gossip steps 1.Request – blocking queries to a single RM, updates perhaps to many RMs for increased reliability 2.Update response – return immediately on updates 3.Coordination – wait until request can be processed according to ordering rules 4.Execution – execute request 5.Query response – reply to query 6.Agreement – update other RMs via gossip messages occasionally

Gossip architecture Consider Figure 15.8 in the book © Pearson Education 2005. Value timestamp: vector timestamp showing which updates have been applied Update log: Record all updates, even if they cannot be applied yet (not yet stable) Keep updates that have become stable so they can be propagated to others Replica timestamp: updates that have been accepted, but may not yet be stable Executed operation table: avoid repeated updates Timestamp table: timestamp for each RM, keep track of applied updates in other RMs

Fall 20095DV02022 Summary Group communication –Views –View-synchronous group communication Replication –Correctness Linearizability: time Sequential consistency: program order –Passive and active replication schemes

Fall 20095DV02023 Next lecture Transactions –Nested transactions Concurrency control –Locks –Optimistic concurrency control

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Similar presentations

Presentation on theme: "Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.

Similar presentations

Presentation on theme: "Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly."— Presentation transcript:

Similar presentations

About project

Feedback