CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication --- 3 Steve Ko Computer Sciences and Engineering University at Buffalo.

Slides:



Advertisements
Similar presentations
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Advertisements

Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems
Chapter 13 Replica Management in Grids
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
6.4 Data and File Replication Gang Shen. Why replicate  Performance  Reliability  Resource sharing  Network resource saving.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Replication ( ) by Ramya Balakumar
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Replication with View Synchronous Group Communication Steve Ko Computer Sciences and Engineering.
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CS425 /CSE424/ECE428 – Distributed Systems – Fall Nikita Borisov - UIUC1 Material derived from slides by I. Gupta, M. Harandi, J. Hou, S.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
By Shruti poundarik.  Data Objects and Files are replicated to increase system performance and availability.  Increased system performance achieved.
1 Highly available services  we discuss the application of replication techniques to make services highly available. –we aim to give clients access to.
CS542: Topics in Distributed Systems Replication.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Mutual Exclusion & Leader Election Steve Ko Computer Sciences and Engineering University.
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Hwajung Lee.  Improves reliability  Improves availability ( What good is a reliable system if it is not available?)  Replication must be transparent.
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Lecture 13: Replication Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Computer Science 425 Distributed Systems (Fall 2009) Lecture 24 Transactions with Replication Reading: Section 15.5 Klara Nahrstedt.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 CSE 486/586 Distributed Systems Leader Election Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Paxos Steve Ko Computer Sciences and Engineering University at Buffalo.
Reliable multicast Tolerates process crashes. The additional requirements are: Only correct processes will receive multicasts from all correct processes.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
CSE 486/586 Distributed Systems Consistency --- 1
Replication Control II Reading: Chapter 15 (relevant parts)
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 Distributed Systems Consistency --- 1
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Concurrency Control --- 2
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Concurrency Control --- 3
Distributed Transactions
Lecture 21: Replication Control
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Concurrency Control --- 2
Lecture 21: Replication Control
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 Distributed Systems Consistency --- 1
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
CSE 486/586 Distributed Systems Concurrency Control --- 3
Presentation transcript:

CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo

CSE 486/586, Spring 2012 Recap Consistency –Linearizability? –Sequential consistency? Passive replication? Active replication? One-copy serialzability? Primary copy replication? Read-one/write-all replication? 2

CSE 486/586, Spring 2012 Available Copies Replication A client's read request on an object can be performed by any RM, but a client's update request must be performed across all available (i.e., non- faulty) RMs in the group. As long as the set of available RMs does not change, local concurrency control achieves one-copy serializability in the same way as in read-one/write-all replication. May not be true if RMs fail and recover during conflicting transactions. 3

CSE 486/586, Spring 2012 Available Copies Approach 4 A X Client + front end P B Replica managers deposit(A,3); UT deposit(B,3); getBalance(B) getBalance(A) Replica managers Y M B N A B

CSE 486/586, Spring 2012 The Impact of RM Failure Assume that: –RM X fails just after T has performed getBalance; and –RM N fails just after U has performed getBalance. –Both failures occur before any of the deposit()'s. Subsequently: –T's deposit will be performed at RMs M and P –U's deposit will be performed at RM Y. The concurrency control on A at RM X does not prevent transaction U from updating A at RM Y. Solution: Must also serialize RM crashes and recoveries with respect to entire transactions. 5

CSE 486/586, Spring 2012 Local Validation From T's perspective, –T has read from an object at X  X must have failed after T's operation. –T observes the failure of N when it attempts to update the object B  N's failure must be before T. –Thus: N fails  T reads object A at X; T writes objects B at M and P  T commits  X fails. From U's perspective, –Thus: X fails  U reads object B at N; U writes object A at Y  U commits  N fails. At the time T tries to commit, –it first checks if N is still not available and if X, M and P are still available. Only then can T commit. –If T commits, U's validation will fail because N has already failed. Can be combined with 2PC. Caveat: Local validation may not work if partitions occur in the network 6

CSE 486/586, Spring 2012 Network Partition How do you deal with this? 7 Client + front end B withdraw(B, 4) Client + front end Replica managers deposit(B,3); U T Network partition B BB

CSE 486/586, Spring 2012 Dealing with Network Partitions During a partition, pairs of conflicting transactions may have been allowed to execute in different partitions. The only choice is to take corrective action after the network has recovered –Assumption: Partitions heal eventually Abort one of the transactions after the partition has healed Basic idea: allow operations to continue in partitions, but finalize and commit trans. only after partitions have healed But to optimize performance, better to avoid executing operations that will eventually lead to aborts…how? 8

CSE 486/586, Spring 2012 Quorum Approaches Quorum approaches used to decide whether reads and writes are allowed There are two types: pessimistic quorums and optimistic quorums In the pessimistic quorum philosophy, updates are allowed only in a partition that has the majority of RMs –Updates are then propagated to the other RMs when the partition is repaired. 9

CSE 486/586, Spring 2012 Static Quorums The decision about how many RMs should be involved in an operation on replicated data is called Quorum selection Quorum rules state that: – At least r replicas must be accessed for read – At least w replicas must be accessed for write – r + w > N, where N is the number of replicas – w > N/2 – Each object has a version number or a consistent timestamp Static Quorum predefines r and w, & is a pessimistic approach: if partition occurs, update will be possible in at most one partition 10

CSE 486/586, Spring 2012 Voting with Static Quorums Modified quorum: –Give different replicas different #’s of votes –e.g., a cache replica may be given a 0 vote with r = w = 2, Access time for write is 750 ms (parallel writes). Access time for read without cache is 750 ms. Access time for read with cache can be in the range 175ms to 825ms – why?. 11 Cache 0100ms0ms0% Rep1 1750ms75ms1% Rep2 1750ms75ms1% Rep3 1750ms75ms1% Replicavotes access time version chk P(failure)

CSE 486/586, Spring 2012 CSE 486/586 Administrivia Project 1 deadline: 3/26 (Monday) Project 2 will be released on Monday. Project 0 scores are up on Facebook. –Request regrading until today. Great feedback so far online. Please participate! 12

CSE 486/586, Spring 2012 Optimistic Quorum Approaches An Optimistic Quorum selection allows writes to proceed in any partition. This might lead to write-write conflicts. Such conflicts will be detected when the partition heals –Any writes that violate one-copy serializability will then result in the transaction (that contained the write) to abort –Still improves performance because partition repair not needed until commit time (and it's likely the partition may have healed by then) Optimistic Quorum is practical when: –Conflicting updates are rare –Conflicts are always detectable –Damage from conflicts can be easily confined –Repair of damaged data is possible or an update can be discarded without consequences –Partitions are relatively short-lived 13

CSE 486/586, Spring 2012 View-based Quorum An optimistic approach Quorum is based on views at any time –Uses group communication as a building block (see previous lecture) In a partition, inaccessible nodes are considered in the quorum as ghost participants that reply “Yes” to all requests. –Allows operations to proceed if the partition is large enough (need not be majority) Once the partition is repaired, participants in the smaller partition know whom to contact for updates. 14

CSE 486/586, Spring 2012 View-based Quorum - details Uses view-synchronous communication as a building block (see previous lecture) Views are per object, numbered sequentially and only updated if necessary We define thresholds for each of read and write : – A w : minimum nodes in a view for write, e.g., A w > N/2 – A r : minimum nodes in a view for read – E.g., A w + A r > N If ordinary quorum cannot be reached for an operation, then we take a straw poll, i.e., we update views In a large enough partition for read, Viewsize  Ar In a large enough partition for write, Viewsize  Aw (inaccessible nodes are considered as ghosts that reply Yes to all requests.) The first update after partition repair forces restoration for nodes in the smaller partition 15

CSE 486/586, Spring 2012 Example: View-based Quorum Consider: N = 5, w = 5, r = 1, A w = 3, A r = V1.0 2 V2.0 3 V3.0 4 V4.0 5 V5.0 Initially all nodes are in 1 V1.0 2 V2.0 3 V3.0 4 V4.0 5 V5.0 Network is partitioned 1 V1.0 2 V2.0 3 V3.0 4 V4.0 5 V5.0 Read is initiated, quorum is reached read 1 V1.0 2 V2.0 3 V3.0 4 V4.0 5 V5.0 write is initiated, quorum not reached w X 1 V1.1 2 V2.1 3 V3.1 4 V4.1 5 V5.0 P1 changes view, writes & updates views w

CSE 486/586, Spring 2012 Example: View-based Quorum (cont'd) 17 1 V1.1 2 V2.1 3 V3.1 4 V4.1 5 V5.0 Partition is repaired 1 V1.1 2 V2.1 3 V3.1 4 V4.1 5 V5.0 P5 initiates read, has quorum, reads stale data r 1 V1.1 2 V2.1 3 V3.1 4 V4.1 5 V5.0 P3 initiates write, notices repair w 1 V1.2 2 V2.2 3 V3.2 4 V4.2 5 V5.2 Views are updated to include P5; P5 is informed of updates 1 V1.1 2 V2.1 3 V3.1 4 V4.1 5 V5.0 P5 initiates write, no quorum, A w not met, aborts. w X X X X

CSE 486/586, Spring 2012 CAP Theorem Consistency Availability –Respond with a reasonable delay Partition tolerance –Even if the network gets partitioned Choose two! Brewer conjectured in 2000, then proven by Gilbert and Lynch in

CSE 486/586, Spring 2012 Coping with CAP The main issue is scale. –As the system size grows, network partitioning becomes inevitable. –You do not want to stop serving requests because of network partitioning. –Giving up partition tolerance means giving up scale. Then the choice is either giving up availability or consistency Giving up availability and retaining consistency –E.g., use 2PC –Your system blocks until everything becomes consistent. –Probably cannot satisfy customers well enough. Giving up consistency and retaining availability –Eventual consistency 19

CSE 486/586, Spring 2012 Summary Distributed transactions with replication –Active copies replication Quorums –Static –Optimistic –View-based CAP Theorem 20

CSE 486/586, Spring Acknowledgements These slides contain material developed and copyrighted by Indranil Gupta (UIUC).