1 Distributed and Replicated Data Seif Haridi. 2 Distributed and Replicated Data Purpose –Increase performance (parallel processing) –Increase safety.

Slides:



Advertisements
Similar presentations
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Advertisements

1 Integrity Ioan Despi Transactions: transaction concept, transaction state implementation of atomicity and durability concurrent executions serializability,
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
CS542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol.
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
CS 603 Handling Failure in Commit February 20, 2002.
1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 12: Three-Phase Commits (3PC) Professor Chen Li.
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
CIS 720 Concurrency Control. Timestamp-based concurrency control Assign a timestamp ts(T) to each transaction T. Each data item x has two timestamps:
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Transaction Processing Lecture ACID 2 phase commit.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.
1 ICS 214B: Transaction Processing and Distributed Data Management Replication Techniques.
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
Transaction Management
1 Distributed Databases CS347 Lecture 16 June 6, 2001.
CS 603 Three-Phase Commit February 22, Centralized vs. Decentralized Protocols What if we don’t want a coordinator? Decentralized: –Each site broadcasts.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
CS162 Section Lecture 10 Slides based from Lecture and
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Distributed Transactions Chapter 13
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa
University of Tampere, CS Department Distributed Commit.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 8 Fault.
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Chapter 15: Transactions Loc Hoang CS 157B. Definition n A transaction is a discrete unit of work that must be completely processed or not processed at.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
Committed:Effects are installed to the database. Aborted:Does not execute to completion and any partial effects on database are erased. Consistent state:
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
Section 06 (a)RDBMS (a) Supplement RDBMS Issues 2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
SHUJAZ IBRAHIM CHAYLASY GNOPHANXAY FIT, KMUTNB JANUARY 05, 2010 Distributed Database Systems | Dr.Nawaporn Wisitpongphan | KMUTNB Based on article by :
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Database recovery techniques
CS 347: Parallel and Distributed Data Management Notes07: Data Replication Hector Garcia-Molina CS 347 Notes07.
Two phase commit.
Transactions.
Outline Announcements Fault Tolerance.
Chapter 10 Transaction Management and Concurrency Control
Outline Introduction Background Distributed DBMS Architecture
Distributed Transactions
Lecture 21: Replication Control
Transaction management
Distributed Databases Recovery
Transaction Management Overview
Lecture 21: Replication Control
CIS 720 Concurrency Control.
Last Class: Fault Tolerance
Transaction Communication
Presentation transcript:

1 Distributed and Replicated Data Seif Haridi

2 Distributed and Replicated Data Purpose –Increase performance (parallel processing) –Increase safety (redundancy) To some extent contradictory –Safer  more complex  slower Abstraction level of models –Tradeoff between conceptually simple high-level models, which hide a lot of the detail but are expensive and don’t scale to well, and low-level models which are less expensive and scale better

3 Applications Fault tolerance is an issue The models considered –Transactions (from database theory) –Reliable broadcast –“Update propagation”

4 Transactions Transactions: complex state transitions which should seem atomic x-z SEKy+z SEK x SEK y SEK Grandma’s accountGrandpa’s account Transfer: z SEK

5 Transactions Usually composed of more primitive read- and write operations: “read x; read y; read z; write x-z; write y+z” If the transaction is interrupted we don’t want to have as the resulting state the one before the last write (then Grandpa gets angry) Also not good if two concurrent transfers A, B to Grandpa’s account are mixed up, so both A and B read the old value of y of his account (then we’ll lose one of the contributions)

6 ACID properties A=Atomicity: either all or none of the operations in a transaction are performed C=Consistency: the execution of interleaved transactions is equivalent to a serial execution of the transactions in some order I=Isolation: partial results of an incomplete transaction are not visible to others before the transaction is successfully committed D=Durability: results of a committed transaction will be permanent even if a failure occurs after the commitment TR persistent state new persistent state

7 ACID properties The “ACID” properties have some consequences for implementation A requires that intermediate results can be done undone if a failure occurs C requires that we lock the read/write entities involved in a transaction (e.g. representation of account in a database) while it goes on D requires that final results are written to permanent storage, and that beyond a certain commit point “preliminary” results have been recorded as well, so the transaction can be successfully completed even if failures occur

8 Common way to implement transactions 1. Lock all entities read and written by the transaction 2. Compute the results of the transaction (but don’t write them yet) 3. Write a persistent record of the results to a log file (precommit) 4. Write a commit record for the transaction to the log file (commit) 5. Perform the actual writes (updating the database) 6. Unlock the entities involved If failure occurs before 4, then the transaction is aborted upon restart, if it occurs after 4 then it is completed (This locking discipline, 2PL, is susceptible to deadlock, see later)

9 Centralized Recovery We need to recover disk failures during transaction execution so as to ensure the all or nothing property. 3 Approaches: –Shadow paging: 2 copies of database. –Before images: store on disk log of before values and update database immediately. If failure occurs and transaction has not committed restore db based on log. –After images: Perform updates in a log of after images. If transaction commits, install values in db from log.

10 Transactions in a distributed system Data is spread out over different processors A number of processors may participate in a transaction (or in different ongoing transactions) Processors may fail (and recover) during a transaction We assume an asynchronous system How to ensure that the ACID properties are still fulfilled?

11 A possible solution Every processor has its own log file Make sure all involved processors agree on every step before proceeding (?) That is: if some processor aborts (or fails at a critical moment), then everybody should abort If all processors have written commit records to their log files, then the transaction will eventually succeed even if some processors fail The processors can vote whether to proceed for each step A coordinator collects the votes and decides whether to go ahead or not

12 Distributed Recovery Databases reside on sites in a distributed system. Communication between sites by messages only. Each transaction has a home site or coordinator, and a number of participants. Goal: Either all sites commit or all abort. When a transaction wants to commit, it must be sure that all sites agree to commit too.

13 Vote coordination TimeCoordinator Participants votes Check if all yes Reply COMMIT or ABORT

14 Atomic Commitment At commit time, the coordinator requests votes from all participants. Atomic commitment requires: –All processes reach same decision –Commit only if all processes vote Yes. –If there are no failures and all processes vote Yes, decision will be commit.

15 Two Phase Commit (2PC) Coordinator –send vote-request –Collect votes. If all Yes, then Commit, else Abort. –Send decision Participant –receive vote-request –send Yes or No –Wait for decision

16 Failures and Blocking What does a process do if it does not receive a message it is expecting? I.e., on timeout? 3 cases: –participant waiting for vote-request  abort –coordinator waiting for vote  abort –participant waiting for decision  uncertain Note: coordinator never uncertain

17 The Two-phase Commit Algorithm (2PC) Code for coordinator (details regarding locking etc. are suppressed): 2PC_Coordinator() precommit the transaction For every participant p, send(p, VOTE_REQ) wait up to T seconds for VOTE messages Vote(sender;vote_response): if vote_response = YES then increment the number of YES votes if every participant responded with YES vote then commit the transaction /* write YES vote and a commit record to log */ for every participant p, send(p, COMMIT) else abort the transaction /* write ABORT record to log */ for every participant p, send(p, COMMIT)

18 2PC Participants 2PC_Participant() while(True) wait for a message from the coordinator VOTE_REQ(coordinator): if I can commit the transaction then precommit the transaction write a YES vote to the log send(coordinator, YES) else abort the transaction send(coordinator, NO) COMMIT(coordinator): commit the transaction ABORT(coordinator): abort the transaction

19 If a processor goes down during the transaction Execute a recovery protocol when it comes up again: Any processor before precommit: abort (and vote NO if participant) Coordinator after precommit: can choose (typically continue). Any votes lost due to the failure will yield abort Participant after precommit: wait to see how the others voted. (though COMMIT or ABORT message from the coordinator) (?) Any processor after commit or abort: complete the respective operation

20 Implication of Asynchronous communication Asynchronous error model implies that in certain phases the transaction cannot be aborted on timeouts It’s OK for the coordinator to count votes not received within T as NO votes (still safe, only means we’ll abort in few cases when late YES message would have admitted the transaction to proceed) But, if the coordinator goes down during the vote then the participants must wait for it to come up

21 Termination Protocol Can participant find help from other participants? Send to all participants: “Help, what is decision?’’ –if any participant has committed or aborted  send commit or abort decision. –If a participant has not yet voted  abort and send abort decision. –If all participants voted Yes  all live participants uncertain Transaction BLOCKED!

22 The coordinator failure Coordinator COMMIT Time T P3 P3 cannot decide to abort unilaterally in timeout The coordinator can be alive and have sent COMMIT to others even though the message is arbitrary delayed to P3 The Three-phase Commit (3PC) is designed to cut this uncertainty period

23 Blocking of 2PC 2PC is a blocking protocol. Basic intuition: When a participant is in wait (uncertain) state, some other participants may be in commit and others in abort states. Solution: Introduce a buffer state so that if any operational site is uncertain, no process can have decided to Commit [Skeen 82]. 3 Phase commit protocol only assumes site failures.

24 Three Phase Commit (3PC) Coordinator –send vote-request –Collect votes. If all Yes, then send Pre-Commit, else send Abort. –Collect all Acks, and send Commit Participant –receive vote-request –send Yes or No –if receive abort, then Abort, else, send Ack –If receive commit, then Commit.

25 Failure handling in 3PC 5 cases: –participant waiting for vote-request  abort –coordinator waiting for vote  abort –coordinator waiting for Ack  commit –participant waiting for decision  elect new leader – participant waiting for commit  elect new leader Note: In (5) a participant may still be waiting for decision.

26 Termination for 3PC Leader sends to all participants requesting state. –if any participant has committed or aborted send commit or abort decision. –If a participant has not yet voted abort and send abort decision. –If all participants voted Yes all live participants uncertain. –If some participant has pre-committed leader sends Pre-commit to all and wait for acks send commit

27 Commit Protocols Summary 2 PC blocks with failures 3PC is non-blocking with site failures only. 3PC blocks with partitioning failures. Partition 1 Partition 2 Theorem [Skeen82]: There is no non-blocking atomic commitment protocol in the presence of partitioning failures.

28 Transaction with Replicated Data Fault tolerance requires some kind of replication Note that the naïve approach can give problems: dd d1d2 tr1 tr2 Inconsistent copies

29 Replicated data In general, correct handling of replicated data requires some kind of consistency: that accesses to data seems to come in the same order for everyone. dd d1d2 d1 Either or write1 write2 write1

30 Transaction data In particular in transaction systems, the transactions themselves must seem executed in the same order for everyone: Either … tr1; tr2 … or … tr2; tr1 …

31 Serializability A database consists of a set of objects: x,y,z. Each object has a value. The values of the all the objects form the state of the database, and these states must satisfy the database integrity constraints. Database objects support 2 atomic operations: read[x], write[x].

32 Preliminaries A transaction is a set of operations executed in some order. We will assume total order. A transaction is assumed to be correct, i.e., if executed alone on a consistent database, it transforms it into another consistent state. Example: r 1 [x] r 1 [y] w 1 [x] w 1 [y] is an example of a transaction t 1 that transfers some amount of money from account x to account y.

33 Serializability 2 operations conflict if the order of execution is important, i.e. if one of them is a write. Xxx a aa t1Rt1R t 1 W(y) t1Rt1R t2Rt2R t 2 W(z) t 2 W(y) no conflictconflict

34 A Quorum-based Protocol Idea: access to datum d requires a “vote” among P(d), the set replicating d All voters read and update (if write access) its copy Require enough votes to ensure that accesses which must be serialized (conflicting RW, WR, WW) have some processors in common

35 Quorum based Protocol

36 Read & Write

37 Example RR W W R W RR

38 Example Common case: “read one/write all” W(d)= V(d) R(d) = 1 Local reads, but writes go to all processors

39 Replicated Servers A server is really an interface to a service -- can be implemented by several processors A server can be replicated over a set of processors P A client contracts some p in P, which acts as a coordinator for the transaction The methods treaded so far can be used to handle the transaction Alternative: primary copy -- single processor coordinating all transactions + Simplifies things, e.g. no distributed locks necessary - No performance gain from parallelism If primary copy fails a new can be elected

40 Preliminaries Given a set of transactions T, a history H over T is a partial order over all transaction operations and the order reflects the operation execution order (transaction order and conflicting operations order). A schedule is any linear order consistent with H’s partial order

41 Example of a history T 1 : r 1 [x] w 1 [x] c 1 T 2 : r 2 [x] w 2 [y] w 2 [x] c 2 T 3 : r 3 [y] w 3 [x] w 3 [y] w 3 [z] c 3 r 1 [x] w 1 [x] c 1 r 3 [y] w 3 [x] w 3 [y] w 3 [z] c 3 r 2 [x] w 2 [y] w 2 [x] c 2

42 Correctness A history is serial if for every 2 transactions, either all operations of one appear before the other or vice- versa. Since every transaction is correct, a serial history must be correct, and if executed on a consistent database, will result in a consistent database. But we want to allow concurrent transactions…

43 Example of concurrent execution: transfer 100 from account x to y Serial executionConcurrent execution r 1 [x] returns 200 r 1 [x] returns 200 w 1 [x] writes 100 w 1 [x] writes 100 r 1 [y] returns 200 r 2 [x] returns 100 w 1 [y] writes 300 r 1 [y] returns 200 commit t 1 w 1 [y] writes 300 r 2 [x] returns 100 commit t 1 r 2 [y] returns 300 r 2 [y] returns 300 commit t 2 commit t 2 BOTH TRANSACTIONS OBSERVE AND WRITE SAME VALUES!

44 Serializability A history is serializable if it is equivalent to a serial history over the same set of transactions. 2 histories are view equivalent of they have the same effects, i.e. same values are written by all transactions. Since we do not know what transactions write, we require that transactions read from the same transactions and final written values are the same.

45 Conflict Serializability Recall: 2 operations conflict if one of them is a write operation. Two histories, H 1 and H 2, are conflict equivalent if the order of conflicting operations is the same in both histories, i.e., if o 1 in t 1 and o 2 in t 2 conflict, then –o 1 < o 2 in H 1 iff o 1 < o 2 in H 2. H is conflict serializable if it is conflict equivalent to a serial history.

46 Serialization Graphs How do we prove a history H is (conflict) serializable? Serialization Graph SG(H): –nodes are transactions, –t 1 -> t 2 if o 1 in t 1 and o 2 in t 2 conflict and o 1 < o 2 in H H: w 1 [x]w 1 [y]c 1 r 2 [x]r 3 [y]w 2 [x]c 2 w 3 [y]c 3 t 1 t 2 t 3 Serializability Theorem: A history H is serializable iff SG(H) is acyclic. A concurrency control protocol ensures serializability.

47 Example t 0 : w 0 [a := 100] w 0 [b:=20] c 0 t 1 : r 1 [a] r 1 [b] w 1 [c := a+b] w 1 [d := a-b] c 1 t 2 : r 2 [a] r 2 [b] w 2 [c := a-b] w 2 [d := a+b] c 2 Assume t 0 completed first, t 1 and t 2 are executed simultaneously If t 1 <t 2 we get (120,80) < (80,120) If t 2 <t 1 we get (80,120) < (120,80) Any other result is illegal