Distributed Transactions

Slides:



Advertisements
Similar presentations
CM20145 Concurrency Control
Advertisements

Database Systems (資料庫系統)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Lock-Based Concurrency Control
Quick Review of Apr 29 material
Transaction Management and Concurrency Control
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Transaction Processing: Concurrency and Serializability 10/4/05.
Transaction Management
1 Transaction Management Database recovery Concurrency control.
Concurrency Control John Ortiz.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Transactions and concurrency control
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
CS162 Section Lecture 10 Slides based from Lecture and
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Transactions and Concurrency Control Distribuerade Informationssystem, 1DT060, HT 2013 Adapted from, Copyright, Frederik Hermans.
Overview of DBMS recovery and concurrency control: Eksemplerne er fra kapitel 3 I bogen: Lars Fank Databaser Teori og Praksis ISBN
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
XA Transactions.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.
Jinze Liu. ACID Atomicity: TX’s are either completely done or not done at all Consistency: TX’s should leave the database in a consistent state Isolation:
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
CS 347: Parallel and Distributed Data Management Notes07: Data Replication Hector Garcia-Molina CS 347 Notes07.
Concurrency Control.
Two phase commit.
Outline Announcements Fault Tolerance.
Distributed Transactions
Chapter 10 Transaction Management and Concurrency Control
Lecture 21: Concurrency & Locking
CS162 Operating Systems and Systems Programming Review (II)
Basic Two Phase Locking Protocol
Chapter 15 : Concurrency Control
CSE 486/586 Distributed Systems Concurrency Control --- 3
Database Security Transactions
Lecture 21: Replication Control
Atomic Commit and Concurrency Control
Distributed Transactions
Transaction management
Distributed Databases Recovery
Distributed Transactions
Distributed Transactions
Distributed Transactions
Lecture 21: Replication Control
CIS 720 Concurrency Control.
CSE 486/586 Distributed Systems Concurrency Control --- 3
Last Class: Fault Tolerance
Transaction Communication
Transactions, Properties of Transactions
Presentation transcript:

Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically) ACID properties - what are these (Atomicity, Consistency, Isolation, Durability) What is a distributed transaction? -Involves objects managed by multiple servers communicating with one another.

Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically) ACID properties - what are these (Atomicity, Consistency, Isolation, Durability) What is a distributed transaction? -Involves objects managed by multiple servers communicating with one another.

Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically) ACID properties - what are these (Atomicity, Consistency, Isolation, Durability) What is a distributed transaction? -Involves objects managed by multiple servers communicating with one another.

Transactions Permanent Record Commit / Abort Shared variables Server operation Server operation Server operation Server operation

Concurrency control The goal of concurrency control is to guarantee that when multiple transactions are concurrently executed, the net effect should be equivalent to executing them in some serial order. This is the essence of the serializability property.

Example 1 T1 starts (20) W(x:=1) [OK] R(x) [OK] T1 commits T2 starts(30) W(x:=2) [OK] T2 commits T3 starts (40) W(x:=3) [OK] R(x) T3 commits This is serializable. Think of other examples too.

Example 2 T1 starts (20) W(x:=1) [OK] R(x) [NO] T1 aborts T2 starts(30) W(x:=2) [OK] R(x) T2 commits? T3 starts (40) W(x:=3) [OK] T3 commits This is not serializable.

Pitfalls in concurrency control Dirty read Lost update Premature write

Lost update Initially, B= $1000 Amy’s transaction Bob’s transaction 1 Load B into local 4 Load B into local Add $250 to local 5 Add $250 to local Store local to B 6 Store local to B What if the interleaving is 1 4 2 5 3 6 ? The final value of B is $1250, although it should have been $1500

Dirty read Initially B= $1000 Amy’s transaction Bob’s transaction 1 Load B into local 4 Load B into local Add $250 to local 5 Add $250 to local Store local to B 6 Store local to B ABORT COMMIT Execute the actions in the sequence 1 2 3 4 5 6. If Amy’s transaction aborts after Bob executes step 4, then the final result is still $1500, although it should have been $1250

Premature write {Initially B = 0} Amy’s transaction Bob’s transaction 1 B:= $500 2 B := $1000 3 COMMIT 4 ABORT B changes to 0. This could have been avoided if the second transaction postponed its commit UNTIL the first transaction commits or aborts.

Locks Locks are commonly used to implement serrializability of concurrent transactions. Operations on shared objects are in conflict when one of them is a write operation. Each transaction must acquire the corresponding exclusive lock before executing an action. Locks can be fine grained. Note that there is no conflict between two reads.

Serializability The serialization graph is a directed graph (V, E) where V is the set of transactions, and E is the set of directed edges between transactions - a directed edge from a transaction Tj to a transaction Tk implies that Tk applied a lock only after Tj released the corresponding lock. Tj Tk

Serializability theorem For a set of concurrent transaction, the serializability property holds if and only if the corresponding serialization graph is acyclic [Proved by Bernstein, Goodman, Hadzilacos in 1987]

Two-phase locking (2PL) Phase 1. Acquire all locks needed to execute the transaction. The locks will be acquired one after another, and this phase is called the growing phase or acquisition phase Phase 2. Release all locks acquired so far. This is called the shrinking phase or the release phase.

Two-phase locking (2PL) acquire release Growing phase Shrinking phase

2PL Theorem. 2PL guarantees serializability. Proof. Suppose that the theorem is not correct. Then the serialization graph must contain a cycle …Tj  Tk  … Tm  Tj …This implies that Tj must have released a lock (that was later acquired by Tk) and then acquired a lock (that was released by Tm). However this violates the condition of two-phase locking that rules out any locking once a lock has been released.

Atomic Commit Protocols Network of servers The initiator of a transaction is called the coordinator, and the remianing servers are participants S1 Servers may crash S3 S2

Requirements of Atomic Commit Protocols Network of servers Termination. All non-faulty servers must eventually reach an irrevocable decision. Agreement. If any server decides to commit, then every server must have voted to commit (i.e. no one voted abort). Validity. If all servers vote commit and there is no failure, then all servers must commit (as opposed to all deciding to abort) Irreversibility. Each participant decides at most once (i.e. decision is not reversible) Servers may crash S3 S2

One-phase Commit server participant server server client participant coordinator server participant If a participant deadlocks or faces a local problem then the coordinator may never be able to find it. Too simplistic.

Two-phase commit (2PC) What if failures occur? Phase 1: The coordinator sends VOTE to the participants. and receive yes / no from them. Phase 2: if ∀server j: vote(j) = yes  multicast COMMIT to all severs [] ∃ server j : vote (j) = no  multicast ABORT to all servers fi What if failures occur?

Failure scenarios in 2PC (Phase 1) Fault: Coordinator did not receive YES / NO: OR Participant did not receive VOTE: Solution: Broadcast ABORT; Abort local transactions

Failure scenarios in 2PC (Phase 2) (Fault) A participant does not receive a COMMIT or an ABORT message from the coordinator (it may be the case that the coordinator crashed after sending ABORT or COMIT to a fraction of the servers). The participant remains undecided, until the coordinator is repaired and reinstalled into the system. This blocking is a known weakness of 2PC.

Coping with blocking in 2PC A non-faulty participant can ask other participants about what message (COMMIT or ABORT) did they receive from the coordinator, and take appropriate actions. But what if no non-faulty participant* received anything? Who knows if the coordinator committed or aborted the local transaction before crashing? Continue to wait … *May be some participant received COMMIT/ABORT, but it crashed.

Non-blocking Atomic Commit A blocking protocol has the potential to prevent non-faulty participants from reaching a final decision. A solution to the atomic commitment problem is called non-blocking, if in spite of server crashes, every non-faulty participant eventually decides. One solution is to impose the requirement of uniform agreement

Uniform agreement If any participant (faulty or not) delivers a message m (commit or abort) then all correct processes eventually deliver m. To implement uniform agreement, no server should deliver a COMMIT or ABORT message until it has relayed it to all other servers. If a process times out in phase 2, then it decides abort.

Recovery: Stable storage Creates the illusion of an incorruptible storage, even if a writer or a disk crashes at any time. The implementation Uses at least two independent disks. A0 A1 inspect update

Stable storage A0 update inspect A1 To write, do the following: 1. copy on disk A0; 2. record timestamp T0; 3. compute checksum S0; 4.copy on disk A1; 5. record timestamp T1; 6. compute checksum S1 Readers check four cases: Both checksums OK and T1>T0 {accept any} Both checksums OK and T1<T0 {accept A0} {failure between 3 & 4} Checksum on A1 wrong {accept A0} Checksum on A0 wrong {accept A0} (Which copy to accept in each case?) A0 update A1 inspect

Checkpointing Mechanism for (backward) error recovery. Transaction states are periodically stored on stable storages. Following a failure, the transaction rolls back to the nearest checkpoint. Independent (unsynchronized) or coordinated (synchronized) checkpointing

Classification of checkpointing Coordinated Checkpointing takes a consistent snapshot. Has some overhead. Uncoordinated checkpointing apparently has no overhead. But it may have some efficiency problems.