Contents Transactions that Read Uncommitted Data View Serializability Resolving Deadlocks Distributed Databases Long-Duration Transactions
The Dirty-Data Problem Data is “dirty” if it has been written by a transaction that is not yet committed Example 1: A written by T1 is a dirty data,T2’s reading of A leaves the database with an inconsistent state
T1 T2 A B 25 25 :l1(A);r1(A); A:=A+100; w1(A);l1(B);u1(A); 125 l2(A);r2(A); A:=A*2; w2(A);u2(A); 250 l2(B);Denied r1(B); Abort;u1(B); l2(B); u2(A); r2(B); B:=B*2; w2(B);u2(B) 50 T1 writes dirty data and then aborts
Example 2: T1 T2 T3 A B C 200 150 175 RT=0 RT=0 RT=0 WT=0 WT=0 WT=0 w2(B) WT=150 r1(B) r2(A) RT=150 r3(C) RT=175 w2(C) Abort WT=0 w3(A)) WT=175 T1 has read dirty data from T2 and must abort when T2 does
Cascading Rollback When transaction T aborts,we must find each transaction U that read dirty data from T,abort U,find any transaction V that read dirty data from U,abort V,and so on Both a timestamp-based scheduler with a commit bit and a validation-based scheduler avoids cascading rollback
Managing Rollbacks Strict Locking: A transaction must not release any write locks (or other locks, such as increment locks that allow values to be changed) until the transaction has either committed or aborted,and the commit or abort log record had been flushed to disk A schedule of transactions that obey the strict locking rule is called recoverable
Motivating example Schedule Q T 1 T 2 T 3 Read(A) Write(A)
Same as Q = r 1 (A) w 2 (A) w 1 (A) w 3 (A) P(Q): T 1 T 2 T 3 Not conflict serializable!
But now compare Q to Ss, a serial schedule: QT 1 T 2 T 3 Read(A) Write(A) Write(A) Write(A) SsT 1 T 2 T 3 Read(A) Write(A) Write(A) Write(A)
T 1 reads same thing in Q, Ss T 2, T 3 read same thing (nothing?) After Q or Ss, DB is left in the same state So what is wrong with Q?
Definition Schedules S 1,S 2 are View Equivalent if: (1) If in S 1: w j (A) r i (A) then in S 2: w j (A) r i (A) (2) If in S 1: r i (A) reads initial DB value, then in S 2: r i (A) also reads initial DB value (3) If in S 1: T i does last write on A, then in S 2: T i also does last write on A means “reads value produced”
Definition Schedule S 1 is View Serializable if it is view equivalent to some serial schedule
Lemma Conflict Serializable View Serializable Proof: Swapping non-conflicting actions does not change what transactions read nor final DB state
Venn Diagram All schedules View Serializable Conflict Serializable
Note: All view serializable schedules that are not conflict serializable, involve useless write S = W 2 (A) … W 3 (A) ….. no reads
How do we test for view-serializability? P(S) not good enough … (see schedule Q)
One problem: some swaps involving conflicting actions are OK … e.g.: S = ….w 2 (A) …… r 1 (A).... w 3 (A) … w 4 (A) this action can move if this write exists
Another problem: useless writes S = …..W 2 (A) …….. W 1 (A) ….. no A reads
To check if S is View Serializable (1) Add final transaction T f that reads all DB (eliminates condition 3 of V-S definition) E.g.: S = …..W 1 (A) …….. W 2 (A) … r f (A) Last A write ? add
(2) Add initial transaction T b that writes all DB (eliminates condition 2 of V-S definition) E.g.: S = w b (A)... r 1 (A) … w 2 (A) … add ?
(3) Create labeled precedence graph of S: (3a) If w i (A) r j (A) in S, add T i T j 0
(3b) For each w i (A) r j (A) do consider each w k (A): [T k T b ] - If T i T b T j T f then insert T k T isome new p T j T k - If T i =T b T j T f then insert T j T k - If T i T b T j =T f then insert T k T i p p 0 0
(4) Check if there is some selection from each arc pair that turn s LP(S) into “ acyclic ” (if so, S is V-S)
Example: check if Q is V-S: Q = r 1 (A) w 2 (A) w 1 (A) w 3 (A) Q ’ = w b (A) r 1 (A) w 2 (A) w 1 (A) w 3 (A) r f (A) T3T3 T2T2 T1T1 TfTf TbTb rule 3(a) 0 0 0 0 rule 3(b) 0 0 LP(S) acyclic!! S is V-S
Another example: Z=w b (A) r 1 (A) w 2 (A) r 3 (A) w 1 (A) w 3 (A) r f (A) T3T3 T2T2 T1T1 TfTf TbTb 0 0 0 1 1 0 0 0 0 do not pick this one of “1” pair LP(Z) acyclic, so Z is V-S (equivalent to T b T 1 T 2 T 3 T f )
Deadlock Detection Build Wait-For graph Use lock table structures Build incrementally or periodically When cycle found, rollback victim T1T1 T3T3 T2T2 T6T6 T5T5 T4T4 T7T7
The Waits-For Graph In the waits-for graph there is an arc from node(transactions) T to node U if there is some database element A such that –U holds a lock on A –T is waiting for a lock on A,and –T cannot get a lock on A in its desired mode unless U first releases its lock on A If there are no cycles in the waits-for graph, then each transaction can eventually complete If there is a cycle,then no transaction in the cycle can ever make progress,so there is a deadlock
T1 T2 T3 T4 1)l1(A);r1(A) 2) l2(C);r2(C) 3) l3(B);r3(B) 4) l4(D);r4(D) 5) l2(A);Denied 6) l3(C);Denied 7) l4(A);Denied 8)l1(B);Denied Beginning of a schedule with a deadlock
3 2 1 4 Waits-for graph with a cycle caused by step(8) 32 Waits-for graph after T1 is rolled back
Deadlock Prevention By Resource Ordering Order all elements A 1, A 2, …, A n Every transaction is required to request locks on element in order. Problem : Ordered lock requests not realistic in most cases
Timeout If transaction waits more than L sec., roll it back! Simple scheme Hard to select L
Wait-die Transactions given a timestamp when they arrive …. ts(T i ) T i can only wait for T j if ts(T i )< ts(T j )...else die
T 1 (ts =10) T 2 (ts =20) T 3 (ts =25) wait Example: wait?
Wound-wait Transactions given a timestamp when they arrive … ts(T i ) T i wounds T j if ts(T i )< ts(T j ) else T i waits “ Wound ” : T j rolls back and gives lock to T i
T 1 (ts =25) T 2 (ts =20) T 3 (ts =10) wait Example: wait
Comparison of Deadlock Management Both wound-wait and wait-die are easier to implement than the waits-for graph method. The waits-for graph method aborts transactions only when there is a deadlock. However, either wound-wait or wait-die will sometimes roll back a transaction when there was no deadlock.
Distributed Databases data DBMS data DBMS data DBMS data DBMS Distributed Database System
Advantages of a DDBS Speedy Queries by Parallelism Fault Tolerance by Data Replication Increasing complexity and communication cost
Data Distribution A bank with many branches A chain store with many individual stores A digital library with a consortium of universities
Partitioning a relation among many sites Horizontal Decomposition Vertical Decomposition
Parallelism: Pipelining Example: –T 1 SELECT * FROM A WHERE cond –T 2 JOIN T 1 and B A B (with index) select join
Parallelism: Concurrent Operations Example: SELECT * FROM A WHERE cond A where A.x < 10 select A where 10 A.x < 20 select A where 20 A.x merge data location is important...
Join Processing Example: JOIN A, B over attribute X A1A1 A2A2 B1B1 B2B2 A.x < 10 A.x 10 B.x < 10 B.x 10 join strategy
Join Processing Example: JOIN A, B over attribute X A1A1 A2A2 B1B1 B2B2 A.z < 10 A.z 10 B.z < 10 B.z 10 join strategy
Data Replication Fault Tolerance Query Speedup 1.How to keep copies identical 2.How to place copies properly 3.How to handle communication failure Some Problems
Distributed Transactions Transaction components at a different site Each having the local scheduler and logger
Distributed Commit --Executing Atomically Office Store1 Store i Store n …… T0 T1 Ti Tn Message Report
Two-Phase Commit Phase One: A coordinator component polls the components whether to commit or abort. Phase Two: The coordinator tells the components to commit if and only if all have expressed a willingness to commit.
Each site logs actions at that site but there is no global log. One site, called the coordinator, plays a special role in deciding whether or not the distributed transaction can commit. The two-phase commit protocol involves sending certain messages between the coordinator and the other sites. As each message is sent, it is logged at the sending site, to aid in recovery should be it necessary.
2PC: ATM Withdrawl Mainframe is coordinator Phase 1: ATM checks if money available; mainframe checks if account has funds (money and funds are “ reserved ” ) Phase 2: ATM releases funds; mainframe debits account
Message in phase 1 of two-phase Commit Coordinator prepare Ready or don ‘ t commit Log Log or Log
Coordinator Commit or abort Message in phase 2 of two-phase Commit Log or Log Log or Log
Recovery of Distributed Transaction Last log record No log record about T
Coordinator Failure Wait for it to recover Elect a new coordinator and poll all the sites (1) If some site has, commit T (2) If some site has, abort T (3) If no sites has or and at least one site does not have, it is safe to abort T (4) If there is no or but every surviving site has, must wait until the original coordinator recovers.
Distributed Locking --Executing Serializably Locking Replicated Elements Centralized Lock Systems Primary-Copy Locking Global Locks From Local Locks
A Cost Model for Distributed Locking Algorithms Assign one component of a transaction as the lock coordinator to gather all the locks it wants Lock data elements at its own site without messages. Lock data elements at the other site with three messages: requesting, granting and releasing.
Locking Replicated Elements Global locks on an element must be obtained through locks on one or more replicas.
Centralized Lock Systems Designate one site, the lock site, to maintain a lock table for logical elements, whether or not they have copies at that site. When a transaction wants a lock on logical element X, it sends a request to the lock site, which grants or denies the lock, as appropriate.
Primary-Copy Locking Each logical element X has one of its copies designated the “primary copy”. A transaction sends a request to the site of the primary copy of X to get a lock on X. The site of the primary copy maintains an entry for X in its lock table and grants or denies the request as appropriate.
Global Locks From Local Locks 1.S is the number of copies of A that must be locked in shared mode in order for a transaction to have a global shared lock on A. 2.X is the number of copies of A that must be locked in exclusive mode in order for a transaction to have an exclusive lock on A.
If 2x > n, there can be only one global exclusive lock on A. If s+x >n, there cannot be both a global shared and global exclusive lock on A.
Read-Locks-One; Write-Locks-All (s=1,x=n) Allowing a global read lock by obtaining a read lock on any copy, while allowing a global write lock only through write locks on every lock. Majority Locking (s=x=[(n+1)/2]) Require a read- or write-lock on a majority of the replicas to obtain a global lock.
Long-Duration Transactions A long transaction is one that takes too long to be allowed to hold locks that another transaction needs.
Three Applications Involving Long Transactions Conventional DBMS Applications Design Systems Workflow Systems
Workflow diagram for a traveler requesting expense reimbursement Create travel report Reserve Money Dept. authorization Corporate approval Write check Assistant approval Start A1 A2 A3 A4 A6 A5 available Give to assistant deny Abort approve Complete approve Abort Not enough deny Abort
Sagas A saga is a collection of actions that together form a long-duration “ transaction”.
Concurrency control for sagas is managed by two facilities Each action may be considered itself a (short) transaction, that when executed uses a conventional concurrency-control mechanism, such as locking. The overall transaction is managed through the mechanism of “compensating transactions”, which are inverses to the transactions at the nodes of the saga.
Compensating Transactions To undo the effects of transactions on database state. If a saga execution leads to the Abort node, then we roll back the saga by executing the compensating transactions for each executed action, in the reserve order of those actions.
Exercises for Storage Management EX 2.2.1 EX 2.2.2 EX 2.6.7 EX 3.2.2 EX 3.3.4 Ex 4.1.2 Ex 4.3.1 Ex 4.4.6 Ex 5.2.7 Ex 5.4.2
Exercises for Query Processing Ex 6.1.6 (a)(d) Ex 6.5.3 Ex 6.6.2 Ex 6.7.2 Ex 6.8.1 Ex 7.1.3 Ex 7.4.1 (c), (d), (e), Ex 7.5.1 Ex 7.6.1 Ex 7.7.1 (b), (c)
Exercises for Transaction Management Ex 8.2.7 (a), (e) Ex 8.3.3 Ex 8.4.5 (c), (d) Ex 9.2.1 Ex 9.8.2 (b) Ex 9.9.1 (b) (c) EX 10.1.2 (b) (c) EX 10.2.1 (b) (c) EX 10.3.1 (b) (c) EX 10.6.2