Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content.

Similar presentations


Presentation on theme: "Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content."— Presentation transcript:

1 Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content courtesy of Susan Davidson, Raghu Ramakrishnan & Johannes Gehrke

2 2 From Queries to Updates  We’ve spent a lot of time talking about querying data  Yet updates are a really major part of many DBMS applications  Particularly important: ensuring ACID properties  Atomicity: each operation looks atomic to the user  Consistency: each operation in isolation keeps the database in a consistent state (this is the responsibility of the user)  Isolation: should be able to understand what’s going on by considering each separate transaction independently  Durability: updates stay in the DBMS!!!

3 3 What is a Transaction?  A transaction is a sequence of read and write operations on data items that logically functions as one unit of work:  should either be done entirely or not at all  if it succeeds, the effects of write operations persist (commit); if it fails, no effects of write operations persist (abort)  these guarantees are made despite concurrent activity in the system, and despite failures that may occur

4 4 How Things Can Go Awry  Suppose we have a table of bank accounts which contains the balance of the account  An ATM deposit of $50 to account # 1234 would be written as:  This reads and writes the account’s balance  What if two accountholders make deposits simultaneously from two ATMs? update Accounts set balance = balance + $50 where account#= ‘1234’;

5 5 Concurrent Deposits  This SQL update code is represented as a sequence of read and write operations on “data items” (which for now should be thought of as individual accounts): where X is the data item representing the account with account# 1234. Deposit 1 Deposit 2 read(X.bal) X.bal := X.bal + $50 X.bal:= X.bal + $10 write(X.bal)

6 6 A “Bad” Concurrent Execution Only one “action” (e.g. a read or a write) can actually happen at a time, and we can interleave deposit operations in many ways: Deposit 1 Deposit 2 read(X.bal) X.bal := X.bal + $50 X.bal:= X.bal + $10 write(X.bal) time BAD!

7 7 A “Good” Execution  Previous execution would have been fine if the accounts were different (i.e. one were X and one were Y), i.e., transactions were independent  The following execution is a serial execution, and executes one transaction after the other: Deposit 1 Deposit 2 read(X.bal) X.bal := X.bal + $50 write(X.bal) read(X.bal) X.bal:= X.bal + $10 write(X.bal) time GOOD!

8 8 Good Executions  An execution is “good” if it is serial (transactions are executed atomically and consecutively) or serializable (i.e. equivalent to some serial execution)  Equivalent to executing Deposit 1 then 3, or vice versa  Why would we want to do this instead? Deposit 1 Deposit 3 read(X.bal) read(Y.bal) X.bal := X.bal + $50 Y.bal:= Y.bal + $10 write(X.bal) write(Y.bal)

9 9 Atomicity  Problems can also occur if a crash occurs in the middle of executing a transaction:  Need to guarantee that the write to X does not persist (ABORT)  Default assumption if a transaction doesn’t commit Transfer read(X.bal) read(Y.bal) X.bal= X.bal-$100 Y.bal= Y.bal+$100 CRASH

10 10 Transactions in SQL  A transaction begins when any SQL statement that queries the db begins.  To end a transaction, the user issues a COMMIT or ROLLBACK statement. Transfer UPDATE Accounts SET balance = balance - $100 WHERE account#= ‘1234’; UPDATE Accounts SET balance = balance + $100 WHERE account#= ‘5678’; COMMIT;

11 11 Read-Only Transactions  When a transaction only reads information, we have more freedom to let the transaction execute in parallel with other transactions.  We signal this to the system by stating: SET TRANSACTION READ ONLY; SELECT * FROM Accounts WHERE account#=‘1234’;...

12 12 Read-Write Transactions  If we state “read-only”, then the transaction cannot perform any updates.  Instead, we must specify that the transaction may update (the default): SET TRANSACTION READ ONLY; UPDATE Accounts SET balance = balance - $100 WHERE account#= ‘1234’;... SET TRANSACTION READ WRITE; update Accounts set balance = balance - $100 where account#= ‘1234’;... ILLEGAL!

13 13 Dirty Reads  Dirty data is data written by an uncommitted transaction; a dirty read is a read of dirty data (WR conflict)  Sometimes we can tolerate dirty reads; other times we cannot: e.g., if we wished to ensure balances never went negative in the transfer example, we should test that there is enough money first!

14 14 “Bad” Dirty Read EXEC SQL select balance into :bal from Accounts where account#=‘1234’; if (bal > 100) { EXEC SQL update Accounts set balance = balance - $100 where account#= ‘1234’; EXEC SQL update Accounts set balance = balance + $100 where account#= ‘5678’; } EXEC SQL COMMIT; If the initial read (italics) were dirty, the balance could become negative!

15 15 Acceptable Dirty Read If we are just checking availability of an airline seat, a dirty read might be fine! (Why is that?) Reservation transaction: EXEC SQL select occupied into :occ from Flights where Num= ‘123’ and date=11-03-99 and seat=‘23f’; if (!occ) {EXEC SQL update Flights set occupied=true where Num= ‘123’ and date=11-03-99 and seat=‘23f’;} else {notify user that seat is unavailable}

16 16 Other Undesirable Phenomena  Unrepeatable read: a transaction reads the same data item twice and gets different values (RW conflict)  Phantom problem: a transaction retrieves a collection of tuples twice and sees different results

17 17 Phantom Problem Example  T1: “find the students with best grades who Take either cis550-f03 or cis570-f02”  T2: “insert new entries for student #1234 in the Takes relation, with grade A for cis570-f02 and cis550-f03”  Suppose that T1 consults all students in the Takes relation and finds the best grades for cis550-f03  Then T2 executes, inserting the new student at the end of the relation, perhaps on a page not seen by T1  T1 then completes, finding the students with best grades for cis570-f02 and now seeing student #1234

18 18 Isolation  The problems we’ve seen are all related to isolation  General rules of thumb w.r.t. isolation:  Fully serializable isolation is more expensive than “no isolation”  We can’t do as many things concurrently (or we have to undo them frequently)  For performance, we generally want to specify the most relaxed isolation level that’s acceptable  Note that we’re “slightly” violating a correctness constraint to get performance!

19 19 Specifying Acceptable Isolation Levels  The default isolation level is SERIALIZABLE (as for the transfer example).  To signal to the system that a dirty read is acceptable,  In addition, there are SET TRANSACTION READ WRITE ISOLATION LEVEL READ UNCOMMITTED; SET TRANSACTION ISOLATION LEVEL READ COMMITTED; SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;

20 20 READ COMMITTED  Forbids the reading of dirty (uncommitted) data, but allows a transaction T to issue the same query several times and get different answers  No value written by T can be modified until T completes  For example, the Reservation example could also be READ COMMITTED; the transaction could repeatably poll to see if the seat was available, hoping for a cancellation

21 21 REPEATABLE READ  What it is NOT: a guarantee that the same query will get the same answer!  However, if a tuple is retrieved once it will be retrieved again if the query is repeated  For example, suppose Reservation were modified to retrieve all available seats  If a tuple were retrieved once, it would be retrieved again (but additional seats may also become available)

22 22 Implementing Isolation Levels  One approach – use locking at some level (tuple, page, table, etc.):  each data item is either locked (in some mode, e.g. shared or exclusive) or is available (no lock)  an action on a data item can be executed if the transaction holds an appropriate lock  consider granularity of locks – how big of an item to lock  Larger granularity = fewer locking operations but more contention!  Appropriate locks:  Before a read, a shared lock must be acquired  Before a write, an exclusive lock must be acquired

23 23 Lock Compatibility Matrix Locks on a data item are granted based on a lock compatibility matrix: When a transaction requests a lock, it must wait (block) until the lock is granted Mode of Data Item None Shared Exclusive Shared Y Y N Exclusive Y N N Request mode {

24 24 Locks Prevent “Bad” Execution  If the system used locking, the first “bad” execution could have been avoided: Deposit 1 Deposit 2 xlock(X) read(X.bal) {xlock(X) is not granted} X.bal := X.bal + $50 write(X.bal) release(X) xlock(X) read(X.bal) X.bal:= X.bal + $10 write(X.bal) release(X)

25 25 Lock Types and Read/Write Modes  When we specify “read-only”, the system only uses shared-mode locks  Any transaction that attempts to update will be illegal  When we specify “read-write”, the system may also acquire locks in exclusive mode  Obviously, we can still query in this mode

26 26 Isolation Levels and Locking  READ UNCOMMITTED allows queries in the transaction to read data without acquiring any lock  For updates, exclusive locks must be obtained and held to end of transaction  READ COMMITTED requires a read-lock to be obtained for all tuples touched by queries, but it releases the locks immediately after the read  Exclusive locks must be obtained for updates and held to end of transaction

27 27 Isolation levels and locking, cont.  REPEATABLE READ places shared locks on tuples retrieved by queries, holds them until the end of the transaction  Exclusive locks must be obtained for updates and held to end of transaction  SERIALIZABLE places shared locks on tuples retrieved by queries as well as the index, holds them until the end of the transaction  Exclusive locks must be obtained for updates and held to end of transaction  Holding locks to the end of a transaction is called “strict”

28 28 Summary of Isolation Levels Level Dirty Read Unrepeatable Read Phantoms READ UN-MaybeMaybeMaybe COMMITTED READ NoMaybeMaybe COMMITTED REPEATABLE No NoMaybe READ SERIALIZABLENo No No

29 29 Theory of Serializability  A schedule of a set of transactions is a linear ordering of their actions  e.g. for the simultaneous deposits example: R1(X.bal) R2(X.bal) W1(X.bal) W2(X.bal)  A serial schedule is one in which all the steps of each transaction occur consecutively  A serializable schedule is one which is equivalent to some serial schedule (i.e. given any initial state, the final state is the same as one produced by some serial schedule)  The example above is neither serial nor serializable

30 30 Questions to Address  Given a schedule S, is it serializable?  How can we "restrict" transactions in progress to guarantee that only serializable schedules are produced?

31 31 Conflicting Actions  Consider a schedule S in which there are two consecutive actions I i and I j of transactions T i and T j respectively  If I i and I j refer to different data items, then swapping I i and I j does not matter  If I i and I j refer to the same data item Q, then swapping I i and I j matters if and only if one of the actions is a write  Ri(Q) Wj(Q) produces a different final value for Q than Wj(Q) Ri(Q)

32 32 Testing for Serializability  Given a schedule S, we can construct a di-graph G=(V,E) called a precedence graph  V : all transactions in S  E : T i  T j whenever an action of T i precedes and conflicts with an action of T j in S  Theorem: A schedule S is conflict serializable if and only if its precedence graph contains no cycles  Note that testing for a cycle in a digraph can be done in time O(|V|2)

33 33 An Example T1 T2 T3 R(X,Y,Z) R(X) W(X) R(Y) W(Y) R(Y) R(X) W(Z) T1 T2 T3 Cyclic: Not serializable.

34 34 Another Example T1 T2 T3 R(X) W(X) R(X) W(X) R(Y) W(Y) R(Y) W(Y) T1 T2 T3 Acyclic: serializable

35 35 Producing the Equivalent Serial Schedule  If the precedence graph for a schedule is acyclic, then an equivalent serial schedule can be found by a topological sort of the graph  For the second example, the equivalent serial schedule is:  R1(Y)W1(Y)R2(X)W2(X) R2(Y)W2(Y) R3(X)W3(X)

36 36 Locking and Serializability  We said that a transaction must hold all locks until it terminates (a condition called strict locking)  It turns out that this is crucial to guarantee serializability  Note that the first (bad) example could have been produced if transactions acquired and immediately released locks.

37 37 Well-Formed, Two-Phased Transactions  A transaction is well-formed if it acquires at least a shared lock on Q before reading Q or an exclusive lock on Q before writing Q and doesn’t release the lock until the action is performed  Locks are also released by the end of the transaction  A transaction is two-phased if it never acquires a lock after unlocking one  i.e., there are two phases: a growing phase in which the transaction acquires locks, and a shrinking phase in which locks are released

38 38 Two-Phased Locking Theorem  If all transactions are well-formed and two-phase, then any schedule in which conflicting locks are never granted ensures serializability  i.e., there is a very simple scheduler!  However, if some transaction is not well-formed or two-phase, then there is some schedule in which conflicting locks are never granted but which fails to be serializable  i.e., one bad apple spoils the bunch.

39 39 Summary  Transactions are all-or-nothing units of work guaranteed despite concurrency or failures in the system  Theoretically, the “correct” execution of transactions is serializable (i.e. equivalent to some serial execution)  Practically, this may adversely affect throughput  isolation levels  With isolation levels, users can specify the level of “incorrectness” they are willing to tolerate


Download ppt "Transactions and Concurrency Control Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems December 2, 2003 Slide content."

Similar presentations


Ads by Google