Presentation on theme: "Transaction Management Reading: CB, Ch. 22. Dept of Computing Science, University of Aberdeen2 In this lecture you will learn the problems of concurrency."— Presentation transcript:
Dept of Computing Science, University of Aberdeen2 In this lecture you will learn the problems of concurrency in multi-user DBMSs why the concept of a transaction is useful how locks can be used to serialise transactions the concept of deadlocks and how they can be resolved how DBMSs roll back transactions after software or hardware failures
Dept of Computing Science, University of Aberdeen3 We are here!
Dept of Computing Science, University of Aberdeen4 Transaction Management Overview Objectives: –schedule queries from multiple simultaneous users efficiently –keep the database in a consistent state Transaction management involves: –performing logical units of work (defn of transactions) –controlling concurrency - stop user tasks interfering –resolving conflicts - e.g. simultaneous update attempts –recovering from errors - restore DB to consistent state
Dept of Computing Science, University of Aberdeen5 The ACID Requirements For a Transaction A transaction should have certain well-defined properties: Atomicity - each unit of work is indivisible; all-or- nothing (transactions that don't complete must be undone or rolled-back) Consistency - a transaction transforms the database from one consistent state into another (intermediates may be inconsistent) Isolation - each transaction effectively executes independently - one transaction should not see the inconsistent/incomplete state of another transaction Durability - once a transaction is complete, its effects cannot be undone or lost (it can only be undone with a compensating transaction)
Dept of Computing Science, University of Aberdeen6 How Are Transactions Defined? In SQL, by default, each SQL statement is treated as a transaction (even if it affects multiple rows or tables): UPDATE Staff SET Salary = 1.1 * Salary; This is one atomic unit of work on the database. Multiple statements may be grouped together into a single transaction. Example: A new member of staff takes over a property for rent: BEGIN TRANSACTION UPDATE PropertyForRent SET StaffNo = 'SN99' WHERE StaffNo = 'SG37'; DELETE FROM Staff WHERE StaffNo = 'SG37'; COMMIT TRANSACTION
Dept of Computing Science, University of Aberdeen7 Concurrent Transactions The Lost Update Problem Suppose an account holds $100. If T1 deposits $100 and T2 withdraws $10, the new balance should be $190. With concurrent transactions, we could get: T1: UPDATE Account SET Balance = Balance + 100; T2: UPDATE Account SET Balance = Balance - 10; TimeT1T2Balance t1read(bal)$100 t2bal += 100read(bal)$100 t3write(bal)bal -=10$200 t4write(bal)$90 How did $100 vanish? This is the lost update problem
Dept of Computing Science, University of Aberdeen8 Serialising Transactions Clearly, one solution would be to serialise all transactions: –make first transaction finish before next one starts However, this would not be efficient on multi- user systems: – Only need to serialise transactions that refer to common tables – Not all transactions need serialisation (e.g. print a mailing list) – Some transactions might benefit from running simultaneously (e.g. if they both read the same tables)
Dept of Computing Science, University of Aberdeen9 Controlling Concurrency With Locks Locks may be used to serialise only those parts of a transaction that need it... Locks may have different levels of granularity: – Table locks - easy to implement, not so efficient – Row locks - more complicated, better performance – Page locks - used by the Memory Manager (not here) Transactions can cooperate by using locks to indicate their intention: – Read (shared) - want to read an object – Write (exclusive) - want to read and write an object
Dept of Computing Science, University of Aberdeen10 The Rules for Using Locks Suppose two transactions want to access a given row... T1 holdsNo lockReadWrite T2 wantsRead Then T2Acquires Waits T1 holdsNo lockReadWrite T2 wantsWrite Then T2AcquiresWaits
Dept of Computing Science, University of Aberdeen11 Preventing Lost Updates With Locks TimeT1T2Balance t1write_lock(bal)$100 t2read(bal)write_lock(bal)$100 t3bal += 100wait$100 t4write(bal)wait$200 t5unlock(bal)wait$200 t6read(bal)$200 t7bal -= 10$200 t8write(bal)$190 t9unlock(bal)$190 Note - if T2 started first, T1 would WAIT
Dept of Computing Science, University of Aberdeen12 Concurrent Transactions Inconsistent Analysis Problem Often, a single transaction may modify multiple rows... What if two transactions execute at the same time ? –User1 (T1): UPDATE Salary... –User2 (T2): SELECT SUM(Salary)... Depending on the timing, User2 might see: –Sum of all old salaries –Sum of all new salaries –Sum of some old + some new salaries (problem!!) This is called the inconsistent analysis problem Clearly, multiple locks are required...
Dept of Computing Science, University of Aberdeen13 Two-Phase Locking What if a transaction requires more than one lock? The transaction should: –acquire the locks as it needs them... –but only release the locks at the end of the transaction This is two-phase locking - So called because there is: –a growing phase - more & more locks are acquired –a shrinking phase - the locks are finally released Two-phase locking solves inconsistent analysis problem. Can you work out why?? Hint: consider read & write lock waiting rules...
Dept of Computing Science, University of Aberdeen14 Deadlocks Two-phase locking ensures serialisability, but it cannot prevent deadlocks Example: Suppose a pair of transactions are in the growing phase, and both need write-locks on objects A and B: TimeT1T2 t1Acquire AAcquire B t2Wait for BWait for A T3Deadlock As deadlocks are relatively rare, most DBMSs allow them to occur rather than attempt to prevent them...
Dept of Computing Science, University of Aberdeen15 Detecting Deadlocks Deadlocks can be detected if the Lock Manager maintains a transaction dependency graph, which is sometimes called a wait-for graph (WFG)... The dependency graph contains: –a node for each transaction T –an arc Ta Tb for each dependency T1 T2 T3 T4 T1 T2 means T1 is waiting for a resource held by T2 A cycle in the graph indicates deadlock...
Dept of Computing Science, University of Aberdeen16 Deadlock Resolution If the DBMS detects deadlock, it picks a victim transaction: –the victim is killed - rolled-back and re- scheduled –the other transaction proceeds... Strategies for picking the victim include: –pick youngest, oldest, or random transaction –consider amount done (or to do) by each transaction –pick deadlocked node in WFG with the most dependencies
Dept of Computing Science, University of Aberdeen17 The Journal File and Rolling Back Transactions Usually, a log of each step of transaction is written to a special journal file. Each record of the journal file contains (in order of time): – a transaction ID & timestamp – a before-image (if the operation is an update or delete) – an after-image (if the operation is an update or insert) Periodically, the DBMS flushes all memory to disc & writes a checkpoint record. The Journal file can then be used for: – rolling back transactions – system error recovery (roll back to last checkpoint)
Dept of Computing Science, University of Aberdeen18 Specifying Locking Modes in SQL ANSI SQL (and some storage engines in MySQL Server) supports several locking modes. These are specified according to the degree of concurrency/isolation required. Syntax: –SET TRANSACTION ISOLATION LEVEL level where level is one of: – SERIALIZABLE - essentially full table locking – REPEATABLE READ - like table locking (but another T may INSERT) – READ COMMITTED - like row locking (the default) – READ UNCOMMITTED - no locking (might see phantom rows)
Dept of Computing Science, University of Aberdeen19 Summary Transaction Management is largely concerned with implementing the ACID requirements: – Defining logical units of work - atomic transactions – Using rollback and transaction scheduling to maintain consistency – Using locks to stop transactions interfering - isolation – Using journaling to recover from system errors - durability In SQL, its up to the programmer to consider & specify: – how much concurrency/isolation is required...