Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transaction Support Chapter 9. Transaction  Transaction  An action or series of actions, carried out by a single user or application program, which.

Similar presentations


Presentation on theme: "Transaction Support Chapter 9. Transaction  Transaction  An action or series of actions, carried out by a single user or application program, which."— Presentation transcript:

1 Transaction Support Chapter 9

2

3 Transaction  Transaction  An action or series of actions, carried out by a single user or application program, which accesses or changes the contents of the database.  A transaction is a logical unit of work on the database.  It may involve any number of operations on the database.  The execution of an application program can be seen as a series of transactions with non- database processing done in between.

4 Transaction states  A transaction transforms a database from one consistent state into another.  During the transaction inconsistent states are possible.  Outcomes:  Committed: database reaches a new consistent state.  Aborted: database must be restored to the consistent state it was in before the transaction started.  An aborted transaction is rolled-back or undone (message independence).  A committed transaction cannot be aborted.  Transactions cannot be nested.  The DBMS has no way of finding the transaction boundaries but most DML have appropriate keywords available:  Begin Transaction  Commit  Rollback

5 Properties of a transaction  Atomicity  The “all or nothing” property.  A transaction is performed in its entirety or not performed at all.  Consistency  A transaction must transform a database from one consistent state into another consistent state.  Independence  Transactions execute independently of one another. In other words, the partial effects of incomplete transactions should not be visible to other transactions.  Durability  The effects of committed transactions should not be lost because of subsequent failure.

6 Database Architecture Access methods File manager System buffers Database and data dictionary Transaction manager Scheduler Buffer manager Recovery manager Transaction Manager: Coordinates transactions on behalf of the applications. Scheduler: Responsible for implementing a strategy for concurrency control. Recovery Manager: Ensures that the database remains in a consistent state. Buffer Manager: Responsible for the transfer of data between disk storage and main memory.

7 Concurrency Control  Concurrency Control  The process of managing simultaneous operations on the database without having them interfere with one another.  Many users work simultaneously.  Database records are first transferred to central memory where the modifications are performed and then rewritten to the disk.  Transactions can be interleaved due to program interrupts.  This leads to concurrency problems:  Lost update problem;  Uncommitted dependency problem;  Inconsistent analysis problem.

8 Lost Update Problem  Successfully completed update is overwritten by another user.  T 1 withdrawing €10 from an account with balance x, initially € 100.  T 2 depositing € 100 into same account.  Serially, final balance would be € 190.

9 Lost Update Problem TimeTransaction T1Transaction T2 value x t1Begin transaction t2Begin TransactionRead R1(x)x=100 t3Read R1(x) t4x=x+100 t5x=x-10 t6write R(x)x=200 t7write R(x)x=90 t8commit t9commit

10 Uncommitted Dependency Problem  Occurs when one transaction can see intermediate results of another transaction before it has committed.  T 4 updates x to €200 but it aborts, so bal x should be back at original value of €100.  T 3 has read new value of x (€200) and uses value as basis of €10 reduction, giving a new balance of €190, instead of €90.

11 Uncommitted Dependency Problem TimeTransaction T3Transaction T4value x t1Begin transaction t2Read R1(x)x=100 t3x = x + 100 t4write R1(x)x=200 t5Begin transaction t6read R1(x) t7Rollbackx=100 t8x = x - 10x=190 t9write R1(x) t10Commit should be 90

12 Inconsistent Analysis Problem  Occurs when transaction reads several values but second transaction updates some of them during execution of first.  Sometimes referred to as dirty read or unrepeatable read.  T 6 is totaling balances of account x (€100), account y (€50), and account z (€25).  Meantime, T 5 has transferred €10 from x to z, so T 6 now has wrong result (€10 too high).

13 Inconsistent Analysis Problem TimeTransaction T5Transaction T6xyzsum t1Begin transaction1005025 t2Begin Transactionsum=010050250 t3read xread x10050250 t4x = x - 10 sum = sum + x1005025100 t5write xread y905025100 t6read zsum = sum + y90 5025150 t7z = z + 10905025150 t8write z905035150 t9commitread z905035150 t10sum = sum + z90 5035185 t11commit90 50 35185

14 Serializability  Objective of a concurrency control protocol is to schedule transactions in such a way as to avoid any interference.  Could run transactions serially, but this limits degree of concurrency or parallelism in system.  Serializability identifies those executions of transactions guaranteed to ensure consistency.

15 Serializability  Schedule  A transaction consists of a sequence of reads and writes to the database. The sequence of reads and writes by a set of concurrent transactions taken together is the schedule.  Serial Schedule  A schedule where the operations of each transaction are executed consecutively without any interleaved operation from other transactions.  Nonserial Schedule  A schedule where the operations from a set of concurrent transactions are interleaved.

16 Nonserial Schedule  Schedule where operations from set of concurrent transactions are interleaved.  Objective of serializability is to find nonserial schedules that allow transactions to execute concurrently without interfering with one another.  In other words, want to find nonserial schedules that are equivalent to some serial schedule. Such a schedule is called serializable.

17 Serializable Schedule  Serializable Schedule  If a set of transactions executes concurrently, we say that the schedule is correct (serializable), if it produces the same result as some serial execution.  The ordering of reads and writes is important in serializability  if two transactions only read a data item, they do not conflict and order is not important;  if two transactions either read or write completely separate data items, they do not conflict and order is not important ;  if one transaction writes a data item and another either reads or writes the same data item, the order of execution is important.

18 Recoverability  Serializability identifies schedules that maintain database consistency, assuming no transaction fails.  Could also examine recoverability of transactions within schedule.  If transaction fails, atomicity requires effects of transaction to be undone.  Durability states that once transaction commits, its changes cannot be undone (without running another, compensating, transaction).

19 Recoverable Schedule A schedule where, for each pair of transactions T i and T j, if T j reads a data item previously written by T i, then the commit operation of T i precedes the commit operation of T j.

20 Concurrency Control Techniques  Two basic concurrency control techniques:  Locking  Time-stamping  Both are conservative approaches: delay transactions in case they conflict with other transactions.  Optimistic methods assume conflict is rare and only check for conflicts at commit.

21 Concurrency Control Techniques  Locking  A procedure used to control concurrent access to data. When one transaction is accessing the database, a lock may deny access to other transactions to prevent incorrect updates.  Data items of various sizes, ranging from the entire database down to a field, may be locked.  The size of the item determines the granularity of the lock.  Implementation can be done by  setting a bit in the data item;  keeping a list of locked parts;  other techniques.

22 Lock Types  Read lock  If a transaction has a read lock on a data item, it can read the item but not update it.  Write lock  If a transaction has a write lock on the data item, it can both read and update the item.

23 Using Locks  Any transaction that needs to access the data item must first lock the item, requesting a read lock for read only access or a write lock for both read and write access.  If the item is not already locked by another transaction, the lock will be granted.  If the item is currently locked, the DBMS determines whether the request is compatible with the existing lock :  a read request on an item with a read lock will be granted;  for other requests the transaction must wait until the existing lock is released.  A transaction continues to hold a lock until it explicitly releases it, either during execution or when it terminates.  It is only when the write lock has been released that the effects of the write operation will be made visible to other transactions.

24 Two-phase Locking  2PL  A transaction follows the two-phase locking protocol if all locking operations precede the first unlock operation in the transaction.  With this protocol every transaction has two phases:  growing phase: where no locks can be released;  shrinking phase: where no locks can be acquired.  Some systems allow upgrades ( in the growing phase ) or downgrades ( in the shrinking phase ) of a lock.  With 2PL serializability of schedules can be granted.

25 2PL: Lost update solution TimeTransaction T1Transaction T2value x t1Begin transaction t2Begin TransactionRead R1(x)WL x=100 t3request WL t4 waitx=x+100 t5 waitwrite R(x) x=200 t6 wait unlock(x)UL t7 Read R1(x) WLcommit t8 x=x-10 t9 write R(x) x=190 t10 unlock(x) UL t11commit

26 2PL: Uncommitted Dependency TimeTransaction T1Transaction T2value x t1Begin transaction t2Read R1(x)WLx=100 t3x = x + 100 t4write R1(x)x=200 t5Begin transaction t6 request WLx=200 t7 waitRollbackULx=100 t8 read R1(x) WLx=100 t9 x = x - 10x=90 t10write(x)x=90 t11Commit

27 Deadlock  An impasse that may result when two or more transactions are each waiting for locks held by the other to be released. TimeTransaction 1Transaction 2 t1begin transaction t2write-lock (x)begin transaction t3read (x)write-lock (y) t4x = x - 10read (y) t5write (x)y = y + 100 t6write-lock (y)write (y) t7 waitwrite-lock (x) t8 wait wait t9 wait wait

28 Deadlock  Only one way to break deadlock: abort one or more of the transactions.  Deadlock should be transparent to user, so DBMS should restart transaction(s).  Two general techniques for handling deadlock:  Deadlock prevention.  Deadlock detection and recovery.

29 Deadlock Prevention  DBMS looks ahead to see if transaction would cause deadlock and never allows deadlock to occur.  Could order transactions using transaction timestamps:  Wait-Die - only an older transaction can wait for younger one, otherwise transaction is aborted (dies) and restarted with same timestamp.

30 Deadlock Prevention  Wound-Wait - only a younger transaction can wait for an older one. If older transaction requests lock held by younger one, younger one is aborted (wounded).

31 Timestamping  Timestamp  A unique identifier created by the DBMS that indicates the relative starting time of a transaction.  Data items can get read-timestamp or a write-timestamp.  Timestamping  A concurrency control protocol in which the fundamental goal is to order transactions globally in such a way that older transactions (with smaller time stamps) get priority in the event of conflict.  If a transaction attempts to read or write a data item, it can only proceed if the last update on that data item was carried out by an older transaction; otherwise, the transaction is restarted and given a new timestamp.

32 Timestamping  Read/write proceeds only if last update on that data item was carried out by an older transaction.  Otherwise, transaction requesting read/write is restarted and given a new timestamp.  Also timestamps for data items:  read-timestamp - timestamp of last transaction to read item.  write-timestamp - timestamp of last transaction to write item.

33 Timestamping - Read(x)  Consider a transaction T with timestamp ts(T): ts(T) < write_timestamp(x)  x already updated by younger (later) transaction.  Transaction must be aborted and restarted with a new timestamp.

34 Timestamping - Read(x) ts(T) < read_timestamp(x)  x already read by younger transaction.  Roll back transaction and restart it using a later timestamp. ( must not be aborted because it cannot have read a wrong version since the write has not yet been done).

35 Granularity of Data Items  Size of data items chosen as unit of protection by concurrency control protocol.  Ranging from coarse to fine:  The entire database.  A file.  A page (or area or database spaced).  A record.  A field value of a record.

36 Granularity of Data Items  Tradeoff:  coarser, the lower the degree of concurrency.  finer, more locking information that is needed to be stored.  Best item size depends on the types of transactions.

37 Levels of Locking

38 File Recovery  Two Complementary Techniques.  Backup.  Periodical copy of the Database on an archive file. full backup. incremental backup.  Recovery ( more than 10% of code of DBMS ).  after a failure bring the database back in a reliable state.  Redundancy is needed.  Time factor is crucial.

39 Restore Actions (simplified) In case of failure: DB is unreadableDB is unreliable Last Archive Logfile REDO UNDO New DB New DB

40 } UNDO  UNDO a started transaction can occur in case of:  insert  delete  update.  Direct access device is needed  active log Continuous transition  archive log  Image before only on active log.

41 Database Recovery  Process of restoring database to a correct state in the event of a failure.  Need for Recovery Control  Two types of storage: volatile (main memory) and nonvolatile.  Volatile storage does not survive system crashes.  Stable storage represents information that has been replicated in several nonvolatile storage media with independent failure modes.

42 Recovery Techniques  If database has been damaged:  Need to restore last backup copy of database and reapply updates of committed transactions using log file.  If database is only inconsistent:  Need to undo changes that caused inconsistency. May also need to redo some transactions to ensure updates reach secondary storage.  Do not need backup, but can restore database using before- and after-images in the log file.

43 Types of failures  System crashes, resulting in loss of main memory.  Media failures, resulting in loss of parts of secondary storage.  Application software errors.  Natural physical disasters.  Carelessness or unintentional destruction of data or facilities.  Sabotage.

44 Transactions and Recovery  Transactions represent basic unit of recovery.  Recovery manager responsible for atomicity and durability.  If failure occurs between commit and database buffers being flushed to secondary storage then, to ensure durability, recovery manager has to redo (rollforward) transaction's updates.

45 Transactions and Recovery  If transaction had not committed at failure time, recovery manager has to undo (rollback) any effects of that transaction for atomicity.  Partial undo - only one transaction has to be undone.  Global undo - all transactions have to be undone.

46 System interruption  I/O buffers lost  DB intact  UNDO of current transactions is needed but those transactions are difficult to identify.  Concept of " Checkpoint "  periodically the following steps must be performed:  Step 1: Log-buffers emptied on the logfile.  Step 2: Checkpoint record written on the Logfile.  Step 3: Database buffers emptied on the database.  Step 4: Address of checkpoint record written on the "restart file".

47 Log File  Contains information about all updates to database:  Transaction records.  Checkpoint records.  Often used for other purposes (for example, auditing).

48 Log File  Transaction records contain:  Transaction identifier.  Type of log record, (transaction start, insert, update, delete, abort, commit).  Identifier of data item affected by database action (insert, delete, and update operations).  Before-image of data item.  After-image of data item.  Log management information.

49 77 Sample Log File

50 Log File  Log file may be duplexed or triplexed.  Log file sometimes split into two separate random-access files.  Potential bottleneck; critical in determining overall performance.

51 Checkpointing  Checkpoint  Point of synchronization between database and log file. All buffers are force-written to secondary storage.  Checkpoint record is created containing identifiers of all active transactions.  When failure occurs, redo all transactions that committed since the checkpoint and undo all transactions active at time of crash.

52 Algorithm to Define Transaction States The algorithm starts with the creation of two lists;  UNDO list contains all transactions in the Checkpoint record  REDO list is empty Forward reading of the logfile starting at Tc ;  encounter "BEGIN" move transaction to UNDO list  encounter "COMMIT" move transaction from UNDO to REDO  at end of logfile both list are correct Time tctc tftf T1 T2 T3 T4 T5

53 Checkpointing  In previous example, with checkpoint at time tc, changes made by T2 and T3 have been written to secondary storage.  Thus:  only redo T2 and T4,  undo transactions T3 and T5.

54 Steps in recovery process 1. Define states of the transactions with specific algorithm. 2. Forward processing of the logfile to REDO transactions. 3. Backward processing of the logfile to UNDO transactions. REDO and UNDO must be idempotent operations. This technique can also be used in a deadlock situation.


Download ppt "Transaction Support Chapter 9. Transaction  Transaction  An action or series of actions, carried out by a single user or application program, which."

Similar presentations


Ads by Google