TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW SERIALIZABILITY RECOVERABLE SCHEDULE CASCADE LESS SCHEDULE CONCURRENCY AND ITS PROBLEM TWO PHASE LOCKING PROTOCOL LOG BASED RECOVERY CHECKPOINTS
TRANSACTION a transaction can be considered as a unit of program execution that accesses and updates various data items of a database. A transaction can be considered as a series of reads and writes of database objects. In order to ensure integrity the database should posses following properties ACID ATOMICITY either all the operations of a transaction are carried out or none are. CONSISTENCY Each transaction, runs by itself with no concurrent execution of the other transactions. Execution of a transaction in isolation maintains the consistency of the database. ISOLATION Users should be able to understand a transaction without considering the effect of other concurrently executing transactions. DURABILITY once the DBMS notifies the user that the transaction has successfully completed, the changes made to the database remain.
RECOVERY Recovery is a database system means recovering the database itself. It implies restoration of database to a state that is known t be correct after some failure has render the current state incorrect or at least suspect. TYPES OF RECOVERY There are three types of Recovery 1. Transaction Recovery 2. System Recovery 3. Media Recovery
TRANSACTION RECOVERY it can be define as recovering the database after some individual transaction has failed for some reason. SYSTEM RECOVERY it can be define as recovering after some kind of system crash has caused all currently running transaction to fail simultaneously MEDIA RECOVERY it can be define as recovering after the database has been physically damaged in some way. Example by head crash on the disk.
TRANSACTION STATES Aborted Partially Committed Commit Failed Active Success Fail
ACTIVE STATE A transaction goes into active state immediately after it starts execution, where it can issue read and write operation. PARTIALLY COMMITED When the transaction get finished the state that comes is called a partially committed state. At this state all the transaction get finished or going to be finished COMMITED A transaction that completes its execution successfully is said to be committed. FAILED If a partially committed transaction faces some error and the transaction get failed than it moves to the failed state. ABORTED After the transaction has been rollback and the database has been restored to its state prior to the start of the transaction.
SERIALIZABILITY Serializability is the generally accepted criteria of correctness for concurrency control mechanism. A given interleaved execution of a set of transaction is considered to be correct if it is serializable i.e., if it produces the same result as some serial execution of the same transaction running them one at a time. Two schedules are said to be equivalent if they are guaranteed to produce the same result independent of the initial time of the database. Two different serial schedule involving the same set of transaction might very well produce different results and hence the two different interleaved schedule involving those transaction might also produce different result and yet both be considered correct. When several transaction are executing concurrently then the order of execution of various instructions is know as a schedule
CONFLICT SERIALIZABILITY Consider a schedule S Let X i and X j be operations belonging to transaction T i & T j resp. If X i and X j operate on different data items than the order of X i & X j is irrelevant and they can be swapped. If operate on the same data item then A) If both operations are read then their order does not matter & they can be swapped. B) If one of the operations is write &other is read, order does matter because either the reader reads the initial value or the value that the other transaction wrote. C) If operations are write then the order of the operation affects the ultimate value left in the database. If a schedule S can be transformed into S’ by series of swap of no conflicting instruction then we can say S & S’ are conflict equivalence. The concept of conflict equivalence leads to concept of conflict Serializability. A schedule is conflict serializable if it is conflict equivalence to a serial schedule.
Swapping 3&5 as they are not conflict Swapping 6&4, as they are also not conflict. we get S’ S and S’ are conflict equivalence and S’ is a serial schedule hence S is conflict serializable S’ T1 T2 Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) S T1 T2 Read(A) Write(A) Read(A) Write(A) Read(B) Write(B) Read(B) Write(B)
VIEW SERIALIZABILITY Schedule S&U are view equivalent if a) For all the items x it T1 reads the initial value of x in schedule s then T1 must also read the initial value of x in schedule U. b) If T i reads the value of x produced by transaction T j in schedule S then it must also read the value produced by T j in schedule U. c) What ever Transaction performs the final write of x in schedule s must also perform the final write of x in schedule U. Schedule is view serializable if it is view equivalent to a serial schedule.
Schedule S and U are View Equivalence and also View Serializable. S T1 T2 Read(A) Write(A) Read(B) Write(B) Read(A) Write(A) Read(B) Write(B) U T1 T2 Read(A) Write(A) Read(A) Write(A) Read(B) Write(B) Read(B) Write(B)
RECOVERABLE SCHEDULE It is a one where for each pair of transaction Ti & Tj such that Tj reads a data item previously written by Ti & the commit operation of Ti appears before the commit operation of Tj. S Ti Tj Write(A) Write(B) Read(A) Read(B)
CASCADELESS SCHEDULE Transaction T10 writes a value of A that is read by transaction T11 similarly transaction T11 writes a value of A that is read by transaction T12. Suppose at any point T10 fails T10 must be rolled back since T11 is dependent on T10, T11 must be rolled back since T12 is dependent on T11,T12 must be rolled back. This phenomenon in which a single transaction failure leads to a series of transaction roll back is called cascading rollback. It is easy to verify that cascade less schedule is also recoverable schedule. T10 T11T12 Read(A) Read(B) Write(A) Read(A) Write(A) Read(A)
TESTING OF SERIALIZABILITY Consider schedule S we constructs a graph called a precedence graph or directed graph from S The set of vertices consist of all the transactions participating in the schedule. The set of edges consist of all the edges Ti ->Tj which one of the three conditions hold Ti executes Write(x) before Tj execute read(x). Ti executes Read(x) before Tj executes Write(x). Ti executes Write(x) before Tj executes Write(x). Ti Tj Ti Tj Ti Tj
CONCURRENCY Concurrency refers to the fact that DBMS allow many transactions to access the same data at the same time and therefore such system requires concurrency control mechanism to ensure that concurrent transactions do not interfere with each others operations. Three concurrency problems are 1. The lost Update problem 2. The uncommitted dependency problem 3. The inconsistent analysis problem
LOST UPDATE PROBLEM Transaction ‘A’ retrieve some tuple ‘p’ at time T1. Transaction ‘B’ retrieve some tuple ‘p’ at time T2. Transaction ‘A’ updates the tuple on the basis of the value seen at T1 at time T3. Transaction ‘B’ update the same tuple at time T4 based on the value of time T2. Transaction ‘A’ update is lost at time T4 because transaction B overrides it without even looking at it. TRANSACTION A TIME TANSACTION B Retrieve ‘p’ T1 T2Retrieve ‘p’ Update ‘p’ T3 T4Update ‘p’
UNCOMMITED DEPENDENCY PROBLEM This problem occur when one transaction updates a database item and then the transaction fails for some reasons. The updates item is accessed by another transaction before it is changed back to its original value. Here A updates item X and then fails before completion,so the system must be roll backed x back to its original value. Before it can do so Transaction B reads the “inconsistent” value of X, which will not be stored in the database because of failure of transaction A. the value of X that is read by Transaction B is called inconsistent data because it is been created by a transaction that has not completed & committed yet. Hence this problem is called uncommitted dependency problem. TRANSACTION A TIME TANSACTION B Read(x) X= x-n; T1 Write(x) Read (x) T2X= x+m Write (x)
INCONSISTENT ANALYSIS PROBLEM Acc1 =40 Acc2=50Acc3=30 Transaction A-> summing balance Transaction B-> Transfers an amount of 10 from Acc3 to Acc1. Result produced by transaction A=110 is incorrect, if Transaction A work to go on Write that result Back into the Database it would actually leave the Database in an inconsistent state therefore transaction A performed an inconsistent analysis and problem is called Inconsistent analysis problem. TRANSACTION A TIME TRANSACTION B Retrieve ‘Acc1’(sum40) T1 Retrieve ‘Acc2’(sum90) T2 T3 Retrieve ‘Acc3’ T4 Update ‘Acc3’(30 -> 20) T5 Retrieve ‘Acc1’ T6 Update ‘Acc1’ (40->50) T7 commit Retrieve ‘Acc3’ (sum 110) T8
TWO PHASE LOCKING PROTOCOL The two phase locking protocol is as follows: Before operating on any object a transaction must acquire a lock on that object. After releasing a lock a transaction must never go to acquire any more locks on that object. Note: a transaction that obeys this protocol has two phases: a lock acquision phase & lock releasing phase. In practice a second phase is often compressed into the single operation of commit or rollback at the end of transaction.
LOG BASED RECOVERY To keep the track of the database transactions, the DBMS maintains special files called log files or journals that contain information about all updates. The log files contains information like transaction identifier, type of the log record etc. LOG The widely used structure for recording database modifications is the log. The log is a sequence of log records, recording all the update activities in the database. There are several type of log records. An update log describes a single database write. It has these fields: TRANSACTION IDENTIFIER is the unique identifier of the transaction that performed the write operation. DATA ITEM IDENTIFIER is the unique identifier of the data item written. Typically, it is the location on disk of the data item. OLD VALUE is the value of the data item prior to the write. NEW VALUE is the value that the data item will have after the write.
CHECKPOINTS A checkpoint is a point of synchronization between the database and the transaction log file. All buffers are force-written to secondary storage at the checkpoint. Checkpoints are also called Sync points or save points. Checkpoints are schedules at predetermined interval and involves operations like writing all log records in main memory to secondary memory. If the transactions are executed serially, when a failure occurs, we check the log file to find the transaction that started before the last checkpoint..