Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.

Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park

Outline  Failure Classification  Recovery and Atomicity  Log-Based Recovery  Shadow Paging  Recovery with Concurrent Transactions  Buffer Management

Failure Classification  Transaction failure  Logical errors: transaction cannot complete due to some logical error condition  System errors: the database system must terminate an active transaction due to an error condition (e.g., deadlock)  System crash: a power failure or other hardware or software failure causes the system to crash  Disk failure: a head crash or similar disk failure destroys all or part of disk storage

Recovery Algorithms  Recovery algorithms are techniques to ensure DB consistency, transaction atomicity, and durability despite failures  Recovery algorithms have two parts  Actions taken during normal transaction processing to ensure enough information exists to recover from failures  Actions taken after a failure to recover the database contents to a state that ensures atomicity, consistency, and durability

Data Access (1/2)  Physical blocks are those blocks residing on the disk  Buffer blocks are those blocks residing temporarily in main memory  Block movements between disk and main memory are initiated through input(B) and output(B) operations  Each transaction T i has its private work-area in which local copies of all data items accessed and updated by T i are kept  We assume, for simplicity, that each data item fits in a single block

Data Access (2/2)  Transaction transfers data items between buffer blocks and its private work-area using read(X) and write(X) operations  Transactions  Perform read(X) when accessing X for the first time  All subsequent accesses are to the local copy  After last access, transaction executes write(X)  Output(B X ) does not need to follow write(X) immediately

Example of Data Access X Y A B x1x1 y1y1 buffer Buffer Block A Buffer Block B input(A) output(B) read(X)write(Y) disk work area of T 1 work area of T 2 memory x2x2

Recovery and Atomicity (1/2)  Modifying the database without ensuring that the transaction will commit may leave the database in an inconsistent state  Consider transaction T i that transfers $50 from account A to account B; goal is either to perform all database modifications made by T i or none at all  Several output operations may be required for T i. A failure may occur after one of these modifications have been made but before all of them are made

Recovery and Atomicity (2/2)  To ensure atomicity despite failures, we first output information describing the modifications to stable storage without modifying the database itself  We study two approaches  Log-based recovery  Shadow-paging  We assume (initially) that transactions run serially, that is, one after the other

Log-Based Recovery (1/2)  The log is a sequence of log records, recording all the update activities in the database  When transaction T i starts, it registers itself by writing a log record  Before T i executes write(X), a log record is written, where V 1 is an old value and V 2 is a new value

Log-Based Recovery (2/2)  When T i finishes its last statement, the log record is written  We assume for now that log records are written directly to stable storage (that is, they are not buffered)  Two approaches using logs: deferred or immediate database modification

Deferred Database Modification (1/4)  The deferred database modification scheme records all database modifications in the log, but defers all the writes until the transaction partially commits  Transaction starts by writing record to log  A write(X) operation by T i results in the writing of a new record to the log  The write is not performed on X at this time, but deferred

Deferred Database Modification (2/4)  When T i partially commits, a record is written to the log  Finally, the log records are read and used to actually execute the previously deferred writes  During recovery after a crash, a transaction needs to be redone if both and are in the log  Redoing a transaction T i (redo T i ) sets the value of all data items updated by the transaction to the new values

Deferred Database Modification (3/4)  Crashes can occur while  The transaction is executing the original updates  While recovery action is being taken  Example transactions T 0 and T 1 (T 0 executes before T 1 ) T 0 : read(A) A := A – 50 write(A) read(B) B := B + 50 write(B) T 1 : read(C) C := C – 100 write(C)

Deferred Database Modification (4/4)  Below is the log as it appears at three time instances  If log on stable storage at time of crash is as in case: (a) No redo actions need to be taken (b) Redo(T 0 ) must be performed since is present (c) Redo(T 0 ) must be performed followed by redo(T 1 ) since both and are present

Immediate Database Modification (1/4)  The immediate modification scheme allows database modifications to be output to the database while the transaction is still in the active stage  Update log record must be written before database item is written  Output of updated buffer blocks can take place at any time before or after transaction commit  Order in which blocks are output can be different from the order in which they are written

Immediate Database Modification (2/4)  Example (B X denotes block containing X) Log Write Output A = 950 B = 2050 C = 600 B B, B C B A

Immediate Database Modification (3/4)  Recovery procedure has two operations instead of one:  Undo(T i ) restores the value of all data items updated by T i to their old values, going backward from the last log record for T i  Redo(T i ) sets the value of all data items updated by T i to the new values, going forward from the first log record for T i  Both operations must be idempotent  That is, even if the operation is executed multiple times, the effect is the same as if it is executed once

Immediate Database Modification (4/4)  When recovering after failure:  Transaction T i needs to be undone if the log contains the record, but does not contain the record  Transaction T i needs to be redone if the log contains both the record and the record  Undo operations are performed first, then redo operations

Immediate Database Modification Recovery Example  Below is the log as it appears at three time instances  Recovery actions in each case above are: (a) Undo(T 0 ): B is restored to 2000 and A to 1000 (b) Undo(T 1 ) and redo(T 0 ): C is restored to 700, and then A and B are set to 950 and 2050 respectively (c) Redo(T 0 ) and redo(T 1 ): A and B are set to 950 and 2050 respectively, then C is set to 600

Checkpoints (1/2)  Problems in recovery procedure discussed earlier:  Searching the entire log is time-consuming  Most of the transactions that need to be redone have already written their updates into the database  To reduce these types of overhead, the system periodically performs checkpoints, which require the following sequence of actions to take place  Output onto stable storage all log records currently residing in main memory  Output to the disk all modified buffer blocks  Output onto stable storage a log record

Checkpoints (2/2)  During recovery we need to consider only the most recent transaction T i that started before the checkpoint, and transactions that started after T i  Scan backward from end of log to find the most recent record  Continue scanning backwards till a record is found  Need only consider the part of log following above start record  For all transactions starting from T i or later with no, execute undo(T i )  Scanning forward in the log, for all transactions starting from T i or later with, execute redo(T i )

Example of Checkpoints  T 1 can be ignored (updates already output to disk due to checkpoint)  T 2 and T 3 redone  T 4 undone TcTc TfTf T1T1 T2T2 T3T3 T4T4 checkpoint system failure

Page And Page Table  The database is partitioned into some number of fixed- length blocks, which are referred to as pages  The pages do not need to be stored in any particular order on disk  We use a page table to find the i th page of the database for a given i

Sample Page Table

Shadow Paging (1/6)  Shadow paging is an alternative to log-based recovery; this scheme is useful if transactions execute serially  The key idea is to maintain two page tables during the lifetime of a transaction – the current page table and the shadow page table  Both page tables are identical when the transaction starts  The shadow table is never changed over the duration of the transaction; the current page table may be changed when a transaction performs a write operation  All input and output operations use the current page table to locate database pages on disk

Shadow Paging (2/6)  Suppose that the transaction T j performs a write(X), and that X resides on the i th page  The system executes the write operation as follows:  If the i th page is not already in main memory, then the system issues input(X)

Shadow Paging (3/6)  If this is the write first performed on the i th page by this transaction, then the system modifies the current page table as follows:  It finds an unused page on disk  It copies the contents of the i th page to the page found in above step  It modifies the current page table so that the i th entry points to the page found in above step  It assigns the value of x j to X in the buffer page

Shadow Paging (4/6)  Example: shadow and current page tables after write to page 4

Shadow Paging (5/6)  To commit a transaction:  Flush all modified pages in main memory to disk  Output current page table to disk  Make the current page table the new shadow page table  Once pointer to shadow page table has been written, transaction is committed  No recovery is needed after a crash – new transactions can start right away, using the shadow page table  Pages not pointed to from current/shadow page table should be freed (garbage collected)

Shadow Paging (6/6)  Advantages of shadow-paging over log-based schemes  No overhead of writing log records  Recovery is trivial  Disadvantages  Copying the entire page table is very expensive  Commit overhead is high  Data gets fragmented  After every transaction completion, the database pages containing old versions of modified data need to be garbage collected  Hard to extend algorithm to allow transactions to run concurrently

Recovery With Concurrent Transactions (1/3)  We modify the log-based recovery schemes to allow multiple transactions to execute concurrently; they share a single buffer and a single log  Logging is done as described as earlier  The checkpoint technique and actions taken on recovery have to be changed

Recovery With Concurrent Transactions (2/3)  Checkpoints are performed as before, except that checkpoint log record is now of the form: (L: list of transactions active at the time of checkpoint)  When the system recovers, it first does the following:  Initialize undo-list and redo-list to empty  Scan the log backward from the end, stopping when the first record is found  For each record found during the backward scan: if the record is, add T i to redo-list; If the record is and T i is not in redo-list, add T i to undo-list  For every T i in L, if T i is not in redo-list, add T i to undo-list

Recovery With Concurrent Transactions (3/3)  At this point, undo-list consists of incomplete transactions that must be undone, and redo-list consists of finished transactions that must be redone  Recovery now continues as follows:  Scan log backward from the end, and perform undo for each log record that belongs to transaction T i on undo-list, stopping when records have been found for every T i in undo-list  Locate the most recent record  Scan log forward from the record, and perform redo for each log record that belongs to transaction T i on redo- list  It is important to undo the transaction in undo-list before redoing transactions in redo-list (example in textbook)

Log Record Buffering (1/2)  Log record buffering: log records are buffered in main memory, instead of being output directly to stable storage  Log records are output to stable storage when a block of log records in the buffer is full, or a log force operation is executed  Log force is performed to commit a transaction by forcing all its log records to stable storage  Several log records can thus be output using a single output operation, reducing the I/O cost

Log Record Buffering (2/2)  The rules below must be followed if log records are buffered  Log records are output to stable storage in the order in which they are created  Transaction T i enters the commit state only when the log record has been output to stable storage  Before a block of data in main memory is output to the database, all log records pertaining to data in that block must have been output to stable storage  This rule is called the write-ahead logging or WAL rule

Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.

Similar presentations

Presentation on theme: "Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.

Similar presentations

Presentation on theme: "Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park."— Presentation transcript:

Similar presentations

About project

Feedback