Chapter 81 Chapter 8 Coping With System Failures Spring 2001 Prof. Sang Ho Lee School of Computing, Soongsil Univ.

Chapter 81 Chapter 8 Coping With System Failures Spring 2001 Prof. Sang Ho Lee School of Computing, Soongsil Univ. shlee@computing.soongsil.ac.kr

Chapter 82 Overview Two major issues –Date must be protected in the face of a system failure; resilience (Chapter 8) –Data must not be corrupted (Chapter 9, 10) Terms –Log (“undo”, “redo”, and “undo/redo”) –Checkpointing –Archiving

Chapter 83 Transaction A sequence of database operations that have ACID properties Syntax in ESQL/C –Start: most (but not all) SQL statements –Not transaction-initiating statements Connect, Disconnect, Set, Commit, Rollback, Declare, Get Diagnostics, … –End: Commit work, Rollback work Commit indicates successful end of a transaction Rollback indicates abnormal termination of a transaction

Chapter 84 ACID Properties (1/2) Atomicity –Either all actions in a transaction occur successfully or nothing has happened –All-or-nothing property Consistency –Assumes that any successful transaction commits only legal result –A transaction is a correct transformation of the state, i.e., from one valid state to another valid state

Chapter 85 ACID Properties (2/2) Isolation –Events within a transaction must be hidden from other transactions running concurrently –The actions carried out by a transaction against a shared database cannot become visible to other transactions until the transaction commits Durability –Once a transaction has completed and has commits, the system must guarantee that these results survive any subsequent failures

Chapter 86 Recovery Objective: to guarantee the durability and atomicity of transactions Metrics –Degree of concurrency supported –Complexity of logic –I/O overhead during restart and normal processing –Functionality –Lock mode supported –Storage management reflexibility

Chapter 87 Failure Modes Transaction failure –When a transaction aborts –Need transaction rollback System failure –Refers to the loss or corruption of volatile storage (main memory) –Power out, OS failure, … –Need system restart Media (catastrophic) failure –When any part of the stable storage (disk) is destroyed –Head crash, disk controller error, … –Need roll-forward

Chapter 88 The Log Manager and Transaction Manager Query processor Transaction manager Log manager Buffer manager Recovery manager Log Data Figure 8.1: the log manager and transaction manager

Chapter 89 The Primitive Operations of Transactions (1) How transactions interact with databases –The space of disk blocks holding the database elements –The virtual or main memory address space that is managed by the buffer manager –The local address space of the transaction * Transactions don’t access the disk holding the database elements directly

Chapter 810 The Primitive Operations of Transactions (2) –INPUT(X): Copy the disk block containing database element X to a memory buffer –READ(X,t): Copy the database element X to the transaction’s local variable t If the block containing database element X is not in memory buffer, then first execute INPUT(X) –WRITE(X,t): Copy the value of local variable t to database element X in a memory buffer –OUTPUT(X): Copy the buffer containing X to disk

Chapter 811 The Primitive Operations of Transactions (3) Note that –A database element is no larger than a single block –If database elements occupy several blocks, then we shall imagine that each block-sized portion of the element is an element by itself

Chapter 812 Example 8.2 (1/2) –Consider a database that has two elements A and B –The constraint that the elements must be equal in all consistent states –Transaction T A := A*2 B := B*2 –We could express T as six steps READ(A,t);t := t*2;WRITE(A,t); READ(B,t);t := t*2;WRITE(B,t);

Chapter 813 Example 8.2 (2/2) ActiontMem AMem BDisk ADisk B READ(A,t) t := t*2 WRITE(A,t) READ(B,t) t := t*2 WRITE(B,t) OUTPUT(A) OUTPUT(B) 8 16 8 16 8 16 8 16 8 16 8 16 Failure could happen anytime anywhere !!!

Chapter 814 Recovery Techniques A very complex area No formal (mathematical) model on recovery Implementation and techniques are completely dependent on other features (concurrency control, disk management, buffer management, index management, etc.) of a particular system Much of work did not get documented well

Chapter 815 Shadowing Approach A logical page is read from a physical page P (shadow version) and after modification is written to another physical page P’ (current version) During checkpoint, shadow versions is discarded and current versions become shadow versions On failure, recovery is performed with log and shadow versions UNDO is very simple (+) Lot of disk space needed (-) Hard to cluster pages in disk (-) Hard to support record-level locking (-) Not adopted in modern commercial systems.

Chapter 816 Logging Approach In-place update in buffer and disk All updates are logged in a “linear file” called log Outperform shadowing in general Widely used in various systems

Chapter 817 Log Concept A history of all changes to the state Log + old state gives new state Log + new state gives old state Log is a sequential file Complete log is the complete history

Chapter 818 LSN (log sequence number) Each log record has a log sequence number LSN plays a key role in many algorithms Key property: monotonicity –If action A happens after action B, then LSN(A) > LSN(B) Think about how to implement LSN?

Chapter 819 DO-REDO-UNDO Redo proceeds forward in the log (FIFO) while undo backward (LIFO) Old state Log record New state DO Old state Log record New state REDO New state Log record Old state UNDO

Chapter 820 Buffer manager vs. Recovery –Steal Not Steal: modified pages are kept in buffer until EOT –Force: All modified pages are flushed during EOT Not Force: –Steal/Not Force: REDO and UNDO Steal/Force: UNDO only Not Steal/Not Force: READ only Not Steal/FORCE: Neither REDO nor UNDO

Chapter 821 Kinds of Logging Physical (value) logging –Keep old and new values of container (page, file, …) –Simple (+), Generate lots of log records (-) Logical logging –Keep all parameters such that we can compute F(x), F -1 (x) –Example: <insert operation, table name, record value) –Compact log (+), complex recovery logic (-) Physio-logica logging

Chapter 822 Physio-logical logging Physical to a page, logical within a page Example –Struct physio_logical_log_record insert { int opcode; long pageNo; long record_id; long length; char record[length]; } Generate a log record per a per-page basis –Logical logging generates only one record per action When a record is inserted, page reorganization, page head/tail changes, etc. are all implicit in physiological logging

Chapter 823 Compensation Log Record UNDO generates a log record recording undo steps Redundant??? But widely used in practice New state Log record Old state UNDO Compensation log record

Chapter 824 Page LSN and Idempotence Page LSN: each page contains LSN (called page LSN) that represents the most recent update to that page Compensation logging makes page LSN monotonic Monotonicity is essential for physiological idempotence and for WAL If page.LSN  log_record.LSN then the effects of that log record are present in the page

Chapter 825 Undo Logging –When there is a crash, some transactions should be redone and some should be undone. –A log is a sequence of log records, which tells something about what some transaction has done –The execution of transactions are interleaved, so are log records –A “flush-log” operation –Do checkpointing to reduce the recovery time after failure

Chapter 826 Undo Log Record Types –This record indicates that transaction T has begun. –Transaction T has completed successfully and will make no more changes to database elements. –Transaction T could not complete successfully. –Transaction T has changed database element X, and its former value was v. –An undo log does not record the new value of a database element, only the old value.

Chapter 827 The Undo-Logging Rules (1/2) U 1 : If transaction T modifies database element X, then the log record of the form must be written to disk before the new value of X is written to disk U 2 : If a transaction commits, then its “COMMIT” log record must be written disk only after all database elements changed by the transaction have been written to disk, but as soon thereafter as possible

Chapter 828 The Undo-Logging Rules (2/2) To summarize U 1 and U 2, materials must be written to disk in the following order –The log records indicating changed database elements –The changed database elements –The “COMMIT” log record A “flush-log” command, to force log records to disk

Chapter 829 Example 8.3 (1/2) –Reconsider the transaction of Example 8.2 –Transaction T A := A*2 B := B*2 –We can express T as READ(A,t);t := t*2;WRITE(A,t); READ(B,t);t := t*2;WRITE(B,t);

Chapter 830 Example 8.3 (2/2) StepActiontM-AM-BD-AD-BLog 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) 12) READ(A,t) t := t*2 WRITE(A,t) READ(B,t) t := t*2 WRITE(B,t) FLUSH LOG OUTPUT(A) OUTPUT(B) FLUSH LOG 8 16 8 16 8 16 8 16 8 16 8 16 /* WHY */

Chapter 831 Recovery Using Undo Logging (1/2) Recovery manager should divide the transactions into committed and uncommitted ones, scanning the log backwards Committed transactions –There is a log record –Undo rule U 2 assures that all changes made by T are previously written to disk, so do nothing Uncommitted transactions –, but no –T is an “incomplete transaction” and must be “undone” –By Rule U 1, all changes made by T are logged as the form on the disk, do undo the actions

Chapter 832 Recovery Using Undo Logging (2/2) If it sees a record, then: –If T is a committed transaction, then do nothing –Otherwise, The recovery manager must change the value of X in the database to v After making these changes, the recovery manager must write a log record for each incomplete transaction T, and then flush the log

Chapter 833 Example 8.4 (1/2) Reconsider Example 8.3 and when the crash occurs After step (12), there is no need to recover Between steps (11) and (12) –If “COMMIT” record reached disk, do nothing –If not, the recovery manager makes B and A have value 8 on disk –Finally, the record is written to the log, and the log is flushed

Chapter 834 Example 8.4 (2/2) Between steps (10) and (11) –The “COMMIT” record surely was not written, so T is incomplete and is undone Between steps (8) and (10) –Again as in the above case, T is undone Prior to step (8) –If there were changes to A and/or B made on disk by T, then the corresponding log record will cause the recovery manager to undo those changes –We could do nothing in this case, too.

Chapter 835 Crashes During Recovery The system can crash while we are recovering from a previous crash The recovery steps should be “idempotent” –Repeating them many times has exactly the same effect as performing them once –We can recover a second time without worrying about changes made the first time The log record we are talking about here supports being “idempotent” !!! (WHY)

Chapter 836 Checkpointing To reduce the recovery time A simple checkpoint –Stop accepting new transactions –Wait until all currently active transactions commit or abort and have written a “COMMIT” or “ABORT” record on the log –Flush the log to disk –Write a log record, and flush the log again –Resume accepting transactions There is no need to scan prior to the

Chapter 837 Example 8.15 Checkpoint

Chapter 838 Nonquiescent Checkpointing (1) A problem with the previous checkpointing –We must shut down the system while the checkpoint is being made –The active transaction may take a long time to commit or abort

Chapter 839 Nonquiescent Checkpointing (2) “Nonquiescent checkpointing” technique –Write a log record <START CKPT(T 1, …,T k ) and flush the log T 1, …,T k are the names or identifiers for all the active transactions –Wait until all of T 1, …,T k commit or abort, but do not prohibit other transactions from starting –When all of T 1, …,T k have completed, write a log record and flush the log

Chapter 840 Nonquiescent Checkpointing (3) Recovery –If we first meet an record All incomplete transactions began after the previous record –If we first meet a record A crash occurred during the checkpoint The first type of incomplete transactions are those we met scanning backwards before we reached the “START CKPT” The second type of incomplete transactions are those of T 1, …,T k that did not complete before the crash We need scan no further back than the start of the earliest of these incomplete transactions

Chapter 841 Example 8.6 Checkpoint Suppose that a crash occurs

Chapter 842 Redo Logging A problem of undo logging –We cannot commit a transaction without first writing all its changed data to disk

Chapter 843 Redo Logging vs. Undo Logging 1.While undo logging cancels the effects of incomplete transactions and ignores committed ones during recovery, redo logging ignores incomplete transactions and repeats the changes made by committed transactions 2.Redo logging requires that the “COMMIT” record appear on disk before any changed values reach disk 3.To recover using redo logging, we need the new values of changed database elements

Chapter 844 The Redo-Logging Rule The log record –Transaction T wrote new value v for database element X A redo rule (“write-ahead logging, WAL”) –R 1 : Before modifying any database element X on disk, it is necessary that all log records pertaining to this modification of X, including both the update record and the record, must appear on disk

Chapter 845 The Redo-Logging Rule The order in which material associated with one transaction gets written to disk: –The log records indicating changed database elements –The “COMMIT” log record –The changed database elements themselves

Chapter 846 Example 8.7 (1/2) –Consider the same transaction T as in Example 8.3 –Differences First, the log records reflecting the changes have the new values of A and B, rather than the old values (See steps (4) & (7) in the next page) Second, the record comes earlier (See step (8) in the next page)

Chapter 847 Example 8.7 (2/2) StepActiontM-AM-BD-AD-BLog 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) READ(A,t) t := t*2 WRITE(A,t) READ(B,t) t := t*2 WRITE(B,t) FLUSH LOG OUTPUT(A) OUTPUT(B) 8 16 8 16 8 16 8 16 8 16 8 16

Chapter 848 Recovery With Redo Logging Recovery steps –Identify the committed transactions –Scan the log forward from the beginning. For each log record encountered: If T is not a committed transaction, do nothing If T is committed, write value v for database element X –For each incomplete transaction T, write an record to the log and flush the log

Chapter 849 Example 8.8 (1/2) –Reconsider Example 8.7 –When the crash occurs any time after step (9) The recovery system identifies T as a committed transaction When scanning the log forward, the log records and cause the recovery manager to write values 16 for A and B –Between steps (8) and (9) The record was written to disk, or not If it did get to disk, do as above If it did not get to disk, do as below –Prior to step (8) surely has not reached disk T is treated as an incomplete transaction No changes to A or B on disk are made on behalf of T An record is written to the log

Chapter 850 Checkpointing a Redo Log We cannot limit our concern to transactions that are active at the time we decide to create a checkpoint (Why???) The key action between the start and end of the checkpoint is to write to disk all database elements that have been modified by committed transactions but not yet written to disk –The buffer manager should keep track of which buffers are dirty and which transaction modified which buffers

Chapter 851 Checkpointing a Redo Log The steps for nonquiescent checkpointing –Write a log record, and flush the log where T 1, …, T k are all the active (uncommitted) transaction –Write to disk all database elements that were written to buffers but not yet to disk by transactions that had already committed when the START CKPT record was written to the log –Write an record to the log and flush the log

Chapter 852 Example 8.9 –Figure 8.8 shows a possible redo log –When we start the checkpoint, only T 2 is active, but the value of A written by T 1 may have reached disk. –If not, then we must copy A to disk before the checkpoint can end Figure 8.8 A redo log

Chapter 853 Recovery With a Checkpoint Two cases –The last checkpoint record on the log is A transaction that committed before that corresponding has had its changes written to disk Any transaction that is either among the T i ’s or that started after the beginning of the checkpoint can still have changes it made not yet migrated to disk We do not have to look further back than the earliest of the

Chapter 854 Recovery With a Checkpoint –The last checkpoint record on the log is We must search back to the previous record Find its matching record Redo all those committed transactions that either started after that START CKPT or are among the S i ’s

Chapter 855 Example 8.10 (1/2) –Consider again the log of Fig. 8.8 –Case (1): a crash occurs at the end We don’t need to redo T 1 Because we can find the records and, we must redo T 2 and T 3 –Case (2): a crash occurs between and We don’t need to redo T 1 Because we can find the records, we must redo T 2 Because T 3 is no longer a committed transaction, we don’t redo T 3, we write an record to the log after recovery

Chapter 856 Example 8.10 (2/2) –Case (3): a crash occurs prior to the record We must go all the way to the beginning of the log (In principal, we must search back to the next-to-last “START CKPT”) We must redo T 1 Because T 2 and T 3 are no longer committed transactions, we don’t redo T 3, we write records and to the log after recovery

Chapter 857 Undo/Redo Logging Drawbacks of undo logging and redo logging –Undo logging requires that data are written to disk immediately after a transaction finishes –Redo logging requires us to keep all modified blocks in buffers until the transaction commits and the log records have been flushed

Chapter 858 The Undo/Redo Rules Log record –Transaction T changes the value of database element X, and its former value is v, and its new value is w The rule –UR 1 : Before modifying any database element X on disk because of changes made by some transaction T, it is necessary that the update record appear on disk Note that, the log record can precede or follow any of the changes to the database elements on disk

Chapter 859 Example 8.11 (fig 8.9) StepActiontM-AM-BD-AD-BLog 1) 2) 3) 4) 5) 6) 7) 8) 9) 10) 11) READ(A,t) t := t*2 WRITE(A,t) READ(B,t) t := t*2 WRITE(B,t) FLUSH LOG OUTPUT(A) OUTPUT(B) 8 16 8 16 8 16 8 16 8 16 8 16 Step (10) could also have appear before step (9)

Chapter 860 Recovery With Undo/Redo Logging The undo/redo recovery policy –Redo all the committed transactions in the order earliest-first –Undo all the incomplete transactions in the order latest-first –We could have either a committed transaction with some or all of changes not on disk, or an uncommitted transaction with some or all of changes on disk

Chapter 861 Example 8.12 –Consider the sequence of actions in Fig. 8.9 –A crash occurs after the is flushed to disk T is treated as a committed transaction We need to write the value 16 for both B to disk –A crash occurs prior to the record reaching disk T is treated as an incomplete transaction The previous values of A and B, 8 in each case, need to be written to disk

Chapter 862 A Problem With Delayed Commitment A possibility –A system using undo/redo logging –A transaction appears to the use to have been completed –The record was not flushed to disk –A subsequent crash causes the transaction to be undone rather than redone Another Rule –UR 2 : A record must be flushed to disk as soon as it appears in the log (flush the log)

Chapter 863 Checkpointing an Undo/Redo Log (1/2) A nonquiescent checkpoint –Write a record to the log, where T 1, …, T k are all the active transactions, and flush the log –Write to disk all the buffers that are “dirty” Unlike redo logging, we flush all buffers, not just those written by committed transactions –Write an record to the log, and flush the log The only requirement is that transactions must not write any values (even to memory buffers) until it is certain not to abort

Chapter 864 Example 8.13 (1/2) Example 8.13 –Figure 8.10 shows an undo/redo log –T 2 ’s new B-value 10 has been written to disk –A has been also written to disk Fig. 8.10

Chapter 865 Example 8.13 (2/2) –The crash occurs at the end of the log T 1 is assumed to have both completed and had its changes written to disk Redo both T 2 and T 3 When we redo a transaction such as T 2, we do not need to look prior to the record –The crash occurs just before the record is written to disk We identify T 2 as committed but T 3 as incomplete Redo T 2 by setting C to 15 on disk; it is not necessary to set B to 10 Undo T 3 by setting D to 19 on disk; if T 3 had been active at the start of the checkpoint, we would have had to look prior to the START-CKPT record

Chapter 866 Protecting Against Media Failures We could, in principle, reconstruct the database from the log if: –The log were on a disk other than the disk(s) that hold the data –The log were never thrown away after a checkpoint –The log were of the redo or the undo/redo type, so new values are stored on the log

Chapter 867 The Archive Archiving –Maintaining a copy of the database separate from the database itself The backup would preserve the database state as it existed at this time, the database could be restored to the state that existed –To advance to a more recent state, we could use the log In order to protect against losing the log, we could transmit a copy of the log, to the same remote site as the archive

Chapter 868 The Archive Two levels of archiving –A full dump –An incremental dump It is also possible to have several levels of dump, with a full dump thought of as a “level 0” dump, and a “level i” dump copying everything changed since the last dump at level i of less

Chapter 869 Nonquiescent Archiving A motivation and solutions –Most databases cannot shut down for the period of time that it takes to make a backup copy –During a nonquiescent archiving, database activity may change many database elements on disk –The archived data may or may not be the data that existed when the dump began These discrepancies can be corrected from the log

Chapter 870 Nonquiescent Archiving Main memory Disk Checkpoint gets data from memory to disk; log allows recovery from system failure Dump gets data from disk to archive; archive plus log allows recovery from media failure Archive

Chapter 871 Example 8.14 –Four elements, A, B, C, and D –Consider the sequence of events shown in Fig. 8.12 Initial values: (1,2,3,4) Final values: (5,7,6,4) Archived values: (1,2,6,4), which existed at no time during the dump StepAction A := 5 C := 6 B := 7 Copy A Copy B Copy C Copy D Fig. 8.12: Events during a nonquiescent dump

Chapter 872 Nonquiescent Archiving Process –Write a log record –Perform a checkpoint –Perform a full or incremental dump of the data disk(s) –Make sure that enough of the log has been copied to the secure and remote site –Write a log record

Chapter 873 Example 8.15 –Figure 8.13 shows a possible undo/redo log of Example 8.14 –Notice that T 1 is not committed Dump completes Fig. 8.13

Chapter 874 Recovery Using an Archive and Log Recovery steps 1. Restore the database from the archive (a)Find the most recent full dump and reconstruct the database from it (i.e. copy the archive into the database) (b)If there are later incremental dumps, modify the database according to each, earliest first 2. Modify the database using the surviving log. Use the method of recovery appropriate to the log method being used

Chapter 875 Example 8.16 –Suppose that The log shown in Fig. 8.13 survive A media failure occurs at the end of log –The database is first restored to the values in the archive (i.e., four values 1,2,6,4) –Since T 2 has completed, we redo the step that sets C to 6 –Since T 1 does not have a COMMIT record, we must undo T 1

Chapter 81 Chapter 8 Coping With System Failures Spring 2001 Prof. Sang Ho Lee School of Computing, Soongsil Univ.

Similar presentations

Presentation on theme: "Chapter 81 Chapter 8 Coping With System Failures Spring 2001 Prof. Sang Ho Lee School of Computing, Soongsil Univ."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 81 Chapter 8 Coping With System Failures Spring 2001 Prof. Sang Ho Lee School of Computing, Soongsil Univ.

Similar presentations

Presentation on theme: "Chapter 81 Chapter 8 Coping With System Failures Spring 2001 Prof. Sang Ho Lee School of Computing, Soongsil Univ."— Presentation transcript:

Similar presentations

About project

Feedback