Implementing Atomicity and Durability

Slides:



Advertisements
Similar presentations
1 CS411 Database Systems 12: Recovery obama and eric schmidt sysadmin song
Advertisements

Recovery Amol Deshpande CMSC424.
Crash Recovery John Ortiz. Lecture 22Crash Recovery2 Review: The ACID properties  Atomicity: All actions in the transaction happen, or none happens 
1 CSIS 7102 Spring 2004 Lecture 9: Recovery (approaches) Dr. King-Ip Lin.
1 CPS216: Data-intensive Computing Systems Failure Recovery Shivnath Babu.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
1 CSIS 7102 Spring 2004 Lecture 8: Recovery (overview) Dr. King-Ip Lin.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 23 Database Recovery Techniques.
Chapter 20: Recovery. 421B: Database Systems - Recovery 2 Failure Types q Transaction Failures: local recovery q System Failure: Global recovery I Main.
Recovery CPSC 356 Database Ellen Walker Hiram College (Includes figures from Database Systems by Connolly & Begg, © Addison Wesley 2002)
CSCI 3140 Module 8 – Database Recovery Theodore Chiasson Dalhousie University.
Chapter 19 Database Recovery Techniques
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
Chapter 19 Database Recovery Techniques Copyright © 2004 Pearson Education, Inc.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Recovery 10/18/05. Implementing atomicity Note, when a transaction commits, the portion of the system implementing durability ensures the transaction’s.
ACID A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction, may change the DB from.
Quick Review of May 1 material Concurrent Execution and Serializability –inconsistent concurrent schedules –transaction conflicts serializable == conflict.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 23 Database Recovery Techniques.
Chapter 19 Database Recovery Techniques. Slide Chapter 19 Outline Databases Recovery 1. Purpose of Database Recovery 2. Types of Failure 3. Transaction.
©Silberschatz, Korth and Sudarshan17.1Database System Concepts Chapter 17: Recovery System Failure Classification Storage Structure Recovery and Atomicity.
1 Implementing Atomicity and Durability Chapter 25.
©Silberschatz, Korth and Sudarshan17.1Database System Concepts 3 rd Edition Chapter 17: Recovery System Failure Classification Storage Structure Recovery.
TRANSACTIONS A sequence of SQL statements to be executed "together“ as a unit: A money transfer transaction: Reasons for Transactions : Concurrency control.
1 Recovery Control (Chapter 17) Redo Logging CS4432: Database Systems II.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
7202ICT – Database Administration
Switch off your Mobiles Phones or Change Profile to Silent Mode.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 294 Database Systems II Coping With System Failures.
Recovery System By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
Recovery system By Kotoua Selira. Failure classification Transaction failure : Logical errors: transaction cannot complete due to some internal error.
Chapter 16 Recovery Yonsei University 1 st Semester, 2015 Sanghyun Park.
CSCI Recovery Control Techniques 1 RECOVERY CONTROL TECHNIQUES Dr. Awad Khalil Computer Science Department AUC.
Chapter 10 Recovery System. ACID Properties  Atomicity. Either all operations of the transaction are properly reflected in the database or none are.
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 2) Academic Year 2014 Spring.
Chapter 17: Recovery System
1 Chapter 6 Database Recovery Techniques Adapted from the slides of “Fundamentals of Database Systems” (Elmasri et al., 2003)
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 17: Recovery System.
Recovery technique. Recovery concept Recovery from transactions failure mean data restored to the most recent consistent state just before the time of.
Transactional Recovery and Checkpoints Chap
Transactional Recovery and Checkpoints. Difference How is this different from schedule recovery? It is the details to implementing schedule recovery –It.
Database Recovery Zheng (Godric) Gu. Transaction Concept Storage Structure Failure Classification Log-Based Recovery Deferred Database Modification Immediate.
1 Database Systems ( 資料庫系統 ) January 3, 2005 Chapter 18 By Hao-hua Chu ( 朱浩華 )
CS422 Principles of Database Systems Failure Recovery Chengyu Sun California State University, Los Angeles.
Jun-Ki Min. Slide Purpose of Database Recovery ◦ To bring the database into the last consistent stat e, which existed prior to the failure. ◦

Database recovery techniques
Database Recovery Techniques
Database Recovery Techniques
DURABILITY OF TRANSACTIONS AND CRASH RECOVERY
CS422 Principles of Database Systems Failure Recovery
Recovery Control (Chapter 17)
Transactional Recovery and Checkpoints
Database Applications (15-415) DBMS Internals- Part XIII Lecture 22, November 15, 2016 Mohammad Hammoud.
File Processing : Recovery
Chapter 10 Recover System
Chapter 16: Recovery System
Database Systems (資料庫系統)
Database Recovery Techniques
Chapter 10 Transaction Management and Concurrency Control
Database Applications (15-415) DBMS Internals- Part XIII Lecture 25, April 15, 2018 Mohammad Hammoud.
Outline Introduction Background Distributed DBMS Architecture
Recovery System.
Chapter 19: Recovery System
Chapter 17: Recovery System
Database Recovery 1 Purpose of Database Recovery
Database Applications (15-415) DBMS Internals- Part XIII Lecture 24, April 14, 2016 Mohammad Hammoud.
Data-intensive Computing Systems Failure Recovery
Presentation transcript:

Implementing Atomicity and Durability Chapter 25

System Malfunctions Transaction processing systems have to maintain correctness in spite of malfunctions Crash Abort Media Failure

Failures: Crash Processor failure, software bug Program behaves unpredictably, destroying contents of main (volatile) memory Contents of mass store (non-volatile memory) generally unaffected Active transactions interrupted, database left in inconsistent state Server supports atomicity by providing a recovery procedure to restore database to consistent state Since rollforward is generally not feasible, recovery rolls active transactions back

Failures: Abort Causes for abort: User (e.g., cancel button) Transaction (e.g., deferred constraint check) System (e.g., deadlock, lack of resources) The technique used by the recovery procedure supports atomicity Roll transaction back

Failures: Media Durability requires that: database state produced by committed transactions must be preserved Possibility of failure of mass store implies that: database state must be stored redundantly (in some form) on independent non-volatile devices

Log Sequence of records (sequential file): Modified by appending (no updating) Contains information from which: database can be reconstructed read by routines that handle abort and crash recovery Log and database stored on: different mass storage devices often replicated to survive media failure Contains valuable historical data not in database: how did database reach current state?

Log Each modification of the database: Update record contains: causes an update record to be appended to log Update record contains: Identity of data item modified Identity of transaction (tid) that did the modification Before image (undo record) – copy of data item before update occurred Referred to as physical logging

Log Update records in a log x y z u y w z T1 T1 T2 T3 T1 T4 T2 17 A 2.4 18 ab 3 4.5 Update records in a log most recent database update

Transaction Abort Using Log Scan log backwards: using tid to identify transaction’s update records reverse each update using before image reversal done in last-in-first-out order In a strict system new values are unavailable to concurrent transactions (result of long term x-locks); hence rollback makes a transaction atomic Problem: terminating scan (log can be long) Solution: append a begin record for each transaction, containing tid, prior to its first update record

Transaction Abort Using Log B U U U U U U U x y z u y w z T1 T1 T1 T2 T3 T1 T4 T2 17 A 2.4 18 ab 3 4.5 Key: B – begin record U – update record abort T1 Abort Procedure: Scan back to begin record using update records to reverse changes

Logging Savepoints Savepoint record inserted in log: when savepoint created contains tid, savepoint identity Rollback Procedure: scan log backwards using tid to identify update records undo updates using before image terminate scan when appropriate savepoint record encountered

Crash Recovery Using Log Abort all transactions active at time of crash Problem: How do you identify them? Solution: abort record or commit record appended to log when transaction terminates Recovery Procedure: Scan log backwards - if T’s first record is an update record, T was active at time of crash. Roll it back A transaction is not committed until its commit record is in the log

Crash Recovery Using Log B U U U U U C U A U x y z u y w z T1 T1 T1 T2 T3 T1 T3 T4 T1 T2 17 A 2.4 18 ab 3 4.5 Key: B – begin record U – update record C – commit record A – abort record crash T1 and T3 were not active at time of crash

Crash Recovery Using Log Problem: Scan must retrace entire log Solution: Periodically append checkpoint rec. to log. Contains tid’s of all active trans. at time of append Backward scan goes at least as far as last checkpoint record appended Transactions active at time of crash determined from log suffix that includes last checkpoint record Scan continues until those transactions have been rolled back

Example Backward scan B2 B3 U2 B1 C2 B5 U3 U5 A5 CK U1 U4 B6 C4 U6 U1 crash T1 T4 T3 Key: U - update record B - begin record C - commit record A - abort record CK - checkpoint record T1, T3 and T6 active at time of crash

Write-Ahead Log When x is updated two writes must occur: update x in database, append of update log record which goes first? …………………..update x; append to log ……………. crash crash crash (no before image in log) ………………..append to log; update x …………………. crash crash crash (use before image; it has no effect)

Write-Ahead Log: Performance Problem: two I/O ops for each database update Solution: log buffer in main memory Extension of log on mass store Periodically flushed to mass store Flush cost pro-rated over multiple log appends This effectively reduces the cost to one I/O operation for each database update

Performance Problem: one I/O operation for each DB update Solution: database page cache in main memory Page is unit of transfer Page containing requested item is brought to cache; then a copy of the item is transferred to application Retain page in cache for future use Check cache for requested item before doing I/O (I/O can be avoided)

Page and Log Buffering database mass store log main memory cache

Cache Management Cache pages that have been updated are marked dirty; others are clean Cache ultimately fills Clean pages can simply be overwritten Dirty pages must be written to database before page frame can be reused

Atomicity, Durability and Buffering Problem: page and log buffers are volatile Their use affects the time data becomes non-volatile Complicates algorithms for atomicity and durability Requirements: Write-ahead feature (move update records to log on mass store before database is updated) necessary to preserve atomicity New values written by a transaction must be on mass store when its commit record is written to log (move new values to mass store before commit record) to preserve durability Transaction not committed until commit record in log on mass store Solution: requires new mechanisms

Forced vs. Unforced Writes: On database page: Unforced write updates cache page, marks it dirty and returns control immediately. Forced write updates cache page, marks it dirty, uses it to update database page on disk, and returns control when I/O completes. On log: Unforced append adds record to log buffer and returns control immediately. Forced append, adds record to log buffer, writes buffer to log, and returns control when I/O completes.

Log Sequence Number (LSN) Log records are numbered sequentially Each database page contains the LSN of the update record describing the most recent update of any item in the page 12 x y 8 9 x 17 10 11 12 y 17 13 Database page 17 log LSN

Preserving Atomicity (the Write-Ahead Property and Buffering) Problem 1: When the cache page replacement algorithm decides to write a dirty page p to mass store, an update record corresponding to p might still be in the log buffer. Solution: Force the log buffer if the LSN stored in p is greater than or equal to the LSN of the oldest record in the log buffer. Then write p. This preserves write-ahead policy.

Preserving Durability I Problem 2: Pages updated by T might still be in cache when T’s commit record is appended to log buffer. Once commit record is in log buffer, it may be flushed to log at any time, causing a violation of durability. Solution: Force the (dirty) pages in the cache that have been updated by T before appending T’s commit record to log buffer (force policy).

Force Policy for Commit Processing Force any update records of T in log buffer then Force any dirty pages updated by T in cache then (1) and (2) ensure atomicity (write-ahead policy) Append T’s commit record to log buffer then Force log buffer for immediate commit or Write log buffer when a group of transactions have committed (group commit) (2) and (3) ensure durability

Force Policy for Commit Processing database r s xold log 3 1 2 j xnew r+1 j ···· k xold log buffer cache LSN update record for T commit record for T

Force Policy Advantage: Disadvantages: Transaction’s updates are in database (on mass store) when it commits. Disadvantages: Commit must wait until dirty cache pages are forced Pages containing items that are updated by many transactions (hotspots) have to be forced with the commit of each such transaction … but an LRU page replacement algorithm would not write such a page out

Preserving Durability II Problem 2: Pages updated by T might still be in cache when T’s commit record is appended to log buffer Solution: Update record contains after image (called a redo record) as well as before image Write-ahead property still requires that update record be written to mass store before page But it is no longer necessary to force dirty pages when commit record is written to log on mass store since all after images precede commit record in log Referred to as a no-force policy

No-Force Commit Processing Append T’s commit record to log buffer Force buffer for immediate commit T’s update records precede its commit record in buffer ensuring updates are durable before (or at the same time as) it commits T’s dirty pages can be flushed from cache at any time after update records have been written Necessary for write-ahead policy T’s dirty pages can be written before or after commit record

No Force Policy for Commit Processing database s xold log r 1 2 j xnew r+1 j ···· k xold xnew log buffer cache update record for T commit record for T LSN

No-Force Policy Advantages: Disadvantages: Commit doesn’t wait until dirty pages are forced Pages with hotspots don't have to be written out Disadvantages: Crash recovery complicated: some updates of committed transactions (contained in redo records) might not be in database on restart after crash Update records are larger

Recovery With No-Force Policy Problem: When a crash occurs there might exist some pages in database (on mass store) containing updates of uncommitted transaction: they must be rolled back that do not (but should) contain the updates of committed transactions: they must be rolled forward Solution: Use a sharp checkpoint

Recovery With No-Force Policy U U C p1 p2 T1 T2 T1 xold xnew yold ynew p1 xold crash T1 committed T2 active p2 flushed p1 not flushed log p2 ynew database p1 must be rolled forward using xnew p2 must be rolled back using yold

Sharp Checkpoint Problem: How far back must log be scanned in order to find update records of committed transactions that must be rolled forward? Solution: Before appending a checkpoint record, CK, to log buffer, halt processing and force all dirty pages from cache Recovery process can assume that all updates in records prior to CK were written to database Only updates in records after CK might not be in database

Recovery with Sharp Checkpoint Pass 1: Log is scanned backward to most recent checkpoint record, CK, to identify transactions active at time of crash. Pass 2: Log is scanned forward from CK to most recent record. The after images in all update records are used to roll the database forward. Pass 3: Log is scanned backwards to begin record of oldest transaction active at time of crash. The before images in the update records of these transactions are used to roll these transactions back.

Recovery with Sharp Checkpoint Issue 1: Database pages containing items updated after CK was appended to log might have been flushed before crash No problem – with physical logging, roll forward using after images in pass 2 is idempotent. Rollforward in this case is unnecessary, but not harmful

Recovery with Sharp Checkpoint Issue 2: Some update records after CK might belong to an aborted transaction, T1. These updates will not be rolled back in pass 3 since T1 was not active at time of crash Treat rollback operations for aborting T1 as ordinary updates and append compensating log records to log CK U1 xold xnew CL1 xnew xold A1 before images crash

Recovery with Sharp Checkpoint Issue 3: What if system crashes during recovery? Recovery is restarted If physical logging is used, pass 2 and pass 3 operations are idempotent and hence can be redone

Fuzzy Checkpoints Problem: Cannot stop the system to take sharp checkpoint (write dirty pages). Use fuzzy checkpoint: Before writing CK, record the identity of all dirty pages (do not flush them) in memory All recorded pages must be flushed before next checkpoint record is appended to log buffer

Fuzzy Checkpoints U1 CK1 U2 CK2 crash Page corresponding to U1 is recorded at CK1 and will have been flushed by CK2 Page corresponding to U2 is recorded at CK2, but might not have been flushed at time of crash Pass 2 must start at CK1

Archiving the Log Problem: What to do when the log fills mass store? Initial portions of log are not generally discarded since they contain important data: Record of how database got to its current state Information for analyzing performance Solution: Archive the initial portion of the log on tertiary storage. Only the portion of the log containing records of active transactions needs to be maintained on secondary store

Logical Logging Problem with physical logging: simple database updates can result in multiple update records with large before and after images Example – “insert t in T” might cause reorganization of a data page and an index page for each index. Before and after images might be entire pages Solution: Log the operation and its inverse instead of before and after images Example - store “insert t in T ”, “delete t from T ” in update record

Logical Logging Problem 1: Logical operations might not be idempotent (e.g., “UPDATE T SET x = x+5”) Pass 2 roll forward does not work (it makes a difference whether the page on mass store was updated before the crash or after the crash) Solution: Do not apply operation in update record i to database item in page P during pass 2 if P.LSN  i

Logical Logging Problem 2: Operations are not atomic A crash during the execution of a non-atomic operation can leave the database in a physically inconsistent state Example - “insert t in T ” requires an update to both a data and an index page. A crash might occur after t has been inserted in T but before the index has been updated Applying a logical redo operation in pass 2 to a physically inconsistent state is not likely to work Example - There might be two copies of t in T after pass 2

Physiological Logging Solution: Use physical-to-a-page, logical-within-a-page logging (physiological logging) A logical operation involving multiple pages is broken into multiple logical mini-operations Each mini-operation is confined to a single page and hence is atomic Example - “insert t in T” becomes “insert t in a page of T” and “insert pointer to t in a page of index” Each mini-operation gets a separate log record Since mini-operations are not idempotent, use LSN check before applying operation in pass 2

Deferred-Update System Update - append new value to intentions list (in volatile memory); append update record (containing only after image) to log buffer; write-ahead property does not apply since there is no before image Abort - discard intentions list Commit - force commit record to log; initiate database update using intentions list Completion of intentions list processing - write completion record to log

Recovery in Deferred-Update System Checkpoint record - contains list of committed (not active) but incomplete transactions Recovery - Scan back to most recent checkpoint record to determine transactions that are committed but for which updates are incomplete at time of crash Scan forward to install after images for incomplete transactions No third pass required since transactions active (not committed) at time of crash have not affected database

Media Failure Durability requires that the database be stored redundantly on distinct mass storage devices Redundant copy on (mirrored) disk => high availability - Log still needed to achieve atomicity after an abort or crash Redundant data in log Problem: Using the log (as in 2 above) to reconstruct the database is impractical since it requires a scan starting at first record Solution: Use log together with a periodic dump

Simple Dump Simple dump System stops accepting new transactions Waits until all active transactions complete Dump: copy entire database to a file on mass storage Restart log and system

Restoring Database From Simple Dump Install most recent dump file Scan backward through log Determine transactions that committed since dump was taken Ignore aborted transactions and those that were active when media failed Scan forward through log Install after images of committed transactions

Fuzzy Dump Problem: The system cannot be shut down to take a simple dump Solution: Use a fuzzy dump Write begin dump record to log Copy database records to dump file while system active Even copying records of active transactions and records that are locked

Fuzzy Dump Dump file might: reflect incomplete execution of an active transaction that later commits reflect updates of an active transaction that later aborts wT(x) dump(x) dump(y) wT(y) commitT time wT(x) dump(x) abortT time

Naïve Restoration Using Fuzzy Dump Install dump on disk Scan log backwards to begin dump record to produce list, L, of all transactions that committed since start of dump Scan log forward and install after images in update records of all transactions in L

Naïve Restoration Using Fuzzy Dump - It does some things correctly wT(x) wT(y) commitT time start dump dump(x,y) end dump T in L; roll it forward beginT wT(x) abortT time T not in L; do not roll it forward start dump end dump

Naïve Restoration Using Fuzzy Dump Problem: Naïve algorithm does not handle two cases: T commits before dump starts but its dirty pages might not have been flushed until dump completed Dump does not read T’s updates and T is not in L . Dump reads T’s updates but T later aborts: wT(x) abortT time start dump dump(x) end dump

Taking a Fuzzy Dump Solution: Use fuzzy checkpointing and compensating log records Dump algorithm: Write checkpoint record Write begin dump record (BD) Dump Write end dump record (ED)

Restoration Using Fuzzy Dump Install dump on mass storage device Scan backward to CK3 to produce list, L, of all transactions active at time of media failure Scan forward from CK1; use redo records to roll the database forward to its state at time of media failure Scan backwards to begin record of oldest transaction in L, roll all transactions in L back all dirty pages in cache at time of CK1 have been written to database media failure CK1 CK2 BD ED CK3