Transactions Chapter 12 Transactions and Concurrency Control.

Transactions Chapter 12 Transactions and Concurrency Control

Transactions We study transactions here because they require a lot of synchronization and coordination. Transactions (think databases) have database tables and data items as shared resources. Transactions have the additional capability of coordinating the update of several of these “resources” at once. It is as if a process must have the CR (critical region) for several resources at the same time.

The Transaction Model A transaction is a unit of program execution that accesses and possibly updates various data items. A transaction must see a consistent database. During transaction execution the database may be inconsistent. When the transaction is committed, the database must be consistent. Two main issues to deal with:  Failures of various kinds, such as hardware failures and system crashes  Concurrent execution of multiple transactions

The Transaction Model Concurrent execution of user programs is essential for good DBMS performance. A user’s program may carry out many operations on the data retrieved from the DB, however, the DBMS is only concerned with reads and writes. Users submit transactions and the DBMS interleaves the operations to achieve concurrency. This is called Concurrency Control

The Transaction Model (ACID) Atomicity. Either all operations of the transaction are properly reflected in the database or none are. Consistency. Execution of a transaction in isolation preserves the consistency of the database. Isolation. Although multiple transactions may execute concurrently, each transaction must be unaware of other concurrently executing transactions. Intermediate transaction results must be hidden from other concurrently executed transactions. Durability. After a transaction completes successfully, the changes it has made to the database persist, even if there are system failures.

Example: Funds Transfer Transaction to transfer $50 from account A to account B: 1.read(A) 2.A := A – 50 3.write(A) 4.read(B) 5.B := B + 50 6.write(B) Consistency requirement – the sum of A and B is unchanged by the execution of the transaction. Atomicity requirement — if the transaction fails on any step (after step 3 and before step 6) the system ensures that its updates are not reflected in the database.

Example: Funds Transfer continued Durability requirement — once the user has been notified that the transaction has completed (i.e., the transfer of the $50 has taken place), the updates to the DB must persist despite failures. Isolation requirement — if between steps 3 and 6, another transaction is allowed to access the partially updated database, it will see an inconsistent database (the sum A + B will be less than it should be) violating the isolation requirement. Can be ensured by running transactions serially.

The Transaction Model Examples of primitives for transactions. Write data to a file, a table, or otherwiseWRITE Read data from a file, a table, or otherwiseREAD Kill the transaction and restore the old valuesABORT_TRANSACTION Terminate the transaction and try to commitEND_TRANSACTION Make the start of a transactionBEGIN_TRANSACTION DescriptionPrimitive

The Transaction Model- Aborts a) Transaction to reserve three flights commits b) Transaction aborts when third flight is unavailable BEGIN_TRANSACTION reserve SFO -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full => ABORT_TRANSACTION (b) BEGIN_TRANSACTION reserve SFO -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi; END_TRANSACTION (a)

Transaction States Active, the transaction is executing Failed, after the discovery that normal execution can no longer proceed. Aborted, after the transaction has been rolled back and the database restored to its state prior to the start of the transaction. Two options after it has been aborted:  restart the transaction – only if no internal logical error  kill the transaction Committed, after successful completion.

Reasons for Concurrency Multiple transactions are allowed to run concurrently in the system. Advantages are:  increased processor and disk utilization, leading to better transaction throughput: one transaction can be using the CPU while another is reading from or writing to the disk  reduced average response time for transactions: short transactions need not wait behind long ones.

Distributed Transactions Flat transactions versus nested transactions (which allow partial results to be committed). A nested transaction is a transaction that is logically decomposed into a hierarchy of subtransactions. A distributed transaction is a logically flat indivisible transaction that operates on distributed data.

Distributed Transactions a) A nested transaction b) A distributed transaction

Transaction Atomicity, Isolation, and Durability Conceptually, when a transaction starts, it is given a private workspace to make its changes to. When it commits, the private workspace replaces the corresponding data items in the permanent workspace. If the transaction aborts, the private workspace can simply be discarded. This type of implementation leads to many private workspaces and thus consumes a lot of space. Also, if a transaction only reads a data table or item, it doesn’t need a private copy.

Private Workspace a) The file index and disk blocks for a three-block file b) After a transaction has modified block 0 and appended block 3 c) After committing

More Efficient Implementation Two common methods of implementation are writeahead logs and before images. With write-ahead logs, the transactions act on the permanent workspace, but before they can make a change, a log record is written to stable storage with the transaction and data item ID and the old and new values. This log can then be used if the transaction aborts and the changes need to be rolled back.

Writeahead Log a) A transaction b) – d) The log before each statement is executed Log [x = 0 / 1] [y = 0/2] [x = 1/4] (d) Log [x = 0 / 1] [y = 0/2] (c) Log [x = 0 / 1] (b) x = 0; y = 0; BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y; END_TRANSACTION; (a)

Before- and After- Images A before- and after-image is kept for each data item. When a data item is changed, the old value is written to the before-image and the new value is the after-image. Other transactions are not allowed to “see” the new value until the current transaction commits. The after-image is made permanent and durable once the transaction which wrote it commits. If the transaction aborts, the before-image is restored.

DBMS Organization General organization of managers for handling transactions.

DBMS Organization General organization of managers for handling distributed transactions.

Concurrency Control Concurrency control schemes – mechanisms to achieve isolation, i.e., to control the interaction among the concurrent transactions in order to prevent them from destroying the consistency of the database. Need a definition for correct execution of transactions: serializability

Transaction Schedules Schedules – sequences that indicate the chronological order in which instructions of concurrent transactions are executed  a schedule for a set of transactions must consist of all instructions of those transactions  must preserve the order in which the instructions appear in each individual transaction.

Example Schedule Let T 1 transfer $50 from A to B, and T 2 transfer 10% of the balance from A to B. Here is a serial schedule, in which T 1 is followed by T 2.

Example Continued Let T 1 and T 2 be the transactions defined previously. This schedule is not a serial schedule, but it is equivalent to Schedule 1. That is, the sum A+B is preserved. All the effects are the same as they would be if the schedule were serial.

Not Serializable This concurrent schedule does not preserve the value of the the sum A + B and is not equivalent to the serial schedule.

Serializability Assumption – Each transaction preserves database consistency. Thus a serial execution of a set of transactions preserves database consistency. A (possibly concurrent) schedule is serializable if it is equivalent to a serial schedule. We assume that transactions may perform arbitrary computations on data in local buffers in between reads and writes. Our simplified schedules consist of only read and write instructions.

Conflict Serializability Instructions l i and l j of transactions T i and T j respectively, conflict if and only if there exists some item Q accessed by both l i and l j, and at least one of these instructions wrote Q. 1. l i = read(Q), l j = read(Q). l i and l j don’t conflict. 2. l i = read(Q), l j = write(Q). They conflict. 3. l i = write(Q), l j = read(Q). They conflict 4. l i = write(Q), l j = write(Q). They conflict Intuitively, a conflict between l i and l j forces a (logical) temporal order between them. If l i and l j are consecutive in a schedule and they do not conflict, their results would remain the same even if they were interchanged.

Conflict Serializable If a schedule S can be transformed into a schedule S´ by a series of swaps of non-conflicting instructions, we say that S and S´ are conflict equivalent. We say that a schedule S is conflict serializable if it is conflict equivalent to a serial schedule Example of a schedule that is not conflict serializable: T 3 T 4 read(Q) write(Q) write(Q)

Conflict Serializable This schedule is conflict serializable.

Remember Recovery? The DB must behave as if it contains all of the effects of committed transactions and none of the effects of uncommitted ones. So, when a transaction aborts, the DBMS must wipe out all its effects. If a transaction, t1, writes a value to data item x, and t2 reads that value, what happens when t1 subsequently aborts?

Recoverable Schedules If T 8 should abort, T 9 would have read (and possibly shown to the user) an inconsistent database state. Hence if T 9 is allowed to commit before T 8, the schedule is not recoverable. In order to be recoverable, a transaction is not allowed to commit until every transaction it reads from has committed.

Cascading Aborts Cascading aborts – a single transaction failure can lead to a series of transaction rollbacks. Consider the following schedule where none of the transactions has yet committed (so the schedule is recoverable) If T 10 fails, T 11 and T 12 must also be rolled back.

How to Avoid Cascading Aborts If we ensure that every transaction reads only those items whose values were written by committed transactions, the schedule will avoid cascading aborts. This restricts the reads of a transaction. What about the writes? Will restricting writes give us an useful property?

Strict Executions If we ensure that every transaction writes to only those items whose values were written by committed transactions, the schedule is strict. This nice property ensures that we only have to keep one before-image. T1T2T3 write(A) abort

Levels of Consistency (SQL92) Serializable — default Repeatable read — only committed records to be read, repeated reads of same record must return same value. However, a transaction may not be serializable. Scheduler must maintain RR property. If t1 makes a second read request and X has been modified, then t1 will be aborted. Read committed — only committed records can be read, but successive reads of record may return different (but committed) values. Read uncommitted — even uncommitted records may be read (browse).

Example: Serializability a) – c) Three transactions T 1, T 2, and T 3 d) Possible schedules schedule 2 might be considered legal because result is same as that of serial schedule: this is view serializability - it is not conflict serializable BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION (a) Illegalx = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3;Schedule 3 Illegalx = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3;Schedule 2 Legalx = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3Schedule 1 (d)

Example Is the schedule on the next slide conflict serializable, and if so, find a conflict equivalent serial order? Remember: two operations conflict if they are from different transactions, they access the same data item, and at least one of them is a write.

T 1 T 2 T 3 T 4 T 5 read(X) read(Y) read(Z) read(V) read(W) read(W) read(Y) write(Y) write(Z) read(U) read(Y) write(Y) read(Z) write(Z) read(U) write(U)

How to Enforce Serializability? Pessimistic approach: prevent transactions from accessing data that might lead to a conflict. Optimistic approach: allow transactions to access the data, but require them to “validate” before committing.

Two Phase Locking (1) Pessimistic approach Easiest and most widely used way. Scheduler maintains a lock for each data item. An item is locked on behalf of a transaction and then no other transaction can access it. Refinement: distinguish between read locks and write locks. Read locks can be shared with other readers.

Rules for Two Phase Locking(2) Transaction must get a read or write lock on data item d before reading d and must get a write lock on d before writing to d. After a transaction relinquishes a lock, it may not acquire any new locks.

Two-Phase Locking (3) 2 Phase Locking

Two Phase Locking (4) Strict 2PL avoids cascading aborts by preventing transactions from seeing uncommitted values. Locks are acquired then held until the transaction is ready to commit or is aborted.

Two-Phase Locking (5) Strict two-phase locking.

Two Phase Locking T1T2 RL(Q) read(Q) RL(Q) read(Q) UL(Q) WL(Q)commit write(Q) UL(Q) commit RL(X) means acquire a read lock on X WL(X) means acquire a write lock on X UL(X) means unlock X

Pessimistic Timestamp Ordering Every transaction gets a (Lamport, totally ordered) timestamp when it starts. Every data item has a read ts and a write ts and a commit bit c. The read ts is the ts of the transaction that most recently read the data item. The write ts is the ts of the transaction that most recently wrote to the item. The commit bit c is true if and only if the most recent transaction to write to that item has committed. The scheduler maintains the item timestamps and checks to make sure the reads and writes are correct. Goal is to enforce serializability.

Read Too Late T1 tries to read X, but ts(T1) < write-ts(X) meaning X has been written to by a later transaction. T1 should not be allowed to read X because it was written by a transaction that occurs later in the serialization order (transactions are serialized by start time). Solution: T1 is aborted. T2 writes X T1 reads X? T1 starts T2 starts

Write Too Late T1 tries to write X, but the read-ts indicates that some other transaction should have read the value about to be written. write-ts(X) < ts(T1) < read-ts(X) Solution: T1 is aborted. T2 reads X T1 writes X? T1 starts T2 starts

Dirty Reads T1 reads X that was last written by T2. The timestamps are properly ordered, but the commit bit c=false so if T2 later aborts then T1 must abort. Solution: We can avoid cascading aborts by delaying T1’s read until T2 has committed (though not necessary to ensure serializability). T2 writes X T1 reads X? T2 starts T1 starts T2 abort

Thomas Write Rule T2 has written to X before T1. When T1 tries to write, the appropriate action is to do nothing. No other transaction T3 that should have read T1’s value of X got T2’s value instead, because it would have been aborted because of a too late read. Future reads of X want T2’s value or a later value, not T1’s value. Solution: T1’s write can be skipped if T2 commits. T2 writes X T1 writes X? T1 starts T2 starts

Commit Requests Transaction commit requests are also passed to the scheduler. To ensure strict executions, a commit request can be delayed until all transactions that wrote items that it overwrote have committed. The scheduler sets the commit bit c on data items in the write set when it services the commit request.

TS Ordering Rules When scheduler receives a read request from transaction T,  if ts(T)>= write-ts(X) and c(X) is true, grant request and set read-ts(X) to MAX{ts(T),read-ts(X)}  if ts(T)>= write-ts(X) and c(X) is false, delay T until c(X) becomes true or txn aborts.  If ts(T)< write-ts(X), abort T and restart with new timestamp.

TS Ordering Rules, continued When scheduler receives a write request from transaction T,  if ts(T)>= read-ts(X) and ts(T)>= write-ts(X), grant request, set write-ts(X) to ts(T) and c(X)=false  if ts(T)>= read-ts(X) and ts(T)< write-ts(X), don’t do the operation but allow T to continue as if done (Thomas write rule).  If ts(T)< read-ts(X), abort T and restart with new timestamp.

Pessimistic TS Ordering If the scheduler enforces these rules, transactions will be serializable. The serial order is the order of their timestamps. The next slide is an example of 3 transactions T1, T2, and T3. T1 runs first and completes and has used every item T2 and T3 want. In a, b, c and d, T2 requests a write(x) at the end of the given sequence. In e, f, g and h, T2 requests a read(x) at the end of the sequence. In (d), T2 could continue (Thomas write rule) if there are no intervening reads. In (f) timestamps of T2 and T3 are reversed.

Pessimistic Timestamp Ordering Concurrency control using timestamps. Tent means written but transaction has not yet committed

Optimistic Timestamp Ordering In any optimistic concurrency control, each transaction does its writes to a private workspace until completion of a validation phase. In the validate phase, the scheduler validates the transaction by comparing its read set and write set with those of other transactions. After validation, the write set values are written to the database and the transaction commits Validation is frequently done with the help of timestamps.

Summary and Examples

Rules for Two Phase Locking Transaction must get a read or write lock on data item d before reading d and must get a write lock on d before writing to d. A read lock can be shared with other read locks. A write lock is an exclusive lock. After a transaction relinquishes a lock, it may not acquire any new locks.

Example One Is this schedule serializable and would it be permitted under the rules of two phase locking?

Example Two Is this schedule serializable and would it be permitted under the rules of two phase locking?

Pessimistic Timestamp Ordering Every transaction gets a (Lamport, totally ordered) timestamp when it starts. Every data item has a read ts and a write ts and a commit bit c. The read ts is the ts of the transaction that most recently read the data item. The write ts is the ts of the transaction that most recently wrote to the item. The commit bit c is true if and only if the most recent transaction to write to that item has committed.

TS Ordering Rules When scheduler receives a read request from transaction T,  if ts(T)>= write-ts(X) and c(X) is true, grant request and set read-ts(X) to MAX{ts(T),read-ts(X)}  if ts(T)>= write-ts(X) and c(X) is false, delay T until c(X) becomes true or txn aborts.  If ts(T)< write-ts(X), abort T and restart with new timestamp.

TS Ordering Rules, continued When scheduler receives a write request from transaction T,  if ts(T)>= read-ts(X) and ts(T)>= write-ts(X), grant request, set write-ts(X) to ts(T) and c(X)=false  if ts(T)>= read-ts(X) and ts(T)< write-ts(X), don’t do the operation but allow T to continue as if done (Thomas write rule).  If ts(T)< read-ts(X), abort T and restart with new timestamp.

Example One Is this schedule serializable and how would it be handled in timestamp ordering?

Example Two Is this schedule serializable and how would it be handled in timestamp ordering?

Transactions Chapter 12 Transactions and Concurrency Control.

Similar presentations

Presentation on theme: "Transactions Chapter 12 Transactions and Concurrency Control."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Transactions Chapter 12 Transactions and Concurrency Control.

Similar presentations

Presentation on theme: "Transactions Chapter 12 Transactions and Concurrency Control."— Presentation transcript:

Similar presentations

About project

Feedback