CMPSC 274: Transaction Processing Lecture #2: Correctness Divy Agrawal Department of Computer Science UC Santa Barbara.

Slides:



Advertisements
Similar presentations
Cs4432concurrency control1 CS4432: Database Systems II Lecture #21 Concurrency Control : Theory Professor Elke A. Rundensteiner.
Advertisements

1 Lecture 11: Transactions: Concurrency. 2 Overview Transactions Concurrency Control Locking Transactions in SQL.
1 Integrity Ioan Despi Transactions: transaction concept, transaction state implementation of atomicity and durability concurrent executions serializability,
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Chapter 15: Transactions Transaction Concept Transaction Concept Concurrent Executions Concurrent Executions Serializability Serializability Testing for.
Cs4432concurrency control1 CS4432: Database Systems II Lecture #22 Concurrency Control Professor Elke A. Rundensteiner.
Transactions (Chapter ). What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions.
1 CS216 Advanced Database Systems Shivnath Babu Notes 11: Concurrency Control.
(c) Oded Shmueli Transactions Lecture 1: Introduction (Chapter 1, BHG) Modeling DB Systems.
Universität Karlsruhe (TH) © 2006 Univ,Karlsruhe, IPD, Prof. Lockemann/Prof. BöhmTAV 4 Chapter 4 Isolation: Correctness in the read/write model.
CS4432: Database Systems II Lecture #26 Concurrency Control and Recovery Professor Elke A. Rundensteiner.
Representing Relations Using Matrices
Concurrent Transactions Even when there is no “failure,” several transactions can interact to turn a consistent state into an inconsistent state.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
©Silberschatz, Korth and Sudarshan15.1Database System ConceptsTransactions Transaction Concept Transaction State Implementation of Atomicity and Durability.
Concurrency. Busy, busy, busy... In production environments, it is unlikely that we can limit our system to just one user at a time. – Consequently, it.
Transaction Processing: Concurrency and Serializability 10/4/05.
Concurrency. Correctness Principle A transaction is atomic -- all or none property. If it executes partly, an invalid state is likely to result. A transaction,
6/27/2015Transactional Information Systems8-1 Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery.
1 Introduction to Transaction Processing (1)
Transactions Sylvia Huang CS 157B. Transaction A transaction is a unit of program execution that accesses and possibly updates various data items. A transaction.
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
CS 162 Discussion Section Week 9 11/11 – 11/15. Today’s Section ●Project discussion (5 min) ●Quiz (10 min) ●Lecture Review (20 min) ●Worksheet and Discussion.
Transaction Lectured by, Jesmin Akhter, Assistant professor, IIT, JU.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
©Silberschatz, Korth and Sudarshan15.1Database System Concepts Chapter 15: Transactions Transaction Concept Transaction State Implementation of Atomicity.
Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed.
Chapter 15: Transactions Loc Hoang CS 157B. Definition n A transaction is a discrete unit of work that must be completely processed or not processed at.
Chapter 14 Transactions Yonsei University 1 st Semester, 2015 Sanghyun Park.
Concurrency (cont.) Schedule. In multiprogramming environment, Several transaction run concurrently Database consistency can be destroy Schedules to ensure.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
1 Controlled concurrency Now we start looking at what kind of concurrency we should allow We first look at uncontrolled concurrency and see what happens.
6/18/2016Transactional Information Systems3-1 Part II: Concurrency Control 3 Concurrency Control: Notions of Correctness for the Page Model 4 Concurrency.
Transactional Information Systems:
Contents. Goal and Overview. Ingredients. The Page Model.
Chapter 8: Concurrency Control on Relational Databases
Database Transaction Abstraction I
Transaction Management
Multiversion Concurrency Control
Transactional Information Systems - Chapter 2. ML Lab 나 용 찬
Computational Models Database Lab Minji Jo.
Chapter 14: Transactions
Transactional Information Systems:
1 Introduction to Transaction Processing (1)
Chap. 3 Concurrency Control (covering 3.8 ~ 3.11)
CS216: Data-Intensive Computing Systems
Database Management System
Transactional Information Systems:
Schedules and Serializability
Chapter 6: Concurrency Control on Objects: Notions of Correctness
CSIS 7102 Spring 2004 Lecture 2 : Serializability
March 21st – Transactions
CIS 720 Concurrency Control.
Transactional Information Systems:
Transactions Sylvia Huang CS 157B.
Lecture 21: Concurrency & Locking
Transactional Information Systems:
Distributed Transactions
Transaction management
Chapter 5: Multiversion Concurrency Control
Transactional Information Systems:
Transactional Information Systems
Transaction Management
C. Faloutsos Transactions
UNIT -IV Transaction.
Lecture 18: Concurrency Control
Outline Introduction Background Distributed DBMS Architecture
Transaction Serializability
Presentation transcript:

CMPSC 274: Transaction Processing Lecture #2: Correctness Divy Agrawal Department of Computer Science UC Santa Barbara

1/30/2016Transactional Information Systems3-2 Schedules and Histories Definition 3.1 (Schedules and histories): Let T={t 1,..., t n } be a set of transactions, where each t i  T has the form t i =(op i, < i ) with op i denoting the operations of t i and < i their ordering. (i)A history for T is a pair s=(op(s),< s ) s.t. (a) op(s)   i=1..n op i   i=1..n {a i, c i } (b) for all i, 1  i  n: c i  op(s)  a i  op(s) (c)  i=1..n < i  < s (d) for all i, 1  i  n, and all p  op i : p < s c i or p < s a i (e) for all p, q  op(s) s.t. at least one of them is a write and both access the same data item: p < s q or q < s p (ii) A schedule is a prefix of a history. Definition 3.2 (Serial history): A history s is serial if for any two transactions t i and t j in s, where i  j, all operations from t i are ordered in s before all operations from t j or vice versa.

History Example r1[x]r2[x] w2[y] w1[x] w3[y] r1[z] r3[z] w3[Z] r1[x]r2[x]r1[z]w1[x]w2[y]r3[z]w3[y]w3[z]c1c2c3

History Example

Histories Without loss of generality: – Examples will be total orders Notations: – Trans(H): transactions in H – Commit(H): committed in H – Abort(H): aborted in H – Active(H): not committed and not aborted.

Correctness Function σ: S  {0,1} such that correct(S)={s in S | σ(s)=1} Pragmatic considerations: – Correct(S)≠ ϕ –Correct(S) is efficiently decidable –Correct(S) is sufficiently large (WHY?) Goal: develop several such criteria given that semantics not known.

Correctness Syntatctical semantics for schedules based on an intuitive notion: – Each transaction is a correct mapping, i.e., Hence, serial execution of transactions will be correct. DB DB’ Transaction T Consistent

General Idea Notion of equivalence of two schedules S1 and S2 Use this notion of equivalence to accept all schedules which are “equivalent” to some serial schedule as being correct. How to establish this equivalence notion?

Semantics Equivalence via a notion of semantics: – We do not know the semantics of transaction programs – We need a general notion that can capture all potential transaction semantics  Need a general enough and powerful notion that can capture all possible semantics of transactions.

Herbrand Semantics Read operation ri[x] reads the last value by the last write that occurs before ri[x]. Wi[x] writes a value that potentially depends on the value of all data items that Ti has read prior to wi[x].

Herbrand Semantics Abstract notion of semantics: 1.ri[x] reads the last wj[x] (j≠i) before ri[x]. 2.Wi[x] depends on: 1.Data from DB 2.Transactions in ACTIVE U COMMIT prior to wi[x]. Last write is well defined!!! Why? Assumption I: No transaction Aborts Assumption II: Initial Transactions: w0[entire-database], or equivalently w0[x, y, z, …] Wi[x] Wj[x] Rk[x]

Formal Definition: H-Semantics Hs(ri[x])=Hs(wj[x]) where wj[x] is the last write operation Hs(Wi[x])=fix(Hs(ri[y1]), …, Hs(ri[ym])) HU (Herbrand Universe) for transaction: what is conveyed to the transaction. HS for schedules: what is the permanent effect of the schedule of transactions.

Example S=w0[x]w0[y]c0r1[x]r2[y]w2[x]w1[y]c1c2 Hs[w0[x]]=f0x() Hs[w0[y]]=f0y() Hs[r1[x]]=f0x() Hs[r2[y]]=f0y() Hs[w2[x]]=f2x(Hs[r2[y]])=f2x(f0y()) Hs[w1[y])=f1y(Hs[r1[x]])=f1y(f0x())

Herbrand Universe Let D={x,y,z,…} be a finite set of data items. For a transaction T let op(T) denote all the steps of T. The HU of Ti is: – f0x() in HU for each x in D – If wi[x] in Op(Ti) then fix(v1, …, vm) in HU where vi are the values read by Ti before wi[x].

History Semantics H[h]: D  HU H[h](x) := Hs(wi[x]) Where wi(x) is the last operation in h writing x. In other words – the semantics of a history h is the set of values that are written last in h.

Why are we doing all this? General/abstract notion of semantics. Can work with any interpretation of the transaction program, i.e., we do not have to worry about the program semantics of the transaction as to how they manipulate the data.

Example h=w0[x]w0[y]c0r1[x]r2[y]w2[x]w1[y]c2c1 Hs[x]=Hs[w2[x]]=f2x(f0y()) Hs[y]=Hs[w1[y]]=f1y(f0x())

Final State Equivalence S and S’ over the same set of transactions then S is equivalent to S’ if H(S)=H(S’).

Example S = r1[x]r2[y]w1[y]r3[z]w3[z]r2[x]w2[z]w1[x] S’=r3[z]w3[z]r2[y]r2[x]w2[z]r1[x]w1[y]w1[x] H[S](x)=f1x(f0x())=H[S’](x) H[S](y)=f1y(f0x())=H[S’](y) H[S](z)=f2z(f0x(),f0y())=H[S’](z)

Another Example S=r1[x]r2[y]w1[y]w2[y]c1c2 S’=r1[x]w1[y]r2[y]w2[y]c1c2 H[S](y)=f2y(f0y()) H[S’](y)=f2y(H[S’](r2[y]))=f2y(f1y(f0x()))

Observations Example shows that we cannot simply determine equivalence on final write operation. What preceded must also be taken into account. – In S: final value of y is based on initial value of y. – In S’: final value of y is based on the value of y written by T1. Our task: can we build an efficient tool to determine equivalence efficiently?

Reads-from Relation, Useful, Alive, and Dead Steps Rj[x] reads-x-from wi[x] if wi[x] is the last write such that wi[x] < rj[x]. RF(S) = { (Ti, x, Tj) | rj[x] reads-x-from wi[x]} Step p is directly useful for q denoted p  q if: – Q reads-from P or – P is a read step and q is a subsequent write in the same transaction.  * is the transitive closure of 

Reads-from Relation, Useful, Alive, and Dead Steps P is alive in S if it is useful for some step in T∞: – Exists q in T∞ such that p  * q – Otherwise P is dead in S. Live reads-from relation: – LRF(S)={Ti, x, Tj) | rj[x] is alive and rj[x] in RF(S)}

Example S=w0[x,y]r1[x]r2[y]w1[y]w2[y]r∞[x,y] S’=w0[x,y]r1[x]w1[y]r2[y]w2[y]r∞[x,y] RF(S)={(T0,x,T1),(T0,y,T2),(T0,x,T∞),(T2,y,T∞)} RF(S’)={(T0,x,T1),(T0,y,T2),(T0,x,T∞),(T2,y,T∞)} R2[y] alive in S and S’ (verify) R1[X] dead in S but alive in S’ (verify)

Example (contd.) LRF(S)={(T0,y,T2), (T0,x,T∞),(T2,y,T∞)} LRF(S’)=RF(S’) Redefine FSE: S and S’ are final state equivalent if and only if LRF(S)=LRF(S’) (Prove it – omitted). Build a tool that will allow to “efficiently” identify the LRF relations: STEP GRAPH.

Step Graph Construction Construct step graph D(S)=(V,E) where: – V=op(S) – E=[(p,q) | p,q in V and p  q] It can be shown that LRF(s)=LRF(s’) iff D(s)=D(S’). S f.s.e. S’ iff D(S)=D(S’) and op(S)=op(S’)

Examples to check FSE using Step Graph S=r1[x]r2[y]w1[y]r3[z]w3[z]r2[x]w2[z]w1[x] S’=r3[z]w3[z]r2[y]r2[x]w3[z]r1[x]w1[y]w1[x] Construct D(S) and D(S’) in class.

Another Example S=r1[x]r2[y][w1[y]w2[y] S’=r1[x]w1[y]r2[y]w2[y] Construct D(S) and D(S’) to check FSE.

1/30/2016Transactional Information Systems3-29 FSR: Example 3.9 s‘= r 1 (x) w 1 (y) r 2 (y) w 2 (y) w 0 (x) r 1 (x) r  (x) w 0 (y) r 2 (y) w 1 (y) w 2 (y) s= r 1 (x) r 2 (y) w 1 (y) w 2 (y) r  (y) D(s): w 0 (x) r 1 (x) r  (x) w 0 (y) w 1 (y) r 2 (y) w 2 (y) r  (y) D(s‘): dead steps

Testing for FSE FSE can be decided in time polynomial in the length of two schedules. FSR: A history is FSR if there exists a serial history S’ such that S is FSE to S’. S=r1[x]r2[y]w1[y]r3[z]w3[z]r2[x]w2[z]w1[x] Is equivalent to serial history T3-T2-T1 (verify)

Testing for FSR How to test for FSR: – Try all N! serialzations of N transactions. Not Efficient!!! More importantly: lets revisit our examples of Lost Update and Fund Transfer and see if it works from application point-of-view?

Lost Update History corresponding to lost update: – H=r1[x]r2[x]w1[x]w2[x] – Possible serializations: H1=r1[x]w1[x]r2[x]w2[x] OR H2=r2[x]w2[x]r1[x]w1[x] Construct D(H), D(H1) and D(H2) and see if this H is not FSE either to H1 or H2?

Fund Transfer Fund Transfer History: – H=r2[x]w2[x]r1[x]r1[y]r2[y]w2[y] – FSE to both T1-T2 and T2-T1. Even if we can develop an efficient tool to enforce FSR executions, it is not good enough for our purpose.

Key Insight We need to strengthen the notion of final sate serializability: – By not only focusing on the state of the database – But also requiring that the “database view” observed by each transaction in the equivalent schedules is identical. NEXT LECTURE.

1/30/2016Transactional Information Systems3-35 Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control and Recovery Gerhard Weikum and Gottfried Vossen “Teamwork is essential. It allows you to blame someone else.”(Anonymous) © 2002 Morgan Kaufmann ISBN

1/30/2016Transactional Information Systems3-36 Part II: Concurrency Control 3 Concurrency Control: Notions of Correctness for the Page Model 4 Concurrency Control Algorithms 5 Multiversion Concurrency Control 6 Concurrency Control on Objects: Notions of Correctness 7 Concurrency Control Algorithms on Objects 8 Concurrency Control on Relational Databases 9 Concurrency Control on Search Structures 10 Implementation and Pragmatic Issues

1/30/2016Transactional Information Systems3-37 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned “Nothing is as practical as a good theory.” (Albert Einstein)

1/30/2016Transactional Information Systems3-38 Lost Update Problem P1TimeP2 /* x = 100 */ r (x) 1 2 r (x) x := x+100 4x := x+200 w (x) 5 /* x = 200 */ 6w (x) /* x = 300 */ update “lost” Observation: problem is the interleaving r 1 (x) r 2 (x) w 1 (x) w 2 (x)

1/30/2016Transactional Information Systems3-39 Inconsistent Read Problem P1TimeP2 1r (x) 2x := x – 10 3w (x) sum := 0 4 r (x) 5 r (y) 6 sum := sum +x 7 sum := sum + y 8 9r (y) 10y := y w (y) “sees” wrong sum Observations: problem is the interleaving r 2 (x) w 2 (x) r 1 (x) r 1 (y) r 2 (y) w 2 (y) no problem with sequential execution

1/30/2016Transactional Information Systems3-40 Dirty Read Problem P1TimeP2 r (x) 1 x := x w (x) 3 4r (x) 5x := x failure & rollback 6 7w (x) cannot rely on validity of previously read data Observation: transaction rollbacks could affect concurrent transactions

1/30/2016Transactional Information Systems3-41 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-42 Schedules and Histories Definition 3.1 (Schedules and histories): Let T={t 1,..., t n } be a set of transactions, where each t i  T has the form t i =(op i, < i ) with op i denoting the operations of t i and < i their ordering. (i)A history for T is a pair s=(op(s),< s ) s.t. (a) op(s)   i=1..n op i   i=1..n {a i, c i } (b) for all i, 1  i  n: c i  op(s)  a i  op(s) (c)  i=1..n < i  < s (d) for all i, 1  i  n, and all p  op i : p < s c i or p < s a i (e) for all p, q  op(s) s.t. at least one of them is a write and both access the same data item: p < s q or q < s p (ii) A schedule is a prefix of a history. Definition 3.2 (Serial history): A history s is serial if for any two transactions t i and t j in s, where i  j, all operations from t i are ordered in s before all operations from t j or vice versa.

1/30/2016Transactional Information Systems3-43 Example Schedules and Notation r 1 (x)w 1 (x)c1c1 r 1 (z) r 2 (x)w 2 (y)c2c2 r 3 (z)w 3 (y)c3c3 w 3 (z) Example 3.4: Example 3.6: r 1 (x) r 2 (z) r 3 (x) w 2 (x) w 1 (x) r 3 (y) r 1 (y) w 1 (y) w 2 (z) w 3 (z) c 1 a 3 trans(s):= {t i | s contains step of t i } commit(s):= {t i  trans(s) | c i  s} abort(s):= {t i  trans(s) | a i  s} active(s):= trans(s) – (commit(s)  abort(s))

1/30/2016Transactional Information Systems3-44 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-45 Correctness of Schedules 1.Define equivalence relation  on set S of all schedules. 2.“Good” schedules are those in the equivalence classes of serial schedules. Equivalence must be efficiently decidable. “Good” equivalence classes should be “sufficiently large”. For the moment, disregard aborts: assume that all transactions are committed.

1/30/2016Transactional Information Systems3-46 Activity What is an equivalence relation? List the three defining conditions!

1/30/2016Transactional Information Systems3-47 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-48 Herbrand Semantics of Schedules Definition 3.3 (Herbrand Semantics of Steps): For schedule s the Herbrand semantics H s of steps r i (x), w i (x)  op(s) is: (i)H s [r i (x)] := H s [w j (x)] where w j (x) is the last write on x in s before r i (x). (ii)H s [w i (x)] := f ix (H s [r i (y 1 )],..., H s [r i (y m )]) where the r i (y j ), 1  j  m, are all read operations of t i that occcur in s before w i (x) and f ix is an uninterpreted m-ary function symbol. Definition 3.4 (Herbrand Universe): For data items D={x, y, z,...} and transactions t i, 1  i  n, the Herbrand universe HU is the smallest set of symbols s.t. (i)f 0x ( )  HU for each x  D where f 0x is a constant, and (ii)if w i (x)  op i for some t i, there are m read operations r i (y 1 ),..., r i (y m ) that precede w i (x) in t i, and v 1,..., v m  HU, then f ix (v 1,..., v m )  HU. Definition 3.5 (Schedule Semantics): The Herbrand semantics of a schedule s is the mapping H[s]: D  HU defined by H[s](x) := H s [w i (x)], where w i (x) is the last operation from s writing x, for each x  D.

1/30/2016Transactional Information Systems3-49 Herbrand Semantics: Example s = w 0 (x) w 0 (y) c 0 r 1 (x) r 2 (y) w 2 (x) w 1 (y) c 2 c 1 H s [w 0 (x)] = f 0x ( ) H s [w 0 (y)] = f 0y ( ) H s [r 1 (x)] = H s [w 0 (x)] = f 0x ( ) H s [r 2 (y)] = H s [w 0 (y)] = f 0y ( ) H s [w 2 (x)] = f 2x (H s [r 2 (y)]) = f 2x (f 0y ( )) H s [w 1 (y)] = f 1y (H s [r 1 (x)]) = f 1y (f 0x ( )) H[s](x) = H s [w 2 (x)] = f 2x (f 0y ( )) H[s](y) = H s [w 1 (y)] = f 1y (f 0x ( ))

1/30/2016Transactional Information Systems3-50 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-51 Final-State Equivalence Definition 3.6 (Final State Equivalence): Schedules s and s‘ are called final state equivalent, denoted s  f s‘, if op(s)=op(s‘) and H[s]=H[s‘]. Example a: s= r 1 (x) r 2 (y) w 1 (y) r 3 (z) w 3 (z) r 2 (x) w 2 (z) w 1 (x) s‘= r 3 (z) w 3 (z) r 2 (y) r 2 (x) w 2 (z) r 1 (x) w 1 (y) w 1 (x) H[s](x) = H s [w 1 (x)] = f 1x (f 0x ( )) = H s‘ [w 1 (x)] = H[s‘](x) H[s](y) = H s [w 1 (y)] = f 1y (f 0x ( )) = H s‘ [w 1 (y)] = H[s‘](y) H[s](z) = H s [w 2 (z)] = f 2z (f 0x ( ), f 0y ( )) = H s‘ [w 2 (z)] = H[s‘](z)  s  f s‘ Example b: s= r 1 (x) r 2 (y) w 1 (y) w 2 (y) s‘= r 1 (x) w 1 (y) r 2 (y) w 2 (y) H[s](y) = H s [w 2 (y)] = f 2y (f 0y ( )) H[s‘](y) = H s‘ [w 2 (y)] = f 2y (f 1y (f 0x ( )))   (s  f s‘)

1/30/2016Transactional Information Systems3-52 Definition 3.7 (Reads-from Relation; Useful, Alive, and Dead Steps): Given a schedule s, extended with an initial and a final transaction, t 0 and t . (i) r j (x) reads x in s from w i (x) if w i (x) is the last write on x s.t. w i (x) < s r j (x). (ii)The reads-from relation of s is RF(s) := {(t i, x, t j ) | an r j (x) reads x from a w i (x)}. (iii)Step p is directly useful for step q, denoted p  q, if q reads from p, or p is a read step and q is a subsequent write step of the same transaction.  *, the “useful” relation, denotes the reflexive and transitive closure of . (iv)Step p is alive in s if it is useful for some step from t , and dead otherwise. (v)The live-reads-from relation of s is LRF(s) := {(t i, x, t j ) | an alive r j (x) reads x from w i (x)} Reads-from Relation Example 3.7:s= r 1 (x) r 2 (y) w 1 (y) w 2 (y) s‘= r 1 (x) w 1 (y) r 2 (y) w 2 (y) RF(s) = {(t 0,x,t 1 ), (t 0,y,t 2 ), (t 0,x,t  ), (t 2,y,t  )} RF(s‘) = {(t 0,x,t 1 ), (t 1,y,t 2 ), (t 0,x,t  ), (t 2,y,t  )} LRF(s) = {(t 0,y,t 2 ), (t 0,x,t  ), (t 2,y,t  )} LRF(s‘) = {(t 0,x,t 1 ), (t 1,y,t 2 ), (t 0,x,t  ), (t 2,y,t  )}

1/30/2016Transactional Information Systems3-53 Definition 3.8 (Final State Serializability): A schedule s is final state serializable if there is a serial schedule s‘ s.t. s  f s‘. FSR denotes the class of all final-state serializable histories. Final-State Serializability Theorem 3.1: For schedules s and s‘ the following statements hold. (i)s  f s‘ iff op(s)=op(s‘) and LRF(s)=LRF(s‘). (ii)For s let the step graph D(s)=(V,E) be a directed graph with vertices V:=op(s) and edges E:={(p,q) | p  q}, and the reduced step graph D 1 (s) be derived from D(s) by removing all vertices that correspond to dead steps. Then LRF(s)=LRF(s‘) iff D 1 (s)=D 1 (s‘). Corollary 3.1: Final-state equivalence of two schedules s and s‘ can be decided in time that is polynomial in the length of the two schedules.

1/30/2016Transactional Information Systems3-54 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-55 Canonical Anomalies Reconsidered Lost update anomaly: L = r 1 (x) r 2 (x) w 1 (x) w 2 (x) c 1 c 2  history is not FSR Inconsistent read anomaly: I = r 2 (x) w 2 (x) r 1 (x) r 1 (y) r 2 (y) w 2 (y) c 1 c 2  history is FSR ! Observation: (Herbrand) semantics of all read steps matters! LRF(L) = {(t 0,x,t 2 ), (t 2,x,t  )} LRF(t 1 t 2 ) = {(t 0,x,t 1 ), (t 1,x,t 2 ), (t 2,x,t  )} LRF(t 2 t 1 ) = {(t 0,x,t 2 ), (t 2,x,t 1 ), (t 1,x,t  )} LRF(I) = {(t 0,x,t 2 ), (t 0,y,t 2 ), (t 2,x,t  ), (t 2,y,t  )} LRF(t 1 t 2 ) = {(t 0,x,t 2 ), (t 0,y,t 2 ), (t 2,x,t  ), (t 2,y,t  )} LRF(t 2 t 1 ) = {(t 0,x,t 2 ), (t 0,y,t 2 ), (t 2,x,t  ), (t 2,y,t  )}

1/30/2016Transactional Information Systems3-56 Definition 3.10 (View Serializability): A schedule s is view serializable if there exists a serial schedule s‘ s.t. s  v s‘. VSR denotes the class of all view-serializable histories. View Serializability Theorem 3.2: For schedules s and s‘ the following statements hold. (i)s  v s‘ iff op(s)=op(s‘) and RF(s)=RF(s‘) (ii)s  v s‘ iff D(s)=D(s‘) Corollary 3.2: View equivalence of two schedules s and s‘ can be decided in time that is polynomial in the length of the two schedules. Definition 3.9 (View Equivalence): Schedules s and s‘ are view equivalent, denoted s  v s‘, if the following hold: (i)op(s)=op(s‘) (ii)H[s] = H[s‘] (iii) H s [p] = H s‘ [p] for all (read or write) steps

1/30/2016Transactional Information Systems3-57 Inconsistent Read Reconsidered Inconsistent read anomaly: I = r 2 (x) w 2 (x) r 1 (x) r 1 (y) r 2 (y) w 2 (y) c 1 c 2  history is not VSR ! Observation: VSR properly captures our intuition RF(I) = {(t 0,x,t 2 ), (t 2,x,t 1 ), (t 0,y,t 1 ), (t 0,y,t 2 ), (t 2,x,t  ), (t 2,y,t  )} RF(t 1 t 2 ) = {(t 0,x,t 1 ), (t 0,y,t 1 ), (t 0,x,t 2 ), (t 0,y,t 2 ), (t 2,x,t  ), (t 2,y,t  )} RF(t 2 t 1 ) = {(t 0,x,t 2 ), (t 0,y,t 2 ), (t 2,x,t 1 ), (t 2,y,t 1 ), (t 2,x,t  ), (t 2,y,t  )}

1/30/2016Transactional Information Systems3-58 Relationship Between VSR and FSR Theorem 3.3: VSR  FSR. Theorem 3.4: Let s be a history without dead steps. Then s  VSR iff s  FSR.

1/30/2016Transactional Information Systems3-59 On the Complexity of Testing VSR Theorem 3.5: The problem of deciding for a given schedule s whether s  VSR holds is NP-complete.

1/30/2016Transactional Information Systems3-60 Properties of VSR Definition 3.11 (Monotone Classes of Histories) Let s be a schedule and T  trans(s).  T (s) denotes the projection of s onto T. A class E of histories is called monotone if the following holds: if s is in E, then  T (s) is in E for each T  trans(s). VSR is not monotone. Example: s = w 1 (x) w 2 (x) w 2 (y) c 2 w 1 (y) c 1 w 3 (x) w 3 (y) c 3  {t1, t2} (s) = w 1 (x) w 2 (x) w 2 (y) c 2 w 1 (y) c 1   VSR   VSR

1/30/2016Transactional Information Systems3-61 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-62 Conflict Serializability Definition 3.12 (Conflicts and Conflict Relations): Let s be a schedule, t, t‘  trans(s), t  t‘. (i)Two data operations p  t and q  t‘ are in conflict in s if they access the same data item and at least one of them is a write. (ii){(p, q)} | p, q are in conflict and p < s q} is the conflict relation of s. Definition 3.13 (Conflict Equivalence): Schedules s and s‘ are conflict equivalent, denoted s  c s‘, if op(s) = op(s‘) and conf(s) = conf(s‘). Definition 3.14 (Conflict Serializability): Schedule s is conflict serializable if there is a serial schedule s‘ s.t. s  c s‘. CSR denotes the class of all conflict serializable schedules. Example a: r 1 (x) r 2 (x) r 1 (z) w 1 (x) w 2 (y) r 3 (z) w 3 (y) c 1 c 2 w 3 (z) c 3 Example b: r 2 (x) w 2 (x) r 1 (x) r 1 (y) r 2 (y) w 2 (y) c 1 c 2   CSR   CSR

1/30/2016Transactional Information Systems3-63 Properties of CSR Theorem 3.8: CSR  VSR Example: s = w 1 (x) w 2 (x) w 2 (y) c 2 w 1 (y) c 1 w 3 (x) w 3 (y) c 3 s  VSR, but s  CSR. Theorem 3.9: (i)CSR is monotone. (ii)s  CSR   T (s)  VSR for all T  trans(s) (i.e., CSR is the largest monotone subset of VSR).

1/30/2016Transactional Information Systems3-64 Activity What is a directed graph? Think of ways to associate a graph with a schedule!

1/30/2016Transactional Information Systems3-65 Conflict Graph Definition 3.15 (Conflict Graph): Let s be a schedule. The conflict graph G(s) = (V, E) is a directed graph with vertices V := commit(s) and edges E := {(t, t‘) | t  t‘ and there are steps p  t, q  t‘ with (p, q)  conf(s)}. Theorem 3.10: Let s be a schedule. Then s  CSR iff G(s) is acyclic. Corollary 3.4: Testing if a schedule is in CSR can be done in time polynomial to the schedule‘s number of transactions. Example 3.12: s = r 1 (y) r 3 (w) r 2 (y) w 1 (y) w 1 (x) w 2 (x) w 2 (z) w 3 (x) c 1 c 3 c 2 G(s): t1 t2 t3

1/30/2016Transactional Information Systems3-66 Activity What is a characterization (in a mathematical sense)? How do you prove a necessary and sufficient condition? What needs to be shown for the serializability theorem?

1/30/2016Transactional Information Systems3-67 Proof of the Conflict-Graph Theorem (i) Let s be a schedule in CSR. So there is a serial schedule s‘ with conf(s) = conf(s‘). Now assume that G(s) has a cycle t 1  t 2 ...  t k  t 1. This implies that there are pairs (p 1, q 2 ), (p 2, q 3 ),..., (p k, q 1 ) with p i  t i, q i  t i, p i < s q (i+1), and p i in conflict with q (i+1). Because s‘  c s, it also implies that p i < s‘ q (i+1). Because s‘ is serial, we obtain t i < s‘ t (i+1) for i=1,..., k-1, and t k < s‘ t 1. By transitivity we infer t 1 < s‘ t 2 and t 2 < s‘ t 1, which is impossible. This contradiction shows that the initial assumption is wrong. So G(s) is acyclic. (ii)Let G(s) be acyclic. So it must have at least one source node. The following topological sort produces a total order < of transactions: a) start with a source node (i.e., a node without incoming edges), b) remove this node and all its outgoing edges, c) iterate a) and b) until all nodes have been added to the sorted list. The total transaction ordering order < preserves the edges in G(s); therefore it yields a serial schedule s‘ for which s‘  c s.

1/30/2016Transactional Information Systems3-68 Commutativity and Ordering Rules Commutativity rules: C1: r i (x) r j (y) ~ r j (y) r i (x) if i  j C2: r i (x) w j (y) ~ w j (y) r i (x) if i  j and x  y C3: w i (x) w j (y) ~ w j (y) w i (x) if i  j and x  y Ordering rule: C4: o i (x), p j (y) unordered ~> o i (x) p j (y) if x  y or both o and p are reads Example for transformations of schedules: s = w 1 (x) r 2 (x) w 1 (y) w 1 (z) r 3 (z) w 2 (y) w 3 (y) w 3 (z) ~>[C2] w 1 (x) w 1 (y) r 2 (x) w 1 (z) w 2 (y) r 3 (z) w 3 (y) w 3 (z) ~>[C2] w 1 (x) w 1 (y) w 1 (z) r 2 (x) w 2 (y) r 3 (z) w 3 (y) w 3 (z) = t 1 t 2 t 3

1/30/2016Transactional Information Systems3-69 Commutativity-based Reducibility Definition 3.16 (Commutativity Based Equivalence): Schedules s and s‘ s.t. op(s)=op(s‘) are commutativity based equivalent, denoted s ~* s‘, if s can be transformed into s‘ by applying rules C1, C2, C3, C4 finitely many times. Theorem 3.11: Let s and s‘ be schedules s.t. op(s)=op(s‘). Then s  c s‘ iff s ~* s‘. Definition 3.17 (Commutativity Based Reducibility): Schedule s is commutativity-based reducible if there is a serial schedule s‘ s.t. s ~* s‘. Corollary 3.5: Schedule s is commutativity-based reducible iff s  CSR.

1/30/2016Transactional Information Systems3-70 Order Preserving Conflict Serializability Definition 3.18 (Order Preservation): Schedule s is order preserving conflict serializable if it is conflict equivalent to a serial schedule s‘ and for all t, t‘  trans(s): if t completely precedes t‘ in s, then the same holds in s‘. OCSR denotes the class of all schedules with this property. Theorem 3.12: OCSR  CSR. Example 3.13: s = w 1 (x) r 2 (x) c 2 w 3 (y) c 3 w 1 (y) c 1   CSR   OCSR

1/30/2016Transactional Information Systems3-71 Commit-order Preserving Conflict Serializability Definition 3.19 (Commit Order Preservation): Schedule s is commit order preserving conflict serializable if for all t i, t j  trans(s): if there are p  t i, q  t j with (p,q)  conf(s) then c i < s c j. COCSR denotes the class of all schedules with this property. Theorem 3.13: COCSR  CSR. Example: s = w 3 (y) c 3 w 1 (x) r 2 (x) c 2 w 1 (y) c 1   OCSR   COCSR Theorem 3.15: COCSR  OCSR. Theorem 3.14: Schedule s is in COCSR iff there is a serial schedule s‘ s.t. s  c s‘ and for all t i, t j  trans(s): t i < s‘ t j  c i < s c j.

1/30/2016Transactional Information Systems3-72 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-73 Commit Serializability Definition 3.20 (Closure Properties of Schedule Classes): Let E be a class of schedules. For schedule s let CP(s) denote the projection  commit(s) (s). E is prefix-closed if the following holds: s  E  p  E for each prefix of s. E is commit-closed if the following holds: s  E  CP(s)  E. Theorem 3.16: CSR is prefix-commit-closed, i.e., prefix-closed and commit-closed. Definition 3.21 (Commit Serializability): Schedule s is commit-  -serializable if CP(p) is  -serializable for each prefix p of s, where  can be FSR, VSR, or CSR. The resulting classes of commit-  -serializable schedules are denoted CMFSR, CMVSR, and CMCSR. Theorem 3.17: (i)CMFSR, CMVSR, CMCSR are prefix-commit-closed. (ii)CMCSR  CMVSR  CMFSR

1/30/2016Transactional Information Systems3-74 Landscape of History Classes

1/30/2016Transactional Information Systems3-75 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-76 Interleaving Specifications: Motivation Example: all transactions known in advance transfer transactions on checking accounts a and b and savings account c: t 1 = r 1 (a) w 1 (a) r 1 (c) w 1 (c) t 2 = r 2 (b) w 2 (b) r 2 (c) w 2 (c) balance transaction: t 3 = r 3 (a) r 3 (b) r 3 (c) audit transaction: t 4 = r 4 (a) r 4 (b) r 4 (c) w 4 (z) Possible schedules: r 1 (a) w 1 (a) r 2 (b) w 2 (b) r 2 (c) w 2 (c) r 1 (c) w 1 (c) r 1 (a) w 1 (a) r 3 (a) r 3 (b) r 3 (c) r 1 (c) w 1 (c) r 1 (a) w 1 (a) r 2 (b) w 2 (b) r 1 (c) r 2 (c) w 2 (c) w 1 (c) r 1 (a) w 1 (a) r 4 (a) r 4 (b) r 4 (c) w 4 (z) r 1 (c) w 1 (c)   CSR   CSR application-tolerable interleavings non-admissable interleavings Observations: application may tolerate non-CSR schedules a priori knowledge of all transactions impractical

1/30/2016Transactional Information Systems3-77 Indivisible Units Definition 3.22 (Indivisible Units): Let T={t 1,..., t n } be a set of transactions. For t i, t j  T, t i  t j, an indivisible unit of t i relative to t j is a sequence of consecutive steps of t i s.t. no operations of t j are allowed to interleave with this sequence. IU(t i, t j ) denotes the ordered sequence of indivisible units of t i relative to t j. IU k (t i, t j ) denotes the k th element of IU(t i, t j ). Example 3.14: t 1 = r 1 (x) w 1 (x) w 1 (z) r 1 (y) t 2 = r 2 (y) w 2 (y) r 2 (x) t 3 = w 3 (x) w 3 (y) w 3 (z) IU(t 1, t 2 ) = IU(t 1, t 3 ) = IU(t 2, t 1 ) = IU(t 2, t 3 ) = IU(t 3, t 1 ) = IU(t 3, t 2 ) = s 1 = r 2 (y) r 1 (x) w 1 (x) w 2 (y) r 2 (x) w 1 (z) w 3 (x) w 3 (y) r 1 (y) w 3 (z) s 2 = r 1 (x) r 2 (y) w 2 (y) w 1 (x) r 2 (x) w 1 (z) r 1 (y)  respects all IUs  violates IU 1 (t 1, t 2 ) and IU 2 (t 2, t 1 ) but is conflict equivalent to an allowed schedule Example 3.15:

1/30/2016Transactional Information Systems3-78 Relatively Serializable Schedules Definition 3.23 (Dependence of Steps): Step q directly depends on step p in schedule s, denoted p~>q, if p < s q and either p, q belong to the same transaction t and p < t q or p and q are in conflict. ~>* denotes the reflexive and transitive closure of ~>. Example 3.16: s 3 = r 1 (x) r 2 (y) w 1 (x) w 2 (y) w 3 (x) w 1 (z) w 3 (y) r 2 (x) r 1 (y) w 3 (z) Definition 3.24 (Relatively Serial Schedule): s is relatively serial if for all transactions t i, t j : if q  t j is interleaved with some IU k (t i, t j ), then there is no operation p  IU k (t i, t j ) s.t. p~>* q or q~>* p Example 3.17: s 4 = r 1 (x) r 2 (y) w 2 (y) w 1 (x) w 3 (x) r 2 (x) w 1 (z) w 3 (y) r 1 (y) w 3 (z) Definition 3.25 (Relatively Serializable Schedule): s is relatively serializable if it is conflict equivalent to a relatively serial schedule.

1/30/2016Transactional Information Systems3-79 Relative Serialization Graph Definition 3.26 (Push Forward and Pull Backward): Let IU k (t i, t j ) be an IU of t i relative to t j. For an operation p i  IU k (t i, t j ) let (i)F(p i, t j ) be the last operation in IU k (t i, t j ) and (ii)B(p i, t j ) be the first operation in IU k (t i, t j ). Definition 3.27 (Relative Serialization Graph): The relative serialization graph RSG(s) = (V,E) of schedule s is a graph with vertices V := op(s) and edge set E  V  V containing four types of edges: (i)for consecutive operations p, q of the same transaction (p, q)  E (I-edge) (ii)if p ~>* q for p  t i, q  t j, t i  t j, then (p, q)  E (D-edge) (iii)if (p, q) is a D-edge with p  t i, q  t j, then (F(p, t j ), q)  E (F-edge) (iv)if (p,q ) is a D-edge with p  t i, q  t j, then (p, B(q, t i ))  E (B-edge) Theorem 3.18: A schedule s is relatively serializable iff RSG(s) is acyclic.

1/30/2016Transactional Information Systems3-80 RSG Example Example 3.19: t 1 = w 1 (x) r 1 (z) t 2 = r 2 (x) w 2 (y) t 3 = r 3 (z) r 3 (y) IU(t 1, t 2 ) = IU(t 1, t 3 ) = IU(t 2, t 1 ) = IU(t 2, t 3 ) = IU(t 3, t 1 ) = IU(t 3, t 2 ) = I s 5 = w 1 (x) r 2 (x) r 3 (z) w 2 (y) r 3 (y) r 1 (z) RSG(s 5 ): w 1 (x) r 2 (x) r 3 (z) r 1 (z) w 2 (y) r 3 (y) I I D,B BD,F F B F D,B D,F D,F,B

1/30/2016Transactional Information Systems3-81 Chapter 3: Concurrency Control – Notions of Correctness for the Page Model 3.2 Canonical Synchronization Problems 3.3 Syntax of Histories and Schedules 3.4 Correctness of Histories and Schedules 3.5 Herbrand Semantics of Schedules 3.6 Final-State Serializability 3.7 View Serializability 3.8 Conflict Serializability 3.9 Commit Serializability 3.10 An Alternative Criterion: Interleaving Specifications 3.11 Lessons Learned

1/30/2016Transactional Information Systems3-82 Lessons Learned Equivalence to serial history is a natural correctness criterion CSR, albeit less general than VSR, is most appropriate for complexity reasons its monotonicity property its generalizability to semantically rich operations OCSR and COCSR have additional beneficial properties

Histories & Schedules How to represent H (informally): – Capture database operations of transactions – Order of the execution of operations within transactions – Order of operations across transactions – How did the transaction terminate: Successfully: COMMIT Unsuccessfully: ABORT Histories: after the execution Schedules: evolving execution