Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 582 / CMPE 481 Distributed Systems Concurrency Control.

Similar presentations


Presentation on theme: "CS 582 / CMPE 481 Distributed Systems Concurrency Control."— Presentation transcript:

1 CS 582 / CMPE 481 Distributed Systems Concurrency Control

2 Class Overview Transactions Why Concurrency Control Concurrency Control Protocols –pessimistic –optimistic –time-based Deadlocks

3 Transactions Definition –a sequence of one or more operations on one or more resources that is atomic: all or nothing consistent: takes system from one consistent state to another isolated: intermediate states invisible to others (serializable) durable: once completed (committed), changes are permanent Primitives –BeginTransaction start transaction and get an ID –EndTransaction commit (make all writes durable) or abort (discard all changes made by writes) transaction –AbortTransaction –Read, Write,...

4 Why Concurrency Control? Increase efficiency by allowing several transactions to execute at the same time Concurrent access to a shared resource may cause inconsistency of the resource –inconsistency examples lost updates –two transactions concurrently perform update operation inconsistent retrievals –performing retrieval operation before or during update operation

5 Concurrency Control Basic Principle –To avoid possible problems due to concurrent access, effect of operations of related transactions must be as if the transactions were executed in some serial order serialized (one-at-a-time) Layered managers

6 Serializability Schedule is serial if the steps of each transaction occur consecutively. Schedule is serializable if its effect is “equivalent” to some serial schedule. BEGIN_TRANSACTION x = 0; x = x + 1; END_TRANSACTION T 1 BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION T 2 BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION T 3 Schedule 1x = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3Legal Schedule 2x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3;Legal Schedule 3x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3;Illegal

7 Serializability (cont) transaction can be modeled as a log of read and write operations –we’re not concerned with the computations of each transaction Two operations OPER(T i ;x) and OPER(T j ;x) on the same data item x, and from a set of logs may conflict at a data manager: –read-write conflict (rw) One is a read operation while the other is a write operation on x –write-write conflict (ww) Both are write operations on x

8 Basic Scheduling Theorem Let T = {T 1, …, T n } be a set of transactions and let E be an execution of these transactions modeled by logs {L 1, …, L n }. E is serializable if there exists a total ordering of T such that for each pair of conflicting operations O i and O j from distinct transactions T i and T j (respectively), O i precedes O j in any log L 1, …, L n if and only if T i precedes T j in the total ordering. For concurrency control, process conflicting reads and writes in certain relative orders. read-write and write-write conflicts can be synchronized independently, as long as we stick to a total ordering of transactions that is consistent with both types of conflicts.

9 Concurrency Control Protocols Two-phase locking: Before reading or writing a data item, a lock must be obtained. After a lock is given up, the transaction is not allowed to acquire any more locks. Timestamp ordering: Operations in a transaction are timestamped, and data managers are forced to handle operations in timestamp order. Optimistic control: Don’t prevent things from going wrong, but correct the situation if conflicts actually did happen. –Basic assumption: you can pull it off in most cases.

10 Two-Phase locking Clients do only READ and WRITE operations within transactions Locks are granted and released only by scheduler –read locks vs. write locks Locking policy is to avoid conflicts between operations –serializable schedules Two-Phase Commit –modify data items only after lock point lock point is when all locks have been acquired

11 Two-Phase locking (cont) Rule 1: –When client submits OPER(T i,x), scheduler tests whether it conflicts with an operation OPER(T j,x) from some other client. –If no conflict then grant LOCK(T i,x), otherwise delay execution of OPER(T i,x) –Conflicting operations are executed in the same order as that locks are granted Rule 2: –If LOCK(T i,x) has been granted, do not release the lock until OPER(T i,x) has been executed by data manager –Guarantees LOCK ! OPER ! RELEASE order Rule 3: –If RELEASE(T i,x) has taken place, no more locks for T i may be granted –Combined with rule 1, guarantees that all pairs of conflicting operations of two transactions are done in the same order

12 Two-Phase locking - Problems Problems –System can come into a deadlock. Practical solution: put a timeout on locks and abort transaction on expiration –When should the scheduler actually release a lock when operation has been executed when it knows that no more locks will be requested –No good way of testing unless transaction has been committed or aborted. –Assume the following execution sequence takes place: RELEASE(Ti,x) ! LOCK(Tj,x) ! ABORT(Ti). Consequence: scheduler will have to abort Tj as well –cascaded aborts Solution: Release all locks only at commit/abort time –strict two-phase locking

13 Time stamp ordering Transaction manager assigns a unique timestamp TS(T i ) to each transaction T i. Each operation OPER(T i,x) submitted by the transaction manager to the scheduler is timestamped –TS(OPER(T i,x)) ← TS(T i ). Scheduler adheres to following rule: –If OPER(T i,x) and OPER(T j,x) conflict –then data manager processes OPER(T i,x) before OPER(T j,x) –iff TS(OPER(T i,x)) < TS(OPER(T j,x)) aggressive –if a single OPER(T i,x) is rejected, T i will have to be aborted. –if TS(OPER(T i,x)) < TS(OPER(T j,x)) but OPER(T j,x) has already been processed by the data manager. –Then scheduler rejects OPER(T i,x) it came in too late. –if TS(OPER(T i,x)) < TS(OPER(T j,x)) OPER(T i,x) has already been processed by the data manager –Then scheduler would submit OPER(T j,x) to data manager. –Refinement: hold back OPER(T j,x) until T i commits or aborts.

14 Time stamp ordering (cont) Every data item “x” has –TS RD (x) ← max(TS(T i )) where OPER(T i, x) = read(T i,x) –TS WR (x) ← max(TS(T i )) where OPER(T i, x) = write(T i,x) –if TS(read(T i,x)) < TS WR (x) Write on x performed after T j started –Then scheduler rejects read(T i,x), T i aborted –TS RD (x) ← max(TS(T i ), TS RD (x)) –if TS(write(T i,x)) < TS RD (x) Current value of x has been read by a more recent T j –Then scheduler rejects write(T i,x), T i aborted –TS WR (x) ← max(TS(T i ), TS WR (x))

15 Optimistic Concurrency Control Observation: –maintaining locks costs a lot –in practice not many conflicts. Alternative: –Go ahead immediately with all operations –use tentative writes everywhere (shadow copies) –solve conflicts later on Phases: –allow operations tentatively → validate effects → make updates permanent. Validation: Check for each pair of active transactions T i and T j : –T i must not read or write data that has been written by T j. –T j must not read or write data that has been written by T i. –If one of the rules doesn’t hold: abort transaction.

16 Comparison Locking vs. timestamp ordering –both are pessimistic –dynamic vs. static ordering –write-dominated vs. read-dominated Optimistic –efficient when there are few conflicts –not widely used

17 Deadlocks Definition –a state in which each member of a group of transactions awaits some other member to release a lock examples –transactions T and U read the data and both try to promote their read lock to write lock –transaction T waits for transaction U to release a lock on a data item A while transaction U waits for transaction V to release a lock on a data item B and transaction V waits for transaction T to release a lock on a data item C Wait-for-graphs –graphical notation to represent wait-for relations among transactions

18 Deadlocks (cont) Deadlock prevention –locks all of the data items at the beginning hard to predict all the required data items at the beginning –requests locks on data items in a predefined order may result in premature locking and reduction in concurrency Deadlock detection –lock manager checks deadlocks whenever a lock request from a transaction is given to a data item currently locked by another transaction, or less frequently to avoid server overhead –lock manager does the following operations to detect a deadlock finds a cycle in the wait-for-graph, and break the cycle –once detected, one of transactions in a cycle is selected and aborted based on age and # of cycles it gets involved Deadlock resolution –once detected, one of transactions in a cycle is selected and aborted –timeouts

19 Distributed Deadlocks Centralized deadlock detection –each server sends its local wait-for graph and the central deadlock detector checks a cycle by global wait-for graphs –phantom deadlocks happens when one of transactions that holds a lock (and creates deadlock) will have aborted during deadlock detection phase

20 Distributed Deadlocks (cont) Distributed deadlock detection –called edge chasing or path pushing –no global wait-for graph –mechanism lock manager informs the coordinator when transactions start waiting and when they become active again three phases –initiation »if transaction A starts waiting for transaction B waiting to access a data item at another server, transaction B’s server sends a probe containing the wait-for relationship to the server of data item where transaction B is blocked and all the servers in which transactions share lock with transaction B –detection »if the data item is hold by another transaction (by consulting with coordinator), add this relationship to the probe and forward the probe in the same manner as above –resolution »when cycle is detected, a transaction in a cycle is aborted to break the deadlock


Download ppt "CS 582 / CMPE 481 Distributed Systems Concurrency Control."

Similar presentations


Ads by Google