Slides for Chapter 10: Distributed transactions

Slides:



Advertisements
Similar presentations
Distributed Systems Course Distributed transactions
Advertisements

Lock-Based Concurrency Control
CS542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol.
Slides for Chapter 13: Distributed transactions
Exercises for Chapter 17: Distributed Transactions
1 TRANSACTION & CONCURRENCY CONTROL Huỳnh Văn Quốc Phương Thái Thị Thu Thủy
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Distributed Systems Fall 2010 Transactions and concurrency control.
Transaction Management and Concurrency Control
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
Transactions ECEN 5053 Software Engineering of Distributed Systems University of Colorado Initially prepared by: David Leberknight.
CS 582 / CMPE 481 Distributed Systems
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 11: Transactions Dr. Michael R. Lyu Computer Science & Engineering.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Systems Fall 2009 Distributed transactions.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Transactions and concurrency control
Distributed Deadlocks and Transaction Recovery.
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
Exercises for Chapter 16: Transactions and Concurrency Control
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
DISTRIBUTED SYSTEMS II AGREEMENT (2-3 PHASE COM.) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Distributed Transactions: Distributed deadlocks Recovery techniques.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Distributed Transactions Chapter 13
Lecture 12: Distributed transactions Haibin Zhu, PhD. Assistant Professor Department of Computer Science Nipissing University © 2002.
Distributed Transactions
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Transactions CS425 /CSE424/ECE428 – Distributed Systems – Fall 2011 Nikita Borisov - UIUC Material derived from slides by I. Gupta, M. Harandi,
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
DISTRIBUTED SYSTEMS II FAULT-TOLERANT AGREEMENT II Prof Philippas Tsigas Distributed Computing and Systems Research Group.
University of Tampere, CS Department Distributed Commit.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 17: Distributed.
 2002 M. T. Harandi and J. Hou (modified: I. Gupta) Distributed Transactions.
IM NTU Distributed Information Systems 2004 Distributed Transactions -- 1 Distributed Transactions Yih-Kuen Tsay Dept. of Information Management National.
© Chinese University, CSE Dept. Distributed Systems / Distributed Systems Topic 8: Fault Tolerance and Replication Dr. Michael R. Lyu Computer Science.
A client transaction becomes distributed if it invokes operations in several different Servers There are two different ways that distributed transactions.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
TRANSACTION & CONCURRENCY CONTROL 1. CONTENT 2  Transactions & Nested transactions  Methods for concurrency control  Locks  Optimistic concurrency.
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
10-Jun-16COMP28112 Lecture 131 Distributed Transactions.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Database Recovery Techniques
Recovery in Distributed Systems:
Chapter 10 Transaction Management and Concurrency Control
Replication and Recovery in Distributed Systems
CSE 486/586 Distributed Systems Concurrency Control --- 3
Slides for Chapter 14: Distributed transactions
Distributed Transactions
Distributed Transactions
Exercises for Chapter 14: Distributed Transactions
Distributed Databases Recovery
Distributed Transactions
UNIVERSITAS GUNADARMA
Distributed Transactions
Distributed Systems Course Distributed transactions
Distributed Transactions
TRANSACTION & CONCURRENCY CONTROL
CSE 486/586 Distributed Systems Concurrency Control --- 3
Presentation transcript:

Slides for Chapter 10: Distributed transactions From Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edition 4, © Addison-Wesley 2005

Topics in Distributed Transactions In previous chapter, we discussed transactions accessed objects at a single server. In the general case, a transaction will access objects located in different computers. Distributed transaction accesses objects managed by multiple servers. The atomicity property requires that either all of the servers involved in the same transaction commit the transaction or all of them abort. Agreement among servers are necessary. Transaction recovery is to ensure that all objects are recoverable. The values of the objects reflect all changes made by committed transactions and none of those made by aborted ones. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.1 Distributed transactions (a) Flat transaction (b) Nested transactions Client X Y Z M N T 1 2 11 P 12 21 22 Flat transaction send out requests to different servers and each request is completed before client goes to the next one. Nested transaction allows sub-transactions at the same level to execute concurrently. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.2 Nested banking transaction a.withdraw(10) c . deposit(10) b.withdraw(20) d.deposit(20) Client A B C T 1 2 3 4 D X Y Z T = openTransaction openSubTransaction a.withdraw(10); closeTransaction b.withdraw(20); c.deposit(10); d.deposit(20); Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Coordinator of a distributed transaction Servers for a distributed transaction need to coordinate their actions. A client starts a transaction by sending an openTransaction request to a coordinator. The coordinator returns the TID to the client. The TID must be unique (serverIP and number unique to that server) Coordinator is responsible for committing or aborting it. Each other server in a transaction is a participant. Participants are responsible for cooperating with the coordinator in carrying out the commit protocol, and keep track of all recoverable objects managed by it. Each coordinator has a set of references to the participants. Each participant records a reference to the coordinator. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Note: client invoke an operation b.withdraw(), Figure 14.3 A distributed banking transaction coordinator . BranchZ BranchX participant C D Client BranchY B A join T a.withdraw(4); c.deposit(4); b.withdraw(3); d.deposit(3); openTransaction b.withdraw(T, 3); closeTransaction T = Note: client invoke an operation b.withdraw(), B will inform participant at BranchY to join coordinator. the coordinator is in one of the servers, e.g. BranchX Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

One-phase atomic commit protocol A transaction comes to an end when the client requests that a transaction be committed or aborted. Simple way is: coordinator to communicate the commit or abort request to all of the participants in the transaction and to keep on repeating the request until all of them have acknowledged that they had carried it out. Inadequate because when the client requests a commit, it does not allow a server to make a unilateral decision to abort a transaction. E.g. deadlock avoidance may force a transaction to abort at a server when locking is used. So any server may fail or abort and client is not aware. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Two-phase commit protocol Allow any participant to abort its part of a transaction. Due to atomicity, the whole transaction must also be aborted. In the first phase, each participant votes for the transaction to be committed or aborted. Once voted to commit, not allowed to abort it. So before votes to commit, it must ensure that it will eventually be able to carry out its part, even if it fails and is replaced. A participant is said to be in a prepared state if it will eventually be able to commit it. So each participant needs to save the altered objects in the permanent storage device together with its status-prepared. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Two-phase commit protocol In the second phase, every participant in the transaction carries out the joint decision. If any one participant votes to abort, the decision must be to abort. If all the participants vote to commit, then the decision is to commit the transaction. The problem is to ensure that all of the participants vote and that they all reach the same decision. It is an example of consensus. It is simple if no error occurs. However, it should work when servers fail, message lost or servers are temporarily unable to communicate with one another. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Two-phase commit protocol If the client requests abort, or if the transaction is aborted by one of the participants, the coordinator informs the participants immediately. It is when the client asks the coordinator to commit the transaction that two-phase commit protocol comes into use. In the first phase, the coordinator asks all the participants if they are prepared to commit; and in the second, it tells them to commit or abort the transaction. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.4 Operations for two-phase commit protocol canCommit?(trans)-> Yes / No Call from coordinator to participant to ask whether it can commit a transaction. Participant replies with its vote. doCommit(trans) Call from coordinator to participant to tell participant to commit its part of a transaction. doAbort(trans) Call from coordinator to participant to tell participant to abort its part of a transaction. haveCommitted(trans, participant) Call from participant to coordinator to confirm that it has committed the transaction. getDecision(trans) -> Yes / No Call from participant to coordinator to ask for the decision on a transaction after it has voted Yes but has still had no reply after some delay. Used to recover from server crash or delayed messages. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.5 The two-phase commit protocol Phase 1 (voting phase): 1. The coordinator sends a canCommit? request to each of the participants in the transaction. 2. When a participant receives a canCommit? request it replies with its vote (Yes or No) to the coordinator. Before voting Yes, it prepares to commit by saving objects in permanent storage. If the vote is No the participant aborts immediately. Phase 2 (completion according to outcome of vote): 3. The coordinator collects the votes (including its own). (a) If there are no failures and all the votes are Yes the coordinator decides to commit the transaction and sends a doCommit request to each of the participants. (b) Otherwise the coordinator decides to abort the transaction and sends doAbort requests to all participants that voted Yes. 4. Participants that voted Yes are waiting for a doCommit or doAbort request from the coordinator. When a participant receives one of these messages it acts accordingly and in the case of commit, makes a haveCommitted call as confirmation to the coordinator. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.6 Communication in two-phase commit protocol canCommit? Yes doCommit haveCommitted Coordinator 1 3 (waiting for votes) committed done prepared to commit step Participant 2 4 (uncertain) status Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

two-phase commit protocol Consider when a participant has voted Yes and is waiting for the coordinator to report on the outcome of the vote by telling it to commit or abort. Such a participant is uncertain and cannot proceed any further. The objects used by its transaction cannot be released for use by other transactions. Participant makes a getDecision request to the coordinator to determine the outcome. If the coordinator has failed, the participant will not get the decision until the coordinator is replaced resulting in extensive delay for participant in uncertain state. Timeout are used since exchange of information can fail when one of the servers crashes, or when messages are lost So process will not block forever. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Performance of two-phase commit protocol Provided that all servers and communication channels do not fail, with N participants N number of canCommit? Messages and replies Followed by N doCommit messages The cost in messages is proportional to 3N The cost in time is three rounds of message. The cost of haveCommitted messages are not counted, which can function correctly without them- their role is to enable server to delete stale coordinator information. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Failure of Coordinator When a participant has voted Yes and is waiting for the coordinator to report on the outcome of the vote, such participant is in uncertain stage. If the coordinator has failed, the participant will not be able to get the decision until the coordinator is replaced, which can result in extensive delays for participants in the uncertain state. One alternative strategy is allow the participants to obtain a decision from other participants instead of contacting coordinator. However, if all participants are in the uncertain state, they will not get a decision. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Concurrency Control in Distributed Transactions Concurrency control for distributed transactions: each server applies local concurrency control to its own objects, which ensure transactions serializability locally. However, the members of a collection of servers of distributed transactions are jointly responsible for ensuring that they are performed in a serially equivalent manner. Thus global serializability is required. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Locks Lock manager at each server decide whether to grant a lock or make the requesting transaction wait. However, it cannot release any locks until it knows that the transaction has been committed or aborted at all the servers involved in the transaction. A lock managers in different servers set their locks independently of one another. It is possible that different servers may impose different orderings on transactions. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Timestamp ordering concurrency control In a single server transaction, the coordinator issues a unique timestamp to each transaction when it starts. Serial equivalence is enforced by committing the versions of objects in the order of the timestamps of transactions that accessed them. In distributed transactions, we require that each coordinator issue globally unique time stamps. The coordinators must agree as to the ordering of their timestamps. <local timestamp, server-id>, the agreed ordering of pairs of timestamps is based on a comparison in which the server-id is less significant. The timestamp is passed to each server whose objects perform an operation in the transaction. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distribunted Systems: Concepts and Design Edn.4 © Pearson Education 2005

Timestamp ordering concurrency control To achieve the same ordering at all the servers, The servers of distributed transactions are jointly responsible for ensuring that they are performed in a serially equivalent manner. E.g. If T commits after U at server X, T must commits after U at server Y. Conflicts are resolved as each operation is performed. If the resolution of a conflict requires a transaction to be aborted, the coordinator will be informed and it will abort the transaction at all the participants. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Locking T U Write(A) at X locks A Write(B) at Y locks B Read(B) at Y waits for U Read(A) at X waits for T ****************************************************************** T before U in one server X and U before T in server Y. These different ordering can lead to cyclic dependencies between transactions and a distributed deadlock situation arises. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Distributed Deadlock Deadlocks can arise within a single server when locking is used for concurrency control. Servers must either prevent or detect and resolve deadlocks. Using timeout to resolve deadlock is a clumsy approach. Why? Another way is to detect deadlock by detecting cycles in a wait for graph. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.12 Interleavings of transactions U, V and W d.deposit(10) lock D at Z b.deposit(10) lock B at Y a.deposit(20) lock A at X c.deposit(30) lock C at Z b.withdraw(30) wait at Y c.withdraw(20) wait at Z a.withdraw(20) wait at X U V and W: transactions Objects a and b by server X and Y Objects c and d by server Z Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.14 Distributed deadlock Waits for Waits for Held by Held by B X Y Z W U V A C Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.14 Local and global wait-for graphs X W U Y V local wait-for graph Global wait for graph is held in part by each of the several servers involved. Communication between these servers is required to find cycles in the graph. Simple solution: one server takes on the role of global deadlock detector. From time to time, each server sends the latest copy of its local wait-for graph. Disadvantages: poor availability, lack of fault tolerance and no ability to scale. The cost of frequent transmission of local wait-for graph is high. V W Z Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Phantom deadlock A deadlock that is detected but is not really a deadlock is called a phantom deadlock. As the procedure of sending local wait-for graph to one place will take some time, there is a chance that one of the transactions that holds a lock will meanwhile have released it, in which case the deadlock will no longer exist. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

X T U Y V Figure 14.14 Phantom deadlock local wait-for graph global deadlock detector suppose U releases object at X and request object held by V . U->V Then the global detector will see deadlock. However, the edge from T to U no longer exist. However, if two-phase locking is used, transactions can not release locks and then obtain more locks, and phantom deadlock cycles cannot occur in the way suggested above. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Edge Chasing / Path Pushing Distributed approach for deadlock detection. No global wait-for graph is constructed, but each of the servers has knowledge about some of its edges. The servers attempt to find cycles by forwarding messages called probes, which follow the edges of the graph throughout the distributed system. A probe message consists of transaction wait-for relationships representing a path in the global wait-for graph. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.15 Probes transmitted to detect deadlock V Held by W Waits for Waits for Deadlock detected U C A B Initiation ® Z Y X Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Initiation ® ® ® ® ® ® When to send the probe in the Initiation? Considering a server X detects a local waiting for relationship as If U is not waiting: There is no chance that a cycle can be formed. However, if U is waiting for another transaction say V, there is a potential for a possible cycle to form. W ® U W ® U ® V V ® … … W ® U ® V Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Three steps Initiation: when a server notes that a transaction T starts waiting for another U, where U is waiting to access object at another server. It initiates detection by sending a probe containing the edge<T->U> to the server of the object at which U is blocked. Detection: consists of receiving probes and deciding whether deadlock has occurred and whether to forward the probes. The server receives the probe and check to see whether U is also waiting. If it is, the transaction it wais for (e.g. V) is added to the probe making it <T->U->V>, and if the new transaction V is waiting for another object elsewhere, the probe is forwarded. In this way, paths through the global wait-for graph are built one edge at a time. After a new transaction is added to the probe, it will see if the just added transaction has caused a cycle. Resolution: when a cycle is detected, a transaction in the cycle is aborted to break the deadlock. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Three steps Server X initiates detection by sending probe <W->U> to the server of B ( server Y) Server Y receives probe <W->U>, note that B is held by V and appends V to the probe to produce <W->U->V>. It notes that V is waiting for C at server Z. This probe is forwarded to server Z. Server Z receives probe <W->U->V> and notes C is held by W and appends W to the probe to produce <W->U->V->W>. One of the transactions in the cycle must abort and the choice can be made based on priorities. V Held by W Waits for Waits for Deadlock detected U C A B Initiation ® Z Y X Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Note: client invoke an operation b.withdraw(), Coordinator and Participants for a distributed transaction coordinator . BranchZ BranchX participant C D Client BranchY B A join T a.withdraw(4); c.deposit(4); b.withdraw(3); d.deposit(3); openTransaction b.withdraw(T, 3); closeTransaction T = Note: client invoke an operation b.withdraw(), B will inform participant at BranchY to join coordinator. the coordinator is in one of the servers, e.g. BranchX Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005 Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Probe Forwarding between servers is actually through Coordinator Lock manager at participants inform coordinator when transaction starts waiting for objects and when transaction acquires objects and become active again. The coordinator is responsible for recording whether the transaction is active or waiting for a object, and participants can get this information from the coordinator. A server usually sends its probe to the coordinator of the last transaction in the path to find out whether the transaction is waiting for another object elsewhere. E.g. W->U->V, see V if waiting or not, if V is waiting for another object, V’s coordinator will forward the probe to the server of the object on which V is waiting on. This shows that when a probe is forwarded, two messages are required. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

when one probe is forwarded, two messages are required. Performance Analysis In above example, two probe messages to detect a cycle involving three transactions. when one probe is forwarded, two messages are required. In general, a probe that detects a cycle involving N transactions will be forwarded by (N-1) transaction coordinators via (N-1) servers of objects, requiring a total of 2(N-1) messages. Deadlock detection can be initiated by several transactions in a cycle at the same time. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.16 Two probes initiated (a) initial situation (b) detection initiated at object requested by T (c) detection initiated at object requested by W U T V W Waits for Waits for ® Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Multiple Probes Problems At about the same time, T waits for U ( T->U) and W waits for V (W->V). Two probes occur, two deadlocks detected by different servers. We want to ensure that only one transaction is aborted in the same deadlock since different servers may choose different transaction to abort leading to unnecessary abort of transactions. So using priorities to determine which transaction to abort will result in the same transaction to abort even if the cycles are detected by different servers. Using priority can also reduce the number of probes. For example, we only initiate probe when higher priority transaction starts to wait for lower priority transaction. If we say the priority order from high to low is: T, U, V and W. Then only the probe of T->U will be sent and not the probe of W->V. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Atomic property of transactions can be described in two aspects: Transaction recovery Atomic property of transactions can be described in two aspects: Durability: objects are saved in permanent storage and will be available indefinitely thereafter. Acknowledgement of a client’s commit request implies that all the effects of the transaction have been recorded in permanent storage as well as in the server’s volatile object. Failure atomicity: the effects of transactions are atomic even when the server crashes. Both can be realized by recovery manager. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Tasks of a recovery manager: Save objects in permanent storage ( in a recovery file) for committed transactions; To restore the server’s objects after a crash; To reorganize the recovery file to improve the performance of recovery; To reclaim storage space in the recovery file. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.18 Types of entry in a recovery file Type of entry Description of contents of entry Object A value of an object. Transaction status Transaction identifier, transaction status ( prepared , committed aborted ) and other status values used for the two-phase commit protocol. Intentions list Transaction identifier and a sequence of intentions, each of which consists of <identifier of object>, < Position of value of object>. Intention list records all of its currently active transactions. A list of a particular transaction contains a list of the references and the values of all the objects that are altered. When committed, the committed version of each object is replaced by the tentative version made by that transaction. When a transaction aborts, the server uses the intention list to delete all the tentative versions of objects. When a participant says it is prepared to commit, its recovery manager must have saved both its intention list for that transaction and the objects in that intention list in its recovery file, so it will be able to carry out the commitment later on, even if it crashes in the interim. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.19 Log for banking service P 1 2 3 4 5 6 7 Object: A B C Trans: T U 100 200 300 80 220 prepared committed 278 242 < , > Checkpoint End of log Log technique contains history of all transactions by a server. When prepared, commits or aborts, the recovery manager is called. It appends all objects in its intention list followed by the current status. After a crash, any transaction that does not have a committed status in the log is aborted. Each transaction status entry contains a pointer to the position in the recovery file of the previous transaction status entry to enable the recovery manager to follow the transaction entries in reverse order. The last pointer points to the checkpoint. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Recovery of objects P 1 2 3 4 5 6 7 Object: A B C Trans: T U 100 200 300 80 220 prepared committed 278 242 < , > Checkpoint End of log When a server is replaced after a crash, it first sets default initial values for its objects and hands over to its recovery manager, which is responsible for restoring the server’s objects so that include all effects of all committed transactions in the correct order and none of aborted transactions. Two approaches: Starting from the beginning of the most recent checkpoint, reads in the values of each of the objects. For committed transactions replaces the values of the objects. Reading the recovery file backwards. Use transactions with committed status to restore those objects that have not yet been restored. It continues until it has restored all of the server’s object. Advantage is each object is restored once only. (U aborted, ignore C and B, then restore A and B as 80 and 220, then C as 300. Reorganize the log file: use Checkpoin: to write the current committed values of all objects to a new recovery file. Since all we need is the committed values.

Figure 14.21 Log with entries relating to two-phase commit protocol Trans: T Coord’r: T Trans: T Trans: U Part’pant: U Trans: U Trans: U prepared part’pant committed prepared Coord’r: . . uncertain committed list: . . . intentions intentions list list Coordinator uses committed/aborted to indicate that the outcome of the vote is Yes/no and done to indicate that two-phase commit protocol is complete, prepared before vote. Participate uses prepared to indicate it has not yet voted and can abort the transaction and uncertain to indicate that it has voted Yes, but does not yet know the outcome and committed indicates that has finished. Above example, this server plays the role of coordinator for transaction T, play participant role for transaction U. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Log with entries relating to two-phase commit protocol In phase 1, when the coordinator is prepared to commit and has already added a prepared status entry, its recovery manager adds a coordinator entry. Before a participant can vote Yes, it must have already prepared to commit and must have already added a prepared status entry. When it votes Yes, its recovery manager records a participant entry and adds an uncertain status. When a participant votes No, it adds an abort status to recovery file. In phase 2, the recovery manager of the coordinator adds either a committed or an aborted, according to the decision. Recovery manager of participants add a commit or abort status to their recovery files according to message received from coordinator. When a coordinator has received a confirmation from all its participants, its recovery manager adds a done status. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Log with entries relating to two-phase commit protocol When a server is replaced after a crash, the recovery manager has to deal with the two-phase commit protocol in addition to restore the objects. For any transaction where the server has played the coordinator role, it should find a coordinator entry and a set of transaction status entries. For any transaction where the server has played the participant role, it should find a participant entry and a set of set of transaction status entries. In both cases, the most recent transaction status entry, that is the one nearest the end of log determine the status at the time of failure. The action of the recovery manage with respect to the two-phase commit protocol for any transaction depends on whether the server was the coordinator or a participant and on its status at the time of failure as shown in the following table. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005

Figure 14.22 Recovery of the two-phase commit protocol Role Status Action of recovery manager Coordinator prepared No decision had been reached before the server failed. It sends abortTransaction to all the servers in the participant list and adds the transaction status aborted in its recovery file. Same action for state . If there is no participant list, the participants will eventually timeout and abort the transaction. committed A decision to commit had been reached before the server failed. It sends a doCommit to all the participants in its participant list (in case it had not done so before) and resumes the two-phase protocol at step 4 (Fig 13.5). Participant The participant sends a haveCommitted message to the coordinator (in case this was not done before it failed). This will allow the coordinator to discard information about this transaction at the next checkpoint. uncertain The participant failed before it knew the outcome of the transaction. It cannot determine the status of the transaction until the coordinator informs it of the decision. It will send a getDecision to the coordinator to determine the status of the transaction. When it receives the reply it will commit or abort accordingly. The participant has not yet voted and can abort the transaction. done No action is required. Instructor’s Guide for Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn.4 © Pearson Education 2005