Distributed Txn Management, 2003Lecture 4 / 11.11. Distributed Transaction Management – 2003 Jyrki Nummenmaa

Slides:



Advertisements
Similar presentations
Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Advertisements

Impossibility of Distributed Consensus with One Faulty Process
University of Tampere, CS Department Distributed Transaction Management Jyrki Nummenmaa
6.852: Distributed Algorithms Spring, 2008 Class 7.
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
CS 603 Handling Failure in Commit February 20, 2002.
Nummenmaa & Thanish: Practical Distributed Commit in Modern Environments PDCS’01 PRACTICAL DISTRIBUTED COMMIT IN MODERN ENVIRONMENTS by Jyrki Nummenmaa.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 12: Three-Phase Commits (3PC) Professor Chen Li.
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
CIS 720 Concurrency Control. Timestamp-based concurrency control Assign a timestamp ts(T) to each transaction T. Each data item x has two timestamps:
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Distributed Transaction Processing Some of the slides have been borrowed from courses taught at Stanford, Berkeley, Washington, and earlier version of.
Termination and Recovery MESSAGES RECEIVED SITE 1SITE 2SITE 3SITE 4 initial state committablenon Round 1(1)CNNN-NNNN Round 2FAILED(1)-CNNN--NNN Round 3FAILED.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Ordering and Consistent Cuts Presented By Biswanath Panda.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Systems of Distributed Systems Module 2 -Distributed algorithms Teaching unit 3 – Advanced algorithms Ernesto Damiani University of Bozen Lesson 6 – Two.
Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.
The Atomic Commit Problem. 2 The Problem Reaching a decision in a distributed environment Every participant: has an opinion can veto.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
1 Distributed Databases CS347 Lecture 16 June 6, 2001.
1 CS 194: Distributed Systems Distributed Commit, Recovery Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering.
CS 603 Three-Phase Commit February 22, Centralized vs. Decentralized Protocols What if we don’t want a coordinator? Decentralized: –Each site broadcasts.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Distributed Commit. Example Consider a chain of stores and suppose a manager – wants to query all the stores, – find the inventory of toothbrushes at.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
CS162 Section Lecture 10 Slides based from Lecture and
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
Distributed Txn Management, 2003Lecture 3 / Distributed Transaction Management – 2003 Jyrki Nummenmaa
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Distributed Transaction Management, Fall 2002 Unconventional transactions Jyrki Nummenmaa
Distributed Transactions Chapter 13
Consensus and Its Impossibility in Asynchronous Systems.
EEC 688/788 Secure and Dependable Computing Lecture 7 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa
Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!
University of Tampere, CS Department Distributed Commit.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 8 Fault.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 8 Fault.
Committed:Effects are installed to the database. Aborted:Does not execute to completion and any partial effects on database are erased. Consistent state:
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
Multi-phase Commit Protocols1 Based on slides by Ken Birman, Cornell University.
Data Link Layer. Data link layer The communication between two machines that can directly communicate with each other. Basic property – If bit A is sent.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Distributed Transactions What is a transaction? (A sequence of server operations that must be carried out atomically ) ACID properties - what are these.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Outline Introduction Background Distributed DBMS Architecture
Database System Implementation CSE 507
Two phase commit.
Alternating Bit Protocol
Distributed Consensus
Commit Protocols CS60002: Distributed Systems
Outline Announcements Fault Tolerance.
Outline Introduction Background Distributed DBMS Architecture
Distributed Transactions
Distributed Databases Recovery
CIS 720 Concurrency Control.
Presentation transcript:

Distributed Txn Management, 2003Lecture 4 / Distributed Transaction Management – 2003 Jyrki Nummenmaa

Distributed Txn Management, 2003Lecture 4 / Blocking and non-blocking commit protocols Based on the paper ”Nonblocking Commit Protocols” by Dale Skeen

Distributed Txn Management, 2003Lecture 4 / Two-Phase Commit Blocks (1) COMMIT sent (2) C crashes C (3) P1 crashes (4) P2 is blocked COMMIT P1P2

Distributed Txn Management, 2003Lecture 4 / Failure model n Our Blocking Theorem from last week states that if network partitioning is possible, then any distributed commit protocol may block. n Let’s assume now that the network can not partition. n Then we can consult other processes to make progress. n However, if all processes fail, then we are, again, blocked. n Let’s further assume that total failure is not possible ie. not all processes are crashed at the same time.

Distributed Txn Management, 2003Lecture 4 / Automata representation n We model the participants with finite state automata (FSA). n The participants move from one state to another as a result of receiving one or several messages or as a result of a timeout event. n Having received these messages, a participant may send some messages before executing the state transition.

Distributed Txn Management, 2003Lecture 4 / Commit Protocol Automata n Final states are divided into Abort states and Commit states (finally, either Abort or Commit takes place). n Once an Abort state is reached, it is not possible to do a transition to a non-Abort state. (Abort is irreversible). Similarly for Commit states (Commit is also irreversible). n The state diagram is acyclic. n We denote the initial state by q, the terminal states are a (an abort/rollback state) and c (a commit state). Often there is a wait-state, which we denote by w. n Assume the participants are P1,…,Pn. Possible coordinator is P0, when the protocol starts.

Distributed Txn Management, 2003Lecture 4 / PC Coordinator q w a c A commit-request from application VoteReq to P1,…,Pn Timeout or No from one of P1,.., Pn Abort to P1,…Pn Yes from all P1,..,Pn Commit to P1,…,Pn

Distributed Txn Management, 2003Lecture 4 / PC Participant q w a c VoteReq from P0 No to P0 Commit from P0 - Abort from P0 - VoteReq from P0 Yes to P0

Distributed Txn Management, 2003Lecture 4 / Commit Protocol State Transitions n In a commit protocol, the idea is to inform other participants on local progress. n In fact, a state transition without message change is uninteresting, unless the participant moves into a terminal state. n Therefore, unless a participant moves into a terminal state, we may assume that it sends messages to other participants about its change of state. n To simplify our analysis, we may assume that the messages are sent to all other participants. This is not necessary, but creates unnecessary complication.

Distributed Txn Management, 2003Lecture 4 / Concurrency set n A concurrency set of a state s is the set of possible states among all participants, if some participant is in state s. n In other words, the concurrency set of state s is the set of all states that can co-exist with state s.

Distributed Txn Management, 2003Lecture 4 / PC Concurrency Sets q w a c Commit-req. VoteReq to All Timeout or a No Abort to all Yes from all Commit to all q w a c VoteReq from P0 No to P0 Abort from P0 - VoteReq from P0 Yes to P0 Concurrency_set(q) = {q,w,a}, Concurrency_set(a) = {q,w,a} Concurrency_set(w) = {q,w,a,c}, Concurrency_set(c) = (w,c)

Distributed Txn Management, 2003Lecture 4 / Committable states n We say that a state is committable, if the existence of a participant in this state means that everyone has voted Yes. n If a state is not committable, we say that it is non-committable. n In 2PC, w and c are committable states.

Distributed Txn Management, 2003Lecture 4 / How can a site terminate when there is a timeout? n Either (1) one of the operational sites knows the fate of the transaction, or (2) the operational sites can decide the fate of the transaction. n Knowing the fate of the transaction means, in practice, that there is a participant in a terminal state. n Start by considering a single participant s. The site must infer the possible states of other participants from its own state. This can be done using concurrency sets.

Distributed Txn Management, 2003Lecture 4 / When can’t a single participant unilaterally abort? n Suppose a participant is in a state, which has a commit state in its concurrency set. Then, it is possible that some other participant is in a commit state. n A participant in a state, which has a commit state in its concurrency set, should not unilaterally abort.

Distributed Txn Management, 2003Lecture 4 / When can’t a single participant unilaterally commit? n Suppose a participant is in a state, which has an abort state in its concurrency set. Then, some participant may be in an abort state. n A participant in a state, which has an abort state in its concurrency set, should not unilaterally commit. n Also, a participant that is not in a committable state should not commit.

Distributed Txn Management, 2003Lecture 4 / The Fundamental Non-Blocking Theorem n A protocol is non-blocking, if and only if it satisfies the following conditions: (1) There exists no local state such that its concurrency set contains both an abort and a commit state, and (2) there exists no noncommittable state, whose concurrency set contains a commit state.

Distributed Txn Management, 2003Lecture 4 / Showing the Fundamental Non- Blocking Theorem n From our discussion above it follows that Conditions (1) and (2) are necessary. n We discuss their sufficiency later by showing how to terminate a commit protocol fulfilling conditions (1) and (2).

Distributed Txn Management, 2003Lecture 4 / Observations on 2PC n As the participants exchange messages as they progress, they progress in a synchronised fashion. n In fact, there is always at most one step difference between the states of any two live participants. n We say that the participants keep a one-step synchronisation. n It is easy to see by Fundamental Nonblocking Theorem that 2PC is blocking.

Distributed Txn Management, 2003Lecture 4 / One-step synchronisation and non- blocking property n If a commit protocol keeps one-step synchronisation, then the concurrency set of state s consists of s and the states adjacent to s. n By applying this observation and the Fundamental Non-blocking Theorem, we get a useful Lemma:

Distributed Txn Management, 2003Lecture 4 / Lemma n A protocol that is synchronous within one state transition is non-blocking, if and only if (1) it contains no state adjacet to both a Commit and an Abort state, and (2) it contains non non-committable state that is adjacet to a commit state.

Distributed Txn Management, 2003Lecture 4 / How to improve 2PC to get a non- blocking protocol n It is easy to see that the state w is the problematic state – and in two ways: - it has both Abort and Commit in its concurrency set, and - it is a non-committable state, but it has Commit in its concurrency set. n Solution: add an extra state between w and c (adding between w and a would not do – why?) n We are primarily interested in the centralised protocol, but similar decentralised improvement is possible.

Distributed Txn Management, 2003Lecture 4 / PC Coordinator q w a c A commit-request from application VoteReq to P1,…,Pn Timeout or No from one of P1,.., Pn Abort to P1,…Pn Yes from all P1,..,Pn Prepare to P1,…,Pn p Ack from all P1,..,Pn Commit to P1,…,Pn

Distributed Txn Management, 2003Lecture 4 / PC Participant q w a c VoteReq from P0 No to P0 Prepare from P0 Ack to P0 Abort from P0 - VoteReq from P0 Yes to P0 p Commit from P0 -

Distributed Txn Management, 2003Lecture 4 / PC Concurrency sets (cs) q w a c A commit-request VoteReq to all Timeout or one No Abort to all Yes from all Prepare to all p Ack from all Commit to all q w a c VoteReq from P0 No to P0 Abort from P0 - VoteReq from P0 Yes to P0 p Commit from P0 - Prepare from P0 Ack to P0 cs(p) = {w,p,c}, cs(w) = {q,a,w,p}, etc.

Distributed Txn Management, 2003Lecture 4 / PC and failures n If there are no failures, then clearly 3PC is correct. n In the presence of failures, the operational participants should be able to terminate their execution. n In the centralised case, a need for termination protocol implies that the coordinator is no longer operational. n We discuss a general termination protocol. It makes the assumption that at least one participant remains operational and that the participants obey the Fundamental Non-Blocking Theorem.

Distributed Txn Management, 2003Lecture 4 / Termination n Basic idea: Choose a backup coordinator B – vote or use some preassigned ids. n Backup Coordinator Decision Rule: If the B’s state contains commit in its concurrency set, commit the transaction. Else abort the transaction. n Reasoning behind the rule: If B’s state contains commit in the concurrency set, then it is possible that some site has performed commit – otherwise not.

Distributed Txn Management, 2003Lecture 4 / Re-executing termination n It is, of course, possible the backup coordinator fails. n For this reason, the termination protocol should be executed in such a way that it can be re-executed. n In particular, the termination protocol must not break the one-step synchronisation.

Distributed Txn Management, 2003Lecture 4 / Implementing termination n To keep one-step synchronisation, the termination protocol should be executed in two steps: n 1. The backup coordinator B tells the others to make a transition to B’s state. Others answer Ok. (This is not necessary if B is in Commit or Abort state.) 2. B tells the others to commit or abort by the decision rule.

Distributed Txn Management, 2003Lecture 4 / Fundamental Non-Blocking Theorem Proof - Sufficiency n The basic termination procedure and decision rule is valid for any protocol that fulfills the conditions given in the Fundamental Non-Blocking Theorem. n The existence of a termination protocol completes the proof.

Distributed Txn Management, 2003Lecture 4 / Discussion n Clearly, an Abort decision can be reached in 3PC similarly as in 2PC in the straightforward case. n However, Commit requires extra messages. n The bad thing here is that in practice nearly all of the txns end doing a Commit. This way, nearly all of the commits require extra messages. n Unfortunately, we can not create a non- blocking protocol by adding a “pre-abort” state instead of the “pre-commit” state.

Distributed Txn Management, 2003Lecture 4 / Back to real world n For the creation of non-blocking protocols, we assumed that the network may not partition. n However, in reality it is possible that the network partitions. n In 2PC this means simply blocking. n Let’s see what 3PC does, if the network blocks.

Distributed Txn Management, 2003Lecture 4 / Assume the following situation… p p p w w The circles are participants and the letters are their states.

Distributed Txn Management, 2003Lecture 4 / …then the network partitions… p p p w w The circles are participants and the letters are their states.

Distributed Txn Management, 2003Lecture 4 / …3PC termination kicks in… p p p w w Termination starts in both sets, as they think that the non- responsive sites have failed. Backup Coordinator 1 – decides Commit Backup Coordinator 2 – decides Abort

Distributed Txn Management, 2003Lecture 4 / …final result. c c a a a Conclusion: We achieved non-blocking if the network does not partition. If it does, the protocol is no longer correct. Backup Coordinator 1 – decides Commit Backup Coordinator 2 – decides Abort

Distributed Txn Management, 2003Lecture 4 / Usefulness of 3PC n The previous example suggests that 3PC is in fact not useful at all in practice. Can we find a fix? n Possibility 1: Allow a group to elect a backup coordinator and terminate only, if they contain a majority of the original sites. n Possibility 2: Allow a group to recover only, if it contains two different states (and there can be at most 2). n Note that Possibilities 1 and 2 are not compatible.

Distributed Txn Management, 2003Lecture 4 / Possibilities 1 and 2 used at the same time p w w w w Group on left contains two different states, group on right contains a majority of the original sites. Backup Coordinator 1 – decides Commit Backup Coordinator 2 – decides Abort

Distributed Txn Management, 2003Lecture 4 / PC Recovery n Similarly as in 2PC, a recovering participant needs to consult the operational participants and ask about the fate of the txn. n Also, the participants need to write logs as in 2PC – details are left as an exercise.

Distributed Txn Management, 2003Lecture 4 / Conclusions on 3PC n 3PC can in practice be used to solve some situations, where 2PC would block. n The costs are increased communication and implementation complexity. n Also, one needs to understand when termination is possible in practice.

Distributed Txn Management, 2003Lecture 4 / PRACTICAL DISTRIBUTED COMMIT Based on paper “Practical distributed commit in modern environments” by Nummenmaa and Thanisch

Distributed Txn Management, 2003Lecture 4 / PC for Distributed Commit Coordinator Vote-Request Yes or No votes Multicast decision: Commit, if all voted Yes, otherwise Abort. Participants

Distributed Txn Management, 2003Lecture 4 / Multicast ABORT 2PC - a timeout occurs Coordinator Vote-Request Timeout occurs Yes or No votes Q: Is this good?A: (as we will see): Maybe Participants

Distributed Txn Management, 2003Lecture 4 / Why would the timeout mechanism be good? n Because it may be that some of the participating processes are holding resources, which are needed for other transactions. l Holding these resources may reduce throughput of transaction processing, which, of course, is a bad thing. l Timeout mechanism may help to find out that something is wrong.

Distributed Txn Management, 2003Lecture 4 / Why would the timeout mechanism not be good? / 1 n Because, given the different types of failures, it may extremely difficult to figure out a ”good” timeout period, even with dynamically adjustable statistics. l This is, assuming that timeout is meant to be used to detect failures.

Distributed Txn Management, 2003Lecture 4 / Why would the timeout mechanism not be good? / 2 n Because it may be that none of the participating processes are holding resources, which are needed for other transactions. l In this case, we should allow the processes to hold their locks for resources. l Rolling the transaction back will only lead to either unnecessarily repeating some processing or a lost transaction.

Distributed Txn Management, 2003Lecture 4 / Why would the timeout mechanism not be good? / 3 n Because it may be that some of the participating processes are holding resources, which are needed for other transactions, and the timeout comes too late to save the performance.

Distributed Txn Management, 2003Lecture 4 / Why is this happening? n The traditional problem definition for atomic distributed commit is not really related to overall system performance. n The impractical problem definition gives impractical protocols. n Currently, the protocols now first try to reach a commit decision, and after a timeout they will try to reach an abort decision, regardless of other factors.

Distributed Txn Management, 2003Lecture 4 / Traditional problem definition for distributed atomic commit /1 (1) A participant can vote Yes or No and may not change the vote. n (2) A participant can decide either Abort or Commit and may not change it. (3) If any participant votes No, then the global decision must be Abort. n (4) It must never happen that one participant decides Abort and another decides Commit.

Distributed Txn Management, 2003Lecture 4 / Traditional problem definition for distributed atomic commit / 2 n (5) All participants, which execute sufficiently long, must eventually decide, regardless whether they have failed earlier or not. (6) If there are no failures or suspected failures and all participants vote Yes, then the decision must not be Abort. The participants are not allowed to create artificial failures or suspicion.

Distributed Txn Management, 2003Lecture 4 / What kind of protocols does the traditional problem definition give? n First, the protocols try to reach a commit decision, regardless of overall system performance. n After a timeout, the protocols will try to reach an abort decision, regardless of overall system performance (again).

Distributed Txn Management, 2003Lecture 4 / What should be changed in the problem definition? (1) A participant can vote Yes or No. Having voted, it can try to change its vote. n (6) If the transaction can be committed and it is feasible to do so for overall efficiency, the decision must be Commit. If this is not the case and it is still possible to abort the transaction, the decision must be Abort. l Earlier version of (6) was about failures.

Distributed Txn Management, 2003Lecture 4 / Multicast ABORT Interactive 2PC Coordinator Vote-Request If the Coordinator gets a Cancel message before multicasting a decision, it decides to abort. A Cancel message, initiated by a local resource manager Participants Yes or No votes

Distributed Txn Management, 2003Lecture 4 / Interactive 2PC - Observations n There is no need for timeouts for participant failure (only for coordinator). n There is no need to estimate the transaction duration. n The mechanism works regardless of the duration of the transaction. n It is possible to adjust the opinion about the feasibility of the transaction based on the changing situation with lock requests.

Distributed Txn Management, 2003Lecture 4 / Interactive 2PC - Termination n If a participant gets fed up waiting for the coordinator, it has better check up that the coordinator is still alive – the coordinator may just not have been able to decide and therefore has not sent any messages.

Distributed Txn Management, 2003Lecture 4 / Multicast ABORT 2PC with deadlines Coordinator Vote-Request Timeout occurs based on deadlines. Yes (with deadline) or No votes Along with the commit votes, the participants tell how long they are willing to wait, based on local resource manager estimation. Participants

Distributed Txn Management, 2003Lecture 4 / Number of messages n The deadline protocol does not imply extra messages. n The interactive protocol only implies extra messages for cancel. l The need for extra messages is low. If the information about the abort (due to a Cancel message) reaches some participants before they have voted, then the overall number of messages may drop.

Distributed Txn Management, 2003Lecture 4 / Overall performance n It is easy to see that the new protocols provide more flexibility, which supports overall performance. n The more often the example situations occur, the more the overall performance improves.

Distributed Txn Management, 2003Lecture 4 / Conclusions n The interactive protocol provides gives most flexibility. n There is no real advantage of using basic 2PC over Interactive 2PC (I2PC). n To benefit from I2PC, the dialogue between the local resource manager and the local participant needs to be improved. n If you want to use timeouts, it might be better to set them based on deadlines.