Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender.

Slides:



Advertisements
Similar presentations
6.852: Distributed Algorithms Spring, 2008 Class 7.
Advertisements

CS542: Topics in Distributed Systems Distributed Transactions and Two Phase Commit Protocol.
(c) Oded Shmueli Distributed Recovery, Lecture 7 (BHG, Chap.7)
CS 603 Handling Failure in Commit February 20, 2002.
Nummenmaa & Thanish: Practical Distributed Commit in Modern Environments PDCS’01 PRACTICAL DISTRIBUTED COMMIT IN MODERN ENVIRONMENTS by Jyrki Nummenmaa.
Byzantine Generals Problem: Solution using signed messages.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 12: Three-Phase Commits (3PC) Professor Chen Li.
Consensus Algorithms Willem Visser RW334. Why do we need consensus? Distributed Databases – Need to know others committed/aborted a transaction to avoid.
Exercises for Chapter 17: Distributed Transactions
CIS 720 Concurrency Control. Timestamp-based concurrency control Assign a timestamp ts(T) to each transaction T. Each data item x has two timestamps:
Computer Science Lecture 18, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 3 – Distributed Systems.
Asynchronous Consensus (Some Slides borrowed from ppt on Web.(by Ken Birman) )
Systems of Distributed Systems Module 2 -Distributed algorithms Teaching unit 3 – Advanced algorithms Ernesto Damiani University of Bozen Lesson 6 – Two.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
The Atomic Commit Problem. 2 The Problem Reaching a decision in a distributed environment Every participant: has an opinion can veto.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 5: Synchronous Uniform.
Distributed systems Module 2 -Distributed algorithms Teaching unit 1 – Basic techniques Ernesto Damiani University of Bozen Lesson 4 – Consensus and reliable.
1 CS 194: Distributed Systems Distributed Commit, Recovery Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering.
CS 603 Three-Phase Commit February 22, Centralized vs. Decentralized Protocols What if we don’t want a coordinator? Decentralized: –Each site broadcasts.
1 More on Distributed Coordination. 2 Who’s in charge? Let’s have an Election. Many algorithms require a coordinator. What happens when the coordinator.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Distributed Commit. Example Consider a chain of stores and suppose a manager – wants to query all the stores, – find the inventory of toothbrushes at.
Composition Model and its code. bound:=bound+1.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Systems Fall 2009 Distributed transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Distributed Commit Dr. Yingwu Zhu. Failures in a distributed system Consistency requires agreement among multiple servers – Is transaction X committed?
1 A Modular Approach to Fault-Tolerant Broadcasts and Related Problems Author: Vassos Hadzilacos and Sam Toueg Distributed Systems: 526 U1580 Professor:
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Consensus Reaching agreement is a fundamental problem in distributed computing. Some examples are Leader election / Mutual Exclusion Commit.
Distributed Transactions March 15, Transactions What is a Distributed Transaction?  A transaction that involves more than one server  Network.
DISTRIBUTED SYSTEMS II AGREEMENT (2-3 PHASE COM.) Prof Philippas Tsigas Distributed Computing and Systems Research Group.
Distributed Txn Management, 2003Lecture 3 / Distributed Transaction Management – 2003 Jyrki Nummenmaa
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Distributed Transactions Chapter 13
Distributed Txn Management, 2003Lecture 4 / Distributed Transaction Management – 2003 Jyrki Nummenmaa
Consensus and Its Impossibility in Asynchronous Systems.
Distributed Systems CS Fault Tolerance- Part III Lecture 19, Nov 25, 2013 Mohammad Hammoud 1.
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Transaction Management, Fall 2002Lecture Distributed Commit Protocols Jyrki Nummenmaa
Fault Tolerance CSCI 4780/6780. Distributed Commit Commit – Making an operation permanent Transactions in databases One phase commit does not work !!!
University of Tampere, CS Department Distributed Commit.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 8 Fault.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Distributed Transactions Chapter – Vidya Satyanarayanan.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved Chapter 8 Fault.
Committed:Effects are installed to the database. Aborted:Does not execute to completion and any partial effects on database are erased. Consistent state:
Two-Phase Commit Brad Karp UCL Computer Science CS GZ03 / M th October, 2008.
SysRép / 2.5A. SchiperEté The consensus problem.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
A client transaction becomes distributed if it invokes operations in several different Servers There are two different ways that distributed transactions.
Multi-phase Commit Protocols1 Based on slides by Ken Birman, Cornell University.
Fault Tolerance Chapter 7. Goal An important goal in distributed systems design is to construct the system in such a way that it can automatically recover.
Mutual Exclusion Algorithms. Topics r Defining mutual exclusion r A centralized approach r A distributed approach r An approach assuming an organization.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
10-Jun-16COMP28112 Lecture 131 Distributed Transactions.
Unreliable Failure Detectors for Reliable Distributed Systems Tushar Deepak Chandra Sam Toueg Presentation for EECS454 Lawrence Leinweber.
Chapter 8 – Fault Tolerance Section 8.5 Distributed Commit Heta Desai Dr. Yanqing Zhang Csc Advanced Operating Systems October 14 th, 2015.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Outline Introduction Background Distributed DBMS Architecture
Distributed Consensus
Assignment 5 - Solution Problem 1
Distributed Transactions
Distributed Databases Recovery
CIS 720 Concurrency Control.
Last Class: Fault Tolerance
Presentation transcript:

Non-blocking Atomic Commitment Aaron Kaminsky Presenting Chapter 6 of Distributed Systems, 2nd edition, 1993, ed. Mullender

Agenda Atomic Commitment Problem Model and Terminology One-Phase Commit (1PC) Generic Atomic Commit Protocol (ACP) Two-Phase Commit (2PC) Non-Blocking ACP Three-Phase Commit (3PC)

Atomic Commitment Distributed transaction involves different processes operating on local data Partial failure can result in an inconsistent state Atomic commitment - either all processes commit or all abort

System Model Distributed system using messages for communication Synchronous model Bounds exist and are known for process speeds Bounds exist and are known for potential message delays

Communication Assume reliable communication Assume defined upper bound for processing and transmission delays δ = time units between send and receive (includes processing at sender and receiver) Timeouts can be used to detect process failure

Process Operational – executes the program Down – performs no action Crash – move from operational to down Correct – has never crashed Faulty – has crashed

Distributed Transactions Each participant updates local data The invoker begins the transaction by sending a message to all participants Piece of the transaction List of participants Δ c – time until the transaction should be concluded

Distributed Transactions Cont. Each process sets a local variable, vote at the end of processing vote = YES – local operation successful, results can be made permanent vote = NO – some failure prevents updating local data with results Finally the Atomic Commitment Protocol is used to decide the outcome of the transaction

The Atomic Commitment Problem AC1: all participants that decide reach the same decision. AC2: if any participant decides commit, then all participants must have voted YES. AC3: if all participants vote YES and no failures occur, then all participants decide commit. AC4: each participant decides at most once (that is, a decision is irreversible).

One-Phase Commit Protocol Elect a coordinator Coordinator tells all participants whether or not to locally commit results Cannot handle the failure of a participant

1PC In Action Coordinator P1P1 P2P2 COMMIT

Generic Atomic Commitment Protocol (ACP) Modification to 2PC Broadcast algorithm is left undefined C know = Local time when participant learns of the transaction Δ c = upper bound for time from C know to coordinator concluding transaction Δ b = upper bound for time from broadcast of message to delivery of message

ACP Coordinator Algorithm send [VOTE_REQUEST] to all participants set timeout to local_clock + 2δ wait for [vote:vote] from all participants if all votes = YES then broadcast commit to all participants else broadcast abort to all participants on timeout broadcast abort to all participants

ACP Participant Algorithm set timeout to (C know + Δ c + δ) wait for [VOTE_REQUEST] from the coordinator send [vote: vote] to the coordinator if (vote = NO) decide(ABORT) else set timeout to (C know + Δ c + δ + Δ b ) wait for delivery of decision message if (decision = abort) decide(abort) else decide(commit) on timeout decide according to termination protocol on timeout decide(abort)

SB1: A Simple Broadcast Algorithm // broadcaster executes: send [DLV: m] to all processes in G deliver m // process p <> broadcaster in G executes upon (receipt of [DLV: m]) deliver m

Properties of SB1 B1 (Validity): If a correct process broadcasts a message m, then all correct processes in G eventually deliver m. B2 (Integrity): For any message m, each process in G delivers m at most once, and only if some process actually broadcasts m. B3 ( Δ b -Timeliness): There exists a known constant Δ b such that if the broadcast of m is initiated at real-time t, no process in G delivers m after real-time t + Δ b.

Combine to get ACP-SB This is equivalent to 2PC in the Tanenbaum text The paper proves that this protocol solves the Atomic Commitment Problem as defined earlier.

ACP-SB In Action Coordinator initiates vote by sending VOTE_REQUEST to participants VOTE_REQUEST Coordinator P1P1 P2P2 VOTE_REQUEST

ACP-SB In Action Coordinator receives response from participants Coordinator P1P1 P2P2 YESNO YES

ACP-SB In Action Coordinator broadcasts decision to participants Coordinator P1P1 P2P2 ABORT

Blocking ACP-SB1 can result in blocking when the coordinator goes down Traditional solution - poll peers to determine decision It can still happen that participants must block and wait for the coordinator to recover Resources are not released

Blocking Example Coordinator receives all YES votes Coordinator P1P1 P2P2 YES

Blocking Example Coordinator and P 2 go down, P 1 never gets COMMIT P 1 must block until Coordinator recovers Coordinator P1P1 P2P2 COMMIT

The Non-Blocking Atomic Commitment Problem Now the goal is to prevent blocking Add a new requirement to the protocol AC5: every correct participant that executes the atomic commitment protocol eventually decides.

Uniform Timed Reliable Broadcast (UTRB) To B1-B3 (Validity, Integrity and Δ b - Timeliness) add another requirement. B4 (Uniform Agreement): If any process (correct or not) in G delivers a message m, then all correct processes in G eventually deliver m. No more blocking…

ACP-UTRB Changes to ACP-SB: Use UTRB instead of SB to broadcast decisions When a participant times out waiting for a decision message, just abort instead of using a termination protocol The second point above means no more blocking in ACP

UTRB1 – Simple UTRB // broadcaster executes: send [DLV: m] to all processes in G deliver m // process p != broadcaster in G executes upon (first receipt of [DLV: m]) send [DLV: m] to all processes in G deliver m

ACP-UTRB1 In Action Coordinator P1P1 P2P2 YES Coordinator receives votes as before…

ACP-UTRB1 In Action Coordinator P1P1 P2P2 COMMIT P 2 broadcasts COMMIT before it goes down, or it could not have delivered the COMMIT message.

Performance Modular: cost = cost of ACP + cost of instance of UTRB Time delay = 2δ + (F+1) δ = (F+3) δ Message complexity = 2n + n 2 n = number of participants F = maximum number of participants that may crash during this execution

Message-Efficient UTRB Use rotating coordinators Instead of each process broadcasting to all others, one process takes over in case of failure Adds delay for determining that the coordinator is down and for a process to notify the new coordinator Message complexity drops from n 2 +n to n + (f + 1)2n

Other modifications to UTRB More time efficient – be pessimistic Do not wait to be sure that the latest coordinator is down Ask for the next coordinator after a much shorter wait Terminate early – detect when coordinator is down early and abort without having to wait the full timeout

Three-Phase Commit Protocol (3PC) Coordinator requests a vote If any process votes no, coordinator broadcasts abort If all processes vote yes, coordinator broadcasts precommit When all processes acknowledge the precommit, coordinator broadcasts commit

3PC In Action Coordinator P1P1 P2P2 VOTE_REQUEST Coordinator requests a vote

3PC In Action Coordinator P1P1 P2P2 YES Participants respond with YES or NO

3PC In Action Coordinator P1P1 P2P2 PRECOMMIT If all participants respond YES, coordinator broadcasts PRECOMMIT

3PC In Action Coordinator P1P1 P2P2 Awk Coordinator waits for acknowledgement

3PC In Action Coordinator P1P1 P2P2 COMMIT Now coordinator can broadcast COMMIT message

3PC cont. A crashed participant cannot recover and try to commit with other participants still waiting for a decision. Failure of coordinator leaves participants to figure out action from one another. Extra state of precommit means that can always occur, so no blocking

Conclusion 2PC allows for atomic commitment of transactions, but is blocking Changing properties of the broadcast primitive creates a non-blocking protocol (APC-UTRB) Adding a phase can also prevent blocking (3PC) Is this really necessary? rarely