EEC-681/781 Distributed Computing Systems Lecture 12 Wenbing Zhao Cleveland State University.

Slides:



Advertisements
Similar presentations
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Advertisements

Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Topic 6.3: Transactions and Concurrency Control Hari Uday.
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.
COS 461 Fall 1997 Transaction Processing u normal systems lose their state when they crash u many applications need better behavior u today’s topic: how.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
Quick Review of Apr 29 material
Jan. 2014Dr. Yangjun Chen ACS Database recovery techniques (Ch. 21, 3 rd ed. – Ch. 19, 4 th and 5 th ed. – Ch. 23, 6 th ed.)
Transaction Processing Lecture ACID 2 phase commit.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
CS 582 / CMPE 481 Distributed Systems Concurrency Control.
Transaction Management and Concurrency Control
Database management concepts Database Management Systems (DBMS) An example of a database (relational) Database schema (e.g. relational) Data independence.
Transaction Management and Concurrency Control
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Synchronization Part 2 REK’s adaptation of Claypool’s adaptation ofTanenbaum’s Distributed Systems Chapter 5 and Silberschatz Chapter 17.
Transaction Management
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Transactions and concurrency control
Synchronization Chapter 6 Part III Transactions. –Most of the lecture notes are based on slides by Prof. Jalal Y. Kawash at Univ. of Calgary –Some slides.
TRANSACTIONS AND CONCURRENCY CONTROL Sadhna Kumari.
CIS 720 Concurrency Control. Locking Atomic statement –Can be used to perform two or more updates atomically Th1: …. ;……. Th2:…………. ;…….
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Transaction Communications Yi Sun. Outline Transaction ACID Property Distributed transaction Two phase commit protocol Nested transaction.
Distributed Transactions Chapter 13
Real-Time & MultiMedia Lab Synchronization Chapter 5.
1 Mutual Exclusion: A Centralized Algorithm a)Process 1 asks the coordinator for permission to enter a critical region. Permission is granted b)Process.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
TRANSACTION MANAGEMENT R.SARAVANAKUAMR. S.NAVEEN..
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Global State (1) a)A consistent cut b)An inconsistent cut.
Synchronization Chapter 5. Outline 1.Clock synchronization 2.Logical clocks 3.Global state 4.Election algorithms 5.Mutual exclusion 6.Distributed transactions.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Distributed synchronization and mutual exclusion Distributed Transactions.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
7c.1 Silberschatz, Galvin and Gagne ©2003 Operating System Concepts with Java Module 7c: Atomicity Atomic Transactions Log-based Recovery Checkpoints Concurrent.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
Concurrency Control Introduction Lock-Based Protocols
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned an earlier time.
9 1 Chapter 9_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
Synchronization CSCI 4900/6900. Transactions Protects data and allows processes to access and modify multiple data items as a single atomic transaction.
Synchronization Chapter 5. Clock Synchronization When each machine has its own clock, an event that occurred after another event may nevertheless be assigned.
Lecture on Synchronization Submitted by
Transactions Chapter 12 Transactions and Concurrency Control.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
Atomic Tranactions. Sunmeet Sethi. Index  Meaning of Atomic transaction.  Transaction model Types of storage. Transaction primitives. Properties of.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Database recovery techniques
Transaction Management
Last Class: Canonical Problems
Transaction Management and Concurrency Control
Synchronization Chapter 5C
Outline Announcements Fault Tolerance.
Chapter 10 Transaction Management and Concurrency Control
Database management concepts
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Distributed Transactions
UNIT -IV Transaction.
Transaction Communication
Transactions, Properties of Transactions
Presentation transcript:

EEC-681/781 Distributed Computing Systems Lecture 12 Wenbing Zhao Cleveland State University

2 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Outline Project report requirement Transaction processing concepts Distributed transaction and two phase commit Midterm #2 –12/6 Wednesday

3 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Project Report Requirement Theory track –Introduction: define the problem and provide motivation why we need a solution –Background: so that readers can understand the techniques used to solve the problem –Current state of the art: what are the fundamental techniques used to solve the problem. Ideally, provide a taxonomy of the techniques –Open issues and future research directions: what are the hard problems remaining to be solved?

4 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Project Report Requirement Implementation track –Introduction: define the problem domain and your implementation. Provide motivation on your system –System model: assumption, restrictions, models –Design: component diagram, class diagram, pseudo code, algorithms, header explanation –Implementation: what language, tools, libraries did you use, a simple user guide on how to user your system –Performance and testing: throughput, latency, test cases –Related work –Conclusion and future work

5 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Project Requirement What you should NOT do –Take an application from Internet or your friend => F grade –False claim of working prototype, fabricate performance data and test cases => F grade –Use other’s slides for presentation What you should do –If used any open source code, acknowledge it in both your source code and your report, and provide reference –Extensively comment your code –Follow good naming and coding conventions –Use a source version control system, such as cvs, svn –If your code does not work, acknowledge it in your report

6 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Project Report Requirement Report format: IEEE Transactions format pages –MS Word Template –LaTex Template IEEEtran.zip (main text) IEEEtranBST.zip (bibliography) Report due: Dec 13 mid-night ( electronic copy of the report & source code is required )

7 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Why Transaction Processing? To achieve a form of fault tolerance –If something bad happens in a middle of a set of operations, we abort and rollback to the original state

8 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Transaction and ACID Properties Atomicity: All operations either succeed, or all of them fail. When the transaction fails, the state of the object will remain unaffected by the transaction. Consistency: A transaction establishes a valid state transition. Isolation: Concurrent transactions do not interfere with each other. It appears to each transaction T that other transactions occur either before T, or after T, but never both. Durability: After the execution of a transaction, its effects are made permanent: changes to the state survive failures. A transaction is a collection of operations on the state of an object (database, object composition, etc.) that satisfies the following properties:

9 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Primitives for Transactions PrimitiveDescription BEGIN_TRANSACTIONMake the start of a transaction END_TRANSACTIONTerminate the transaction and try to commit ABORT_TRANSACTIONKill the transaction and restore the old values READRead data from a file, a table, or otherwise WRITEWrite data to a file, a table, or otherwise BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi; END_TRANSACTION (a) BEGIN_TRANSACTION reserve WP -> JFK; reserve JFK -> Nairobi; reserve Nairobi -> Malindi full => ABORT_TRANSACTION (b) Example transactions

10 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Transaction Classification Flat transactions: a sequence of operations that satisfies the ACID properties (the most common one) Nested transactions: A hierarchy of transactions that allows –Concurrent processing of subtransactions, and –Recovery per subtransaction Distributed transactions: A (flat) transaction that span multiple databases distributed across the network

11 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Implementation of Transactions Private workspace Writeahead log

12 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Private Workspace The file index and disk blocks for a three-block file The situation after a transaction has modified block 0 and appended block 3 After committing A transaction gets its own copy of the (part of the) database. When things go wrong delete copy, otherwise commit the changes to the original

13 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Writeahead Log x = 0; y = 0; BEGIN_TRANSACTION; x = x + 1; y = y + 2 x = y * y; END_TRANSACTION; (a) Log [x = 0 / 1] (b) Log [x = 0 / 1] [y = 0 / 2] (c) Log [x = 0 / 1] [y = 0 / 2] [x = 1 / 4] (d) A transaction The log before & after each statement is executed Use a writeahead log in which changes are recorded allowing one to roll back when things go wrong

14 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Concurrency Control Goal: Increase efficiency by allowing several transactions to execute at the same time Constraint: Effect should be the same as if the transactions were executed in some serial order General organization of managers for handling transactions

15 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Concurrency Control General organization of managers for handling distributed transactions

16 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Serializability Consider a collection E of transactions T1, … Tn Goal is to conduct a serializable execution of E: –Transactions in E are possibly concurrently executed according to some schedule S –Schedule S is equivalent to some totally ordered execution of T1, … Tn Two operations Op(Ti, x) and Op(Tj, x) on the same data item x, and from a set of logs may conflict at a data manager: –read-write conflict (rw): One is a read operation while the other is a write operation on x –write-write conflict (ww): Both are write operations on x

17 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Basic Scheduling Theorem Concurrency control - process conflicting reads and writes in certain relative orders Read-write and write-write conflicts can be synchronized independently, as long as we stick to a total ordering of transactions that is consistent with both types of conflicts

18 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Synchronization Techniques Two-phase locking: Before reading or writing a data item, a lock must be obtained. After a lock is released, the transaction is not allowed to acquire any more locks Timestamp ordering: Operations in a transaction are timestamped, and data managers are forced to handle operations in timestamp order Optimistic control: Don’t prevent things from going wrong, but correct the situation if conflicts actually did happen

19 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Two-phase Locking There are only READ and WRITE operations within transactions Locks are granted and released only by scheduler Locking policy is to avoid conflicts between operations

20 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Two-phase Locking Rule 1: When client submits Op(Ti,x), scheduler tests whether it conflicts with an operation Op(Tj,x) from some other client. If no conflict then grant Op(Ti,x), otherwise delay execution of Op(Ti,x) –Conflicting operations are executed in the same order as that locks are granted Rule 2: If Op(Ti,x) has been granted, do not release the lock until Op(Ti,x) has been executed by data manager –Guarantees LOCK => Op => RELEASE order Rule 3: If RELEASE(Ti,x) has taken place, no more locks for Ti may be granted –Combined with rule 1, guarantees that all pairs of conflicting operations of two transactions are done in the same order

21 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Two-Phase Locking Centralized 2PL: A single site handles all locks Primary 2PL: Each data item is assigned a primary site to handle its locks. Data is not necessarily replicated Distributed 2PL: Assumes data can be replicated. Each primary is responsible for handling locks for its data, which may reside at remote data managers

22 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Two-phase Locking: Problems Problem 1: System can come into a deadlock. How? –Practical solution: put a timeout on locks and abort transaction on expiration. Problem 2: When should the scheduler actually release a lock: –(1) when operation has been executed –(2) when it knows that no more locks will be requested No good way of testing condition (2) unless transaction has been committed or aborted Moreover: Assume the following execution sequence takes place: RELEASE(Ti,x) => LOCK(Tj,x) => ABORT(Ti). Consequence: scheduler will have to abort Tj as well (cascaded aborts) Solution: Release all locks only at commit/abort time (strict two-phase locking)

23 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Strict Two-Phase Locking

24 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Two-Phase Commit – Achieving Atomicity in Distributed Transactions Model: The client who initiated the computation acts as a coordinator; processes required to commit are the participants Phase 1a: Coordinator sends VOTE_REQUEST to participants (also called a pre-write) Phase 1b: When participant receives VOTE_REQUEST it returns either YES or NO to coordinator. If it sends NO, it aborts its local computation Phase 2a: Coordinator collects all votes; if all are YES, it sends COMMIT to all participants, otherwise it sends ABORT Phase 2b: Each participant waits for COMMIT or ABORT and handles accordingly

25 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Two-Phase Commit The finite state machine for the coordinator in 2PC The finite state machine for a participant

26 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao 2PC – Failing Participant Initial state: No problem, as participant was unaware of the protocol Ready state: Participant is waiting to either commit or abort. After recovery, participant needs to know which state transition it should make => log the coordinator ’ s decision Abort state: Need to make entry into abort state idempotent Commit state: Also make entry into commit state idempotent Consider participant crash in one of its states, and the subsequent recovery to that state:

27 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao 2PC – Failing Coordinator If it fails, the final decision is not available until the coordinator recovers Alternative: Let a participant P in the ready state timeout when it hasn ’ t received the coordinator ’ s decision –P tries to find out what other participants know Question: Can P not succeed in getting the required information?

28 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao 2PC – Failing Coordinator Question: Can P not succeed in getting the required information? Observation: Essence of the problem is that a recovering participant cannot make a local decision: it is dependent on other (possibly failed) processes –There might exist one participant that has received a COMMIT decision from the coordinator and subsequently failed (more or less concurrently failed with the coordinator) –The rest of participants cannot unilaterally decide to abort the transaction

29 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Serializability BEGIN_TRANSACTION x = 0 (read x value) x = x + 1; END_TRANSACTION (a) BEGIN_TRANSACTION x = 0; x = x + 2; END_TRANSACTION (b) BEGIN_TRANSACTION x = 0; x = x + 3; END_TRANSACTION (c) Schedule 1y = 0; x = x + 1; x = 0; x = x + 2; x = 0; x = x + 3Legal Schedule 2x = 0; x = 0; x = x + 1; x = x + 2; x = 0; x = x + 3;Illegal (Book is wrong!) Schedule 3x = 0; x = 0; x = x + 1; x = 0; x = x + 2; x = x + 3;Illegal

30 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Serializability For the purpose of serializability analysis, a transaction is modeled as a log of read and write operations Two operations Op(Ti, x) and Op(Tj, x) on the same data item x, and from a set of logs may conflict at a data manager: –read-write conflict (rw): One is a read operation while the other is a write operation on x –write-write conflict (ww): Both are write operations on x

31 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Basic Scheduling Theorem Let T = {T1, … Tn} be a set of transactions and let E be an execution of these transactions modeled by logs {L1, … Ln} E is serializable if there exists a total ordering of T such that for each pair of conflicting operations Oi and Oj from distinct transactions Ti and Tj (respectively), Oi precedes Oj in any log L1, … Ln, if and only if Ti precedes Tj in the total ordering

32 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Basic Scheduling Theorem Concurrency control - process conflicting reads and writes in certain relative orders Read-write and write-write conflicts can be synchronized independently, as long as we stick to a total ordering of transactions that is consistent with both types of conflicts

33 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Timestamp Ordering Basic idea: –Transaction manager assigns a unique timestamp TS(Ti) to each transaction Ti (at the time of creation of the transaction) –Each operation Op(Ti,x) submitted by the transaction manager to the scheduler is timestamped TS(Op(Ti,x)) = TS(Ti) Scheduler adheres to following rule: If Op(Ti,x) and Op(Tj,x) conflict then data manager processes Op(Ti,x) before Op(Tj,x) iff TS(Op(Ti,x)) < TS(Op(Tj,x)) Rather aggressive since if a single Op(Ti,x) is rejected, Ti will have to be aborted

34 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Timestamp Ordering Suppose: TS(Op(Ti,x)) < TS(Op(Tj,x)), but that Op(Tj,x) has already been processed by the data manager. –Then: the scheduler rejects Op(Ti,x), as it came in too late Suppose: TS(Op(Ti,x)) < TS(Op(Tj,x)), and that Op(Ti,x) has already been processed by the data manager. –Then: the scheduler would submit Op(Tj,x) to data manager –Refinement: hold back Op(Tj,x) until Ti commits or aborts Question: Why would we do this?

35 Fall Semester 2006EEC-681: Distributed Computing SystemsWenbing Zhao Optimistic Concurrency Control Observation: (1) Maintaining locks costs a lot; (2) In practice not many conflicts Alternative: Go ahead immediately with all operations, use tentative writes everywhere (shadow copies), and solve conflicts later on Phases: allow operations tentatively validate effects make updates permanent Validation: Check two basic rules for each pair of active transactions Ti and Tj: –Rule 1: Ti must not read or write data that has been written by Tj –Rule 2: Tj must not read or write data that has been written by Ti If one of the rules doesn’t hold: abort one of the transactions