Presentation is loading. Please wait.

Presentation is loading. Please wait.

Transactions: Structure for Distributed Applications Jim Gray Microsoft Format Revised and Some Typing Errors corrected 5-22-2002.

Similar presentations


Presentation on theme: "Transactions: Structure for Distributed Applications Jim Gray Microsoft Format Revised and Some Typing Errors corrected 5-22-2002."— Presentation transcript:

1 Transactions: Structure for Distributed Applications Jim Gray Microsoft Gray@Microsoft.com Format Revised and Some Typing Errors corrected 5-22-2002

2 Gray: Transactions. 2 Outline Why transactions? How transactions.

3 Gray: Transactions. 3 Thesis: Transactions are key to Structuring Distributed Applications ACIDACID properties ease exception handling AA tomic: all or nothing CC onsistent: State transformation II solated: no concurrency anomalies DD urable: committed transaction effects persist

4 Gray: Transactions. 4 What is a transaction? Programmer’s view: Bracket a collection of actions. a SIMPLE! failure model Only two outcomes: Begin() action Commit() Success!! Begin() action Rollback() Begin() action Rollback() Failure!! Fail !

5 Gray: Transactions. 5 Why Bother: Atomicity? RPC semantics: at least once : try one time at most once : keep trying till acknowledged exactly once : keep trying till acknowledged and server discards duplicate requests.

6 Gray: Transactions. 6 Why Bother: Atomicity? Example: Insert record in file: at most once: timeout means Maybe at least once: retry may get Duplicate?error or may do second insert. exactly once: you do not have to worry. But what if operation involves insert several records? send several messages? Want ALL or NOTHING

7 Gray: Transactions. 7 Why Bother: Consistency Begin-Commit brackets a set of operations. You can violate consistency inside brackets debit but not credit (creates money) delete old file before create new file in a copy print document before delete from spool queue. Begin and Commit are points of consistency Begin Commit State Transformations new state under construction

8 Gray: Transactions. 8 Why Bother: Isolation Running programs concurrently on same data can create concurrency anomalies the shared checking account example Programming is hard enough without having to worry about concurrency. Begin() read BAL add 10 write BAL Commit() Bal = 100 Bal = 70 Bal = 110 Bal = 100 Begin() read BAL Subtract 30 write BAL Commit()

9 Gray: Transactions. 9 Isolation It is as though programs run one-at-a-time No concurrency anomalies. System automatically protects applications Locking (DB2, Informix, SQLserver, Sybase,? Versioned databases (Oracle, Interbase,...) Begin() read BAL add 10 write BAL Commit() Bal = 100 Bal = 110 Bal = 80 Bal = 110 Begin() read BAL Subtract 30 write BAL Commit()

10 Gray: Transactions. 10 Why Bother: Durability Once a transaction commits, want effects to survive failures. Fault tolerance: old-master-new-master won’t work: Can’t do daily dumps: would lose recent work. want Continuous? dumps Redo Most?transactions in case of failure Resend unacknowledged messages

11 Gray: Transactions. 11 ACID Generalizations Operation Taxonomy Unprotected: not undone or redone. temp files Transactional: can be undone before commit database and message operations Real: cannot be undone drill a hole in a piece of metal, print a check. Nested transactions: Sub-transactions Workflow: Long-lived transactions

12 Gray: Transactions. 12 Why ACID for Client/Server & Distributed ACID is important for centralized systems. Failures in centralized systems are simpler. In distributed systems: More and more-independent failures ACID is harder to implement That makes it even MORE IMPORTANT simple failure model. simple repair model.

13 Gray: Transactions. 13 Implementing Transactions Atomicity The DO/UNDO/REDO protocol Idempotence Two Phase Commit Durability Durable Logs Force at commit Isolation Locking or Versioning

14 Gray: Transactions. 14 Each action generates a log record Has an UNDO action Has a REDO action New State Old State DO Log DO/UNDO/REDO New State Old State UNDO Log New State Old State RED O Log

15 Gray: Transactions. 15 What does a Log Record Look Like Log record has header (transaction ID, timestamp, ?) item ID Old value New value For messages: just message text and sequence # For records: old and new value on update. Keep records small ? Log ?

16 Gray: Transactions. 16 Transaction is a sequence of actions Each action changes state Changes database Sends messages Operates a display/printer/drill press Leaves a log trail New State Old State DO Log New State Old State DO Log New State Old State DO Log New State

17 Gray: Transactions. 17 Transaction UNDO Is Easy Read log backwards UNDO one step at a time Can go 1/2 way back to get nested transactions. New State Old State UNDO Log Old State UNDO New State UNDO Log Old State New State Old State UNDO Log New State

18 Gray: Transactions. 18 Durability: Protecting the log When transaction commits Put its log in a durable place (duplexed disk) Need log to redo transaction in case of failure System failure: lost in-memory updates Media failure (lost disk) durableThis makes transaction durable. Log is sequential file Converts random IO to single sequential IO see NTFS or newer UNIX file systems Log Write

19 Gray: Transactions. 19 Recovery After a System Failure In case of system failure, Reapply log of all committed transactions Force-at-commit insures log will survive restart. Then UNDO all uncommitted transactions. New State Old State REDO Log New State Old State REDO Log REDO New State Old State Log Old State REDO New State

20 Gray: Transactions. 20 Idempotence: Dealing With Failure What if fail during restart? REDO many times What if new state not around at restart? UNDO something not done. Solution: make F(F(x))=F(x) (idempotence). Discard duplicates message sequence numbers to discard duplicates use sequence numbers on pages to detect state. (or) Make operations idempotent Move to position x, write value V to byte B,... Old State REDO New State Log REDO New State UNDO Old State Log UNDO Old State

21 Gray: Transactions. 21 Two Phase Commit: Dealing with Multiple Logs If all use one log, then all or none commit If multiple logs: then need something extra Standard technique: Marriage: Do you? I do. I pronounce.. Kiss. Theater: Ready on the set? Ready! Action! Act Sailing: Ready about? Ready! Helm a lee! Tack Contract law: Escrow agent Two phase commit: 1. Voting phase: can you do it? 2. If all vote yes, then commit phase: do it!

22 Gray: Transactions. 22 Two Phase Commit In Pictures Transactions managed by a Transaction Manager (TM) Transaction gets unique ID (TRID) from TM at Begin() Each subsystem transaction uses, JOINs transaction subsystems are called Resource Managers (RMs) SQLserver, DB2, OFS,... Transaction Manager tracks joined RMs. At Rollback, TM calls each RM Rollback() RM 2 TM App RM 1 RM 2 TM App RM 1 RM 2 TM App RM 1 Begin Join RM 2 TM App RM 1 Join

23 Gray: Transactions. 23 When Application Requests Commit TM calls RM1 Prepared() callback If all vote yes, TM calls RM1 Commit() If any vote no, TM calls RM1 Rollback() Phase 2 is starts when TM decides YES. TM resends to RM till acknowledged (allows RM to fail). RM 2 TM App RM 1 RM 2 TM App RM 1 RM 2 TM App RM 1 RM 2 TM App RM 1 Prepared? Application requests Commit TM broadcasts Prepared? TM decides Yes, broadcasts RMs all vote Yes RMs acknowledge Yes Commit Yes

24 Gray: Transactions. 24 X/Open Standardizes Two Phase Commit Comm Mgr TM ClientServer RM Comm Mgr TX: begin commit rollback SQL or OFS or.. XA+: outgoing incoming XA: Join, Prepare Commit Standardized APIs to applications and to RMs Points to OSI/TP for inter-operation.

25 Gray: Transactions. 25 Recap ACID makes it easy to program distributed Apps DO/UNDO/REDO + log allows atomicity Multiple logs need 2 Phase commit Persistent log gives durability recover from system failure recover from media failure

26 Gray: Transactions. 26 How does this relate to Microsoft? NT file system directory is transactional SQLserver is transactional Object File System (OFS) is transactional Transaction manager (2 phase commit) will be standard in NT 4 and Win96. Other Resource mangers? (SNA LU6.2, DB2, Informix, Oracle, Sybase,? will be able to participate in transactions.

27 Gray: Transactions. 27 Concurrency Control: Locking How to automatically prevent concurrency bugs? Locks: Mode: shared or exclusive or... Granularity: objects or containers or server Set automatically (well formed) Released at commit/rollback (two phase locking) Serialization theorem: if you lock all you touch and hold to commit: no bugs if you do not follow these rules, you may see bugs.

28 Gray: Transactions. 28 Reduced Isolation Levels It is possible to lock less and risk fuzzy data Example: want statistical summary of DB but do not want to lock whole database Reduced levels: Repeatable Read: may see fuzzy inserts/delete but will serialize all updates. Read Committed: see only committed data Read Uncommitted: may see uncommitted updates.

29 Gray: Transactions. 29 Multi-Version Concurrency Control Run transaction at some timestamp in the past. No locking needed, reconstruct ld?state from log. Add in your transaction’s updates. At commit assure updates do not collide with other committed transactions Almost as good as serializable (only obscure bugs)

30 Gray: Transactions. 30 Summary ACID eases error handling Atomic: all or nothing Consistent: correct transformation Isolated: no concurrency bugs Durable: survives failures. Allows you to build robust distributed Apps. ACID is becoming standard part of systems It’s real.

31 Gray: Transactions. 31 References Transaction Processing Concepts and Techniques Gray & Reuter, Morgan Kaufmann, 1993 Concurrency Control and Recovery in Database Systems, Bernstein, Hadzilacos, Goodman, Adison-Wesley, 1987 The Theory of Database Concurrency Control Papadimitriou, Computer Science Press, 1986. A Critique of ANSI SQL Isolation Levels Berenson et.al., ACM SIGMOD


Download ppt "Transactions: Structure for Distributed Applications Jim Gray Microsoft Format Revised and Some Typing Errors corrected 5-22-2002."

Similar presentations


Ads by Google