CMU SCS Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Concurrency Control – Part 3 (R&G.

Slides:



Advertisements
Similar presentations
Concurrency Control III. General Overview Relational model - SQL Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
Transaction Management: Concurrency Control CS634 Class 17, Apr 7, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Concurrency Control – Part 3 (R&G.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
Concurrency Control Amol Deshpande CMSC424. Approach, Assumptions etc.. Approach  Guarantee conflict-serializability by allowing certain types of concurrency.
Concurrency Control II. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Concurrency Control Part 2 R&G - Chapter 17 The sequel was far better than the original! -- Nobody.
Lecture 11 Recoverability. 2 Serializability identifies schedules that maintain database consistency, assuming no transaction fails. Could also examine.
1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS OPTIONAL.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 19.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#22: Concurrency Control – Part 2 (R&G.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
Concurrency Control Nate Nystrom CS 632 February 6, 2001.
Concurrency Control R &G - Chapter 19 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue Book.
CONCURRENCY CONTROL SECTION 18.7 THE TREE PROTOCOL By : Saloni Tamotia (215)
Transaction Management and Concurrency Control
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17.
Concurrency Control III R &G - Chapter 17 Lecture 24 Smile, it is the key that fits the lock of everybody's heart. Anthony J. D'Angelo, The College Blue.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Alternative Concurrency Control Methods R&G - Chapter 17.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Concurrency Control Chapter 17 Modified by Donghui Zhang.
Concurrency Control in Distributed Databases. By :- Rishikesh Mandvikar rmandvik[at]engr.smu.edu May 1, 2004.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#21: Concurrency Control (R&G ch. 17)
CPSC 461. Outline I. Transactions II. Concurrency control Conflict Serializable Schedules Two – Phase protocol Lock Management Deadlocks detection and.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 18.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Database Systems/COMP4910/Spring05/Melikyan1 Transaction Management Overview Unit 2 Chapter 16.
1 Transaction Management Overview Chapter Transactions  Concurrent execution of user programs is essential for good DBMS performance.  Because.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#21: Concurrency Control (R&G ch. 17)
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
Concurrency Control Concurrency Control By Dr.S.Sridhar, Ph.D.(JNUD), RACI(Paris, NICE), RMR(USA), RZFM(Germany) DIRECTOR ARUNAI ENGINEERING COLLEGE TIRUVANNAMALAI.
1 Concurrency Control II: Locking and Isolation Levels.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two schedules are conflict equivalent if:  Involve the same actions of the same.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Distributed synchronization and mutual exclusion Distributed Transactions.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Concurrency Control – Part 2 (R&G.
Timestamp-based Concurrency Control
1 Database Systems ( 資料庫系統 ) December 27/28, 2006 Lecture 13 Merry Christmas & New Year.
10 1 Chapter 10_B Concurrency Control Database Systems: Design, Implementation, and Management, Rob and Coronel.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two schedules are conflict equivalent if:  Involve the same actions of the same.
Database Isolation Levels. Reading Database Isolation Levels, lecture notes by Dr. A. Fekete, resentation/AustralianComputer.
Computer Science Lecture 13, page 1 CS677: Distributed OS Last Class: Canonical Problems Election algorithms –Bully algorithm –Ring algorithm Distributed.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
1 Concurrency Control. 2 Why Have Concurrent Processes? v Better transaction throughput, response time v Done via better utilization of resources: –While.
Last Class: Canonical Problems
Concurrency Control Techniques
Transaction Management and Concurrency Control
Lecture#23: Concurrency Control – Part 2
Multiple Granularity Granularity is the size of data item  allowed to lock. Multiple Granularity is the hierarchically breaking up the database into portions.
Concurrency Control Chapter 17 Modified by Donghui Zhang
Concurrency Control Chapter 17
Concurrency Control via Validation
Concurrency Control II (OCC, MVCC)
Chapter 10 Transaction Management and Concurrency Control
Section 18.8 : Concurrency Control – Timestamps -CS 257, Rahul Dalal - SJSUID: Edited by: Sri Alluri (313)
Concurrency Control Chapter 17
Concurrency Control Chapter 17
Distributed Database Management Systems
Atomic Commit and Concurrency Control
Concurrency Control Chapter 17
Transaction Management
Concurrency Control Chapter 17
Concurrency control (OCC and MVCC)
Database Systems (資料庫系統)
Presentation transcript:

CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#23: Concurrency Control – Part 3 (R&G ch. 17)

CMU SCS Last Class Lock Granularities Locking in B+Trees The Phantom Problem Transaction Isolation Levels Faloutsos/PavloCMU SCS /6152

CMU SCS Concurrency Control Approaches Two-Phase Locking (2PL) –Determine serializability order of conflicting operations at runtime while txns execute. Timestamp Ordering (T/O) –Determine serializability order of txns before they execute. Faloutsos/PavloCMU SCS /6153

CMU SCS Today's Class Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control Multi-Version+2PL Partition-based T/O Performance Comparisons Faloutsos/PavloCMU SCS /6154

CMU SCS Timestamp Allocation Each txn Ti is assigned a unique fixed timestamp that is monotonically increasing. –Let TS(Ti) be the timestamp allocated to txn Ti –Different schemes assign timestamps at different times during the txn. Multiple implementation strategies: –System Clock. –Logical Counter. –Hybrid. Faloutsos/PavloCMU SCS /6155

CMU SCS T/O Concurrency Control Use these timestamps to determine the serializability order. If TS(Ti) < TS(Tj), then the DBMS must ensure that the execution schedule is equivalent to a serial schedule where Ti appears before Tj. Faloutsos/PavloCMU SCS /6156

CMU SCS Basic T/O Txns read and write objects without locks. Every object X is tagged with timestamp of the last txn that successfully did read/write: –W-TS(X) – Write timestamp on X –R-TS(X) – Read timestamp on X Check timestamps for every operation: –If txn tries to access an object “from the future”, it aborts and restarts. Faloutsos/PavloCMU SCS /6157

CMU SCS Basic T/O – Reads If TS(Ti) < W-TS(X), this violates timestamp order of Ti w.r.t. writer of X. –Abort Ti and restart it (with same TS? why?) Else: –Allow Ti to read X. –Update R-TS(X) to max(R-TS(X), TS(Ti)) –Have to make a local copy of X to ensure repeatable reads for Ti. Faloutsos/PavloCMU SCS /6158

CMU SCS Basic T/O – Writes If TS(Ti) < R-TS(X) or TS(Ti) < W-TS(X) –Abort and restart Ti. Else: –Allow Ti to write X and update W-TS(X) –Also have to make a local copy of X to ensure repeatable reads for Ti. Faloutsos/PavloCMU SCS /6159

CMU SCS Basic T/O – Example #1 Faloutsos/PavloCMU SCS /61510 TIME BEGIN R(B) R(A) COMMIT T1 T2 BEGIN R(B) W(B) R(A) W(A) COMMIT ScheduleDatabase ObjectR-TSW-TS A 00 B TS(T1)=1TS(T2)= No violations so both txns are safe to commit.

CMU SCS Basic T/O – Example #2 Faloutsos/PavloCMU SCS /61511 TIME BEGIN R(A) W(A) COMMIT T1 T2 BEGIN W(A) COMMIT ScheduleDatabase ObjectR-TSW-TS A Violation: TS(T1) < W-TS(A) T1 cannot overwrite update by T2, so it has to abort+restart.

CMU SCS Basic T/O – Thomas Write Rule If TS(Ti) < R-TS(X): –Abort and restart Ti. If TS(Ti) < W-TS(X): –Thomas Write Rule: Ignore the write and allow the txn to continue. –This violates timestamp order of Ti Else: –Allow Ti to write X and update W-TS(X) Faloutsos/PavloCMU SCS /61512

CMU SCS Basic T/O – Thomas Write Rule Faloutsos/PavloCMU SCS /61513 TIME BEGIN R(A) W(A) COMMIT T1 T2 BEGIN W(A) COMMIT ScheduleDatabase ObjectR-TSW-TS A Ignore the write and allow T1 to commit. We do not update W-TS(A)

CMU SCS Basic T/O Ensures conflict serializability if you don’t use the Thomas Write Rule. No deadlocks because no txn ever waits. Possibility of starvation for long txns if short txns keep causing conflicts. Permits schedules that are not recoverable. Faloutsos/PavloCMU SCS /61514

CMU SCS Recoverable Schedules Transactions commit only after all transactions whose changes they read, commit. Faloutsos/PavloCMU SCS /61515

CMU SCS Recoverability Faloutsos/PavloCMU SCS /61516 BEGIN W(A) ⋮ T1 T2 BEGIN R(A) W(B) COMMIT Schedule T2 is allowed to read the writes of T1. This is not recoverable because we can’t restart T2. TIME T1 aborts after T2 has committed. ABORT

CMU SCS Basic T/O – Performance Issues High overhead from copying data to txn’s workspace and from updating timestamps. Long running txns can get starved. Suffers from timestamp bottleneck. Faloutsos/PavloCMU SCS /61517

CMU SCS Today's Class Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control Multi-Version+2PL Partition-based T/O Performance Comparisons Faloutsos/PavloCMU SCS /61518

CMU SCS Optimistic Concurrency Control Assumption: Conflicts are rare Forcing txns to wait to acquire locks adds a lot of overhead. Optimize for the no-conflict case. Faloutsos/PavloCMU SCS /61519

CMU SCS OCC Phases Read: Track the read/write sets of txns and store their writes in a private workspace. Validation: When a txn commits, check whether it conflicts with other txns. Write: If validation succeeds, apply private changes to database. Otherwise abort and restart the txn. Faloutsos/PavloCMU SCS /61520

CMU SCS OCC – Example Faloutsos/PavloCMU SCS /61521 TIME BEGIN READ R(A) W(A) VALIDATE WRITE COMMIT T1 T2 BEGIN READ R(A) VALIDATE WRITE COMMIT ScheduleDatabase ObjectValueW-TS A T1 Workspace ObjectValueW-TS T2 Workspace ObjectValueW-TS A 0 A 456∞ TS(T2)=1 TS(T1)=2

CMU SCS OCC – Validation Phase Need to guarantee only serializable schedules are permitted. At validation, Ti checks other txns for RW and WW conflicts and makes sure that all conflicts go one way (from older txns to younger txns). Faloutsos/PavloCMU SCS /61522

CMU SCS OCC – Serial Validation Maintain global view of all active txns. Record read set and write set while txns are running and write into private workspace. Execute Validation and Write phase inside a protected critical section. Faloutsos/PavloCMU SCS /61523

CMU SCS OCC – Validation Phase Each txn’s timestamp is assigned at the beginning of the validation phase. Check the timestamp ordering of the committing txn with all other running txns. If TS(Ti) < TS(Tj), then one of the following three conditions must hold… Faloutsos/PavloCMU SCS /61524

CMU SCS OCC – Validation #1 Ti completes all three phases before Tj begins. Faloutsos/PavloCMU SCS /61525

CMU SCS OCC – Validation #1 Faloutsos/PavloCMU SCS /61526 BEGIN READ VALIDATE WRITE COMMIT T1 T2 BEGIN READ VALIDATE WRITE COMMIT TIME

CMU SCS OCC – Validation #2 Ti completes before Tj starts its Write phase, and Ti does not write to any object read by Tj. –WriteSet(Ti) ∩ ReadSet(Tj) = Ø Faloutsos/PavloCMU SCS /61527

CMU SCS OCC – Validation #2 Faloutsos/PavloCMU SCS /61528 TIME BEGIN READ R(A) W(A) VALIDATE T1 T2 BEGIN READ R(A) VALIDATE WRITE COMMIT ScheduleDatabase ObjectValueW-TS A T1 Workspace ObjectValueW-TS T2 Workspace ObjectValueW-TS A 0 A 456∞ T1 has to abort even though T2 will never write to the database.

CMU SCS OCC – Validation #2 Faloutsos/PavloCMU SCS /61529 TIME BEGIN READ R(A) W(A) VALIDATE WRITE COMMIT T1 T2 BEGIN READ R(A) VALIDATE WRITE COMMIT ScheduleDatabase ObjectValueW-TS A T1 Workspace ObjectValueW-TS T2 Workspace ObjectValueW-TS A 0 A 456∞ Safe to commit T1 because we know that T2 will not write.

CMU SCS OCC – Validation #3 Ti completes its Read phase before Tj completes its Read phase And Ti does not write to any object that is either read or written by Tj: –WriteSet(Ti) ∩ ReadSet(Tj) = Ø –WriteSet(Ti) ∩ WriteSet(Tj) = Ø Faloutsos/PavloCMU SCS /61530

CMU SCS OCC – Validation #3 Faloutsos/PavloCMU SCS /61531 TIME BEGIN READ R(A) W(A) VALIDATE WRITE COMMIT T1 T2 BEGIN READ R(B) R(A) VALIDATE WRITE COMMIT ScheduleDatabase ObjectValueW-TS A 1230 B XYZ0 T1 Workspace ObjectValueW-TS T2 Workspace ObjectValueW-TS A XYZ0 B 456∞ 1 A 1 Safe to commit T1 because T2 sees the DB after T1 has executed. TS(T1)=1

CMU SCS OCC – Validation Phase Forward-Oriented: –If there is a running txn Tj that read data written by the validating transaction Ti, then solve the conflict by killing Tj, else commit Backward-Oriented: Faloutsos/PavloCMU SCS /61532

CMU SCS OCC – Observations Q: When does OCC work well? A: When # of conflicts is low: –All txns are read-only (ideal). –Txns access disjoint subsets of data. If the database is large and the workload is not skewed, then there is a low probability of conflict, so again locking is wasteful. Faloutsos/PavloCMU SCS /61533

CMU SCS OCC – Performance Issues High overhead for copying data locally. Validation/Write phase bottlenecks. Aborts are more wasteful because they only occur after a txn has already executed. Suffers from timestamp allocation bottleneck. Faloutsos/PavloCMU SCS /61534

CMU SCS Today's Class Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control Multi-Version+2PL Partition-based T/O Performance Comparisons Faloutsos/PavloCMU SCS /61535

CMU SCS Multi-Version Concurrency Control Writes create new versions of objects instead of in-place updates: –Each successful write results in the creation of a new version of the data item written. Use write timestamps to label versions. –Let X k denote the version of X where for a given txn Ti: W-TS(X k ) ≤ TS(Ti) Faloutsos/PavloCMU SCS /61536

CMU SCS MVCC – Reads Any read operation sees the latest version of an object from right before that txn started. Every read request can be satisfied without blocking the txn. If TS(Ti) > R-TS(X k ): –Set R-TS(X k ) = TS(Ti) Faloutsos/PavloCMU SCS /61537

CMU SCS MVCC – Writes If TS(Ti) < R-TS(X k ): –Abort and restart Ti. If TS(Ti) = W-TS(X k ): –Overwrite the contents of X k. Else: –Create a new version of X k+1 and set its write timestamp to TS(Ti). Faloutsos/PavloCMU SCS /61538

CMU SCS MVCC – Example #1 Faloutsos/PavloCMU SCS /61539 TIME BEGIN R(A) W(A) R(A) COMMIT T1 T2 BEGIN R(A) W(A) COMMIT ScheduleDatabase ObjectValueR-TSW-TS A0A T1 reads version A 1 that it wrote earlier A1A A2A2 TS(T1)=1TS(T2)=2

CMU SCS MVCC – Example #2 Faloutsos/PavloCMU SCS /61540 TIME BEGIN R(A) W(A) T1 T2 BEGIN R(A) COMMIT ScheduleDatabase ObjectValueR-TSW-TS A0A T1 is aborted because T2 “moved” time forward. Violation: TS(T1) < R-TS(A 0 )

CMU SCS MVCC Can still incur cascading aborts because a txn sees uncommitted versions from txns that started before it did. Old versions of tuples accumulate. The DBMS needs a way to remove old versions to reclaim storage space. Faloutsos/PavloCMU SCS /61541

CMU SCS MVCC Implementations Store versions directly in main tables: –Postgres, Firebird/Interbase Store versions in separate temp tables: –MSFT SQL Server Only store a single master version: –Oracle, MySQL Faloutsos/PavloCMU SCS /61542

CMU SCS Garbage Collection – Postgres Never overwrites older versions. New tuples are appended to table. Deleted tuples are marked with a tombstone and then left in place. Separate background threads ( VACUUM ) has to scan tables to find tuples to remove. Faloutsos/PavloCMU SCS /61543

CMU SCS Garbage Collection – MySQL Only one “master” version for each tuple. Information about older versions are put in temp rollback segment and then pruned over time with a single thread ( PURGE ). Deleted tuples are left in place and the space is reused. Faloutsos/PavloCMU SCS /61544

CMU SCS MVCC – Performance Issues High abort overhead cost. Suffers from timestamp allocation bottleneck. Garbage collection overhead. Requires stalls to ensure recoverability. Faloutsos/PavloCMU SCS /61545

CMU SCS Today's Class Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control Multi-Version+2PL Partition-based T/O Performance Comparisons Faloutsos/PavloCMU SCS /61546

CMU SCS MVCC+2PL Combine the advantages of MVCC and 2PL together in a single scheme. Use different concurrency control scheme for read-only txns than for update txns. Faloutsos/PavloCMU SCS /61547

CMU SCS MVCC+2PL – Reads Use MVCC for read-only txns so that they never block on a writer Read-only txns are assigned a timestamp when they enter the system. Any read operations see the latest version of an object from right before that txn started. Faloutsos/PavloCMU SCS /61548

CMU SCS MVCC+2PL – Writes Use strict 2PL to schedule the operations of update txns: –Read-only txns are essentially ignored. Txns never overwrite objects: –Create a new copy for each write and set its timestamp to ∞. –Set the correct timestamp when txn commits. –Only one txn can commit at a time. Faloutsos/PavloCMU SCS /61549

CMU SCS MVCC+2PL – Performance Issues All the lock contention of 2PL. Suffers from timestamp allocation bottleneck. Faloutsos/PavloCMU SCS /61550

CMU SCS Today's Class Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control Multi-Version+2PL Partition-based T/O Performance Comparisons Faloutsos/PavloCMU SCS /61551

CMU SCS Observation When a txn commits, all previous T/O schemes check to see whether there is a conflict with concurrent txns. This requires locks/latches/mutexes. If you have a lot of concurrent txns, then this is slow even if the conflict rate is low. Faloutsos/PavloCMU SCS /61552

CMU SCS Partition-based T/O Split the database up in disjoint subsets called partitions (aka shards). Only check for conflicts between txns that are running in the same partition. Faloutsos/PavloCMU SCS /61553

CMU SCS Database Partitioning Faloutsos/PavloCMU SCS /61554 DISTRICT CUSTOMER ORDER_ITEM ITEM STOCK WAREHOUSE ORDERS DISTRICT CUSTOMER ORDER_ITEM STOCK ORDERS ITEM Replicate d WAREHOUSE SchemaSchema Tree

CMU SCS Database Partitioning Faloutsos/PavloCMU SCS /61555 DISTRICT CUSTOMER ORDER_ITEM STOCK ORDERS WAREHOUSE ITEM P2P4 Replicated P1 P2 P3 P4 P5 P3P1 ITEM Schema TreePartitions

CMU SCS Partition-based T/O Txns are assigned timestamps based on when they arrive at the DBMS. Partitions are protected by a single lock: –Each txn is queued at the partitions it needs. –The txn acquires a partition’s lock if it has the lowest timestamp in that partition’s queue. –The txn starts when it has all of the locks for all the partitions that it will read/write. Faloutsos/PavloCMU SCS /61556

CMU SCS Partition-based T/O – Reads Do not need to maintain multiple versions. Txns can read anything that they want at the partitions that they have locked. If a txn tries to access a partition that it does not have the lock, it is aborted + restarted. Faloutsos/PavloCMU SCS /61557

CMU SCS Partition-based T/O – Writes All updates occur in place. –Maintain a separate in-memory buffer to undo changes if the txn aborts. If a txn tries to access a partition that it does not have the lock, it is aborted + restarted. Faloutsos/PavloCMU SCS /61558

CMU SCS Partition-based T/O – Performance Issues Partition-based T/O protocol is very fast if: –The DBMS knows what partitions the txn needs before it starts. –Most (if not all) txns only need to access a single partition. Multi-partition txns causes partitions to be idle while txn executes. Faloutsos/PavloCMU SCS /61559

CMU SCS Today's Class Basic Timestamp Ordering Optimistic Concurrency Control Multi-Version Concurrency Control Multi-Version+2PL Partition-based T/O Performance Comparisons Faloutsos/PavloCMU SCS /61560

CMU SCS Performance Comparison Different schemes make different trade-offs. Measure how well each scheme scales on future many-core CPUs. –Ignore indexing and logging issues (for now). Faloutsos/PavloCMU SCS /61561 Joint work with Xiangyao Yu, George Bezerra, Mike Stonebraker, and Srini Devadas.

CMU SCS Graphite CPU Simulator Simulates a single CPU with 1024 cores. –Runs on a 22-node cluster. –Average slowdown: 10,000x Custom, lightweight DBMS that supports pluggable concurrency control coordinator. Faloutsos/PavloCMU SCS /61562

CMU SCS Tested CC Schemes Faloutsos/PavloCMU SCS /61563 DL_DETECT 2PL with Deadlock Detection NO_WAIT 2PL with Non-waiting Deadlock Prevention WAIT_DIE 2PL with Wait-Die Deadlock Prevention TIMESTAMP Basic T/O OCC Optimistic Concurrency Control MVCC Multi-Version Concurrency Control H-STORE Partition-based T/O 2PL Schemes T/O Schemes

CMU SCS Benchmark #1 Faloutsos/PavloCMU SCS /61564 YCSB Workload – Read-Only (~60GB)

CMU SCS Benchmark #2 Faloutsos/PavloCMU SCS /61565 TPC-C Workload – 1024 Warehouses (~26GB)

CMU SCS Which CC Scheme is Best? Like many things in life, it depends… –How skewed is the workload? –Are the txns short or long? –Is the workload mostly read-only? Faloutsos/PavloCMU SCS /61566

CMU SCS CC Schemes Faloutsos/PavloCMU SCS /61567 DL_DETECT Scales under low-contention. Suffers from lock thrashing and deadlocks. NO_WAIT Has no centralized point of contention. Highly scalable. Very high abort rates. WAIT_DIE Suffers from lock thrashing and timestamp allocation bottleneck. No deadlocks. TIMESTAMP High overhead from copying data and timestamp bottleneck. Non-blocking writes. OCC Performs well for read-only workloads. Non-blocking reads and writes. Timestamp bottleneck. MVCC High overhead for copying data locally. High abort cost. Suffers from timestamp bottleneck. H-STORE The best algorithm for partitioned workloads. Suffers from timestamp bottleneck. 2PL Schemes T/O Schemes

CMU SCS Real Systesms Faloutsos/PavloCMU SCS /61568 SchemeReleased Ingres Strict 2PL1975 Informix Strict 2PL1980 IBM DB2 Strict 2PL1983 Oracle MVCC1984 * Postgres MVCC1985 MS SQL Server Strict 2PL or MVCC1992 * MySQL (InnoDB) MVCC+2PL2001 Aerospike OCC2009 SAP HANA MVCC2010 VoltDB Partition T/O2010 MemSQL MVCC2011 MS Hekaton MVCC+OCC2013

CMU SCS Summary Concurrency control is hard. Faloutsos/PavloCMU SCS /61569