Hideaki Kimura #* Efficient Locking Techniques for Databases on Modern Hardware Goetz Graefe + Harumi Kuno + # Brown University * Microsoft Jim Gray Systems.

Slides:



Advertisements
Similar presentations
CM20145 Concurrency Control
Advertisements

Critical Sections: Re-emerging Concerns for DBMS Ryan JohnsonIppokratis Pandis Anastasia Ailamaki Carnegie Mellon University École Polytechnique Féderale.
1 Concurrency Control Chapter Conflict Serializable Schedules  Two actions are in conflict if  they operate on the same DB item,  they belong.
Accessing data Transactions. Agenda Questions from last class? Transactions concurrency Locking rollback.
Concurrency Control II. General Overview Relational model - SQL  Formal & commercial query languages Functional Dependencies Normalization Physical Design.
Concurrency Control Part 2 R&G - Chapter 17 The sequel was far better than the original! -- Nobody.
Presented by, MySQL & O’Reilly Media, Inc. Falcon from the Beginning Jim Starkey
1 Supplemental Notes: Practical Aspects of Transactions THIS MATERIAL IS OPTIONAL.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Meanwhile RAM cost continues to drop Moore’s Law on total CPU processing power holds but in parallel processing… CPU clock rate stalled… Because.
Concurrency Control and Recovery In real life: users access the database concurrently, and systems crash. Concurrent access to the database also improves.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
9 Chapter 9 Transaction Management and Concurrency Control Hachim Haddouti.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Transaction Management Chapter 9. What is a Transaction? A logical unit of work on a database A logical unit of work on a database An entire program An.
Locking Key Ranges with Unbundled Transaction Services 1 David Lomet Microsoft Research Mohamed Mokbel University of Minnesota.
Selecting and Implementing An Embedded Database System Presented by Jeff Webb March 2005 Article written by Michael Olson IEEE Software, 2000.
Logging in Flash-based Database Systems Lu Zeping
H-Store: A Specialized Architecture for High-throughput OLTP Applications Evan Jones (MIT) Andrew Pavlo (Brown) 13 th Intl. Workshop on High Performance.
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
ITEC 3220M Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: 3220m.htm
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms –Bully algorithm.
Lecture 12 Recoverability and failure. 2 Optimistic Techniques Based on assumption that conflict is rare and more efficient to let transactions proceed.
V. Megalooikonomou Concurrency control (based on slides by C. Faloutsos at CMU and on notes by Silberchatz,Korth, and Sudarshan) Temple University – CIS.
Carnegie Mellon Carnegie Mellon Univ. Dept. of Computer Science Database Applications C. Faloutsos Concurrency control.
Chapter 11 Concurrency Control. Lock-Based Protocols  A lock is a mechanism to control concurrent access to a data item  Data items can be locked in.
Authors: Stavros HP Daniel J. Yale Samuel MIT Michael MIT Supervisor: Dr Benjamin Kao Presenter: For Sigmod.
1 Concurrency Control II: Locking and Isolation Levels.
Operating Systems. Overview What is an Operating System (OS) What is an Operating System (OS) What Operating Systems do. What Operating Systems do. Operating.
11/7/2012ISC329 Isabelle Bichindaritz1 Transaction Management & Concurrency Control.
Database Replication in Tashkent CSEP 545 Transaction Processing Sameh Elnikety.
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
The Relational Model1 Transaction Processing Units of Work.
Darko Makreshanski Department of Computer Science ETH Zurich
1 Concurrency Control Lecture 22 Ramakrishnan - Chapter 19.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
1 Copyright © 2005, Oracle. All rights reserved. Following a Tuning Methodology.
1 CSE232A: Database System Principles More Concurrency Control and Transaction Processing.
CS 540 Database Management Systems
3 Database Systems: Design, Implementation, and Management CHAPTER 9 Transaction Management and Concurrency Control.
10 1 Chapter 10 - A Transaction Management Database Systems: Design, Implementation, and Management, Rob and Coronel.
Motivation for Recovery Atomicity: –Transactions may abort (“Rollback”). Durability: –What if DBMS stops running? (Causes?) crash! v Desired Behavior after.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
Lock Tuning. Overview Data definition language (DDL) statements are considered harmful DDL is the language used to access and manipulate catalog or metadata.
© Copyright EnterpriseDB Corporation, All Rights Reserved.1 Isolation Levels in PostgreSQL Kevin Grittner | | PgConf.Russia 2016.
On Transactional Memory, Spinlocks and Database Transactions Khai Q. Tran Spyros Blanas Jeffrey F. Naughton (University of Wisconsin Madison)
18 September 2008CIS 340 # 1 Last Covered (almost)(almost) Variety of middleware mechanisms Gain? Enable n-tier architectures while not necessarily using.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
Don’t be lazy, be consistent: Postgres-R, A new way to implement Database Replication Paper by Bettina Kemme and Gustavo Alonso, VLDB 2000 Presentation.
Locks, Blocks & Isolation Oh My!. About Me Keith Tate Data Professional for over 14 Years MCITP in both DBA and Dev tracks
SYSTEMS IMPLEMENTATION TECHNIQUES TRANSACTION PROCESSING DATABASE RECOVERY DATABASE SECURITY CONCURRENCY CONTROL.
CS 540 Database Management Systems
PHyTM: Persistent Hybrid Transactional Memory
Transaction Management and Concurrency Control
Concurrency Control More !
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Transactions and Concurrency
Concurrency Control E0 261 Prasad Deshpande, Jayant Haritsa
Transaction Management
STRUCTURE OF PRESENTATION :
Temple University – CIS Dept. CIS661 – Principles of Data Management
CONCURRENCY Concurrency is the tendency for different tasks to happen at the same time in a system ( mostly interacting with each other ) .   Parallel.
CSE 542: Operating Systems
Presentation transcript:

Hideaki Kimura #* Efficient Locking Techniques for Databases on Modern Hardware Goetz Graefe + Harumi Kuno + # Brown University * Microsoft Jim Gray Systems Lab + Hewlett-Packard Laboratories at ADMS'12 Slides/papers available on request. us:

2/26 Traditional DBMS on Modern Hardware Optimized for Magnetic Disk Bottleneck Fig. Instructions and Cycles for New Order [S. Harizopoulos et al. SIGMOD‘08] Disk I/O Costs Other Costs Useful Work Query Execution Overhead Then What’s This?

3/26 Context of This Paper Achieved up to 6x overall speed-up Foster B-trees This Paper Work in progress Consolidation Array, Flush-Pipeline Shore-MT/Aether [Johnson et al'10]

4/26 Our Prior Work: Foster B-trees Foster Relationship Fence Keys Simple Prefix Compression Poor-man's Normalized Keys Efficient yet Exhaustive Verification On Sun Niagara. Tested without locks. only latches. Low Latch ContentionHigh Latch Contention 2-3x speed-up6x speedup Implemented by modifying Shore-MT and compared with it: [TODS'12]

5/26 Talk Overview 1) Key Range Locks w/ Higher Concurrency Combines fence-keys and Graefe lock modes 2) Lightweight Intent Lock Extremely Scalable and Fast 3) Scalable Deadlock Detection Dreadlocks Algorithm applied to Databases 4) Serializable Early-Lock-Release Serializable all-kinds ELR that allows read-only transaction to bypass logging

6/26 1. Key Range Lock SELECT Key=10 UPDATE Key=30 XS SELECT Key=20 ~ 25 SELECT Key=15 Gap Mohan et al. : Locks neighboring key. Lomet et al.: Adds a few new lock modes. (e.g., RangeX-S) Still lacks a few lock modes, resulting in lower concurrency.

7/26 Our Key Range Locking Graefe Lock Modes. All 3*3=9 modes Create a ghost record (pseudo deleted record) before insertion as a separate Xct. Use Fence Keys to lock on page boundary EAEA EBEB … EZEZ DE EF Fence Keys

8/26 2. Intent Lock Coarse level locking (e.g., table, database) Intent Lock (IS/IX) and Absolute Lock (X/S/SIX) Saves overhead for large scan/write transactions (just one absolute lock) [Gray et al]

9/26 Intent Lock: Physical Contention Key-A Key-B DB-1 VOL-1 IND-1 DB-1 VOL-1 IND-1 Key-A Key-B Lock Queues IS S S IX X X IS S S IX X X LogicalPhysical

10/26 Lightweight Intent Lock Key-A Key-B DB-1 VOL-1 IND-1 Key-A Key-B Lock Queues for Key Locks S S X X IS S S IX X X LogicalPhysical Counters for Coarse Locks ISIXSX DB11100 VOL11100 IND11100 No Lock Queue, No Mutex

11/26 Intent Lock: Summary Extremely Lightweight for Scalability Just a set of counters, no queue Only spinlock. Mutex only when absolute lock is requested. Timeout to avoid deadlock Separate from main lock table Main Lock TableIntent Lock Table Physical ContentionLowHigh Required FunctionalityHighLow

12/26 3. Deadlock Handling Deadlock Prevention (e.g., wound-wait/wait-die) can cause many false positives Deadlock Detection (Cycle Detection) Infrequent check: delay Frequent/Immediate check: not scalable on many cores Timeout: false positives, delays, hard to configure. Traditional approaches have some drawback

13/26 Solution: Dreadlocks Immediate deadlock detection Local Spin: Scalable and Low-overhead Almost * no false positives (*)due to Bloom filter More details in paper Issues specific to databases: Lock modes, queues and upgrades Avoid pure spinning to save CPU cycles Deadlock resolution for flush pipeline [Koskinen et al '08]

14/26 4. Early Lock Release Resources ABC Lock Commit Request Flush Wait Unlock Commit Protocol T2:X T1:S T3:S T3:X Locks Transactions T1 T2 T3 S: Read X: Write 10ms- T4 T5 T1000 … Group-Commit Flush-Pipeline More and More Locks, Waits, Deadlocks [DeWitt et al'84] [Johnson et al'10]

15/26 Prior Work: Aether First implementation of ELR in DBMS Significant speed-up (10x) on many-core Simply releases locks on commit-request "… [must hold] until both their own and their predecessor’s log records have reached the disk. Serial log implementations preserve this property naturally,…" [Johnson et al VLDB'10] Problem: A read-only transaction bypasses logging T1: Write T1: Commit T2: Commit ELR Serial LogLSN Dependent

16/26 Anomaly of Prior ELR Technique D=10 Event Latest LSN Durable LSN T2: D=20 10 (T1: Read D) 20 T2: Commit-Req 30 T1: Read D 40 T1: Commit 51 …..2 T2: Commit..3 T2:X T1:S Lock-queue: "D" Crash! D=20 D is 20! T1 Rollback T2

17/26 Naïve Solutions Flush wait for Read-Only Transaction Orders of magnitude higher latency. Short read-only query: microseconds Disk Flush: milliseconds Do not release X-locks in ELR (S-ELR) Concurrency as low as No-ELR After all, all lock-waits involve X-locks

18/26 Safe SX-ELR: X-Release Tag D=10 0 Event Latest LSN Durable LSN T2: D=20 10 (T1: Read D) 20 T2: Commit-Req 30 T1: Read D (max-tag=3) 40 T1: Commit-Req51 T3: Read E (max- tag=0) & Commit 62 T1, T2: Commit73 E=5 0 tag T2:X T1:S T3:S 3 Lock-queue: "D" Lock-queue: "E" D=20 T3 E is 5 T1 max-tag

19/26 Safe SX-ELR: Summary Serializable yet Highly Concurrent Safely release all kinds of locks Most read-only transaction quickly exits Only necessary threads get waited Low Overhead Just LSN comparison Applicable to Coarse Locks Self-tag and Descendant-tag SIX/IX: Update Descendant-tag. X: Upd. Self-tag IS/IX: Check Self-tag. S/X/SIX: Check Both

20/26 Experiments TPC-B: 250MB of data, fits in bufferpool Hardware Sun-Niagara: 64 Hardware contexts HP Z600: 6 Cores. SSD drive Software Foster B-trees (Modified) in Shore-MT (Original) with/without each technique Fully ACID, Serializable mode.

21/26 Key Range Locks Z600, 6-Threads, AVG & 95% on 20 Runs

22/26 Lightweight Intent Lock Sun Niagara, 60 threads, AVG & 95% on 20 Runs

23/26 Dreadlocks vs Traditional Sun Niagara, AVG on 20 Runs

24/26 Early Lock Release (ELR) SX-ELR performs 5x faster. S-only ELR isn’t useful All improvements combined, -50x faster. HDD LogSSD Log Z600, 6-Threads, AVG & 95% on 20 Runs

25/26 Related Work ARIES/KVL, IM [Mohan et al] Key range locking [Lomet'93] Shore-MT at EPFL/CMU/UW-Madison Speculative Lock Inheritance [Johnson et al'09] Aether [Johnson et al'10] Dreadlocks [Koskinen and Herlihy'08] H-Store at Brown/MIT

26/26 Wrap up Locking as bottleneck on Modern H/W Revisited all aspects of database locking 1. Graefe Lock Modes 2. Lightweight Intent Lock 3. Dreadlock 4. Early Lock Release All together, significant speed-up (-50x) Future Work: Buffer-pool

27/26

28/26 Reserved: Locking Details

29/26 Transactional Processing High Concurrency Very Short Latency Fully ACID-compliant Relatively Small Data # Digital Transactions Modern Hardware CPU Clock Speed

30/26 Many-Cores and Contentions Logical Contention Physical Contention Critical Section Critical Section Shared Resource Mutex or Spinlock Doesn't Help, even Worsens!

31/26 Background: Fence keys A~~B B~ ~C C~~E A~~Z ACE AMV A~ ~M ~C Define key ranges in each page.

32/26 Key-Range Lock Mode [Lomet '93] RangeX-S X S RangeI-N I Adds a few new lock modes Consists of 2 parts; Range and Key RangeS-SS (RangeN-S) But, still lacks a few lock modes * (*) Instant X lock

33/26 Example: Missing lock modes SELECT Key=15 UPDATE Key=20 RangeS-N? RangeS-S X RangeA-B

34/26 Graefe Lock Modes New lock modes (*) S≡SS X≡XX *

35/26 (**) Ours locks the key prior to the range while SQL Server uses next-key locking. RangeS-N ≈ NS Next-key locking Prior-key locking

36/26 LIL: Lock-Request Protocol

37/26 LIL: Lock-Release Protocol

38/26 Dreadlocks [Koskinen et al '08] A A B B C C D D E E A waits for B (live lock) (dead lock) Thread AB C E D Digest * {A} {B} {C} {E} {D} (*) actually a Bloom filter (bit-vector). 1. does it contain me? 2. add it to myself {A,B} {C,D} {D,E} {E,C} {E,C,D}D deadlock!!

39/26 0 Naïve Solution: Check Page-LSN? Read-only transaction can exit only after Commit Log of dependents becomes durable. LSN Page D=10 E=5 1: T2, D, 10 → 20 2: T2, Z, 20 → 10 3: T2, Commit Log-buffer 20 1 T2 T1 Page Z M immediately exits if durable-LSN ≥ 1?

40/26 Deadlock Victim & Flush Pipeline

41/26 Victim & Flush Pipeline (Cont'd)

42/26 Dreadlock + Backoff on Sleep TPC-B, Lazy commit, SSD, Xct-chain max 100k

43/26 Related Work: H-Store/VoltDB Disk-based DB ↔ Pure Main-Memory DB Shared-everything ↔ -nothing in each node Differences RAM (Note: both are shared-nothing across-nodes) Foster B-Trees/Shore-MTVoltDB Distributed Xct RAM Keep 'em, but improve 'em.Get rid of latches. -Accessible RAM per CPU -Simplicity and Best-case Performance Pros/Cons Both are interesting directions.

44/26 Reserved: Foster B-tree Slides

45/26 Latch Contention in B-trees 1. Root-leaf EX Latch 2. Next/Prev Pointers

46/26 Foster B-trees Architecture A~~B B~ ~C C~~E A~~Z ACE AMV 1. Fence-keys 2. Foster Relationship A~ ~M ~C cf. B-link tree [Lehman et al‘81]

47/26 More on Fence Keys Efficient Prefix Compression Powerful B-tree Verification Efficient yet Exhaustive Verification Simpler and More Scalable B-tree No tree-latch B-tree code size Halved Key Range Locking High: "AAP" Low: "AAF" "AAI31""I31" "I3" "J1" Slot array Poor man's normalization "I31", xxx Tuple

48/26 B-tree lookup speed-up No Locks. SELECT-only workload.

49/26 Insert-Intensive Case 6-7x Speed-up Latch Contention Bottleneck Log-Buffer Contention Bottleneck Will port "Consolidation Array" [Johnson et al]

50/26 Chain length: Mixed 1 Thread

51/26 Eager-Opportunistic

52/26 B-tree Verification