VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California, Santa Barbara, §Zhejiang University 王夏青 LogBase: A Scalable Log-structured Database System in the cloud
Abstract Introduction Background & Related Work Design & Implementation Performance Evaluation Conclusion
Introduction: Requirements High write throughput Dynamic scalability Efficient multiversion data access Transactional semantics Fast recovery from machine failures
Introduction: Characters Log serves as the unique data repository in the system Adopts an architecture similar to HBase and BigTable where a mashine in the system is responsible for some tablets Builds an index per tablet for retrieving the data from the log
Introduction: Contributions Propose LogBase – a scalable log-structured database system that can be dynamically deployed in the cloud. Design a multiversion index strategy in LogBase to provide efficient access to the multiversion data. Enhance LogBase to support transactional. Conduct an extensive performance study on LogBase.
Background & Related Work No-overwrite Strategies: System R: shadow paging strategy; POSTGRES: delta record WAL+Data: Most storage systems Log-structured Systems: LFS, BlueSky, Berkeley DB, PrimeBase, Hyder, RAMCloud
Design & Implementation: Data Model Model: relational data model Data Partitioning: vertical: column groups; horizontal: tablets
Design & Implementation: Architecture Overview Log Repository Data Access Manager Transaction Manager
Design & Implementation: Log Repository Guarantee: Stable storage: The log-only approach provides similar capability of recovering data from machine failures compared to the WAL+Data approach Stores the log in HDFS Design choices for the implementation of the log Log record: LogKey: LSN, table name, tablet information Data:
Design & Implementation: In-memory Multiversion Index Index: to provide efficient access to the data In-memory index Index structure: Blink-trees Index entry: IdxKey: primary key + timestamp Consumption analysis
Design & Implementation: Tablet Serving(1)
Design & Implementation: Tablet Serving(2) Write Read Delete Scan Compaction
Design & Implementation: Transaction Management(1) Concurrency Control and Isolation: The Rationale of MVOCC Validation with Write Locks Snapshot Isolation in LogBase Guarantee: Isolation: The hybrid scheme of multiversion optimistic concurrency control(MVOCC) in LogBase guarantees snapshot isolation
Design & Implementation: Transaction Management(2) Commit Protocol and Atomicity: Guarantee: Atomicity: The LogBase’s commit protocol guarantees similar atomicity property to the WAL+Data approach Commit procedure
Design & Implementation: Failures and Recovery Guarantee: Durability: The LogBase’s recovery protocol guarantees similar data durability property to the WAL+Data approach Checkpoint operation Recovery procedure
Performance Evaluation: Experimental Setup An in-house cluster including 24 machines, each with a quad core processor, 8 GB of physical memory, 500 GB of disk capacity and 1 gigabit Ethernet Implemented in Java, inherits basic infrastructures from HBase open source Compare the performance of LogBase with HBase Workload: 5000 operations operations for warming up the cathe
Performance Evaluation: Micro-benchmarks(1) Basic data operations: Write Random read Sequential scan Range scan
Performance Evaluation: Micro-benchmarks(2)
Performance Evaluation: Micro-benchmarks(3)
Performance Evaluation: Micro-benchmarks(4)
Performance Evaluation: YCSB Benchmark(1) Mixed workloads: 95% and 75% update in the workload Varying system sizes: 3 to 24 nodes
Performance Evaluation: YCSB Benchmark(2)
Performance Evaluation: YCSB Benchmark(3)
Performance Evaluation: TPC-W Benchmark(1) Examine the performance when accessing multiple data records possibly from different tables within the transaction boundary Models a webshop application workload Browsing, shopping, ordering: 5%, 20%, 50% update transactions
Performance Evaluation: TPC-W Benchmark(2)
Performance Evaluation: Checkpoint and Recovery
Performance Evaluation: Comparison with Log-structured Systems(1) RAMClouds: stores its data and indexes entirely in memory Hyder: scales its database in shared-flash environments without data partitioning LRS: has a distributed architecture and data partitioning strategy similar to RAMCloud and LogBase but stores data on disks
Performance Evaluation: Comparison with Log-structured Systems(2)
Performance Evaluation: Comparison with Log-structured Systems(3)
Conclusion Introduced a scalable log-structured database system called LogBase Can be elastically deployed in the cloud Can provide sustained write throughput and effective recovery time The in-memory indexes support efficient data retrieval Provides the widely accepted snapshot isolation for transactions Extensive experiments Future works: the design and implementation of efficient secondary indexes and query processing for LogBase
Thanks