VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,

Slides:



Advertisements
Similar presentations
Chen Zhang Hans De Sterck University of Waterloo
Advertisements

CS525: Special Topics in DBs Large-Scale Data Management HBase Spring 2013 WPI, Mohamed Eltabakh 1.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
An Efficient Multi-Dimensional Index for Cloud Data Management Xiangyu Zhang Jing Ai Zhongyuan Wang Jiaheng Lu Xiaofeng Meng School of Information Renmin.
Transaction.
1 HYRISE – A Main Memory Hybrid Storage Engine By: Martin Grund, Jens Krüger, Hasso Plattner, Alexander Zeier, Philippe Cudre-Mauroux, Samuel Madden, VLDB.
NoSQL Databases: MongoDB vs Cassandra
Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears Yahoo! Research Presenter.
More on transactions…. Dealing with concurrency (OR: how to handle the pressure!) Locking Timestamp ordering Multiversion protocols Optimistic protocols.
1 Tashkent: Uniting Durability & Ordering in Replicated Databases Sameh Elnikety, EPFL Steven Dropsho, EPFL Fernando Pedone, USI.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Distributed storage for structured data
Gowtham Rajappan. HDFS – Hadoop Distributed File System modeled on Google GFS. Hadoop MapReduce – Similar to Google MapReduce Hbase – Similar to Google.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
Hypertable Doug Judd Background  Zvents plan is to become the “Google” of local search  Identified the need for a scalable DB 
1 Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Dabek Presented By Michael Over.
Implementation and Evaluation of a Protocol for Recording Process Documentation in the Presence of Failures Zheng Chen and Luc Moreau
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Presented By HaeJoon Lee Yanyan Shen, Beng Chin Ooi, Bogdan Marius Tudor National University of Singapore Wei Lu Renmin University Cang Chen Zhejiang University.
1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.
L/O/G/O 云端的小飞象系列报告之二 Cloud 组. L/O/G/O Hadoop in SIGMOD
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Alireza Angabini Advanced DB class Dr. M.Rahgozar Fall 88.
1099 Why Use InterBase? Bill Todd The Database Group, Inc.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Bigtable: A Distributed Storage System for Structured Data 1.
Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI Feb 2012 Presentation.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Scaling Out Without Partitioning Phil Bernstein & Colin Reid Microsoft Corporation A Novel Transactional Record Manager for Shared Raw Flash © 2010 Microsoft.
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Databases Illuminated
Page 1 MD-HBase: A Scalable Multi-dimensional Data Infrastructure for Location Aware Services Shoji Nishimura (NEC Service Platforms Labs.), Sudipto Das,
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
Building a Distributed Full-Text Index for the Web by Sergey Melnik, Sriram Raghavan, Beverly Yang and Hector Garcia-Molina from Stanford University Presented.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Cloudera Kudu Introduction
CS 540 Database Management Systems
1 Lightweight Indexing of Observational Data in Log-Structured Storage National University of Singapore (Sheng Wang, Beng Chin Ooi) Portland State University(David.
Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,
Bigtable: A Distributed Storage System for Structured Data
State Machine Replication State Machine Replication through transparent distributed protocols State Machine Replication through a shared log.
Decibel: The Relational Dataset Branching System
1 Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan and Russell Sears Yahoo! Research.
Transactional Flash V. Prabhakaran, T. L. Rodeheffer, L. Zhou (MSR, Silicon Valley), OSDI 2008 Shimin Chen Big Data Reading Group.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Gorilla: A Fast, Scalable, In-Memory Time Series Database
Bigtable A Distributed Storage System for Structured Data.
CalvinFS: Consistent WAN Replication and Scalable Metdata Management for Distributed File Systems Thomas Kao.
Towards a Non-2PC Transaction Management in Distributed Database Systems Qian Lin, Pengfei Chang, Gang Chen, Beng Chin Ooi, Kian-Lee Tan, Zhengkui Wang.
CSE-291 (Distributed Systems) Winter 2017 Gregory Kesden
CS 540 Database Management Systems
Slide credits: Thomas Kao
Indexing Goals: Store large files Support multiple search keys
Gowtham Rajappan.
Introduction to NewSQL
Google Filesystem Some slides taken from Alan Sussman.
CSE-291 (Cloud Computing) Fall 2016 Gregory Kesden
Albatross: Lightweight Elasticity in Shared Storage Databases for the Cloud using Live Data Migration Sudipto Das1, Shoji Nishimura2, Divyakant Agrawal1,
Building a Database on S3
Benchmarking Cloud Serving Systems with YCSB
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
The Gamma Database Machine Project
Sunil Agarwal | Principal Program Manager
Presentation transcript:

VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California, Santa Barbara, §Zhejiang University 王夏青 LogBase: A Scalable Log-structured Database System in the cloud

Abstract  Introduction  Background & Related Work  Design & Implementation  Performance Evaluation  Conclusion

Introduction: Requirements  High write throughput  Dynamic scalability  Efficient multiversion data access  Transactional semantics  Fast recovery from machine failures

Introduction: Characters  Log serves as the unique data repository in the system  Adopts an architecture similar to HBase and BigTable where a mashine in the system is responsible for some tablets  Builds an index per tablet for retrieving the data from the log

Introduction: Contributions  Propose LogBase – a scalable log-structured database system that can be dynamically deployed in the cloud.  Design a multiversion index strategy in LogBase to provide efficient access to the multiversion data.  Enhance LogBase to support transactional.  Conduct an extensive performance study on LogBase.

Background & Related Work  No-overwrite Strategies: System R: shadow paging strategy; POSTGRES: delta record  WAL+Data: Most storage systems  Log-structured Systems: LFS, BlueSky, Berkeley DB, PrimeBase, Hyder, RAMCloud

Design & Implementation: Data Model  Model: relational data model  Data Partitioning: vertical: column groups; horizontal: tablets

Design & Implementation: Architecture Overview  Log Repository  Data Access Manager  Transaction Manager

Design & Implementation: Log Repository  Guarantee: Stable storage: The log-only approach provides similar capability of recovering data from machine failures compared to the WAL+Data approach  Stores the log in HDFS  Design choices for the implementation of the log  Log record: LogKey: LSN, table name, tablet information Data:

Design & Implementation: In-memory Multiversion Index  Index: to provide efficient access to the data  In-memory index  Index structure: Blink-trees  Index entry: IdxKey: primary key + timestamp  Consumption analysis

Design & Implementation: Tablet Serving(1)

Design & Implementation: Tablet Serving(2)  Write  Read  Delete  Scan  Compaction

Design & Implementation: Transaction Management(1) Concurrency Control and Isolation:  The Rationale of MVOCC  Validation with Write Locks  Snapshot Isolation in LogBase  Guarantee: Isolation: The hybrid scheme of multiversion optimistic concurrency control(MVOCC) in LogBase guarantees snapshot isolation

Design & Implementation: Transaction Management(2) Commit Protocol and Atomicity:  Guarantee: Atomicity: The LogBase’s commit protocol guarantees similar atomicity property to the WAL+Data approach  Commit procedure

Design & Implementation: Failures and Recovery  Guarantee: Durability: The LogBase’s recovery protocol guarantees similar data durability property to the WAL+Data approach  Checkpoint operation  Recovery procedure

Performance Evaluation: Experimental Setup  An in-house cluster including 24 machines, each with a quad core processor, 8 GB of physical memory, 500 GB of disk capacity and 1 gigabit Ethernet  Implemented in Java, inherits basic infrastructures from HBase open source  Compare the performance of LogBase with HBase  Workload: 5000 operations  operations for warming up the cathe

Performance Evaluation: Micro-benchmarks(1) Basic data operations:  Write  Random read  Sequential scan  Range scan

Performance Evaluation: Micro-benchmarks(2)

Performance Evaluation: Micro-benchmarks(3)

Performance Evaluation: Micro-benchmarks(4)

Performance Evaluation: YCSB Benchmark(1)  Mixed workloads: 95% and 75% update in the workload  Varying system sizes: 3 to 24 nodes

Performance Evaluation: YCSB Benchmark(2)

Performance Evaluation: YCSB Benchmark(3)

Performance Evaluation: TPC-W Benchmark(1)  Examine the performance when accessing multiple data records possibly from different tables within the transaction boundary  Models a webshop application workload  Browsing, shopping, ordering: 5%, 20%, 50% update transactions

Performance Evaluation: TPC-W Benchmark(2)

Performance Evaluation: Checkpoint and Recovery

Performance Evaluation: Comparison with Log-structured Systems(1)  RAMClouds: stores its data and indexes entirely in memory  Hyder: scales its database in shared-flash environments without data partitioning  LRS: has a distributed architecture and data partitioning strategy similar to RAMCloud and LogBase but stores data on disks

Performance Evaluation: Comparison with Log-structured Systems(2)

Performance Evaluation: Comparison with Log-structured Systems(3)

Conclusion  Introduced a scalable log-structured database system called LogBase  Can be elastically deployed in the cloud  Can provide sustained write throughput and effective recovery time  The in-memory indexes support efficient data retrieval  Provides the widely accepted snapshot isolation for transactions  Extensive experiments  Future works: the design and implementation of efficient secondary indexes and query processing for LogBase

Thanks