Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.

Slides:



Advertisements
Similar presentations
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Advertisements

Tomcy Thankachan  Introduction  Data model  Building Blocks  Implementation  Refinements  Performance Evaluation  Real applications  Conclusion.
Bigtable: A Distributed Storage System for Structured Data Fay Chang et al. (Google, Inc.) Presenter: Kyungho Jeon 10/22/2012 Fall.
Homework 2 What is the role of the secondary database that we have to create? What is the role of the secondary database that we have to create?  A relational.
CS525: Special Topics in DBs Large-Scale Data Management HBase Spring 2013 WPI, Mohamed Eltabakh 1.
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
The google file system Cs 595 Lecture 9.
Big Table Alon pluda.
Map/Reduce in Practice Hadoop, Hbase, MongoDB, Accumulo, and related Map/Reduce- enabled data stores.
Bigtable: A Distributed Storage System for Structured Data Presenter: Guangdong Liu Jan 24 th, 2012.
HBase Presented by Chintamani Siddeshwar Swathi Selvavinayakam
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
 Pouria Pirzadeh  3 rd year student in CS  PhD  Vandana Ayyalasomayajula  1 st year student in CS  Masters.
Authors Fay Chang Jeffrey Dean Sanjay Ghemawat Wilson Hsieh Deborah Wallach Mike Burrows Tushar Chandra Andrew Fikes Robert Gruber Bigtable: A Distributed.
BigTable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
Distributed storage for structured data
Bigtable: A Distributed Storage System for Structured Data
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Gowtham Rajappan. HDFS – Hadoop Distributed File System modeled on Google GFS. Hadoop MapReduce – Similar to Google MapReduce Hbase – Similar to Google.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
BigTable and Google File System
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
1 The Google File System Reporter: You-Wei Zhang.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.
Zois Vasileios Α. Μ :4183 University of Patras Department of Computer Engineering & Informatics Diploma Thesis.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
Introduction to Hadoop and HDFS
Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.
Bigtable: A Distributed Storage System for Structured Data Google’s NoSQL Solution 2013/4/1Title1 Chao Wang Fay Chang, Jeffrey Dean, Sanjay.
BigTable and Accumulo CMSC 461 Michael Wilson. BigTable  This was Google’s original distributed data concept  Key value store  Meant to be scaled up.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Bigtable: A Distributed Storage System for Structured Data 1.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
Big Table - Slides by Jatin. Goals wide applicability Scalability high performance and high availability.
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
Key/Value Stores CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
CS 347Lecture 9B1 CS 347: Parallel and Distributed Data Management Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Cloudera Kudu Introduction
Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,
Bigtable: A Distributed Storage System for Structured Data
1 HBASE – THE SCALABLE DATA STORE An Introduction to HBase XLDB Europe Workshop 2013: CERN, Geneva James Kinley EMEA Solutions Architect, Cloudera.
Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Apache Accumulo CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Bigtable A Distributed Storage System for Structured Data.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
Big Data Infrastructure Week 10: Mutable State (1/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.
Bigtable: A Distributed Storage System for Structured Data Written By: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike.
Bigtable A Distributed Storage System for Structured Data
Lecture 7 Bigtable Instructor: Weidong Shi (Larry), PhD
HBase Mohamed Eltabakh
Bigtable: A Distributed Storage System for Structured Data
Data Management in the Cloud
CSE-291 (Cloud Computing) Fall 2016
آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95
A Distributed Storage System for Structured Data
Presentation transcript:

Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University of Science and Technology,

Introduction BigTable is a distributed storage system for managing structured data. Scales to Petabytes of data and thousands of machines. Developed and in use at Google since Used for more than 60 Google products. 2

Data Model (row, column, time) => string Row, column, value are arbitrary strings. Every read or write of data under a single row key is atomic (regardless of the number of different columns being read or written in the row). Columns are dynamically added. Timestamps for different versions of data. – Assigned by client application. – Older versions are garbage-collected. Example: Web map 3

Tablets Rows are sorted lexicographically. Consecutive keys are grouped together as “tablets”. – Allows data locality. – Example rows: com.google.maps/index.html and com.google.maps/foo.html are likely to be in same tablet. 4

Column Families Column keys are grouped into sets called “column families”. Column key is named using syntax: family:qualifier Access control and disk/memory accounting are at column family level Example: “anchor:cnnsi.com” 5

API Data Design – Creating/deleting tables and column families – Changing cluster, table and column family metadata like access control rights Client Interactions – Write/Delete values – Read values – Scan row ranges – Single-row transactions (e.g., read/modify/write sequence for data under a row key) Map/Reduce integration. – Read from Big Table; Write to Big Table. 6

Building Blocks SSTable file: Data structure for storage – Maps keys to values – Ordered. Enables data locality for efficient writes/reads. – Immutable. On reads, no concurrency control needed. Need to garbage collect deleted data. – Stored in Google File System (GFS), and optionally can be mapped into memory. Replicates data for redundancy. Chubby: Distributed lock service. – Store the root tablet, schema info, access control list – Synchronize and detect tablet servers 7

Implementation 3 components: 1.Client library 2.Master Server (exactly 1). Assigns tablets to tablet servers. Detecting the addition and expiration of tablet servers. Balancing tablet-server load Garbage collection of GFS files Schema changes such as table and column family creations. 3.Tablet Servers (multiple, dynamically added/removed) Handles read and write requests to the tablets that it has loaded Splits tablets that have grown too large. Each tablet MB. 8

Tablet Location How to know which node to route client request? 3-level hierarchy – One file in Chubby for location of Root Tablet – Root tablet contains location of Metadata tablets – Metadata table contains location of user tablets Row: [Tablet’s Table ID] + [End Row] Key: [Node ID] Client library caches tablet locations. 9

Tablet Assignment Master keeps track of tablet assignment and live servers Chubby – Tablet server creates & locks a unique file. – Tablet server stops serving if loses lock. – Master periodically checks tablet servers. If fails, master tries to lock the file and un-assigns the tablet. – Master failure does not change tablets assignments. Master restart 10

Tablet Serving Write 1.Check well-formedness of request. 2.Check authorization in Chubby file. 3.Write to “tablet log” (i.e., a transaction log for “redo” in case of failure). 4.Write to memtable (RAM). 5.Separately, “compaction” moves memtable data to SSTable. And truncates tablet log. 11 Read 1.Check well-formedness of request. 2.Check authorization in Chubby file. 3.Merge memtable and SSTables to find data. 4.Return data.

Compaction In order to control size of memtable, tablet log, and SSTable files, “compaction” is used. 1.Minor Compaction. Move data from memtable to SSTable. Truncate tablet log. 2.Merging Compaction. Merge multiple SSTables and memtable to a single SSTable. 3.Major Compaction. Remove deleted data. 12

Refinements Locality group. – Client can group multiple column families into a locality group. Enables more efficient reads since each locality group is a separate SSTable. Compression. – Client can choose to compress at locality group level. Two level caching in servers – Scan cache ( K/V pairs) – Block cache (SSTable blocks read from GFS) Bloom filter – Efficient check if a SSTable contain data for a row/column pair. Commit log implementation – Each tablet server has a single commit log (not one-per-tablet). 13

Performance Evaluation Random reads are slowest. Need to access SSTable block from disk. Writes are faster than reads. Commit log is append-only. Reads require merging of SSTables and memtable. Scans reduce number of read operations. 14

Performance Evaluation: Scaling Not linear, but not bad up to 250 tablet servers. Random read has worst scaling. Block transfers saturate network. 15

Conclusions Satisfies goals of high-availability, high-performance, massively scalable data storage. API. Successfully used by various Google products (>60). Additional features in progress: – Secondary indexes – Cross data center replication. – Deploy as a hosted service. Advantages of the custom development: – Significant flexibility due to own data model. – Can remove bottlenecks and inefficiencies as they arise. 16

Big Table Family Tree 17 Non-relational DBs (HBase, Cassandra, MongoDB, etc.) Column-oriented data model. Multi-level storage (commit log, RAM table, SSTable) Tablet management (assignment, splitting, recovery, GC, Bloom filters) Google related technologies and open-source equivalents GFS => Hadoop Distributed File System (HDFS) Chubby => Zookeeper Map/Reduce => Apache Map/Reduce

Any Question ? 18