7/2/2015EECS 584, Fall 20111 Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.

Slides:

Advertisements

Similar presentations

Introduction to cloud computing

Advertisements

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China

Tomcy Thankachan  Introduction  Data model  Building Blocks  Implementation  Refinements  Performance Evaluation  Real applications  Conclusion.

Bigtable: A Distributed Storage System for Structured Data Fay Chang et al. (Google, Inc.) Presenter: Kyungho Jeon 10/22/2012 Fall.

Homework 2 What is the role of the secondary database that we have to create? What is the role of the secondary database that we have to create?  A relational.

Big Table Alon pluda.

Bigtable: A Distributed Storage System for Structured Data Presenter: Guangdong Liu Jan 24 th, 2012.

Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.

Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.

Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.

 Pouria Pirzadeh  3 rd year student in CS  PhD  Vandana Ayyalasomayajula  1 st year student in CS  Masters.

Authors Fay Chang Jeffrey Dean Sanjay Ghemawat Wilson Hsieh Deborah Wallach Mike Burrows Tushar Chandra Andrew Fikes Robert Gruber Bigtable: A Distributed.

Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.

BigTable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,

Distributed storage for structured data

Bigtable: A Distributed Storage System for Structured Data

BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.

Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.

Google Distributed System and Hadoop Lakshmi Thyagarajan.

Gowtham Rajappan. HDFS – Hadoop Distributed File System modeled on Google GFS. Hadoop MapReduce – Similar to Google MapReduce Hbase – Similar to Google.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗

Google and Cloud Computing Google 与云计算王咏刚 Google 资深工程师.

Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.

1 The Google File System Reporter: You-Wei Zhang.

CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.

Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.

SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.

HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.

Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.

Bigtable: A Distributed Storage System for Structured Data Google’s NoSQL Solution 2013/4/1Title1 Chao Wang Fay Chang, Jeffrey Dean, Sanjay.

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China

BigTable and Accumulo CMSC 461 Michael Wilson. BigTable  This was Google’s original distributed data concept  Key value store  Meant to be scaled up.

Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.

1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.

Bigtable: A Distributed Storage System for Structured Data 1.

Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.

Big Table - Slides by Jatin. Goals wide applicability Scalability high performance and high availability.

Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,

Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI Feb 2012 Presentation.

Key/Value Stores CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.

Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.

CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.

Essential Google by Zhongyuan Wang Outline Motivation & Goals Problems Solution ： BigTable File System vs Database Google’s Database ： Google.

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China

Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,

Bigtable: A Distributed Storage System for Structured Data

Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.

Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)

Bigtable A Distributed Storage System for Structured Data.

Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.

Bigtable: A Distributed Storage System for Structured Data Written By: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike.

CSCI5570 Large Scale Data Processing Systems

Bigtable A Distributed Storage System for Structured Data

Lecture 7 Bigtable Instructor: Weidong Shi (Larry), PhD

HBase Mohamed Eltabakh

Bigtable: A Distributed Storage System for Structured Data

Data Management in the Cloud

CSE-291 (Cloud Computing) Fall 2016

Google and Cloud Computing

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95

Cloud Computing Storage Systems

A Distributed Storage System for Structured Data

John Kubiatowicz (with slides from Ion Stoica and Ali Ghodsi)

Presentation transcript:

7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current System and Future Directions, Jeff Dean

7/2/2015 EECS 584, Fall Outline Motivation Data Model APIs Building Blocks Implementation Refinement Evaluation

7/2/2015 EECS 584, Fall Outline Motivation Data Model APIs Building Blocks Implementation Refinement Evaluation

Google’s Motivation – Scale! Scale Problem –Lots of data –Millions of machines –Different project/applications –Hundreds of millions of users Storage for (semi-)structured data No commercial system big enough –Couldn’t afford if there was one Low-level storage optimization help performance significantly – Much harder to do when running on top of a database layer 7/2/2015 EECS 584, Fall 20114

Bigtable Distributed multi-level map Fault-tolerant, persistent Scalable –Thousands of servers –Terabytes of in-memory data –Petabyte of disk-based data –Millions of reads/writes per second, efficient scans Self-managing –Servers can be added/removed dynamically –Servers adjust to load imbalance 7/2/2015 EECS 584, Fall 20115

Real Applications 7/2/2015 EECS 584, Fall 20116

7/2/2015 EECS 584, Fall Outline Motivation Data Model APIs Building Blocks Implementation Refinement Evaluation

Data Model a sparse, distributed persistent multidimensional sorted map (row, column, timestamp) -> cell contents 7/2/2015 EECS 584, Fall 20118

Data Model Rows –Arbitrary string –Access to data in a row is atomic –Ordered lexicographically 7/2/2015 EECS 584, Fall 20119

Data Model Column –Tow-level name structure: family: qualifier –Column Family is the unit of access control 7/2/2015 EECS 584, Fall

Data Model Timestamps –Store different versions of data in a cell –Lookup options Return most recent K values Return all values 7/2/2015 EECS 584, Fall

Data Model The row range for a table is dynamically partitioned Each row range is called a tablet Tablet is the unit for distribution and load balancing 7/2/2015 EECS 584, Fall

7/2/2015 EECS 584, Fall Outline Motivation Data Model APIs Building Blocks Implementation Refinement Evaluation

APIs Metadata operations –Create/delete tables, column families, change metadata Writes –Set(): write cells in a row –DeleteCells(): delete cells in a row –DeleteRow(): delete all cells in a row Reads –Scanner: read arbitrary cells in a bigtable Each row read is atomic Can restrict returned rows to a particular range Can ask for just data from 1 row, all rows, etc. Can ask for all columns, just certain column families, or specific columns 7/2/2015 EECS 584, Fall

7/2/2015 EECS 584, Fall Outline Motivation Data Model APIs Building Blocks Implementation Refinement Evaluation

Typical Cluster 7/2/2015 EECS 584, Fall Shared pool of machines that also run other distributed applications

Building Blocks Google File System (GFS) –stores persistent data (SSTable file format) Scheduler –schedules jobs onto machines Chubby –Lock service: distributed lock manager –master election, location bootstrapping MapReduce (optional) –Data processing –Read/write Bigtable data 7/2/2015 EECS 584, Fall

Chubby {lock/file/name} service Coarse-grained locks Each clients has a session with Chubby. –The session expires if it is unable to renew its session lease within the lease expiration time. 5 replicas, need a majority vote to be active Also an OSDI ’06 Paper 7/2/2015 EECS 584, Fall

7/2/2015 EECS 584, Fall Outline Motivation Overall Architecture & Building Blocks Data Model APIs Implementation Refinement Evaluation

Implementation Single-master distributed system Three major components –Library that linked into every client –One master server Assigning tablets to tablet servers Detecting addition and expiration of tablet servers Balancing tablet-server load Garbage collection Metadata Operations –Many tablet servers Tablet servers handle read and write requests to its table Splits tablets that have grown too large 7/2/2015 EECS 584, Fall

Implementation 7/2/2015 EECS 584, Fall

Tablets Each Tablets is assigned to one tablet server. –Tablet holds contiguous range of rows Clients can often choose row keys to achieve locality –Aim for ~100MB to 200MB of data per tablet Tablet server is responsible for ~100 tablets –Fast recovery: 100 machines each pick up 1 tablet for failed machine –Fine-grained load balancing: Migrate tablets away from overloaded machine Master makes load-balancing decisions 7/2/2015 EECS 584, Fall

How to locate a Tablet? Given a row, how do clients find the location of the tablet whose row range covers the target row? 7/2/2015 EECS 584, Fall METADATA: Key: table id + end row, Data: location Aggressive Caching and Prefetching at Client side

Tablet Assignment Each tablet is assigned to one tablet server at a time. Master server keeps track of the set of live tablet servers and current assignments of tablets to servers. When a tablet is unassigned, master assigns the tablet to an tablet server with sufficient room. It uses Chubby to monitor health of tablet servers, and restart/replace failed servers. 7/2/2015 EECS 584, Fall

Tablet Assignment Chubby –Tablet server registers itself by getting a lock in a specific directory chubby Chubby gives “lease” on lock, must be renewed periodically Server loses lock if it gets disconnected –Master monitors this directory to find which servers exist/are alive If server not contactable/has lost lock, master grabs lock and reassigns tablets GFS replicates data. Prefer to start tablet server on same machine that the data is already at 7/2/2015 EECS 584, Fall

7/2/2015 EECS 584, Fall Outline Motivation Overall Architecture & Building Blocks Data Model APIs Implementation Refinement Evaluation

Refinement – Locality groups & Compression Locality Groups –Can group multiple column families into a locality group Separate SSTable is created for each locality group in each tablet. –Segregating columns families that are not typically accessed together enables more efficient reads. In WebTable, page metadata can be in one group and contents of the page in another group. Compression –Many opportunities for compression Similar values in the cell at different timestamps Similar values in different columns Similar values across adjacent rows 7/2/2015 EECS 584, Fall

7/2/2015 EECS 584, Fall Outline Motivation Overall Architecture & Building Blocks Data Model APIs Implementation Refinement Evaluation

Performance - Scaling 7/2/2015 EECS 584, Fall As the number of tablet servers is increased by a factor of 500: –Performance of random reads from memory increases by a factor of 300. –Performance of scans increases by a factor of 260. Not Linear! WHY?

Not linearly? Load Imbalance –Competitions with other processes Network CPU –Rebalancing algorithm does not work perfectly Reduce the number of tablet movement Load shifted around as the benchmark progresses 7/2/2015 EECS 584, Fall