Presentation is loading. Please wait.

Presentation is loading. Please wait.

BigTable CSE 490h, Autumn 2008. What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.

Similar presentations


Presentation on theme: "BigTable CSE 490h, Autumn 2008. What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by."— Presentation transcript:

1 BigTable CSE 490h, Autumn 2008

2 What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, a column key, and a timestamp; each value in the map is an uninterpreted array of bytes.” (row:string, column:string, time:int64) -> string

3 Motivating example z Lots of semi-structured data z Enormous scale

4 Why? z You often need more than a file system y Cf. Windows Vista z Do you need a full DBMS? y Few DBMS’s support the requisite scale y Most DBMS’s require very expensive infrastructure y Google had highly optimized lower-level systems that could be exploited x GFS, Chubby, MapReduce, job scheduling y DBMS’s provide more than Google needed x E.g., full transactions, SQL

5 Relative to a DBMS, BigTable provides … z - Simplified data retrieval mechanism y (row, column, time:) -> string only y No relational operators z - Atomic updates only possible at row level z + Arbitrary number of columns per row z + Arbitrary data type for each column z Designed for Google’s application set z Provides extremely large scale (data, throughput) at extremely small cost

6 Physical representation of data z A logical “table” is divided into multiple tablets y Each tablet is one or more SSTable files in GFS z Each tablet stores an interval of table rows y Rows are lexicographically ordered (by key) y If a tablet grows beyond a certain size, it is split into two new tablets

7 Software structure of a BigTable cell z One master server y Communicates only with tablet servers z Multiple tablet servers y Perform actual client accesses z Chubby lock service holds metadata (e.g., the location of the root metadata tablet for the table ), handles master election y How? z GFS servers provide underlying storage

8 High-level structure BigTable Cell BigTable Master Performs metadata ops and load balancing BigTable Tablet Server Serves data Cluster scheduling systemGFSChubby Holds tablet data, logs Holds metadata, handles master election Handles failover, monitoring BigTable Client BigTable Client Library

9 A few implementation notes z Finding the tablet holding the row you want y Hierarchical metadata tablets z Writing y Append-only log held in GFS z A million and one details y Locality groups y Compression y Caching y Bloom filters y Commit log implementation y Tablet recovery


Download ppt "BigTable CSE 490h, Autumn 2008. What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by."

Similar presentations


Ads by Google