Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,

Similar presentations


Presentation on theme: "Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,"— Presentation transcript:

1 Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Reporter: yu-wen chang

2 Abstract  Bigtable(Bt) is a distributed storage system for managing structured data that is designed to scale to a very large size.  Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance.

3 Outline  Introduction  Data Model  APIs  Building Blocks  Implementation  Refinements  Performance  Real Applications  Conclusion

4 Introduction  Bigtable is designed to reliably scale to petabytes of data and thousands of machines.  Bigtable has achieved several goals:  Wide applicability.  Scalability.  High performance.  High availability.

5 Data Model  A Bigtable is a sparse, distributed, persistent multidimensional sorted map.  The map is indexed by a row key, column key, and a timestamp.

6 Data Model-Row  The row keys in a table are arbitrary strings.  Bigtable maintains data in lexicographic order by row key.  Each row range is called a tablet, which is the unit of distribution and load balancing.

7 Data Model-Column  Column keys are grouped into sets called column families.  A column key is named using the following syntax: family : qualier.  Another useful column family for this table is anchor.

8 Data Model-Timestamp  Each cell in a Bigtable can contain multiple versions of the same data; these versions are indexed by timestamp.

9 Data Model http://tw.yahoo.com/ com.yahoo.tw row keyrow key http://tw.bid.y ahoo.com http://buy.yahoo.com.tw TTT Anchor: 拍 賣 Anchor: 購物中心 ColumnColumn Family

10 APIs  The Bigtable API provides functions :  Creating and deleting tables and column families.  Changing cluster, table and column family metadata.

11 Building Blocks  Bigtable uses the distributed Google File System (GFS) to store log and data files.  The Google SSTable file format is used internally to store Bigtable data.  An SSTable provides a persistent, ordered immutable map from keys to values, where both keys and values are arbitrary byte strings.

12 Building Blocks(cont.)  Bigtable relies on a highly-available and persistent distributed lock service called Chubby.  Chubby provides a namespace that consists of directories and small files. Each directory or file can be used as a lock.

13 Implementation  The Bigtable implementation has three major components:  Library that linked into client.  Master server.  Table server.

14 Implementation

15  Each tablet server manages a set of tablets  The tablet server handles read and write requests to the tablets that it has loaded, and also splits tablets that have grown too large.

16 Implementation-Tablet Location  We use a three-level hierarchy analogous to that of a B+-tree to store tablet location information.  Each METADATA row stores approximately 1KB of data in memory. With a modest limit of 128 MB METADATA tablets.

17 Implementation-Tablet Location

18  The client library caches tablet locations.  Tablet locations are stored in memory, so no GFS accesses are required.  We also store secondary information in the METADATA table, including a log of all events pertaining to each tablet

19 Implementation-Tablet Assignment  Each tablet is assigned to one tablet server at a time.  Bigtable uses Chubby to keep track of tablet servers.  When a tablet server starts, it creates, and acquires an exclusive lock on, a uniquely- named file in a specific Chubby directory.

20 Implementation-Tablet Assignment  When a master is started by the cluster management system, it needs to discover the current tablet assignments before it can change them.

21 Implementation-Tablet Assignment  The master executes the following steps at startup:  The master grabs a unique master lock in Chubby.  The master scans the servers directory in Chubby to find the live servers.  The master communicates with every live tablet server to discover what tablets are already assigned to each server.  The master scans the METADATA table to learn the set of tablets.

22 Tablet Serving

23 Implementation-Compactions  When the memtable size reaches a threshold, the memtable is frozen, a new memtable is created, and the frozen memtable is converted to an SSTable and written to GFS.  A merging compaction that rewrites all SSTables into exactly one SSTable is called a major compaction.

24 Renements  Locality groups:  Clients can group multiple column families together into a locality group.  Compression:  Uses Bentley and McIlroy's scheme and fast compression algorithm.  Caching for read performance:  Uses Scan Cache and Block Cache.  Bloom filters:  Reduce the number of accesses.

25 Performance Evaluation

26 Real Applications  Google Analytics  http://analytics.google.com  Google Earth  http://earth.google.com  Personalized search  www.google.com/psearch

27 Conclusion


Download ppt "Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,"

Similar presentations


Ads by Google