Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.

Similar presentations


Presentation on theme: "Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006."— Presentation transcript:

1 Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006

2 Content Introduction Data Model API and DEMO Building Blocks Implementation Conclusion

3 Introduction Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. goals: – wide applicability – Scalability – high performance – high availability

4 Introduction

5 Bigtable resembles a database, but isn’t Bigtable does not support a full relational data model Bigtable treats data as uninterpreted strings Bigtable schema parameters let clients dynamically control whether to serve data out of memory or from disk.

6 Data Model A Bigtable is a sparse, distributed, persistent multi- dimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes. (row:string, column:string, time:int64) -> string

7 Data Model

8 Rows – a tablet is a row range: a the unit of distribution and load balancing Column Families – Column Families mean the sets Column keys are grouped into. – syntax: family:qualier. Column family names must be printable,but qualiers may be arbitrary strings Timestamp – 64-bit integers

9 API and DEMO

10

11 Building Blocks Google File System (GFS) – store log and data les – Run in a shared pool of machines Google SSTable: File format – Key to Value – Index map Chubby: a highly-available and persistent distributed lock service

12 Implementation Client Master Server Tablet Server

13 Implementation Master Server: – assigning tablets to tablet servers – detecting the addition and expiration of tablet servers – balancing tablet-server load – garbage collection of files in GFS. Tablet Server: – manage a set of tablets – read and write requests to the tablets – Splits tablets

14 Implementation

15

16 Others Refinement – Locality groups – Compression – Caching for read performance – Bloom filters – Commit-log implementation Performance Evaluation Lesson

17

18 THANK YOU


Download ppt "Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006."

Similar presentations


Ads by Google