Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bigtable: A Distributed Storage System for Structured Data 1.

Similar presentations


Presentation on theme: "Bigtable: A Distributed Storage System for Structured Data 1."— Presentation transcript:

1 Bigtable: A Distributed Storage System for Structured Data 1

2 Before we begin …  BigTable  Sawzall  MapReduce  Bloom Filters Bigtable: A Distributed Storage System for Structured Data 2

3  Introduction  Data Model  API  Building Blocks  Implementation  Refinements  Performance Evaluation  Real Applications  Lessons  Related Work  Conclusions Bigtable: A Distributed Storage System for Structured Data 3

4  What is Bigtable? A distributed storage system for managing structured data at Google  Used by > 60 Google products  Google Analytics  Google reader  Personalized Search  Orkut Bigtable: A Distributed Storage System for Structured Data 4

5  Goals  Wide applicability  Scalability  High performance  High availability  Bigtable and Database  Bigtable does not support a full relational data model Bigtable: A Distributed Storage System for Structured Data 5

6  A Bigtable is sparse, distributed, persistent multi-dimensional sorted map  Distributed multi-dimensional sparse map (row, column, timestamp) cell contents Webtable Bigtable: A Distributed Storage System for Structured Data 6

7 Rows  row keys are arbitrary strings up to 64KB  every read or write of data in a single row is atomic (regardless of the # or columns)  row ranges are dynamically partitioned into tablets Bigtable: A Distributed Storage System for Structured Data 7

8 Column Families  column keys are grouped into sets called column families  usually of the same type  number of columns families should be small  number of columns is unbounded  access control is at the column family level Bigtable: A Distributed Storage System for Structured Data 8

9 Timestamps  Each cell in a Bigtable can contain multiple versions of the same data  Versions are indexed by 64-bit integer timestamps  Garage-collection settings per-column-family:  only the last n versions of a cell be kept, or  only new-enough versions be kept Bigtable: A Distributed Storage System for Structured Data 9

10 10  Rows  Columns  Timestamps

11  Metadata operations  Create/delete tables or column families  Change metadata  Writes (atomic)  Bigtable does not support general transactions across row keys  does not support writing to Bigtable  filtering, summarization, and transformation  Bigtable can be used with MapReduce Bigtable: A Distributed Storage System for Structured Data 11

12  Google File System (GFS)  used to store log and data files  Scheduler cluster management system  used to manage jobs and resources  SSTable file format  used internally to store Bigtable data  Chubby distributed lock service  highly-available with five active replicas 0.0047% unavailability for 14 Bibtable clusters 0.0326% unavailability for most affectected cluster Bigtable: A Distributed Storage System for Structured Data 12

13 What is a tablet?  A Bigtable cluster stores a number of tables  Each table consists of a set of tablets  Each tablet managed by a specific tablet server  As a table grows, it is automatically split into multiple tablets (100-200) MB in size by default  Tablet servers handle read/write requests for their tablets Bigtable: A Distributed Storage System for Structured Data 13

14 BigTable: Servers  Master manages assignment of tablets servers Bigtable: A Distributed Storage System for Structured Data 14 Tablet server 1 Bigtable Master Tablet server 2 Tablets

15 Tablet Location  A three-level hierarchy of tablets is used to store tablet locations  The root tablet is never split Bigtable: A Distributed Storage System for Structured Data 15

16 Tablet Assignment  A master server is responsible for assigning tablets to tablet servers  The master server also:  detects addition and expiration of tablet servers  balances tablet server loads  initiates garbage collection of files in GFS  reassigns tablets when a tablet server is lost  If the master server dies, a new master server is recreated Bigtable: A Distributed Storage System for Structured Data 16

17 Tablet Serving  The persistent state of a tablet as stored in GFS Bigtable: A Distributed Storage System for Structured Data 17 memtableRead Op Write Op SSTable Files Memory GFS

18 Compactions  Minor Compactions  memtable size reaches a threshold  memtable is frozen  new memtable is created  frozen memtable is converted into a new SSTable  Merging Compactions Bigtable: A Distributed Storage System for Structured Data 18

19  A number of refinements were required for Bigtable implementations to achieve high:  performance  availability  reliability Bigtable: A Distributed Storage System for Structured Data 19

20 Locality groups  Clients can group multiple column families together into a locality group  A separate SSTable is generated for each locality group  Segregating column families which are not typically accessed together enables more efficient reads Bigtable: A Distributed Storage System for Structured Data 20

21 Refinements Compression  Clients can control whether compression is used on a locality group  Many clients use a two pass compression algorithm  Bentley and McIlroy's scheme Bigtable: A Distributed Storage System for Structured Data 21

22 Refinements Caching & Bloom Filters  Tablets use two levels of caching to improve read performance  Scan caching is useful for data which tends to be read repeatedly  Block caching is useful for when read data tends to be close to data recently read  Bloom filters reduce disk seeks by allowing a client to ask whether a SSTable contains a row/column key pair Bigtable: A Distributed Storage System for Structured Data 22

23 Refinements Speeding Table Recovery  When a tablet is moved to another tablet server :  A minor compaction is performed  The tablet server stop serving the tablet  Another minor compaction (unusually fast)  Then the tablet is moved without requiring any log entry recovery Bigtable: A Distributed Storage System for Structured Data 23

24 Refinements Exploiting Immutability  Because SSTables are immutable, various parts of the Bigtable system have been simplified:  file system access synchronization  permanently removing deleted data is completely handled thru garbage collection  splitting tables is efficient because child tablets can share the SSTable of parent tablets Bigtable: A Distributed Storage System for Structured Data 24

25  Google setup a Bigtable cluster with N tablet servers to measure performance and scalability as N is varied.  configured to use 1 GB of memory  each with two 400GB IDE hard drives, two dual core 2 GHz chips, and a single gigabit Ethernet link  N client machines generated the Bigtable load used for tests  Every machine ran a GFS server. Bigtable: A Distributed Storage System for Structured Data 25

26 Performance Evaluation Single tablet - server performance Bigtable: A Distributed Storage System for Structured Data 26 Experiment # of Tablet Servers Random Reads 1212593479241 Random Reads (mem) 10811851180006250 Random Writes 8850374534252000 Sequintial Reads 4425246326252469 Sequintial Writes 8547362324511905 Scans 153851052695247843

27 Performance Evaluation  Scaling : Aggregate throughput increases by over a factor of 100 as the number of tablet servers is increased from 1 to 500. Bigtable: A Distributed Storage System for Structured Data 27

28 Real Applications  As of August 2006  388 non-test Bigtable cluster  24500 tablet servers Bigtable: A Distributed Storage System for Structured Data 28 # of Tablet Servers # of Clusters 0.. 19259 20.. 4947 50.. 9920 100.. 49950 > 50012

29 Real Applications Bigtable: A Distributed Storage System for Structured Data 29  This table provides some data about a few of the tables currently in use  Table size (measured before compression) and # Cells indicate approximate sizes

30 Real Applications Google Analytics  Google Analytics is supported by 2 Bigtables  200 TB raw click table  20 TB summary table Bigtable: A Distributed Storage System for Structured Data 30

31 Real Applications Google Earth  Google Earth is supported by 2 Bigtables  70 TB images table, compression turned off  500 GB index table Bigtable: A Distributed Storage System for Structured Data 31

32 Real Applications Personalized Search  Personalized Search supported by 1 Bigtable  one row per user id  separate column family for each type of action Bigtable: A Distributed Storage System for Structured Data 32

33 Lessons learned  Large distributed systems are vulnerable to many types of failures  memory and network corruption  hung machines  extended and asymmetric network partitions  bugs in other systems (i.e. Chubby)  overflow of GFS quotas  planned and unplanned hardware maintenance  To address experience problems  some protocols have been changed  some assumptions have been modified Bigtable: A Distributed Storage System for Structured Data 33

34 Lessons learned  It is important to delay adding new features until it is clear how the new features will be used  It is important to support system-level monitoring  allowed for detection and fixing of many issues  also enables tracking clusters to answer common questions Bigtable: A Distributed Storage System for Structured Data 34

35 Related Work  The Boxwood project's goal is to provide infrastructure for building higher-level services such as file systems or databases  while the goal of Bigtable is to directly support client applications that wish to store data Bigtable: A Distributed Storage System for Structured Data 35

36 Related Work  C-Store and Bigtable share many characteristics  shared-nothing architecture  two different data structures  however these two systems vary significantly in their APIs performance optimization Bigtable: A Distributed Storage System for Structured Data 36

37 Conclusions  Bigtable is a distributed system for storing structure data at Google  in production since April 2005  seven person-years to design and implement  more than 60 projects using in August 2006  users like performance and high availability  Users can scale their applications capacity by simply adding more machines to their system Bigtable: A Distributed Storage System for Structured Data 37

38 Conclusions  Google has begun deploying Bigtable as a service to product groups  Google has gained significant advantages by building their own storage solution  has control over implementation and infrastructure  can remove bottlenecks and inefficiencies as the arise Bigtable: A Distributed Storage System for Structured Data 38

39 Strengths  Implementation and Usable  Optimization  Performance Evaluation  Used by > 60 Google products Bigtable: A Distributed Storage System for Structured Data 39

40 Weaknesses  Complexity  Chubby  Master  Network Bigtable: A Distributed Storage System for Structured Data 40

41 Bigtable: A Distributed Storage System for Structured Data 41


Download ppt "Bigtable: A Distributed Storage System for Structured Data 1."

Similar presentations


Ads by Google