Presentation is loading. Please wait.

Presentation is loading. Please wait.

C-Store: An Introduction to Berkeley DB Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar. 13, 2009.

Similar presentations


Presentation on theme: "C-Store: An Introduction to Berkeley DB Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar. 13, 2009."— Presentation transcript:

1 C-Store: An Introduction to Berkeley DB Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar. 13, 2009

2 Overview of Berkeley DB Means the Berkeley Database  An open-source, embedded transactional data management system  A key/value store Embedded ?  As a library that is linked with an application  Hides data management from end-user Scales from Bytes to Petabytes Runs on everything from cell phone to large servers.

3 Berkeley DB : Examples of Applications Google Accounts  Store all user and service account information and preferences. Amazon’s user-customization Berkeley DB has high reliability and high performance.

4 Berkeley DB: A Brief History (1) Began life in 1991 as a dynamic linear hashing implementation.  historic UNIX database libraries: dbm, ndbm and hsearchdbmndbm hsearch Released as a library in the 4.4 BSD in 1992.  db-1.85 == Hash + B-Tree The package LIBTP  Transactional Implementation of db-1.85  A research prototype that was never released.

5 Berkeley DB: A Brief History (2) In 1996, Seltzer and Bostic started Sleepycat Software.  for use in the Netscape browser Berkeley DB 2.0, Released in 1997  Transactional implementation  the first commercial release Berkeley DB 3.0, Released in 1999  Transformed into an Object-Oriented Handle and Method style API.

6 Berkeley DB: A Brief History (3) Berkeley DB 4.0, Released in 1999  Single-Master, Multiple-Reader Replication  High Availability replicas can take over for a failed master  High Scalability Read-only replicas can reduce master load  Similar ideas are adopted in C-Store. In Feb. 2006, Oracle acquired Sleepycat.

7 Sleepycat Public License: a Dual License The code  Is open source  And may be downloaded and used freely However, redistribution requires  Either the package using Berkeley DB be released as open source  Or that the distributors obtain a commercial license from Sleepycat (and now Oracle, acquired in Feb. 2006).

8 Berkeley DB: Product Family Today The original Berkeley DB library Berkeley DB XML  Atop the library Berkeley DB Java Edition  100% pure Java implementation

9 Berkeley DB : Product Family Architecture

10 Berkeley DB: The Design Philosophy Provide mechanisms without specifying policies For example, Berkeley DB is abstracted as a store of pairs.  Both keys and values are opaque byte-strings.  i.e., Berkeley DB has no schema,  And the application that embeds Berkeley DB is responsible for imposing its own schema on the data.

11 Advantages of pairs An application is free to store data in whatever form is most natural to it.  Objects (like structures in C language)  Rows in Oracle, SQL Server  Columns in C-store Different data formats can be stored in the same databases.  As long as the application understands how to interpret the data items.

12 Indexing Key Values Indexing methods  B-Tree  Hash  Queue  A record-number-based index implemented atop B-Tree Data manipulation  Put, store key/value pairs  Get, retrieve key/value pairs  Delete, remove key/value pairs

13 How Applications Access key/value pairs? Through handles on databases  Similar to relational tables Or through cursor handles  Representing a specific place within a database  Used for iteration, i.e., fetch a key/value pair each time. Databases are implemented atop OS file system.  A file may contain one or more databases.

14 Berkeley DB Replication: A Log-Shipping System A Replication Group  A single Master  One or more Read-Only Replicas. All write operations must be processed transactionally by the Master The Master sends log records to each of the Replicas. The Replicas apply log records only when they receive a transaction commit record.

15 Berkeley DB: Configuration Flexibility Configuration flexibility is critical  Due to a wide range of applications Three ways  Compile Time Configuration  Feature Set Selection  Runtime Configuration

16 Compile Time Configuration Option 1: small footprint build  -enable-smallbuild  For use in a cell phone  The compiled library contains only B-Tree index,  Omits replication, cryptography, statistics collection, etc. The library is about 0.5 MB. Option 2: higher concurrency locking  -enable-fine-grained-lock-manager  For use in a Data Center  Lock-Based Concurrency Control

17 Feature Set Selection 1. The Data Store (DS) feature set  Most similar to the original db-1.85 library  Good for temporary data storage 2. The Concurrent Data Store (CDS) feature set Acquires a single lock per API invocation Good for Read-Most applications 3. The Transactional Data Store (TDS) feature set  Currently the most widely used feature set Acquires a single lock per page 4. The High Availability (HA) feature set  Can continue running even after a site fails.

18 Runtime Configuration Index Selection and Tuning  Applications can select the page size in an index Trading off Durability and Performance  No-force log write  Extreme case: applications can run completely in memory Trading off Two-Phase Locking and Multiversion Concurrency Control. Note: C-Store adopts similar ideas for high performance.

19 Challenges of Berkeley DB ’ s Flexibility Need flexibility in Berkeley DB designers Need flexibility in application developers

20 Any Dream? Any Idea? iGoogle 中国大学生创新设计大赛 中山大学软件学院第四届软件创新设计大赛 Some Research with Me?

21 References M Seltzer. Berkeley DB: A Retrospective. IEEE Data Engineering Bulletin, Pp. 21-28, Volume 30, Number 3, September 2007 MA Olson, K Bostic, M Seltzer. Berkeley DB. USENIX Annual Technical Conference, Pp. 183–192, June 6-11, 1999, Monterey, California, USA. Oracle Berkeley DB Site. http://www.oracle.com/technology/products/b erkeley-db http://www.oracle.com/technology/products/b erkeley-db


Download ppt "C-Store: An Introduction to Berkeley DB Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar. 13, 2009."

Similar presentations


Ads by Google