Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spanner: Google’s Globally-Distributed Database By - James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman Sanjay.

Similar presentations


Presentation on theme: "Spanner: Google’s Globally-Distributed Database By - James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman Sanjay."— Presentation transcript:

1 Spanner: Google’s Globally-Distributed Database By - James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, Wilson Hsieh, Sebastian Kanthak, Eugene Kogan, Hongyi Li, Alexander Lloyd, Sergey Melnik, David Mwaura, David Nagle, Sean Quinlan, Rajesh Rao, Lindsay Rolig, Yasushi Saito, Michal Szymaniak, Christopher Taylor, Ruth Wang, Dale Woodford Published in Proceedings of OSDI 2012 Speaker-Mugdha Goel

2 What is Spanner? It is a system to distribute data at a global scale and support externally consistent distributed transactions. With Spanner, Google can offer a web service to a worldwide audience, but still ensure that something happening on the service in one part of the world doesn’t contradict what’s happening in another. It automatically migrates data across machines and data centers to balance load in case of failures. Scalable, Multi-version, and synchronously-replicated database.

3 Need and Evolution Spanner was made for high availability. Need for consistent data across the globe. Read and write without being crushed by huge latencies. Having data located as per the Clients need. Evolution Spanner has evolved from a Bigtable-like versioned key-value store into a temporal multi-version database. Spanner is the successor to Google's Megastore system. Data is stored in Semi-relational tables and handles faster read-writes. Google's 'F1' advertising backend is using Spanner. Gmail, Picasa, Google Calendar, the Android Market and its AppEngine cloud all use Megastore, making them potential candidates for a Spanner upgrade.

4 Features The replication configurations for data can be dynamically controlled at a fine grain level by applications and this process is transparent. Applications can control the location of data. Provides externally Consistent Read-writes. Provides globally consistent reads across the database at a timestamp. Provides an implementation of the TrueTime API.

5 Implementation

6 Spanserver Software Stack BigTable based implementation has following mapping: (key:string, timestamp:int64) -> string

7 Spanner’s Data Model Need for Schematized Semi-relational tables and synchronous replication - Megastore (replication,despite its relatively poor write throughput). A semi-relational data model which provides support for synchronous Need for an SQL like query language – Dremel(an interactive data analysis tool) 2 phase commit had availability problems in Bigtable. Paxos mitigates the problems. Underneath uses a Distributed File system known as Colossus. Structure An application creates one or more databases in a universe. Each database can contain an unlimited number of schematized tables. Every table has an ordered set of one or more primary-key columns. Primary keys form the name for a row. Table defines a mapping from the primary-key columns to the non- primary-key columns.

8 Example CREATE TABLE Users { uid INT64 NOT NULL, STRING } PRIMARY KEY (uid), DIRECTORY; CREATE TABLE Albums { uid INT64 NOT NULL, aid INT64 NOT NULL, name STRING } PRIMARY KEY (uid, aid), INTERLEAVE IN PARENT Users ON DELETE CASCADE;

9 TrueTime MethodReturns TT.now()TTinterval: [earliest; latest] TT.after(t)True if t has definitely passed TT.before(t)True if t has definitely not arrived TTinterval - Interval with bounded time uncertainty(Endpoints are of TTstamp type). t abs (e) – denotes absolute time for an event e. for an invocation e now, tt = TT.now(), tt.earliest <= tabs(e now ) <= tt.latest

10 Concurrency Control Spanner supports: Read only transactions – - Predeclared as not having any writes. - Not simply a read-write transaction without any writes. - Reads execute at a system-chosen timestamp without locking, so that incoming writes are not blocked. Read-Write transactions - Writes that occur in a transaction are buffered at the client until commit. Snapshot reads - Client chooses a timestamp for the read - OR Client specifies an upperbound on the timestamp’s staleness.

11 Evaluations Availaibility Two-phase commit scalability. Mean and standard deviations over 10 runs. Microbenchmarks Effect of killing servers on throughput

12 TrueTime Distribution of TrueTime values F1 F1-perceived operation latencies

13 F1’s transition F1’s backend was originally based on MYSQL. Disadvantages of using MYSQL – It manually Sharded data. – The MySQL sharding scheme assigned each customer and all related data to a fixed shard. – Extremely costly with increasing customers. – Resharding was a very complex process which took about 2 years. Advantages of Spanner – Spanner removes the need to manually reshard. – Spanner provides synchronous replication and automatic failover. With MySQL master- slave replication, failover was difficult, and risked data loss and downtime. – F1 Requires strong transactional semantics.

14 Future Work Google’s advertising backend was transitioned from MySQL to Spanner. Currently working on Spanner’s schema language, automatic maintenance of secondary indices, and automatic load-based resharding. In Future, Optimistically doing reads in parallel. Plans to support direct changes to paxos configurations. Improve single node performance by improving algorithms and data structures. Moving data automatically between datacenters in response to changes in client load.


Download ppt "Spanner: Google’s Globally-Distributed Database By - James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman Sanjay."

Similar presentations


Ads by Google