Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spanner: Becoming a SQL System

Similar presentations


Presentation on theme: "Spanner: Becoming a SQL System"— Presentation transcript:

1 Spanner: Becoming a SQL System
Xiyang Feng CS848

2 Outline Introduction Spanner Recap Query Distribution
Query Range Extraction Query Restarts Comments Summary

3 Introduction Globally distributed database Worldwide replication
OLTP DBMS for structured data Tens of millions of QPS, hundreds of petabytes of data Features: Distributed query processor Standard optimization: parallel processing, partition pruning Distributed Query execution; Range extraction; Query restarts

4 Spanner Recap Database is horizontally row-range sharded
Shards are distributed across servers within a datacenter Shards are replicated to separated datacenters Parent – child table relation Child table co-located with parent table Interleaved data layout

5 Spanner Recap – Relation Data Layout
SingerId SingerName 1 Beatles 2 U2 3 Pink Floyd SingerId AlbumId AlbumName 1 Help! 2 Abbey Road 3 The Wall

6 Spanner Recap – Interleaved Data Layout
1 Beatles Help! 2 Abbey Road U2 3 Pink Floyd The Wall

7 Query Distribution – DistributedUnion
Operator tree -> optimization by equivalent rewrite A fundamental distribution operator Scan(T) -> DistributedUnion[shard ⊆ T](Scan(shard)) Push down operation to shard Op = OpFinal ◦ OpLocal, is an algebraic aggregation Op(DistributedUnion[shard ⊆ T](Scan(shard))) = OpFinal(DistributedUnion[shard ⊆ T](OpLocal(Scan(shard))))

8 Query Distribution - Example

9 Query Distribution Distributed execution
Extract a set of shard, typically minimal Detect local shard Distributed join Naïve approach – 1 cross-machine call / row Batch manner Batches of rows from input to remote subquery Rows from each batch to original join’s subquery locally on a shard

10 Query Range Extraction
What portion of tables are referenced by the query Intervals of primary key values Range extraction: Distribution range extraction Seek range extraction Lock range extraction

11 Query Range Extraction - Rewrite
Rewrite to a tree of correlated self-joins SELECT d.* FROM Document d WHERE d.ProjectId AND START_WITH(d.DocumentPath, ‘/proposal’) AND d.Version 2

12 Query Restarts Compensate for failure, resharding … Benefits
Hiding transient failure No retry loop Streaming pagination through query results Forward progress for long-running queries Recurrent rolling upgrades Simpler Spanner internal error handling

13 Query Restarts - Challenges
Dynamic resharding Non-determinism Restarts across server version Restart token wire format Query plan Operator behaviour

14 Comments/Questions Inadequate experiment
“The cost of range extraction may however outweigh the benefit of converting full scan to seeks …” Hotspotting and load balancing CA or CP?

15 Summary Globally distributed database with SQL functionality
Standard implementation with unique capabilities Query Distribution Range Extraction Query Restart From NoSQL to SQL


Download ppt "Spanner: Becoming a SQL System"

Similar presentations


Ads by Google