Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos.

Slides:



Advertisements
Similar presentations
Chen Zhang Hans De Sterck University of Waterloo
Advertisements

Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.
Transactional storage for geo-replicated systems Yair Sovran, Russell Power, Marcos K. Aguilera, Jinyang Li NYU and MSR SVC.
Dynamo: Amazon’s Highly Available Key-value Store
More About Transaction Management Chapter 10. Contents Transactions that Read Uncommitted Data View Serializability Resolving Deadlocks Distributed Databases.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
IDA / ADIT Lecture 10: Database recovery Jose M. Peña
TRANSACTION PROCESSING SYSTEM ROHIT KHOKHER. TRANSACTION RECOVERY TRANSACTION RECOVERY TRANSACTION STATES SERIALIZABILITY CONFLICT SERIALIZABILITY VIEW.
Chapter 15: Transactions Transaction Concept Transaction Concept Concurrent Executions Concurrent Executions Serializability Serializability Testing for.
Piccolo: Building fast distributed programs with partitioned tables Russell Power Jinyang Li New York University.
Principles of Transaction Management. Outline Transaction concepts & protocols Performance impact of concurrency control Performance tuning.
High throughput chain replication for read-mostly workloads
Phase Reconciliation for Contended In-Memory Transactions Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard 1.
Transactions (Chapter ). What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions.
Author: Yang Zhang[SOSP’ 13] Presentator: Jianxiong Gao.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture X: Transactions.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Concurrency control using transactions 1Transactions.
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
Transaction Management and Concurrency Control
Transaction Management and Concurrency Control
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Persistent State Service 1 Distributed Object Transactions  Transaction principles  Concurrency control  The two-phase commit protocol  Services for.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Rococo: Extract more concurrency from distributed transactions
9 Chapter 9 Transaction Management and Concurrency Control Hachim Haddouti.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
Multi-user Database Processing Architectures Architectures Transactions Transactions Security Security Administration Administration.
Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.
Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Cassandra - A Decentralized Structured Storage System
CS 245Notes 101 CS 245: Database System Principles Notes 10: More TP Hector Garcia-Molina.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
1 Transactions Chapter Transactions A transaction is: a logical unit of work a sequence of steps to accomplish a single task Can have multiple.
Eiger: Stronger Semantics for Low-Latency Geo-Replicated Storage Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky † David G. Andersen ‡ * Princeton,
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Transactions. What is it? Transaction - a logical unit of database processing Motivation - want consistent change of state in data Transactions developed.
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
1 CSE232A: Database System Principles More Concurrency Control and Transaction Processing.
1 Advanced Database Concepts Transaction Management and Concurrency Control.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.
MULTIUSER DATABASES : Concurrency and Transaction Management.
CSCI5570 Large Scale Data Processing Systems
Cassandra - A Decentralized Structured Storage System
Shuai Mu, Lamont Nelson, Wyatt Lloyd, Jinyang Li
a journey from the simple to the optimal
Transaction Management and Concurrency Control
CSE 486/586 Distributed Systems Consistency --- 1
The SNOW Theorem and Latency-Optimal Read-Only Transactions
CPS 512 midterm exam #1, 10/7/2016 Your name please: ___________________ NetID:___________ /60 /40 /10.
Introduction to NewSQL
I Can’t Believe It’s Not Causal
آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95
Chapter 10 Transaction Management and Concurrency Control
CSE 486/586 Distributed Systems Consistency --- 1
Concurrency Control and Reliable Commit Protocol in Distributed Database Systems Jian Jia Chen 2002/05/09 Real-time and Embedded System Lab., CSIE, National.
Scalable Causal Consistency
Introduction of Week 13 Return assignment 11-1 and 3-1-5
Atomic Commit and Concurrency Control
COS 418: Distributed Systems Lecture 16 Wyatt Lloyd
Transaction Management
Presentation transcript:

Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos K. Aguilera Jinyang Li New York University *Microsoft Research Silicon Valley

Large-scale Web applications Why geo-distributed storage? Geo-distributed storage Replication

Geo-distribution is hard Low latency: O(Intra-datacenter RTT) Low latency: O(Intra-datacenter RTT) Strong semantics: relational tables w/ transactions Strong semantics: relational tables w/ transactions

? Low latency Key/value only Limited forms of transaction General transaction Prior work Strict serializable Serializable Eventual Various non-serializable High latency Provably high latency according to CAP Spanner [OSDI’12] Dynamo [SOSP’07] COPS [SOSP’11] Walter [SOSP’11] Eiger [NSDI’13] Our work

Our contributions 1.A new primitive: transaction chain – Allow for low latency, serializable transactions 1.Lynx geo-storage system: built with chains – Relational tables – Secondary indices, materialized join views

Talk Outline Motivation Transaction chains Lynx Evaluation

Why transaction chains? BidderItemPrice SellerItemHighest bid Bids Items AliceBook$100 BobBook$20 AliceiPhone$20 Bob Datacenter-1 Datacenter-2 Alice BobCamera$100 Auction service

Why transaction chains? Alice’s Bids AliceBook$100 Bob Datacenter-1 Datacenter-2 Alice BobCamera$100 Bob’s Items 1. Insert bid to Alice’s Bids 2. Update highest bid on Bob’s Items Operation: Alice bids on Bob’s camera 1. Insert bid to Alice’s Bids

Why transaction chains? Alice’s Bids AliceBook$100 Bob Datacenter-1 Datacenter-2 Alice BobCamera$100 Bob’s Items 2. Update highest bid on Bob’s Items Operation: Alice bids on Bob’s camera 1. Insert bid to Alice’s Bids

Low latency with first-hop return Alice’s Bids AliceBook$100 Bob Datacenter-1 Datacenter-2 Alice BobCamera$100 Bob’s Items bid on Bob’s camera AliceCamera$500

Problem: what if chains fail? 1.What if servers fail after executing first-hop? 2.What if a chain is aborted in the middle?

Solution: provide all-or-nothing atomicity 1.Chains are durably logged at first-hop – Logs are replicated to another closest data center – Chains are re-executed upon recovery 2.Chains allow user-aborts only at first hop Guarantee: First hop commits  all hops eventually commit

Problem: non-serializable interleaving Concurrent chains ordered inconsistently at different hops X=1Y=1 X=2Y=2 Time T1 T2 Server-X: T1 < T2 Server-Y: T2 < T1 Not serializable! T2T1 Traditional 2PL+2PC prevents non-serializable interleaving at the cost of high latency

Conflict? Solution: detect non-serializable interleaving via static analysis Statically analyze all chains to be executed – Web applications invoke fixed set of operations X=1Y=1 X=2Y=2 Serializable if no SC-cycle [Shasha et. al TODS’95] A SC-cycle has both red and blue edges T1 T2

Outline Motivation Transaction chains Lynx’s design Evaluation

How Lynx uses chains User chains: used by programmers to implement application logic System chains: used internally to maintain – Secondary indexes – Materialized join views – Geo-replicas

Example: secondary index BobCar$20 AliceBook$20 BobCamera$100 AliceiPhone$100 BidderItemPrice Bids (base table) AliceCamera$100 BobiPhone$20 BidderItemPrice Bids (secondary index) AliceCamera$100 BobCar$20

Example user and system chain AliceBook$100 Bob Datacenter-1 Datacenter-2 Alice BobCamera$100 bid on Bob’s camera AliceCamera$100

Insert to Bids table Update Items table Lynx statically analyzes all chains beforehand Put-bid Read-bids Put-bid Insert to Bids table Update Items table Read-bids SC-cycle One solution: execute chain as a distributed transaction Read Bids table Read Bids table

Insert to Bids table Update Items table SC-cycle source #1: false conflicts in user chains Put-bid Insert to Bids table Update Items table Put-bid False conflict because max(bid, current_price) commutes

Insert to Bids table Update Items table Solution: users annotate commutativity Put-bid Insert to Bids table Update Items table Put-bid commutes

SC-cycle source #2: system chains Insert to Bids table … Put-bid Insert to Bids table … Put-bid Insert to Bids-secondary Insert to Bids-secondary SC-cycle

Solution: chains provide origin-ordering Observation: conflicting system chains originate at the same first hop server. Both write the same row of Bids table Origin-ordering: if chains T1 < T2 at same first hop, then T1 < T2 at all subsequent overlapping hops. – Can be implemented cheaply  sequence number vectors T1 Insert to Bids table Insert to Bids-secondary T2 Insert to Bids table Insert to Bids-secondary

Limitations of Lynx/chains 1.Chains are not strictly serializable, only serializable. 2.Programmers can abort only at first hop Our application experience: limitations are managable

Outline Motivation Transaction chains Lynx’s design Evaluation

Simple Twitter Clone on Lynx AuthorTweet Tweets AliceNew York rocks FromTo Follow-Graph AliceBob AliceEve BobTime to sleep ToFrom Follow-Graph (secondary) BobAlice BobClark Geo-replicated Author (=to) FromTweet BobAliceTime to sleep EveAliceHi there Tweets JOIN Follow-Graph (Timeline) EveHi there

Experimental setup us-west europe us-east 82ms 153ms 102ms Lynx protoype: In-memory database Local disk logging only. Lynx protoype: In-memory database Local disk logging only.

Returning on first-hop allows low latency First hop return Chain completion

Applications achieve good throughput

Related work Transaction decomposition – SAGAS [SIGMOD’96], step-decomposed transactions Incremental view maintenance – Views for PNUTS [SIGMOD’09] Various geo-distributed/replicated storage – Spanner[OSDI’12], MDCC[Eurosys’13], Megastore[CIDR’11], COPS [SOSP’11], Eiger[NSDI’13], RedBlue[OSDI’12].

Conclusion Chains support serializability at low latency – With static analysis of SC-cycles Key techniques to reduce SC-cycles – Origin ordering – Commutative annotation Chains are useful – Performing application logic – Maintaining indices/join views/geo-replicas

Limitations of Lynx/chains 1.Chains are not strict serializable Time Remedies: – Programmers can wait for chain completion – Lynx provides read-your-own-writes 2. Programmers can only abort at first hop Our application experience shows the limitations are managable SerializableStrict serializable

2PC and chains The easy way W(A) R(A) W(B) W(A) W(B) R(A) 2PC-W(AB) R(A) T1 T2 T1 T2 T1

2PC and chains The hard way W(A) R(A)R(B) W(B) W(A) W(B) R(A)R(B) 2PC-W(AB) R(A)R(B) R(A)R(B) T1 T2 T1 T2 T1

2PC and chains The hard way Chain DC1 DC2 DC3 DC4 A B CD 2PC retry Parallel unlock

Lynx is scalable

1. Insert bid into bid history2. Update max price on item 1. Insert bid into bid history2. Update max price on item T1 T2 Conflict on bid history Conflict on item SC-cycle  Not serializable Challenge of static analysis: false conflict

Solution: communitivity annotations 1. Insert bid into bid history2. Update max price on item 1. Insert bid into bid history2. Update max price on item T1 T2 Conflict on bid history Commutative operation No SC-cycle  Serializable Conflict on item No real conflict because bid ids are unique Updating max commutes Commutative operation

ACID: all-or-nothing atomicity Chain’s failure guarantee: – If the first hop of a chain commits, then all hops eventually commit Users are only allowed to abort a chain in the first hop Achievable with low latency: – Log chains durably at the first hop Logs replicated to a nearby datacenter – Re-execute stalled chains upon failure recovery

ACID: serializability Serializability – Execution result appears as if obey a serial order for all transactions – No restrictions on the serial order Ordering 1 Ordering 2 Transactions

Problem #2: unsafe interleaving Serializability – Execution result appears as if obey a serial order for all transactions – No restrictions on the serial order Ordering 1 Ordering 2 Transactions

Chains are not linearizable Serializability Linearability Ordering 1 Ordering 2 Transactions Time Linearizable  a total ordering of chains & total order obeys the issue order

Transaction chains: recap Chains provide all-or-nothing atomicity Chains ensure serializability via static analysis Practical challenges: – How to use chains? – How to avoid SC-cycles?

Example user chain BidderItemPrice Bids AliceCamera Insert bid into Alice’s bid history Alice Bob SellerItemHighest Items BobCameraBobCamera Update max price on Bob’s camera

Lynx implementation 5000 lines C++ and 3500 lines RPC library Uses an in-memory key/value store Support user chains in Javascript (via V8)

Geo-distributed storage is hard Applications demand simplicity & performance – Friendly programming model Relational tables Transactions – Fast response Ideally, operation latency = O(intra-datacenter RTT) Geo-distribution leads to high latency – Coordinate data access across datacenters Operation latency = O(inter-datacenter RTT) = O(100ms)