Declarative Programming over Eventually Consistent Data Stores KC Sivaramakrishnan Gowtham Kaki Suresh Jagannathan.

Slides:



Advertisements
Similar presentations
Wyatt Lloyd * Michael J. Freedman * Michael Kaminsky David G. Andersen * Princeton, Intel Labs, CMU Dont Settle for Eventual : Scalable Causal Consistency.
Advertisements

Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.
Consistency Guarantees and Snapshot isolation Marcos Aguilera, Mahesh Balakrishnan, Rama Kotla, Vijayan Prabhakaran, Doug Terry MSR Silicon Valley.
Transaction chains: achieving serializability with low-latency in geo-distributed storage systems Yang Zhang Russell Power Siyuan Zhou Yair Sovran *Marcos.
COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Transaction Processing on Top of Hadoop Spring 2012 Aviram Rehana Lior Zeno Supervisor : Edward Bortnikov.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. –Because disk accesses are.
Computer Science 425 Distributed Systems Lecture 21 Transaction Processing and Concurrency Control Reading: Sections
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
1 Countermeasures against Consistency Anomalies in Databases with Relaxed ACID Properties. By Lars Frank Copenhagen Business School.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
1 ACID Properties of Transactions Chapter Transactions Many enterprises use databases to store information about their state –e.g., Balances of.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Computer Science Lecture 14, page 1 CS677: Distributed OS Consistency and Replication Introduction Consistency models –Data-centric consistency models.
COMP 5138 Relational Database Management Systems Semester 2, 2007 Lecture 8A Transaction Concept.
ICS (062)CC in Adv. DB Applications1 Concurrency Control in Advanced Database Applications Dr. Muhammad Shafique 31 March 2007.
Compe 301 ER - Model. Today DBMS Overview Data Modeling Going from conceptual requirements of a application to a concrete data model E/R Model.
Distributed Databases
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Transaction Management Overview Chapter 16.
Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.
1 Transactions BUAD/American University Transactions.
CSE 755, part3 Axiomatic Semantics Will consider axiomatic semantics (A.S.) of IMP: ::=skip | | | | ; | | Only integer vars; no procedures/fns; vars declared.
An Investigation of Oracle and SQL Server with respect to Integrity, and SQL Language standards Presented by: Paul Tarwireyi Supervisor: John Ebden Date:
Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.
1099 Why Use InterBase? Bill Todd The Database Group, Inc.
Chapter 25 Formal Methods Formal methods Specify program using math Develop program using math Prove program matches specification using.
CS 162 Discussion Section Week 9 11/11 – 11/15. Today’s Section ●Project discussion (5 min) ●Quiz (10 min) ●Lecture Review (20 min) ●Worksheet and Discussion.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
第5讲 一致性与复制 §5.1 副本管理 Replica Management §5.2 一致性模型 Consistency Models
Service Primitives for Internet Scale Applications Amr Awadallah, Armando Fox, Ben Ling Computer Systems Lab Stanford University.
1 Multiversion Reconciliation for Mobile Databases Shirish Hemanath Phatak & B.R.Badrinath Presented By Presented By Md. Abdur Rahman Md. Abdur Rahman.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Transaction Management Overview. Transactions Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are.
9/27/05© 2005 Microsoft Corporation1 Relaxed-Currency Serializability Philip A. Bernstein, Microsoft Research Alan Fekete, Univ. of Sydney Raghu Ramakrishnan,
SQLintersection Understanding Transaction Isolation Levels Randy Knight Wednesday, 3:45-5:00.
An Architecture for Mobile Databases By Vishal Desai.
What’s Ahead for Embedded Software? (Wed) Gilsoo Kim
Declarative Programming over Eventually Consistent Data Stores KC Sivaramakrishnan Gowtham Kaki Suresh Jagannathan.
Free Recovery: A Step Towards Self-Managing State Andy Huang and Armando Fox Stanford University.
Specifying Multithreaded Java semantics for Program Verification Abhik Roychoudhury National University of Singapore (Joint work with Tulika Mitra)
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
Relaxed Currency Serializability for Middle-Tier Caching and Replication Philip A. Bernstein, Alan Fekete, Hongfei Guo, Raghu Ramakrishnan, Pradeep Tamma.
Formal Modeling and Analysis of RAMP Transaction Systems Si Liu, Peter Csaba Ölveczky, Muntasir Raihan Rahman, Jatin Ganhotra, Indranil Gupta, and José.
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Consistency --- 2
CSE 486/586 Distributed Systems Consistency --- 1
Specifying Multithreaded Java semantics for Program Verification
Relational Algebra Chapter 4, Part A
Consistency and Replication
ACID PROPERTIES.
Consistency Models.
EECS 498 Introduction to Distributed Systems Fall 2017
CSE 486/586 Distributed Systems Concurrency Control --- 1
CSE 486/586 Distributed Systems Consistency --- 1
Batches, Transactions, & Errors
Scalable Causal Consistency
Atomic Commit and Concurrency Control
Transaction Management Overview
Server-Side Programming
CSE 486/586 Distributed Systems Concurrency Control --- 1
CSE 486/586 Distributed Systems Consistency --- 2
Presentation transcript:

Declarative Programming over Eventually Consistent Data Stores KC Sivaramakrishnan Gowtham Kaki Suresh Jagannathan

Http AppServer Cache Stateless Consistency, Integrity, Durability, Availability, etc.

Account balances should be non-negative Usernames should be unique Only bona fide bids are accepted in an auction. Application invariants Strong consistency Linearizability & Serializability

INTERNET ☐ Strongly consistent, but not “always on” ☐ Be “always on”, but no strong consistency Eventual Consistency ∞ (convergence)

INTERNET Basic eventualRead-my-writesCausal Monotonic writes Bounded staleness Read committed Parallel Snapshot Isolation Store Consistency Levels Eventually Consistent Data Stores Read-my-writes //init balance = 0 deposit(100) ?  get_balance() Session 1

INTERNET Basic eventualRead-my-writesCausal Monotonic writes Bounded staleness Read committed Parallel Snapshot Isolation Store Consistency Levels Eventually Consistent Data Stores Read-my-writes //init balance = 0 deposit(100) 0  get_balance() Session 1 bal=0 Replica 2 bal=100 Replica 1 Eventual Consistency //init balance = 0 deposit(100) ???  get_balance() Session 1 bal=100 Replica 1 Read-my-writes consistency bal=0 Replica 2

INTERNET Basic eventualRead-my-writesCausal Monotonic writes Bounded staleness Read committed Parallel Snapshot Isolation Store Consistency Levels Eventually Consistent Data Stores Read-my-writes //init balance = 0 deposit(100) 0  get_balance() Session 1 bal=0 Replica 2 bal=100 Replica 1 Eventual Consistency //init balance = 0 deposit(100) 100  get_balance() Session 1 bal=100 Replica 1 Read-my-writes consistency bal=100 Replica 2

INTERNET Basic eventualRead-my-writesCausal Monotonic writes Bounded staleness Read committed Parallel Snapshot Isolation Store Consistency Levels Eventually Consistent Data Stores deposit()withdraw()bid()tweet() Application invariants

INTERNET Basic eventualRead-my-writesCausal Monotonic writes Bounded staleness Read committed Parallel Snapshot Isolation Store Consistency Levels Eventually Consistent Data Stores deposit()withdraw()bid()tweet() Application invariants Can we automate the process of mapping application requirements to store consistency levels?

Application requirements Store consistency guarantees Unique usernames. Non-negative balance. Bona fide bids. Read-my-writes consistency Causal consistency Read committed isolation level Repeatable read isolation level Map Specification Language Classification Scheme Our solution … Sound. Optimal A common medium to express both. An algorithm to …

Deposit(200) Withdraw(20) Withdraw(10) …… Deposit(200) Withdraw(20) Withdraw(10) …… Replica 1 Prelims - System Model …… Deposit(200) Withdraw(10) …… Deposit(200) Withdraw(10) …… Replica n Session 1 v1 = getBalance(); …… v2 = getBalance(); v1 = getBalance(); …… v2 = getBalance(); Session n …… Replicated Data Store Session Order (SO) getBalance Visibility (Vis) getBalance Vis

12 Primitive relations Specification Language Happens-before Axiomatically capture set of valid executions Associate with each operation a single abstract effect –Express relationship between effects –Visibility (vis), Session order (so), Same object (sameobj)

Replicated Bank Account (1) balance >= 0 violated Session 2 //init balance = 100 withdraw(70); //init balance = 100 withdraw(70); //init balance = 100 withdraw(70); //init balance = 100 withdraw(70); Session 1 a a b b vis

Bank Account Contracts (2) Session 1 deposit(100) deposit(100) a a Session 2 withdraw(50) withdraw(50) b b vis Session  A.getbalance () -50  A.getbalance () getbalance () getbalance () c c vis 50  getbalance () 50  getbalance ()

Capturing Store Consistency Levels Strong Consistency Causal Consistency Eventual Consistency

Decidable  Automatically discharged with the help of Z3 SMT solver. Classification Scheme deposit  EC withdraw  SC getBalance  CC Strong Consistency Causal Consistency Eventual Consistency

Classification Scheme (2)

Txn 1 (current) oper1(…) a a Txn 2 (committed) oper2(…) oper2(…) b b vis oper2(…) oper3(…) oper2(…) oper3(…) c c vis Read Committed Transactions Generalizing to transactions is easy. –Add single primitive relation - sametxn(a,b) –Derived relation: Isolation guarantees of stores can now be specified.

Monotonic atomic view Repeatable Read Read Committed

Haskell library for Eventually Consistent Data Stores (ECDS) –Definition language  define operations and transactions on replicated data. –Specification language  specify consistency and isolation requirements. + DEFS Quelea Data Store GHC

ApplicationRUBiSTwitter-lite #Tables65 #Operations1720 #Transactions610 Invariants e.g.See all bids placed in current session Unique username Results of classification #EC Ops1413 #CC Ops26 #SC Ops11 #RC Txns46 #MAV Txns23 #RR Txns01 Case Studies

Evaluation Correctness with classification vs without classification –How do they compare in terms of availability? Experimental Setup: –Amazon EC2; 5 replicas (StrongRep & Quelea); 1 replica (NoRep) –Gradually increased # of concurrent clients from 128 to NoRep: No Replication StrongRep: Strong Replication

Conclusion Quelea  Haskell-library for programming ECDS –Automatic classification of operation and transaction contracts through SMT solver Leveraging off-the-shelf ECDS –Avoid re-engineering complex systems –Makes it practical!

Thank you!

Summarization is essential to check the unbounded growth of the log. How is summarization done? –Ask developer for summarization semantics. –Replace (many) original effects with (few) summary effects. State Summarization

… like “monotonic reads” (roughly requiring that time doesn’t appear to go backward). … So, if we want to build an available system providing the monotonic reads session guarantee, we can ensure that read operations only return writes when the writes are present on all servers. Monotonic Reads

Read Committed (Atomicity) Txn 1 (current) X.oper1(…) a a Txn 2 (committed) X.oper2(…) X.oper2(…) b b vis X.oper2(…) X.oper3(…) X.oper2(…) X.oper3(…) c c vis

Repeatable Read Atomicity + Repeatable read should return the same value  snapshot Txn 1 (current) X.oper1(…) a a Txn 2 (committed) X.oper2(…) X.oper2(…) b b vis X.oper2(…) Y.oper4(…) X.oper2(…) Y.oper4(…) c c vis d d X.oper1(…) Y.oper3(…) X.oper1(…) Y.oper3(…) a = d (Atomicity) RR RR =

In the paper –Stronger eventual consistency –Highly available transaction support –Summarization

System Model A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } …… Quelea Data Store A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } R1R2Rn Session 1 B.deposit($5) B.withdraw($6) B.deposit($5) B.withdraw($6) Session 2 Session n …… Session Order w6w6 w6w6 d5d5 d5d5 AliceBob Effects

System Model A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } Quelea Data Store A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } R1R2Rn Session 1 B.deposit($5) B.withdraw($6) Session 2 Session n …… Session Order w6w6 w6w6 d5d5 d5d5 AliceBob ……

System Model A  {d 10,w 2 } B  {d 9,d 5 } A  {d 10,w 2 } B  {d 9,d 5 } …… Quelea Data Store A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9 } A  {d 10,w 2 } B  {d 9,w 6 } A  {d 10,w 2 } B  {d 9,w 6 } R1R2Rn Session 1 B.deposit($5) B.withdraw($6) Session 2 Session n …… Session Order w6w6 w6w6 w6w6 d5d5 d5d5 AliceBob d5d5

System Model A  {d 10,w 2 } B  {d 3,d 5,w 6 } A  {d 10,w 2 } B  {d 3,d 5,w 6 } …… Quelea Data Store A  {d 10,w 2 } B  {d 3,d 5,w 6 } A  {d 10,w 2 } B  {d 3,d 5,w 6 } A  {d 10,w 2 } B  {d 3,d 5,w 6 } A  {d 10,w 2 } B  {d 3,d 5,w 6 } R1R2Rn Session 1 B.deposit($5) B.withdraw($6) Session 2 Session n …… Session Order w6w6 w6w6 d5d5 d5d5 AliceBob

System Model A  {d 10,w 2 } B  {d 3,d 5,w 6 } A  {d 10,w 2 } B  {d 3,d 5,w 6 } Quelea Data Store R1 Session 1 B.deposit($5) B.withdraw($6) v1 = B.getBalance() B.deposit($5) B.withdraw($6) v1 = B.getBalance() w6w6 w6w6 d5d5 d5d5 AliceBob …… gb vis

System Model A  {d 10,w 2 } B  {d 3,d 5,w 6 } A  {d 10,w 2 } B  {d 3,d 5,w 6 } Quelea Data Store R1 Session 1 B.deposit($5) B.withdraw($6) v1 = $3+$5–$6 = $2 B.deposit($5) B.withdraw($6) v1 = $3+$5–$6 = $2 w6w6 w6w6 d5d5 d5d5 AliceBob …… gb vis

Deposit(200) Withdraw(20) Withdraw(10) …… Deposit(200) Withdraw(20) Withdraw(10) …… Replica 1 System Model …… Deposit(200) Withdraw(10) Deposit(10) …… Deposit(200) Withdraw(10) Deposit(10) …… Replica n Session 1 getBalance; …… withdraw(6); getBalance; …… withdraw(6); Session n …… Replicated Data Store Session Order (SO)

Replicated Bank Account (2) Session 1 deposit(100) deposit(100) a a Session 2 withdraw(50) withdraw(50) b b vis Session  getbalance () -50  getbalance () getbalance () getbalance () c c vis 50  getbalance () 50  getbalance ()

Evaluation

Replicated Bank Account (1) bal = A.getBalance(); If (bal ≥ 70) A.withdraw(70); bal = A.getBalance(); If (bal ≥ 70) A.withdraw(70); Session 1 Session 2 A.withdraw(70);

Replicated Bank Account (1) 100 = A.getBalance(); If (100 ≥ 70) A.withdraw(70); 100 = A.getBalance(); If (100 ≥ 70) A.withdraw(70); Session 1 b b SO Session 1 Session 2 a a b b A.withdraw(70); c c c c a a

Replicated Bank Account (1) Session 2 A.withdraw(70); 100 = A.getBalance(); If (100 ≥ 70) A.withdraw(70); 100 = A.getBalance(); If (100 ≥ 70) A.withdraw(70); Session 1 b b SO Session 1 Session 2 a a b b a a c c c c VIS

Replicated Bank Account (1) Required invariant: balance >= 0 SO Session 1 Session 2 a a b b c c VIS

Replicated Bank Account (1) Session 2 //init balance = 100 withdraw(70); //init balance = 100 withdraw(70); //init balance = 100 withdraw(70); //init balance = 100 withdraw(70); Session 1

Transactions Allows composition of operations. Serializable transactions are unavailable Highly available transactions (HAT) –Atomic, but relaxed Isolation. –Isolation levels: read committed, repeatable read, monotonic atomic view, etc. –Express foreign key constraints, secondary indexes etc. Choosing the correct isolation guarantee is an error- prone process –Automate it through specifications and classification! – sametxn(a,b)

Operations (17)Transactions (6) StockItemnewItem RemoveItemFromStockOpenAuction AddBidConcludeAuction …… An “e-Bay”-like auction site Case Study - RUBiS Application Invariants: –Canceling a bid must not violate data integrity –A bidder must see all bids placed in the current session –…

INTERNET ☐ Strongly consistent, but not “always on” ☐ Be “always on”, but no strong consistency Eventual Consistency ∞ (convergence) VIDEO_IDCOUNT …… VIDEO_IDCOUNT …… VIDEO_IDCOUNT …… VIDEO_IDCOUNT ……