Logic and Lattices for Distributed Programming Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein UC Berkeley David Maier Portland State.

Slides:



Advertisements
Similar presentations
Disorderly Distributed Programming with Bloom
Advertisements

Two phase commit. Failures in a distributed system Consistency requires agreement among multiple servers –Is transaction X committed? –Have all servers.
Edelweiss: Automatic Storage Reclamation for Distributed Programming Neil Conway Peter Alvaro Emily Andrews Joseph M. Hellerstein University of California,
Implementing Declarative Overlays From two talks by: Boon Thau Loo 1 Tyson Condie 1, Joseph M. Hellerstein 1,2, Petros Maniatis 2, Timothy Roscoe 2, Ion.
BloomUnit Declarative testing for distributed programs Peter Alvaro UC Berkeley.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Distributed Systems Overview Ali Ghodsi
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Byzantine Fault Tolerance Steve Ko Computer Sciences and Engineering University at Buffalo.
Declarative Distributed Programming with Dedalus and Bloom Peter Alvaro, Neil Conway UC Berkeley.
Logic and Lattices for Distributed Programming Neil Conway UC Berkeley Joint work with: Peter Alvaro, Peter Bailis, David Maier, Bill Marczak, Joe Hellerstein,
Distributed Programming and Consistency: Principles and Practice Peter Alvaro Neil Conway Joseph M. Hellerstein UC Berkeley.
Conflict-free Replicated Data Types MARC SHAPIRO, NUNO PREGUIÇA, CARLOS BAQUERO AND MAREK ZAWIRSKI Presented by: Ron Zisman.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Advanced Topics COMP163: Database Management Systems University of the Pacific December 9, 2008.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
EEC 688/788 Secure and Dependable Computing Lecture 13 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 18: Replication Control All slides © IG.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Accelerating Mobile Applications through Flip-Flop Replication
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
Disorderly programming for a distributed world Peter Alvaro UC Berkeley.
Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.
Chapter 19 Recovery and Fault Tolerance Copyright © 2008.
Architectures of distributed systems Fundamental Models
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
CSE 486/586 CSE 486/586 Distributed Systems Concurrency Control Steve Ko Computer Sciences and Engineering University at Buffalo.
Cloud Programming: From Doom and Gloom to BOOM and Bloom Peter Alvaro, Neil Conway Faculty Recs: Joseph M. Hellerstein, Rastislav Bodik Collaborators:
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Practical Byzantine Fault Tolerance
Byzantine fault-tolerance COMP 413 Fall Overview Models –Synchronous vs. asynchronous systems –Byzantine failure model Secure storage with self-certifying.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Communication & Synchronization Why do processes communicate in DS? –To exchange messages –To synchronize processes Why do processes synchronize in DS?
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Fault Tolerant Services
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
D u k e S y s t e m s Asynchronous Replicated State Machines (Causal Multicast and All That) Jeff Chase Duke University.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Write Conflicts in Optimistic Replication Problem: replicas may accept conflicting writes. How to detect/resolve the conflicts? client B client A replica.
Blazes: coordination analysis for distributed program Peter Alvaro, Neil Conway, Joseph M. Hellerstein David Maier UC Berkeley Portland State.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
State Machine Replication State Machine Replication through transparent distributed protocols State Machine Replication through a shared log.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Transactions on Replicated Data Steve Ko Computer Sciences and Engineering University at Buffalo.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Fundamentals of Fault-Tolerant Distributed Computing In Asynchronous Environments Paper by Felix C. Gartner Graeme Coakley COEN 317 November 23, 2003.
Bloom: Big Systems, Small Programs Neil Conway UC Berkeley.
Primary-Backup Replication COS 418: Distributed Systems Lecture 5 Kyle Jamieson.
Consistency Analysis in Bloom: a CALM and Collected Approach Authors: Peter Alvaro, Neil Conway, Joseph M. Hellerstein, William R. Marczak Presented by:
Primary-Backup Replication
Distributed Systems – Paxos
Outline Announcements Fault Tolerance.
Distributed Systems, Consensus and Replicated State Machines
CRDTs and Coordination Avoidance (Lecture 8, cs262a)
Architectures of distributed systems Fundamental Models
EEC 688/788 Secure and Dependable Computing
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Architectures of distributed systems Fundamental Models
EEC 688/788 Secure and Dependable Computing
Lecture 21: Replication Control
View Change Protocols and Reconfiguration
Architectures of distributed systems Fundamental Models
EEC 688/788 Secure and Dependable Computing
Paxos Made Simple.
Lecture 21: Replication Control
Implementing Consistency -- Paxos
Presentation transcript:

Logic and Lattices for Distributed Programming Neil Conway, William R. Marczak, Peter Alvaro, Joseph M. Hellerstein UC Berkeley David Maier Portland State University

Distributed Programming: Key Challenges Asynchrony Partial Failure

Dealing with Disorder Enforce global order –Paxos, Two-Phase Commit, GCS, … –“Strong Consistency” Tolerate disorder –Programmer must ensure correct behavior for many possible network orders –“Eventual Consistency” Typical goal: replicas converge to same final state

Dealing with Disorder Enforce global order –Paxos, Two-Phase Commit, GCS, … –“Strong Consistency” Tolerate disorder –Programmer must ensure correct behavior for many possible network orders –“Eventual Consistency” Typical goal: replicas converge to same final state

Goal: Make it easier to write programs on top of eventual consistency

This Talk 1.Prior Work –Convergent Modules (CRDTs) –Monotonic Logic (CALM) 2.Bloom L 3.Case Study

Read: {Alice, Bob} Write: {Alice, Bob, Dave} Write: {Alice, Bob, Carol} Students {Alice, Bob, Dave} Students {Alice, Bob, Carol} Client 0 Client 1 Read: {Alice, Bob} Students {Alice, Bob} How to resolve? Students {Alice, Bob}

Proble m Replicas perceive different event orders GoalSame final state at all replicas Solutio n Use commutative operations (“merge functions”)

Students {Alice, Bob, Carol, Dave} Client 0 Client 1 Merge = Set Union

Commutative Operations Common design pattern Formalized as CRDTs: Convergent and Commutative Replicated Data Types –Shapiro et al., INRIA ( ) –Based on join semilattices

Lattices hS,t,?i is a bounded join semilattice iff: –S is a set –t is a binary operator (“least upper bound”) Associative, commutative, and idempotent Induces a partial order on S: x · S y if x t y = y Informally, “merge function” for elements of S –? is the “least” element in S 8x 2 S: ? t x = x 12

Time Set (LUB = Union) Increasing Integer (LUB = Max) Boolean (LUB = Or)

Client 0 Client 1 Students {Alice, Bob, Carol, Dave} Teams { } Read: {Alice, Bob, Carol, Dave} Read: { } Write: {, } Teams {, } Remove: {Dave} Students {Alice, Bob, Carol} Replica Synchronization Students {Alice, Bob, Carol} Teams {, }

Client 0 Client 1 Students {Alice, Bob, Carol, Dave} Teams { } Read: {Alice, Bob, Carol} Read: { } Teams { } Remove: {Dave} Students {Alice, Bob, Carol} Replica Synchronization Students {Alice, Bob, Carol} Nondeterministic Outcome! Teams { }

Problem: Composition of CRDTs can result in non-determinism

Possible Solution: Encapsulate all distributed state in a single CRDT Hard to design, verify, and test Doesn’t scale with application size

Goal: Design a language that allows safe composition of CRDTs

Solution: … Datalog? Concurrent work: distributed programming using Datalog –P2 ( ) –Bloom ( ) Monotonic logic: building block for convergent distributed programs

Monotonic Logic As input set grows, output set does not shrink –“Retraction-free” Order independent e.g., map, filter, join, union, intersection Non-Monotonic Logic New inputs might retract previous outputs Order sensitive e.g., aggregation, negation

Monotonicity and Determinism Agents learn strictly more knowledge over time Different learning order, same final outcome Result: Program is deterministic!

Consistency As Logical Monotonicity CALM Analysis 1.All monotone programs are deterministic 2.Simple syntactic test for monotonicity Result: Whole-program static analysis for eventual consistency

Problem: CALM only applies to programs over growing sets Version NumbersTimestampsThreshold Tests

Quorum Vote A coordinator accepts votes from agents Count # of votes –When Count(Votes) > k, send “success” message

Quorum Vote A coordinator accepts votes from agents Count # of votes –When Count(Votes) > k, send “success” message Aggregation is non-monotonic!

CRDTs Limited scope (single object) Flexible types (any lattice) CALM Whole program analysis Limited types (only sets) Bloom L Whole program analysis Flexible types (any lattice)

Bloom L Constructs OrganizationCollection of agents CommunicationMessage passing StateLattices ComputationFunctions over lattices

Monotone Functions f : S  T is a monotone function iff 8a,b 2 S : a · S b ) f(a) · T f(b) 28

Time Set (LUB = Union) Increasing Integer (LUB = Max) Boolean (LUB = Or) size() >= 5 Monotone function from set  increase-int Monotone function from increase-int  boolean

Quorum Vote in Bloom L QUORUM_SIZE = 5 RESULT_ADDR = "example.org" class QuorumVote include Bud state do channel :vote_chn, :voter_id] channel :result_chn, lset :votes lmax :vote_cnt lbool :got_quorum end bloom do votes <= vote_chn {|v| v.voter_id} vote_cnt <= votes.size got_quorum <= vote_cnt.gt_eq(QUORUM_SIZE) result_chn <~ got_quorum.when_true { [RESULT_ADDR] } end Monotone function: set ! max Monotone function: max ! bool Threshold test on bool (monotone) Lattice state declarations 30 Communication interfaces Accumulate votes into set Annotated Ruby class Program state Program logic Merge function for set lattice Monotonic  CALM

Bloom L Features Generalizes logic programming to lattices –Integration of relational-style queries and functions over lattices –Efficient incremental evaluation scheme Library of built-in lattices –Booleans, increasing/decreasing integers, sets, multisets, maps, … API for defining custom lattices

Case Studies Key-Value Store –Object versioning via vector clocks –Quorum replication Replicated Shopping Cart –Using custom lattice types to encode domain-specific knowledge

Case Studies Key-Value Store –Object versioning via vector clocks –Quorum replication Replicated Shopping Cart –Using custom lattice types to encode domain-specific knowledge

Case Study: Shopping Carts 34

Case Study: Shopping Carts 35

Case Study: Shopping Carts 36

Case Study: Shopping Carts 37

Perspectives on Shopping CRDTs –Individual server replicas converge Bloom –Checkout is non-monotonic  requires distributed coordination Built-in Bloom L lattice types –Checkout is not a monotone function of any of the built-in lattices

Observation: Once a checkout occurs, no more shopping actions can be performed

Observation: Each client knows when a checkout can be processed “safely”

Monotone Checkout OPS = [1] Incomplete OPS = [2] Incomplete OPS = [3] Incomplete OPS = [1,2] Incomplete OPS = [2,3] Incomplete OPS = [1,2,3] Complete 41

Monotone Checkout 42

Monotone Checkout 43

Monotone Checkout 44

Monotone Checkout 45

Shopping Takeaways Checkout summary is a monotone function of client’s activities Custom lattice type captures application- specific notion of “forward progress” –“Unsafe” state hidden behind ADT interface

Recap 1.How to build eventually consistent systems –Write disorderly programs 2.Disorderly state –Lattices 3.Disorderly computation –Monotone functions over lattices 4.Bloom L –Type system for deterministic behavior –Support for custom lattice types

Thank You!

Backup Slides

Strong Consistency in Industry “… there was a single overarching theme within the keynote talks… strong synchronization of the sort provided by a locking service must be avoided like the plague… [the key] challenge is to find ways of transforming services that might seem to need locking into versions that … can operate correctly without locking.” -- Birman et al., “Toward a Cloud Computing Research Agenda” (LADIS, 2009) 50

Bloom Operational Model 51

QUORUM_SIZE = 5 RESULT_ADDR = "example.org" class QuorumVote include Bud state do channel :vote_chn, :voter_id] channel :result_chn, table :votes, [:voter_id] scratch :cnt, [] => [:cnt] end bloom do votes <= vote_chn {|v| [v.voter_id]} cnt <= votes.group(nil, count(:voter_id)) result_chn = QUORUM_SIZE} end Quorum Vote in Bloom Communication Persistent Storage Transient Storage Accumulate votes Send message when quorum reached Not (set) monotonic! 52 Count votes Annotated Ruby class Program state Program logic

Built-in Lattices NameDescription?a t bSample Monotone Functions lboolThreshold testfalse a ∨ b when_true() ! v lmaxIncreasing number 1max(a,b ) gt(n) ! lbool +(n) ! lmax -(n) ! lmax lminDecreasing number −1−1min(a,b)lt(n) ! lbool lsetSet of values;a [ bintersect(lset) ! lset product(lset) ! lset contains?(v) ! lbool size() ! lmax lpsetNon-negative set;a [ bsum() ! lmax lbagMultiset of values;a [ bmult(v) ! lmax +(lbag) ! lbag lmapMap from keys to lattice values empty map at(v) ! any-lat intersect(lmap) ! lmap 53

Failure Handling Great question! 1.Monotone programs handle transient faults very well –Deterministic  simple logging –Commutative, idempotent  simple recovery 2.Future work: “controlled non-determinism” –Timeout code is fundamentally non-deterministic –But we still want mostly deterministic programs

Handling Non-Monotonicity … is not the focus of this talk Basic alternatives: 1.Nodes agree on an event order using distributed coordination (e.g., Paxos) 2.Allow non-deterministic outcomes If needed, compensate and apologize