Presentation is loading. Please wait.

Presentation is loading. Please wait.

PCAP Project: Probabilistic CAP and Adaptive Key-value Stores

Similar presentations


Presentation on theme: "PCAP Project: Probabilistic CAP and Adaptive Key-value Stores"— Presentation transcript:

1 PCAP Project: Probabilistic CAP and Adaptive Key-value Stores
Indranil Gupta Associate Professor Dept. of Computer Science, University of Illinois at Urbana-Champaign Joint work with Muntasir Raihan Rahman, Lewis Tseng, Son Nguyen, Nitin Vaidya Distributed Protocols Research Group (DPRG)

2 Key-value/NoSQL Storage Systems
Key-value/NoSQL stores: $3.4B sector by 2018 Distributed storage in the cloud Netflix: video position (Cassandra) Amazon: shopping cart (DynamoDB) And many others NoSQL = “Not Only SQL”

3 Key-value/NoSQL Storage Systems (2)
Necessary API operations: get(key) and put(key, value) And some extended operations, e.g., “CQL” in Cassandra key-value store Lots of open-source systems (startups) Cassandra (Facebook) Riak (Basho) Voldemort (LinkedIn) Closed-source systems with papers Dynamo

4 Key-value/NoSQL Storage: Fast and Fresh
Cloud clients expect both Availability: Low latency for all operations (reads/writes) 500ms latency increase at Google.com costs 20% drop in revenue each extra ms  $4 M revenue loss Consistency: read returns value of one of latest writes Freshness of data means accurate tracking and higher user satisfaction Most KV stores only offer weak consistency (Eventual consistency) Eventual consistency = if writes stop, all replicas converge, eventually Why eventual? Why so weak?

5 CAP Theorem  NoSQL Revolution
Conjectured: [Brewer 00] Proved: [Gilbert Lynch 02] When network partitioned, system must choose either strong consistency or availability. Kicked off NoSQL revolution Abadi PACELC If P, choose A or C Else, choose L (latency) or C Consistency HBase, HyperTable, BigTable, Spanner RDBMSs Partition-tolerance Availability /Latency Cassandra, RIAK, Dynamo, Voldemort

6 Hard vs. Soft Partitions
Hard partition CAP Theorem looks at hard partitions However, soft partitions may happen inside a data-center Periods of elevated message delays Periods of elevated loss rates Data-center 2 (Europe) Data-center 1 (America) CoreSw Congestion at switches => Soft partition ToR ToR

7 Our work: From Impossibility to Possibility
C  Probabilistic C (Consistency) A  Probabilistic A (Latency) P  Probabilistic P (Partition Model) Probabilistic CAP Theorem PCAP System to support SLAs (service level agreements)

8 tc + ta < tp and pua + pic < α
Probabilistic CAP time W(1) W(2) R(1) tc A read is tc-fresh if it returns the value of a write That starts at-most tc time before the read pic is likelihood a read is NOT tc-fresh Probabilistic Consistency (pic ,tc) pua is likelihood a read DOES NOT return an answer within ta time units Probabilistic Latency (pua ,ta) α is likelihood that a random host pair has message delay exceeding tp time units Probabilistic Partition (α, tp ) PCAP Theorem: Impossible to achieve both Probabilistic Consistency and Latency under Probabilistic Partitions if: tc + ta < tp and pua + pic < α PCAP theorem: i.e. there exists an execution such that the three properties cannot be satisfied simultaneously, but this Define start and end times (issue time and return time) Say ic stands for inconsistency, ua stands for unavailability High tp and high alpha imply bad network conditions First condition: When network is bad (high tp ), cannot achieve both low staleness (low tc), and low latency (low ta) at the same time Second condition implies that if we allow a fraction (pic) of reads to be tc-stale, and a fraction pua of reads to be later than ta, then when the natwork is bad (high alpha), we cannot reduce pic and pua arbitrarily close to zero Original CAP theorem is a special case of our PCAP theorem The formulation and proof of the PCAP theorem was obtained independently by my collaborators Bad network -> High (α, tp ) To get better consistency -> lower (pic ,tc) To get better latency -> lower (pua ,ta)

9 Towards Probabilistic SLAs
Consistency SLA: Goal is to Meet a desired freshness probability (given freshness interval) Maximize probability that client receives operation’s result within the timeout Example: Google search application/Twitter search Wants users to receive “recent” data as search Only 10% results can be more than 5 min stale SLA: (pic , tc)=(0.1, 5 min) Minimize response time (fast response to query) Minimize: pua (Given: ta) We define SLA using the prob models of C and A, this is independent of the PCAP theorem

10 Towards Probabilistic SLAs (2)
Latency SLA: Goal is to Meet a desired probability that client receives operation’s result within the timeout Maximize freshness probability within given freshness interval Example: Amazon shopping cart Doesn’t want to lose customers due to high latency Only 10% operations can take longer than 300ms SLA: (pua, ta) = (0.1, 300ms) Minimize staleness (don’t want customers to lose items) Minimize: pic (Given: tc) We define SLA using the prob models of C and A, this is independent of the PCAP theorem

11 Meeting these SLAs: PCAP Systems
KV-store (Cassandra, Riak) CONTROL KNOBS PCAP System Satisfies PCAP SLA ADAPTIVE System assumptions: Client sends query to coordinator server which then forwards to replicas (answers reverse path) There exist background mechanisms to bring stale replicas up to date Increased Knob Latency Consistency Read Delay Degrades Improves Read Repair Rate Unaffected Consistency Level this is the existing structure and protocol in Cassandra, Riak, Dynamo In most experiments we use read delay knob, two way control, fine grained, and not intrusive of CL expectations, non-blocking Assumptions: A coordinator node that forwards requests to multiple replicas; background synchronization Continuously adapt control knobs to always satisfy PCAP SLA

12 PCAP System Control Loop Architecture
Active approach (1) Inject test operations (read and write) (2) Get estimate of current consistency or latency By analyzing operation log PCAP Coordinator Original k-v store PCAP system knobs Passive Approach: Sample ongoing client operations Non-intrusive to client workloads (3) Update control knobs to reach target SLA

13 Meeting Consistency SLA for PCAP Cassandra (pic=0.135)
Mean latency = 3 ms | 4 ms | 5 ms Optimal envelopes under different Network conditions Log normal distribution “change” means the mean and sd of delay distribution is changed Consistency always below target SLA PCAP system Satisfies SLA and close to optimal Lognormal delay variation Setup 9 server Emulab cluster: each server has 4 Xeon + 12 GB RAM 100 Mbps Ethernet YCSB workload (144 client threads) Network delay: Log-normal distribution

14 Distributed Protocols Research Group (DPRG)
Summary CAP Theorem motivated NoSQL Revolution But apps need freshness + fast responses Under soft partition We proposed Probabilistic models for C, A, P Probabilistic CAP theorem – generalizes classical CAP PCAP system satisfies Latency/Consistency SLAs Integrated into Apache Cassandra and Riak KV stores Distributed Protocols Research Group (DPRG)

15 MOOC on “Cloud Computing Concepts”
On Coursera Ran Feb-Apr 2015 (just wrapping up) 120K+ students Covered distributed systems and algorithms used in cloud computing Free and Open to everyone Or do a search on Google for “Coursera Cloud Computing” (click on first link) Distributed Protocols Research Group (DPRG)

16 Distributed Protocols Research Group (DPRG)
Our Posters at GCASR 15 Consistency-Availability Tradeoffs and SLAs Muntasir Rahman [POSTER HERE] Online Reconfiguration operations: Morphus project [IEEE ICAC 2015] Mainak Ghosh [POSTER HERE] Distributed Protocols Research Group (DPRG)

17 Distributed Protocols Research Group (DPRG)
Summary CAP Theorem motivated NoSQL Revolution But apps need freshness + fast responses Under soft partition We proposed Probabilistic models for C, A, P Probabilistic CAP theorem – generalizes classical CAP PCAP system satisfies Latency/Consistency SLAs Integrated into Apache Cassandra and Riak KV stores Distributed Protocols Research Group (DPRG)


Download ppt "PCAP Project: Probabilistic CAP and Adaptive Key-value Stores"

Similar presentations


Ads by Google