Dynamo: Amazon’s Highly Available Key-value Store

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Dynamo: Amazon’s Highly Available Key-value Store Slides taken from created by paper authors Giuseppe DeCandia, Deniz Hastorun,
Dynamo: Amazon’s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
AMAZON’S KEY-VALUE STORE: DYNAMO DeCandia,Hastorun,Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels: Dynamo: Amazon's highly available.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - V ALUE S TORE Presented By Roni Hyam Ami Desai.
Distributed Hash Tables Chord and Dynamo Costin Raiciu, Advanced Topics in Distributed Systems 18/12/2012.
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Dynamo: Amazon's Highly Available Key-value Store Guiseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
Amazon Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Dynamo: Amazon’s Highly Available Key-value Store Adopted from slides and/or materials by paper authors (Giuseppe DeCandia, Deniz Hastorun, Madan Jampani,
1 Dynamo Amazon’s Highly Available Key-value Store Scott Dougan.
Dynamo Highly Available Key-Value Store 1Dennis Kafura – CS5204 – Operating Systems.
NoSQL Databases: MongoDB vs Cassandra
Dynamo Kay Ousterhout. Goals Small files Always writeable Low latency – Measured at 99.9 th percentile.
Dynamo: Amazon’s Highly Available Key- value Store (SOSP’07) Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
Versioning and Eventual Consistency COS 461: Computer Networks Spring 2011 Mike Freedman 1.
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Dynamo: Amazon's Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, et.al., SOSP ‘07.
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 14 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia et al. [Amazon.com] Jagrut Sharma CSCI-572 (Prof. Chris Mattmann)
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 17 th, 2014 John Kubiatowicz Electrical Engineering and Computer.
Depot: Cloud Storage with minimal Trust COSC 7388 – Advanced Distributed Computing Presentation By Sushil Joshi.
Cloud Computing Cloud Data Serving Systems Keke Chen.
Dynamo: Amazon's Highly Available Key-value Store Dr. Yingwu Zhu.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - VALUE S TORE Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1 Professor : Dr Sheykh Esmaili.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Trade-offs in Cloud.
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
Replicated Databases. Reading Textbook: Ch.13 Textbook: Ch.13 FarkasCSCE Spring
Peer to Peer Networks Distributed Hash Tables Chord, Kelips, Dynamo Galen Marchetti, Cornell University.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 20 th, 2013 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
DYNAMO: AMAZON’S HIGHLY AVAILABLE KEY-VALUE STORE GIUSEPPE DECANDIA, DENIZ HASTORUN, MADAN JAMPANI, GUNAVARDHAN KAKULAPATI, AVINASH LAKSHMAN, ALEX PILCHIN,
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Big Data Yuan Xue CS 292 Special topics on.
Kitsuregawa Laboratory Confidential. © 2007 Kitsuregawa Laboratory, IIS, University of Tokyo. [ hoshino] paper summary: dynamo 1 Dynamo: Amazon.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
CSCI5570 Large Scale Data Processing Systems NoSQL Slide Ack.: modified based on the slides from Peter Vosshall James Cheng CSE, CUHK.
CSCI5570 Large Scale Data Processing Systems
Amazon Simple Storage Service (S3)
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
P2P: Storage.
Dynamo: Amazon’s Highly Available Key-value Store
Lecturer : Dr. Pavle Mogin
Lecture 9: Dynamo Instructor: Weidong Shi (Larry), PhD
John Kubiatowicz Electrical Engineering and Computer Sciences
Introduction to NewSQL
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Key-Value Tables: Chord and DynamoDB (Lecture 16, cs262a)
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Presentation transcript:

Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avanish Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels SIGOPS 07

Index Introduction NoSQL Dynamo CAP Genealogy Partition Replication Versioning Handoff Evaluation

Why No-Relational DB? Relational DB are hard to scale Replication – Scaling by Duplication Partitioning (Sharding) – Scaling by Division Do not need some features Update / Delete Join Transactions (ACID) Fixed Schema

Desired Characteristics High Scalability Ability to add nodes incrementally to support more users and data High Availability Data should be available even as some nodes go down High Performance Operations should return fast Consistency Do not need strong consistency Durability Data should be persisted to disk and not just kept in volatile memory Deployment Flexibility Modeling Flexibility Key-Value pairs, Graphs Query Flexibility Multi-Gets, Range Queries

CAP Theorem (1/2) You can’t have it all

CAP Theorem (2/2)

NoSQL Genealogy

Dynamo Motivation Build a distributed storage system ACID vs BASE Scale Simple: key-value Highly available Guarantee Service Level Agreements (SLA) ACID vs BASE Basically Available Soft-state Eventual consistency

Dynamo Motivation Application can deliver its functionality in abounded time Every dependency in the platform needs to deliver its functionality with even tighter bounds Example Service guaranteeing that it will provide a response within 300ms for 99.9% of its requests for a peak client load of 500 requests per second

Consideration Sacrifice strong consistency for availability Conflict resolution is executed during read instead of write, i.e. “always writeable” Other principles Principle Effect Incremental scalability A storage host can be scaled without undue impact to the system Symmetry All nodes are the same Decentralization Focus on peer to peer techniques Heterogeneity Work must be distributed according to capabilities of the nodes

Summary of techniques used in Dynamo Problem Technique Advantage Partitioning Consistent Hashing Incremental Scalability High Availability for writes Vector clocks Version size is decoupled from update rates Handling temporary failures Sloppy Quorum and Hinted handoff Provides high availability and durability guarantee when some of the replicas are not available. Recovering from permanent failures Merkle trees Synchronizes divergent replicas in the background. Membership and failure detection Gossip Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information.

Partition Algorithm Consistent hashing Virtual Nodes The output range of a hash function is treated as a fixed circular space or “ring” Virtual Nodes Each node can be responsible for more than one virtual node

Replication Optimistic Replication Changes are allowed to propagate to replicas in the background, and concurrent, disconnected work is tolerated Each data item is replicated at N hosts Preference list The list of nodes that is responsible for storing a particular key

Data Versioning A put() call may return to its caller before the update has been applied at all the replicas A get() call may return many versions of the same object Challenge An object having distinct version sub-histories, which the system will need to reconcile in the future. Solution Uses vector clocks in order to capture causality between different versions of the same object

Vector Clock A vector clock is a list of (node, counter) pairs Every version of every object is associated with one vector clock If the counters on the first object’s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten

Vector clock example

Sloppy Quorum NRW Configuration W=N R=1 Read optimized strong consistency W=1 R=N Write optimized strong consistency W+R <= N Weak eventual consistency. A read might not see latest update W+R > N Strong consistency through quorum assembly. A read will see at least one copy of the most recent update In this model, the latency of a get (or put) operation is dictated by the slowest of the R (or W) replicas. For this reason, R and W are usually configured to be less than N, to provide better latency Typical Values of (N,R,W) for Amazon Apps are (3,2,2)

Hinted handoff Assume N = 3. When A is temporarily down or unreachable during a write, send replica to D D is hinted that the replica is belong to A and it will deliver to A when A is recovered Again: “always writeable”

Other techniques Replica synchronization Merkle hash tree Membership and Failure Detection Gossip Physical Storage Berkley DB transactional store, MySQL, In-memory buffer

Implementation Java Local persistence component allows for different storage engines to be plugged in: Berkeley Database (BDB) Transactional Data Store: object of tens of kilobytes MySQL: object of > tens of kilobytes BDB Java Edition, etc.

Evaluation

Evaluation

3/18 3/25 Cassandra Consensus on Transaction Commit Bigtable + Dynamo Partitioning, Replication, Membership Consensus on Transaction Commit Paxos Commit / 2Phase Commit 3/25 Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore Data Replication 기타 NoSQL 특징 비교 장단점, Query Language등