Big Data Yuan Xue CS 292 Special topics on.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Dynamo: Amazon’s Highly Available Key-value Store Slides taken from created by paper authors Giuseppe DeCandia, Deniz Hastorun,
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Dynamo: Amazon’s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
AMAZON’S KEY-VALUE STORE: DYNAMO DeCandia,Hastorun,Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels: Dynamo: Amazon's highly available.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - V ALUE S TORE Presented By Roni Hyam Ami Desai.
Distributed Hash Tables Chord and Dynamo Costin Raiciu, Advanced Topics in Distributed Systems 18/12/2012.
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Dynamo: Amazon's Highly Available Key-value Store Guiseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
Amazon Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Dynamo: Amazon’s Highly Available Key-value Store Adopted from slides and/or materials by paper authors (Giuseppe DeCandia, Deniz Hastorun, Madan Jampani,
1 Dynamo Amazon’s Highly Available Key-value Store Scott Dougan.
Dynamo Highly Available Key-Value Store 1Dennis Kafura – CS5204 – Operating Systems.
CS 582 / CMPE 481 Distributed Systems
Dynamo Kay Ousterhout. Goals Small files Always writeable Low latency – Measured at 99.9 th percentile.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Dynamo: Amazon’s Highly Available Key- value Store (SOSP’07) Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
Versioning and Eventual Consistency COS 461: Computer Networks Spring 2011 Mike Freedman 1.
Wide-area cooperative storage with CFS
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Dynamo: Amazon's Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, et.al., SOSP ‘07.
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 14 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia et al. [Amazon.com] Jagrut Sharma CSCI-572 (Prof. Chris Mattmann)
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 17 th, 2014 John Kubiatowicz Electrical Engineering and Computer.
Depot: Cloud Storage with minimal Trust COSC 7388 – Advanced Distributed Computing Presentation By Sushil Joshi.
Cloud Computing Cloud Data Serving Systems Keke Chen.
Dynamo: Amazon's Highly Available Key-value Store Dr. Yingwu Zhu.
Apache Cassandra - Distributed Database Management System Presented by Jayesh Kawli.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - VALUE S TORE Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1 Professor : Dr Sheykh Esmaili.
Dynamo: Amazon’s Highly Available Key-value Store
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
CAP + Clocks Time keeps on slipping, slipping…. Logistics Last week’s slides online Sign up on Piazza now – No really, do it now Papers are loaded in.
Peer to Peer Networks Distributed Hash Tables Chord, Kelips, Dynamo Galen Marchetti, Cornell University.
Databases Illuminated
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 20 th, 2013 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
DYNAMO: AMAZON’S HIGHLY AVAILABLE KEY-VALUE STORE GIUSEPPE DECANDIA, DENIZ HASTORUN, MADAN JAMPANI, GUNAVARDHAN KAKULAPATI, AVINASH LAKSHMAN, ALEX PILCHIN,
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Kitsuregawa Laboratory Confidential. © 2007 Kitsuregawa Laboratory, IIS, University of Tokyo. [ hoshino] paper summary: dynamo 1 Dynamo: Amazon.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
CSCI5570 Large Scale Data Processing Systems NoSQL Slide Ack.: modified based on the slides from Peter Vosshall James Cheng CSE, CUHK.
Amazon Simple Storage Service (S3)
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
P2P: Storage.
Dynamo: Amazon’s Highly Available Key-value Store
Lecturer : Dr. Pavle Mogin
Lecture 9: Dynamo Instructor: Weidong Shi (Larry), PhD
John Kubiatowicz Electrical Engineering and Computer Sciences
Scaling Out Key-Value Storage
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Key-Value Tables: Chord and DynamoDB (Lecture 16, cs262a)
Apollo Facebook is trying to address problems with latencies by switching to a NoSQL database called Apollo. Facebook created Apollo internally, and it.
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Presentation transcript:

Big Data Yuan Xue CS 292 Special topics on

Part II NoSQL Database (Amazon Dynamo) Yuan Xue

Introduction  Dynamo Background  Developed in Amazon (published 2007)  Shopping cart management for Amazon Dynamo BigTable VoldemortRiak Cassandra

Overview  Distributed storage system  Key-value store/Database  Simple  Query Model: simple read and write operations to a data item that is uniquely identified by a key  Highly available  Sacrifice strong consistency for availability  Conflict resolution is executed during read instead of write, i.e. “always writeable”.  Scalable  Incremental scalability.  Symmetry.  Decentralization.  Heterogeneity.  Guarantee SLA

Road Map  Database User/Application Developer: How to use?  (Logic) data model and CRUD operations  Database System Designer: How to design?  Under the hood: (Physical) data model and distribution algorithm  Database Designer: How to link application needs with database design  Schema design

Data Model and Operation  Two operations: get() and put().  get(key) operation  locates the object replicas associated with the key in the storage system and returns a single object or a list of objects with conflicting versions along with a context.  put(key, context, object) operation determines where the replicas of the object should be placed based on the associated key, and writes the replicas to disk.  The context encodes system metadata about the object that is opaque to the caller and includes information such as the version of the object. The context information is stored along with the object so that the system can verify the validity of the context object supplied in the put request.

Road Map  Database User/Application Developer: How to use?  (Logic) data model and CRUD operations  Database System Designer: How to design?  Under the hood: (Physical) data model and distribution algorithm  Database Designer: How to link application needs with database design  Schema design

Solution Summary ProblemTechniqueAdvantage PartitioningConsistent HashingIncremental Scalability High Availability for writes Vector clocks with reconciliation during reads Version size is decoupled from update rates. Handling temporary failuresSloppy Quorum and hinted handoff Provides high availability and durability guarantee when some of the replicas are not available. Recovering from permanent failures Anti-entropy using Merkle trees Synchronizes divergent replicas in the background. Membership and failure detection Gossip-based membership protocol and failure detection. Preserves symmetry and avoids having a centralized registry for storing membership and node liveness information.

Partition Algorithm  Consistent hashing: the output range of a hash function is treated as a fixed circular space or “ring”.  ” Virtual Nodes”: Each node can be responsible for more than one virtual node.

Advantages of using virtual nodes  If a node becomes unavailable the load handled by this node is evenly dispersed across the remaining available nodes.  When a node becomes available again, the newly available node accepts a roughly equivalent amount of load from each of the other available nodes.  The number of virtual nodes that a node is responsible can decided based on its capacity, accounting for heterogeneity in the physical infrastructure.

Replication  Each data item is replicated at N hosts.  “preference list”: The list of nodes that is responsible for storing a particular key.

Data Versioning  A put() call may return to its caller before the update has been applied at all the replicas  A get() call may return many versions of the same object.  Challenge: an object having distinct version sub-histories, which the system will need to reconcile in the future.  Solution: uses vector clocks in order to capture causality between different versions of the same object.

Vector Clock  A vector clock is a list of (node, counter) pairs.  Every version of every object is associated with one vector clock.  If the counters on the first object’s clock are less-than-or-equal to all of the nodes in the second clock, then the first is an ancestor of the second and can be forgotten.

get () and put () operations under the hood 1. Route its request through a generic load balancer that will select a node based on load information. 2. Use a partition-aware client library that routes requests directly to the appropriate coordinator nodes.

Sloppy Quorum  R/W is the minimum number of nodes that must participate in a successful read/write operation.  Setting R + W > N yields a quorum-like system.  In this model, the latency of a get (or put) operation is dictated by the slowest of the R (or W) replicas. For this reason, R and W are usually configured to be less than N, to provide better latency.

Hinted handoff  Assume N = 3. When A is temporarily down or unreachable during a write, send replica to D.  D is hinted that the replica is belong to A and it will deliver to A when A is recovered.  Again: “always writeable”

Replica synchronization  Merkle hash tree  a hash tree where leaves are hashes of the values of individual keys.  Parent nodes higher in the tree are hashes of their respective children.  Advantage of Merkle tree:  Each branch of the tree can be checked independently without requiring nodes to download the entire tree.  Help in reducing the amount of data that needs to be transferred while checking for inconsistencies among replicas.

Acknowledgment and Additional Reading  Slides taken from created by paper authors Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogelswww.slideworld.com  Comparison Hbase and Cassandra 