IBM Haifa Research 1 The Cloud Trade Off IBM Haifa Research Storage Systems.

Slides:



Advertisements
Similar presentations
©2011 Hewlett-Packard Company and Vertica Confidential11 Cloud Storage Challenges Dr. Dinkar Sitaram
Advertisements

COMP 655: Distributed/Operating Systems Summer 2011 Dr. Chunbo Chu Week 7: Consistency 4/13/20151Distributed Systems - COMP 655.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
High throughput chain replication for read-mostly workloads
Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services Authored by: Seth Gilbert and Nancy Lynch Presented by:
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
The google file system Cs 595 Lecture 9.
CSE Cloud Computing-Fall 2014 Prof. Dong Wang
Life after CAP Ali Ghodsi CAP conjecture [reminder] Can only have two of: – Consistency – Availability – Partition-tolerance Examples.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Replication Steve Ko Computer Sciences and Engineering University at Buffalo.
BREWER’S CONJECTURE AND THE FEASIBILITY OF CAP WEB SERVICES (Eric Brewer) Seth Gilbert Nancy Lynch Presented by Kfir Lev-Ari.
Giovanni Chierico | May 2012 | Дубна Consistency in a distributed world.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
GrapevineCS-4513, D-Term Introduction to the Grapevine Distributed System CS-4513 Distributed Computing Systems.
CS 582 / CMPE 481 Distributed Systems
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Rethinking Dynamo: Amazon’s Highly Available Key-value Store --An Offense Shih-Chi Chen Hongyu Gao.
Distributed Systems 2006 Group Membership * *With material adapted from Ken Birman.
MS CLOUD DB - AZURE SQL DB Fault Tolerance by Subha Vasudevan Christina Burnett.
CS346: Advanced Databases
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Massively Parallel Cloud Data Storage Systems S. Sudarshan IIT Bombay.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Distributed Storage System Survey
1 The Google File System Reporter: You-Wei Zhang.
Module 12: Designing High Availability in Windows Server ® 2008.
Fault Tolerance via the State Machine Replication Approach Favian Contreras.
CH2 System models.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
© , OrangeScape Technologies Limited. Confidential 1 Write Once. Cloud Anywhere. Building Highly Scalable Web applications BASE gives way to ACID.
Distributed Algorithms – 2g1513 Lecture 9 – by Ali Ghodsi Fault-Tolerance in Distributed Systems.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Trade-offs in Cloud.
From imagination to impact a.com.au From imagination to impact Insert name Insert Title H. Wada, A. Fekete, L. Zhao, K. Lee and A.
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Data Versioning Lecturer.
AZR308. Building distributed systems on an abstraction against commodity hardware at Internet scale, composed of multiple services. Distributed System.
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
Geo-distributed Messaging with RabbitMQ
Distributed Systems CS Consistency and Replication – Part I Lecture 10, September 30, 2013 Mohammad Hammoud.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 CSE 486/586 Distributed Systems Consistency Steve Ko Computer Sciences and Engineering University at Buffalo.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
Consistency and CAP 1. Reasons for Replication Data replication is a common technique in distributed systems. There are two reasons for data replication:
CSCI5570 Large Scale Data Processing Systems
Trade-offs in Cloud Databases
CSE 486/586 Distributed Systems Consistency --- 1
Strong Consistency & CAP Theorem
Consistency and CAP.
Consistency in Distributed Systems
Strong Consistency & CAP Theorem
Massively Parallel Cloud Data Storage Systems
NoSQL Databases An Overview
CSE 486/586 Distributed Systems Consistency --- 1
PERSPECTIVES ON THE CAP THEOREM
CONSISTENCY IN DISTRIBUTED SYSTEMS
Transaction Properties: ACID vs. BASE
CMSC Cluster Computing Basics
Global Distribution.
Implementing Consistency -- Paxos
CSE 486/586 Distributed Systems Consistency --- 3
CSE 486/586 Distributed Systems Consistency --- 1
Presentation transcript:

IBM Haifa Research 1 The Cloud Trade Off IBM Haifa Research Storage Systems

IBM Haifa Research 2 Fundamental Requirements form Cloud Storage Systems  The Google File System first design consideration: “component failures are the norm rather than the exception… We have seen problems caused by application bugs, operating system bugs, human errors, and the failures of disks, memory, connectors, networking, and power supplies. Therefore, constant monitoring, error detection, fault tolerance, and automatic recovery must be integral to the system”  Amazon Dymano Key-Value Store “In this environment there is a particular need for storage technologies that are always available. For example, customers should be able to view and add items to their shopping cart even if disks are failing, network routes are flapping, or data centers are being destroyed by tornados. Therefore, the service responsible for managing shopping carts requires that it can always write to and read from its data store, and that its data needs to be available across multiple data centers.  Facebook Photo Storage System (Haystack) “In large scale systems, failures happen every day. Our users rely on their photos being available and should not experience errors despite the inevitable server crashes and hard drive failures. It may happen that an entire datacenter loses power or a cross-country link is severed. Haystack replicates each photo in geographically distinct locations.”

IBM Haifa Research 3 The CAP Theorem  Lets concentrate on being: –Always Available –Fault Tolerant (to many types of failure) -Requires replication across nodes/data centers Is that asking too much? Turns out that yes.  The CAP Theorem (Eric Brewer): –One can achieve at most two of the following: -Data Consistency -System Availability -Tolerance to network Partitions –Was first made as a conjecture At PODC 2000 by Eric Brewer –The Conjecture was formalized and confirmed by MIT researchers Seth Gilbert and Nancy Lynch in 2002

IBM Haifa Research 4 Consistency (Simplified) WAN Replica A Replica B UpdateRetrieve

IBM Haifa Research 5 Tolerance to Network Partitions / Availability WAN Replica A Replica B Update

IBM Haifa Research 6 Problems with CAP Daniel Abadi, Yale University  Asymmetry of CAP properties –C is a property of the system in general –A is a property of the system only if there is a partition  Effectively, there are two choices here: –Be Consistent (C): - Make sure you have a good transaction protocol. But in the presence of a network partition, the protocol would fail and result in un-availability –Be Available in the presence of partitions (PA): -This is where Cloud Storage typically lives. (Almost) -Always accept an update, no matter which replica got it. -But what with the consistency issue…

IBM Haifa Research 7 Adding Latency to the Game Daniel Abadi, Yale University  Consistency has its price even in the absence of network partitions  Synchronization between replicas has its overhead –No way to get around without at least one round-trip message

IBM Haifa Research 8 CAP – Revised Daniel Abadi, Yale University  PALEC –In the case of a partition (P), does the system choose availability (A) or consistency (C)? –Else (E), does the system choose latency (L) or consistency (C)?  PA/EL –Dynamo, SimpleDB, Cassandra, Riptano, CouchDB, Cloudant  PC/EC –Traditional database systems  PA/EC –GenieDB  PC/EL –Existence is debatable

IBM Haifa Research 9 Strong Vs. Weak Consistency  Strong consistency –after an update is committed, each subsequent access will return the updated value  Weak consistency –the systems does not guarantee that subsequent accesses will return the updated value –a number of conditions might need to be met before the updated value is returned –inconsistency window: period between update and the point in time when every access is guaranteed to return the updated value

IBM Haifa Research 10 Eventual Consistency  A specific form of weak consistency  The Literature Definition: –The storage system guarantees that if no new updates are made to the data eventually (after the inconsistency window closes) all accesses will return the last updated value.  In the absence of failures, the maximum size of the inconsistency window can be determined based on communication delays –communication delays –system load –number of replicas –…  Example: –The most popular system that implements eventual consistency is DNS, the domain name system. Updates to a name are distributed according to a configured pattern and in combination with time controlled caches, eventually of client will see the update.

IBM Haifa Research 11 Models of Eventual Consistency  Read your writes –After updating a value, a process will always read the updated value, and will never see an older value  Monotonic read consistency –if a process has seen a particular value, any subsequent access will never return any previous value  Monotonic write consistency –system guarantees to serialize the writes of one process  Many more…  Properties can be combined –E.g. monotonic reads plus read-your-own-writes  Applications should be written with the consistency model in mind.

IBM Haifa Research 12 The Consistency-Availability Trade-Off  Some Definitions: –N – The number of replicas –W – The number of replicas that need to acknowledge the receipt of the update before the update completes –R - The number of replicas that are contacted when a data object is accessed through a read operation  If W+R > N then the write set and read set are always overlapped and so consistency can be guaranteed. –If however, not all replicas can be contacted, the system is not available. –Examples: -Systems that focus solely on fault-tolerance often use N=3 (with W=2 and R=2 configurations) -R=1,W=N. Optimized for read access. -W=1, R=N. Optimizes for write access  If W+R <= N consistency cannot be guaranteed

IBM Haifa Research 13 The Trade-Off Granularity  Which of the following are natural to the eventual consistency model? –Write an object as a whole –Append –Rename –ACL Setting  Conclusion –The Storage System can and should define different R, W settings for different operations.

IBM Haifa Research 14 References  S. Gilbert and N. Lynch: Brewer’s Conjecture and the Feasibility of Consistent, Available and Partition-Tolerant Web Services. SIGACT News 33(2), pp ,  W. Vogels: Eventually Consistent. ACM Queue 6(6), pp ,  Daniel Abadi, Yale University, Problems with CAP, little.html little.html 