CS 600.419 Storage Systems Lecture 14 Consistency and Availability Tradeoffs.

Slides:



Advertisements
Similar presentations
Database System Concepts and Architecture
Advertisements

Database Architectures and the Web
Replication. Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Distributed Processing, Client/Server, and Clusters
“Managing Update Conflicts in Bayou, a Weekly Connected Replicated Storage System” Presented by - RAKESH.K.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
CS 582 / CMPE 481 Distributed Systems
“Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System ” Distributed Systems Κωνσταντακοπούλου Τζένη.
G Robert Grimm New York University Porcupine.
Computer Science Lecture 21, page 1 CS677: Distributed OS Today: Coda, xFS Case Study: Coda File System Brief overview of other recent file systems –xFS.
Department of Electrical Engineering
Overview Distributed vs. decentralized Why distributed databases
1 Porcupine: A Highly Available Cluster-based Mail Service Yasushi Saito Brian Bershad Hank Levy University of Washington Department of Computer Science.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
G Robert Grimm New York University Bayou: A Weakly Connected Replicated Storage System.
Chapter 12 Distributed Database Management Systems
Wide-area cooperative storage with CFS
Manageability, Availability and Performance in Porcupine: A Highly Scalable, Cluster-based Mail Service.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
University of Pennsylvania 11/21/00CSE 3801 Distributed File Systems CSE 380 Lecture Note 14 Insup Lee.
Managing Update Conflicts in Bayou, a Weakly Connected Replicated Storage System D. B. Terry, M. M. Theimer, K. Petersen, A. J. Demers, M. J. Spreitzer.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Porcupine: A Highly Available Cluster- based Mail Service Y. Saito, B. Bershad, H. Levy U. Washington.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
WORKFLOW IN MOBILE ENVIRONMENT. WHAT IS WORKFLOW ?  WORKFLOW IS A COLLECTION OF TASKS ORGANIZED TO ACCOMPLISH SOME BUSINESS PROCESS.  EXAMPLE: Patient.
PMIT-6102 Advanced Database Systems
1 The Google File System Reporter: You-Wei Zhang.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
Mobility in Distributed Computing With Special Emphasis on Data Mobility.
Replication and Consistency. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer,
Feb 7, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements.
Database Design – Lecture 16
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Bayou. References r The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer.
2/1/00 Porcupine: a highly scalable service Authors: Y. Saito, B. N. Bershad and H. M. Levy This presentation by: Pratik Mukhopadhyay CSE 291 Presentation.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Distributed Databases
Databases Illuminated
CS 347Notes101 CS 347 Parallel and Distributed Data Processing Distributed Information Retrieval Hector Garcia-Molina Zoltan Gyongyi.
GLOBE DISTRIBUTED SHARED OBJECT. INTRODUCTION  Globe stands for GLobal Object Based Environment.  Globe is different from CORBA and DCOM that it supports.
Distributed database system
CS425 / CSE424 / ECE428 — Distributed Systems — Fall 2011 Some material derived from slides by Prashant Shenoy (Umass) & courses.washington.edu/css434/students/Coda.ppt.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Chapter 20 Parallel Sysplex
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
An Architecture for Mobile Databases By Vishal Desai.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Features Of SQL Server 2000: 1. Internet Integration: SQL Server 2000 works with other products to form a stable and secure data store for internet and.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Mobility Victoria Krafft CS /25/05. General Idea People and their machines move around Machines want to share data Networks and machines fail Network.
CS 540 Database Management Systems NoSQL & NewSQL Some slides due to Magda Balazinska 1.
Nomadic File Systems Uri Moszkowicz 05/02/02.
CS 440 Database Management Systems
EECS 498 Introduction to Distributed Systems Fall 2017
CS 440 Database Management Systems
Today: Coda, xFS Case Study: Coda File System
Outline The Case for Non-transparent Replication: Examples from Bayou Douglas B. Terry, Karin Petersen, Mike J. Spreitzer, and Marvin M. Theimer. IEEE.
Database System Architectures
Presentation transcript:

CS Storage Systems Lecture 14 Consistency and Availability Tradeoffs

CS Storage Systems Overview Bayou – always available replicated storage –always disconnected operation, even when connected –application specific conflict, resolution –replication Porcupine – self-adapting, self-tuning mail systems –lock free, eventual consistency –manageability, scalability and performance tradeoffs

CS Storage Systems Bayou: System Goals Always available system –read and write regardless of network/system state Automatic conflict resolution Eventual consistency –no instantaneous consistency guarantees, but always merges to a consistent state –1 copy serializable equivalence Based on pair-wise communication –no central services to fail or limit availability

CS Storage Systems Bayou: Example Applications Non-real-time, collaborative applications –shared calendars, mail, document editing, program development Applications implemented –Meeting room scheduler: degenerate calendar form based reservation tentative (gray) and committed (black) reservations –Bibliography database keyed entries automatic merging of same item with different keys Applications have well defined conflict and resolution semantics –application specific, but automatic resolution –Bayou does not generalize to block storage

CS Storage Systems Bayou: System Architecture

CS Storage Systems Bayou: System Architecture Servers may be –distinguished –collocated RPC interface –read/write only –sessions Data collections replicated in full –weak consistency –update any copy, read any copy

CS Storage Systems Bayou: System Architecture Server state –log of writes Each write has a global ID –assigned by accepting server Anti-entropy sessions –pair-wise conflict resolution –reduce disorder –apply locally accepted writes to other replicas Epidemic algorithms –pair-wise between many sites converge to a consistent state

CS Storage Systems Bayou: Conflict Resolution Application specific conflict resolution Fine-grained –record level, individual meeting room entries Automatic resolution –merging of bibliographic entries Two constructs to implement conflict detection and resolution –dependency checks (application defined) –merge procedures

CS Storage Systems Bayou: Write Operation Dependency check is a DB query –passes if query gets the expected result Failed dependency checks invoke a merge procedure –results in a resolved update

CS Storage Systems Bayou: Write Example

CS Storage Systems Bayou: Anti-Entropy Merging To merge a set of tentative replicas with another site –perform the tentative writes at the new site –for writes that conflict, use the resolution procedure defined as part of the write –rollback the log as necessary to undo tentative writes Update ordering –each server defines its own update order –when merging two sites, define an update order over both servers –transitive property gives a global ordering over all sites Vector clocks –for k replicas, each server maintains a k th order vector clock –list of applied, forgotten and tentative updates at each server

CS Storage Systems Bayou: Database Structure

CS Storage Systems Bayou: Timestamp Vectors O vector – omitted and committed writes, no longer in log C vector – committed writes, known to be stable F vector – full state, tentative writes

CS Storage Systems Bayou: DB Views In-memory – full view of all tentative writes –tenative writes are stable in the log On disk – only committed writes

CS Storage Systems Bayou: In conclusion Non-transparency Application specific resolver, achieve automation Tentative and stable resolutions Partial and multi-object updates –sessions, which we did not talk about Impressively rich and available storage for applications that can stand tentative updates –writes may change long after they have been performed

CS Storage Systems Porcupine: Goals Scalable mail server –“dynamic load balancing, automatic configuration, and graceful degradation in the presence of failures.” –“Key to the system’s manageability, availability, and performance is that sessions, data, and underlying services are distributed homogeneously and dynamically across nodes in a cluster.” Tradeoffs between manageability, scalability, and performance

CS Storage Systems Porcupine: Requirements Management –self-configuring, self-healing: no runtime interaction –management task is to add/remove resources (disk, computer) –resource serve in different roles over time, transparently Availabiltiy –service to all users at all times Performance –single node performance competitive with other single-node systems –scale linearly to thousands of machines

CS Storage Systems Porcupine: Requirements Central goal System requirement Method of achievement

CS Storage Systems Porcupine: What’s what. Functional homogeneity: any node can perform any function. –increases availability because a single node can run the whole system, no idependent failure of different functions –manageability: all nodes are identical in software and configuration

CS Storage Systems Porcupine: What’s what. Automatic reconfiguration –no management tasks beyond installing software

CS Storage Systems Porcupine: What’s what. Replication –availability: sites failing does not make data unavailable –performance: updates can go to closest replica, least loaded replica, or several replicas in parallel –replication performance is predicated on weak consistency

CS Storage Systems Porcupine: What’s what. Dynamic transaction scheduling: dynamic distribution of load to less busy machines –no configuration for load balance

CS Storage Systems Porcupine: Uses Why mail? (can be configured as a Web or Usenet Server) –need: single corporations handle more than 10 8 messages per day, goal is to scale to 10 9 messages per day –write-intensive: Web-services have been shown to be highly scalable, so pick a more interesting workload –consistency: requirements for consistency are weak enough to justify extensive replication

CS Storage Systems Porcupine: Data Structures

CS Storage Systems Porcupine: Data Structures Mailbox fragment: portion of some users mail –a mailbox consists of the union of all replicas of all fragments for a user Fragment list: list of all nodes that contain fragments –soft state, not persistent or recoverable

CS Storage Systems Porcupine: Data Structures User profile database –client population, user names, passwords, profiles, etc. –hard (persistent state), changes infrequently User profile –soft state version of database, used for updates to user profile –kept at one node in a system

CS Storage Systems Porcupine: Data Structures User map –maps user to a node that is managing soft state and fragment list –replicated at each node –hash index

CS Storage Systems Porcupine: Replication Tradeoff Plusses: replication allows for: –dynamic load balancing –availability when nodes fail Minuses: replication detracts from: –delivery and retrieval, more complex, longer paths –performance, compared with a statically load balanced system, performance is lower Replication ethos: –as wide as necessary, no wider

CS Storage Systems Porcupine: Control Flow (write/send)

CS Storage Systems Porcupine: Control Flow (read/IMAP/POP)

CS Storage Systems Porcupine: Replication Approach Eventual consistency Update anywhere Total update –changes to an object modify the entire object, invalidating the previous copy –reasonable for mail, simplifies system Lock free –side-effect of update anywhere Ordering by loosely synchronized clocks –not vector based clocks System is less sophisticated and flexible than Bayou

CS Storage Systems Porcupine: Scaling Replication trades off availability for performance

CS Storage Systems Porcupine: Handling Skew Dyanmic load balancing helps deal with workload skew –SX – static distribution on X nodes –DX – dynamic distribution on X nodes –SM – sendmail and pop –R – random, unrealistic

CS Storage Systems Porcupine: Handling Skew Replication eases recovery from failures