RAMCloud Overview John Ousterhout Stanford University.

Slides:



Advertisements
Similar presentations
Fast Crash Recovery in RAMCloud
Advertisements

Ivan Pleština Amazon Simple Storage Service (S3) Amazon Elastic Block Storage (EBS) Amazon Elastic Compute Cloud (EC2)
John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2014.
RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Nandu Jayakumar, Diego Ongaro, Mendel Rosenblum,
RAMCloud Scalable High-Performance Storage Entirely in DRAM John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières,
RAMCloud 1.0 John Ousterhout Stanford University (with Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro, Seo Jin.
Log-Structured Memory for DRAM-Based Storage Stephen Rumble, Ankita Kejriwal, and John Ousterhout Stanford University.
SEDCL: Stanford Experimental Data Center Laboratory.
CS 140 Lecture Notes: Technology and Operating SystemsSlide 1 Technology Changes Mid-1980’s2009Change CPU speed15 MHz2 GHz133x Memory size8 MB4 GB500x.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
CS 142 Lecture Notes: Large-Scale Web ApplicationsSlide 1 RAMCloud Overview ● Storage for datacenters ● commodity servers ● GB DRAM/server.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
RAMCloud Design Review Recovery Ryan Stutsman April 1,
A Compilation of RAMCloud Slides (through Oct. 2012) John Ousterhout Stanford University.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
1 The Google File System Reporter: You-Wei Zhang.
RAMCloud: a Low-Latency Datacenter Storage System John Ousterhout Stanford University
RAMCloud: Concept and Challenges John Ousterhout Stanford University.
RAMCloud: A Low-Latency Datacenter Storage System Ankita Kejriwal Stanford University (Joint work with Diego Ongaro, Ryan Stutsman, Steve Rumble, Mendel.
Cool ideas from RAMCloud Diego Ongaro Stanford University Joint work with Asaf Cidon, Ankita Kejriwal, John Ousterhout, Mendel Rosenblum, Stephen Rumble,
John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Forum January, 2015.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Amazon Web Services BY, RAJESH KANDEPU. Introduction  Amazon Web Services is a collection of remote computing services that together make up a cloud.
Windows Azure Conference 2014 Deploy your Java workloads on Windows Azure.
Webscale Computing Mike Culver Amazon Web Services.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015.
IMDGs An essential part of your architecture. About me
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
Bigtable: A Distributed Storage System for Structured Data 1.
Durability and Crash Recovery for Distributed In-Memory Storage Ryan Stutsman, Asaf Cidon, Ankita Kejriwal, Ali Mashtizadeh, Aravind Narayanan, Diego Ongaro,
RAMCloud: Low-latency DRAM-based storage Jonathan Ellithorpe, Arjun Gopalan, Ashish Gupta, Ankita Kejriwal, Collin Lee, Behnam Montazeri, Diego Ongaro,
RAMCloud: Scalable High-Performance Storage Entirely in DRAM John Ousterhout Stanford University (with Christos Kozyrakis, David Mazières, Aravind Narayanan,
Log-structured Memory for DRAM-based Storage Stephen Rumble, John Ousterhout Center for Future Architectures Research Storage3.2: Architectures.
CS 140 Lecture Notes: Technology and Operating Systems Slide 1 Technology Changes Mid-1980’s2012Change CPU speed15 MHz2.5 GHz167x Memory size8 MB4 GB500x.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
11.1Database System Concepts. 11.2Database System Concepts Now Something Different 1st part of the course: Application Oriented 2nd part of the course:
Experiences With RAMCloud Applications John Ousterhout, Jonathan Ellithorpe, Bob Brown.
RAMCloud Overview and Status John Ousterhout Stanford University.
Implementing Linearizability at Large Scale and Low Latency Collin Lee, Seo Jin Park, Ankita Kejriwal, † Satoshi Matsushita, John Ousterhout Platform Lab.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Web Technologies Lecture 13 Introduction to cloud computing.
CS 540 Database Management Systems
Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,
Bigtable: A Distributed Storage System for Structured Data
CSci8211: Distributed Systems: RAMCloud 1 Distributed Shared Memory/Storage Case Study: RAMCloud Developed by Stanford Platform Lab  Key Idea: Scalable.
John Ousterhout Stanford University RAMCloud Overview and Update SEDCL Retreat June, 2013.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
CS 140 Lecture Notes: Technology and Operating Systems Slide 1 Technology Changes Mid-1980’s2012Change CPU speed15 MHz2.5 GHz167x Memory size8 MB4 GB500x.
CS422 Principles of Database Systems Introduction to NoSQL Chengyu Sun California State University, Los Angeles.
IT-DSS Alberto Pace2 ? Detecting particles (experiments) Accelerating particle beams Large-scale computing (Analysis) Discovery We are here The mission.
RAMCloud and the Low-Latency Datacenter John Ousterhout Stanford Platform Laboratory.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
Log-Structured Memory for DRAM-Based Storage Stephen Rumble and John Ousterhout Stanford University.
Noah Treuhaft UC Berkeley ROC Group ROC Retreat, January 2002
Operating Systems ECE344 Lecture 11: SSD Ding Yuan
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
CS 140 Lecture Notes: Technology and Operating Systems
RAMCloud Architecture
RAMCloud Architecture
CS 140 Lecture Notes: Technology and Operating Systems
AWS Cloud Computing Masaki.
Presentation transcript:

RAMCloud Overview John Ousterhout Stanford University

April 1, 2010RAMCloud OverviewSlide 2 Introduction ● Large-scale storage system entirely in DRAM ● Interesting combination: scale, low latency ● Enable new applications? ● The future of datacenter storage?

April 1, 2010RAMCloud OverviewSlide 3 Outline ● Overview of RAMCloud ● Motivation ● Research challenges ● Basic cluster structure and data model

April 1, 2010RAMCloud OverviewSlide 4 The Basic Idea ● Storage for datacenters ● commodity servers ● GB DRAM/server ● All data always in RAM ● Durable and available ● Performance goals:  High throughput: 1M ops/sec/server  Low-latency access: 5-10µs RPC Application Servers Storage Servers Datacenter

April 1, 2010RAMCloud OverviewSlide 5 Example Configurations Today5-10 years # servers GB/server24GB256GB Total capacity48TB1PB Total server cost$3.1M$6M $/GB$65$6

April 1, 2010RAMCloud OverviewSlide 6 RAMCloud Motivation: Latency ● Large-scale apps struggle with high latency  Facebook: can only make internal requests per page UI App. Logic Data Structures Traditional Application UI Bus. Logic Application Servers Storage Servers Web Application << 1µs latency0.5-10ms latency Single machine Datacenter

April 1, 2010RAMCloud OverviewSlide 7 Computation Power Dimensions of Scalability Storage Capacity Data Access Rate

April 1, 2010RAMCloud OverviewSlide 8 Computation Power Dimensions of Scalability Storage Capacity Data Access Rate Not provided by today's infrastructure

April 1, 2010RAMCloud OverviewSlide 9 MapReduce Sequential data access → high data access rate  Not all applications fit this model Computation Data

April 1, 2010RAMCloud OverviewSlide 10 RAMCloud Motivation: Latency ● RAMCloud goal: large scale and low latency ● Enable a new breed of information-intensive applications UI App. Logic Data Structures Traditional Application UI Bus. Logic Application Servers Storage Servers Web Application << 1µs latency ms latency Single machine Datacenter 5-10µs

April 1, 2010RAMCloud OverviewSlide 11 RAMCloud Motivation: Scalability ● Relational databases don’t scale ● Every large-scale Web application has problems:  Facebook: 4000 MySQL instances memcached servers ● Major system redesign for every 10x increase in scale ● New forms of storage appearing:  Bigtable  Dynamo  PNUTS  Sinfonia  H-store  memcached

April 1, 2010RAMCloud OverviewSlide 12 RAMCloud Motivation: Technology Disk access rate not keeping up with capacity: ● Disks must become more archival ● More information must move to memory Mid-1980’s2009Change Disk capacity30 MB500 GB16667x Max. transfer rate2 MB/s100 MB/s50x Latency (seek & rotate)20 ms10 ms2x Capacity/bandwidth (large blocks) 15 s5000 s333x Capacity/bandwidth (1KB blocks) 600 s58 days8333x Jim Gray's rule5 min30 hrs360x

April 1, 2010RAMCloud OverviewSlide 13 Why Not a Caching Approach? ● Lost performance:  1% misses → 10x performance degradation  Hard to approach 1% misses (Facebook ~ 5-7% misses) ● Won’t save much money:  Already have to keep information in memory  Example: Facebook caches ~75% of data size ● Changes disk management issues:  Optimize for reads, vs. writes & recovery

April 1, 2010RAMCloud OverviewSlide 14 Why not Flash Memory? ● Many candidate technologies besides DRAM  Flash (NAND, NOR)  PC RAM  … ● DRAM enables lowest latency today:  5-10x faster than flash ● Most RAMCloud techniques will apply to other technologies

April 1, 2010RAMCloud OverviewSlide 15 Is RAMCloud Capacity Sufficient? ● Facebook: 200 TB of (non-image) data in 2009 ● Amazon: Revenues/year:$16B Orders/year:400M? ($40/order?) Bytes/order: ? Order data/year: TB? RAMCloud cost:$26-260K? ● United Airlines: Total flights/day:4000? (30,000 for all airlines in U.S.) Passenger flights/year:200M? Bytes/passenger-flight: ? Order data/year: TB? RAMCloud cost:$13-130K? ● Ready today for almost all online data; media soon

April 1, 2010RAMCloud OverviewSlide 16 RAMCloud Research Issues ● Data durability/availability ● Fast RPCs ● Data model, concurrency/consistency model ● Data distribution, scaling ● Automated management ● Multi-tenancy ● Client-server functional distribution ● Node architecture

April 1, 2010RAMCloud OverviewSlide 17 RAMCloud Cluster Structure Modified Linux MasterBackup Servers Clients (App Servers) OS/VMM RAMCloud Library Application UntrustedTrusted Coordinator

April 1, 2010RAMCloud OverviewSlide 18 Client Library vs. Server ● Move functionality to library?  Flexibility: enable different implementations  Throughput: offload servers  May improve performance (e.g., aggregation) ● Concentrate functionality in servers?  May improve performance (e.g., faster synchronization)  Can't depend on proper client behavior: ● Security/access control ● Consistency/crash recovery Application RAMCloud Library RAMCloud Server Application Custom Library

April 1, 2010RAMCloud OverviewSlide 19 Data Model Rationale How to get best application-level performance? Lower-level APIs Less server functionality Higher-level APIs More server functionality Key-value store Distributed shared memory : Server implementation easy Low-level performance good  APIs not convenient for applications  Lose performance in application-level synchronization Distributed shared memory : Server implementation easy Low-level performance good  APIs not convenient for applications  Lose performance in application-level synchronization Relational database : Powerful facilities for apps Best RDBMS performance  Simple cases pay RDBMS performance  More complexity in servers Relational database : Powerful facilities for apps Best RDBMS performance  Simple cases pay RDBMS performance  More complexity in servers

April 1, 2010RAMCloud OverviewSlide 20 Data Model Basics ● Workspace:  All data for one or more apps  Unit of access control ● Table:  Related collection of objects ● Object:  Variable-length up to 1MB  Contents opaque to servers ● Id:  64 bits, unique within table  Chosen explicitly by client or implicitly by server (0,1,2,...) ● Version:  64 bits  Guaranteed increasing, even across deletes Workspace Table idobject idobject idobject idobject idobject idobject idobject idobject idobject Table vers

April 1, 2010RAMCloud OverviewSlide 21 Basic Operations get(tableId, objId)→ (blob, version) put(tableId, blob)→ (objId, version) put(tableId, objId, blob)→ (version) delete(tableId, objId) Other facilities (discussed in later talks) ● Conditional updates ● Mini-transactions ● Indexes

April 1, 2010RAMCloud OverviewSlide 22 Other Design Goals ● Data distributed automatically by RAMCloud:  Tables can be split across multiple servers  Indexes can be split across multiple servers  Distribution transparent to applications ● Multi-tenancy for cloud computing:  Support multiple (potentially hostile) applications  Cost proportional to application size

April 1, 2010RAMCloud OverviewSlide 23 Conclusion ● Interesting combination of scale and latency ● Enable more powerful uses of information at scale:  clients  100TB - 1PB  5-10 µs latency

April 1, 2010RAMCloud OverviewSlide 24 Questions/Comments