Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China www.jiahenglu.net.

Slides:



Advertisements
Similar presentations
An Overview of Cloud Computing Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many discussions.
Advertisements

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno Jacobsen,
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
大规模数据处理 / 云计算 Lecture 4 – Mapreduce Algorithm Design 彭波 北京大学信息科学技术学院 4/24/2011 This work is licensed under a Creative.
C-Store: Data Management in the Cloud Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Jun 5, 2009.
1 Web-Scale Data Serving with PNUTS Adam Silberstein Yahoo! Research.
PNUTS: Yahoo’s Hosted Data Serving Platform Jonathan Danaparamita jdanap at umich dot edu University of Michigan EECS 584, Fall Some slides/illustrations.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen,
PNUTS: Yahoo!’s Hosted Data Serving Platform Yahoo! Research present by Liyan & Fang.
Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, Russell Sears Yahoo! Research Presenter.
Cloud Computing Lecture #3 More MapReduce Jimmy Lin The iSchool University of Maryland Wednesday, September 10, 2008 This work is licensed under a Creative.
Web Data Management Raghu Ramakrishnan Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy.
Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
MapReduce Algorithm Design Data-Intensive Information Processing Applications ― Session #3 Jimmy Lin University of Maryland Tuesday, February 9, 2010 This.
Jimmy Lin The iSchool University of Maryland Wednesday, April 15, 2009
Overview Distributed vs. decentralized Why distributed databases
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
Data-Intensive Text Processing with MapReduce Jimmy Lin The iSchool University of Maryland Sunday, May 31, 2009 This work is licensed under a Creative.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
1 An Overview of Cloud Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many.
1 Cloud Data Serving: From Key-Value Stores to DBMSs Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Brian Cooper Adam Silberstein Utkarsh.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
PNUTS: Y AHOO !’ S H OSTED D ATA S ERVING P LATFORM B RIAN F. C OOPER, R AGHU R AMAKRISHNAN, U TKARSH S RIVASTAVA, A DAM S ILBERSTEIN, P HILIP B OHANNON,
MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Meet with the AppEngine Márk Gergely eu.edge. What is AppEngine? It’s a tool, that lets you run your web applications on Google's infrastructure. –Google's.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Alireza Angabini Advanced DB class Dr. M.Rahgozar Fall 88.
PNUTS PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
大规模数据处理 / 云计算 Lecture 5 – Mapreduce Algorithm Design 彭波 北京大学信息科学技术学院 7/19/2011 This work is licensed under a Creative.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
MapReduce Algorithm Design Based on Jimmy Lin’s slides
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
大规模数据处理 / 云计算 Lecture 3 – Mapreduce Algorithm Design 闫宏飞 北京大学信息科学技术学院 7/16/2013 This work is licensed under a Creative.
NoSQL Or Peles. What is NoSQL A collection of various technologies meant to work around RDBMS limitations (mostly performance) Not much of a definition...
Web Technologies Lecture 13 Introduction to cloud computing.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Big Data Infrastructure Week 2: MapReduce Algorithm Design (2/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0.
1 Benchmarking Cloud Serving Systems with YCSB Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan and Russell Sears Yahoo! Research.
Next Generation of Apache Hadoop MapReduce Owen
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
BIG DATA/ Hadoop Interview Questions.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Web-Scale Data Serving with PNUTS
Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore
MongoDB Er. Shiva K. Shrestha ME Computer, NCIT
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
PNUTS: Yahoo!’s Hosted Data Serving Platform
NOSQL databases and Big Data Storage Systems
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
湖南大学-信息科学与工程学院-计算机与科学系
MapReduce Algorithm Design Adapted from Jimmy Lin’s slides.
AWS Cloud Computing Masaki.
Introduction to MapReduce
Presentation transcript:

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China

Advanced MapReduce Application Reference: Jimmy Lin ud-2008-Fall/schedule.html

Managing Dependencies Remember: Mappers run in isolation You have no idea in what order the mappers run You have no idea on what node the mappers run You have no idea when each mapper finishes Tools for synchronization: Ability to hold state in reducer across multiple key- value pairs Sorting function for keys Partitioner Cleverly-constructed data structures

Motivating Example Term co-occurrence matrix for a text collection M = N x N matrix (N = vocabulary size) M ij : number of times i and j co-occur in some context (for concreteness, let’s say context = sentence) Why? Distributional profiles as a way of measuring semantic distance Semantic distance useful for many language processing tasks e.g., Mohammad and Hirst (EMNLP, 2006)

MapReduce: Large Counting Problems Term co-occurrence matrix for a text collection = specific instance of a large counting problem A large event space (number of terms) A large number of events (the collection itself) Goal: keep track of interesting statistics about the events Basic approach Mappers generate partial counts Reducers aggregate partial counts

First Try: “Pairs” Each mapper takes a sentence: Generate all co-occurring term pairs For all pairs, emit (a, b) → count Reducers sums up counts associated with these pairs Use combiners!

“Pairs” Analysis Advantages Easy to implement, easy to understand Disadvantages Lots of pairs to sort and shuffle around (upper bound?)

Another Try: “Stripes” Idea: group together pairs into an associative array Each mapper takes a sentence: Generate all co-occurring term pairs (a, b) → 1 (a, c) → 2 (a, d) → 5 (a, e) → 3 (a, f) → 2 a → { b: 1, c: 2, d: 5, e: 3, f: 2 } a → { b: 1, d: 5, e: 3 } a → { b: 1, c: 2, d: 2, f: 2 } a → { b: 2, c: 2, d: 7, e: 3, f: 2 }

Another Try: “Stripes” Reducers perform element-wise sum of associative arrays a → { b: 1, d: 5, e: 3 } a → { b: 1, c: 2, d: 2, f: 2 } a → { b: 2, c: 2, d: 7, e: 3, f: 2 } +

“Stripes” Analysis Advantages Far less sorting and shuffling of key-value pairs Can make better use of combiners Disadvantages More difficult to implement Underlying object is more heavyweight Fundamental limitation in terms of size of event space

Cluster size: 38 cores Data Source: Associated Press Worldstream (APW) of the English Gigaword Corpus (v3), which contains 2.27 million documents (1.8 GB compressed, 5.7 GB uncompressed)

Conditional Probabilities How do we compute conditional probabilities from counts? Why do we want to do this? How do we do this with MapReduce?

P(B|A): “Pairs” For this to work: Must emit extra (a, *) for every b n in mapper Must make sure all a’s get sent to same reducer (use Partitioner) Must make sure (a, *) comes first (define sort order) (a, b 1 ) → 3 (a, b 2 ) → 12 (a, b 3 ) → 7 (a, b 4 ) → 1 … (a, *) → 32 (a, b 1 ) → 3 / 32 (a, b 2 ) → 12 / 32 (a, b 3 ) → 7 / 32 (a, b 4 ) → 1 / 32 … Reducer holds this value in memory

P(B|A): “Stripes” Easy! One pass to compute (a, *) Another pass to directly compute P(B|A) a → {b 1 :3, b 2 :12, b 3 :7, b 4 :1, … }

Synchronization in Hadoop Approach 1: turn synchronization into an ordering problem Sort keys into correct order of computation Partition key space so that each reducer gets the appropriate set of partial results Hold state in reducer across multiple key-value pairs to perform computation Approach 2: construct data structures that “bring the pieces together” Each reducer receives all the data it needs to complete the computation

Issues and Tradeoffs Number of key-value pairs Object creation overhead Time for sorting and shuffling pairs across the network Size of each key-value pair De/serialization overhead Combiners make a big difference! RAM vs. disk and network Arrange data to maximize opportunities to aggregate partial results

Data Types in Hadoop WritableDefines a de/serialization protocol. Every data type in Hadoop is a Writable. WritableComprableDefines a sort order. All keys must be of this type (but not values). IntWritable LongWritable Text … Concrete classes for different data types.

Complex Data Types in Hadoop How do you implement complex data types? The easiest way: Encoded it as Text, e.g., (a, b) = “a:b” Use regular expressions to parse and extract data The hard way: Define a custom implementation of WritableComprable Must implement: readFields, write, compareTo Computationally efficient, but slow for rapid prototyping

Yahoo ! PNUTS and Hadoop

babycenter epicurious Search Results of the Future yelp.com answers.com LinkedIn webmd Gawker New York Times

What’s in the Horizontal Cloud? Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization ID & Account Management Monitoring & QoS Shared Infrastructure Metering, Billing, Accounting Horizontal Cloud Services Edge Content Services e.g., YCS, YCPI Provisioning & Virtualization e.g., EC2 Batch Storage & Processing e.g., Hadoop & Pig Operational Storage e.g., S3, MObStor, Sherpa Other Services Messaging, Workflow, virtual DBs & Webserving Security Simple Web Service API’s

Yahoo! Cloud Stack Provisioning (Self-serve) Horizontal Cloud Services …YCSYCPI Brooklyn EDGE Monitoring/Metering/Security Horizontal Cloud Services …Hadoop BATCH Horizontal Cloud Services …SherpaMOBStor STORAGE Horizontal Cloud Services VM/OS… APP Horizontal Cloud Services VM/OSyApache WEB Data Highway Serving Grid PHPApp Engine

Yahoo! CCDI Thrust Areas Fast Provisioning and Machine Virtualization: On demand, deliver a set of hosts imaged with desired software and configured against standard services Multiple hosts may be multiplexed onto the same physical machine. Batch Storage and Processing: Scalable data storage optimized for batch processing, together with computational capabilities Operational Storage: Persistent storage that supports low- latency updates and flexible retrieval Edge Content Services: Support for dealing with network topology, communication protocols, caching, and BCP Rest of today’s talk

Web Data Management Large data analysis (Hadoop) Structured record storage (PNUTS/Sherpa) Blob storage (SAN/NAS) Scan oriented workloads Focus on sequential disk I/O $ per cpu cycle CRUD Point lookups and short scans Index organized table and random I/Os $ per latency Object retrieval and streaming Scalable file storage $ per GB

The World Has Changed Web serving applications need: Scalability! Preferably elastic Flexible schemas Geographic distribution High availability Reliable storage Web serving applications can do without: Complicated queries Strong transactions

PNUTS / SHERPA To Help You Scale Your Mountains of Data

Yahoo! Serving Storage Problem Small records – 100KB or less Structured records – lots of fields, evolving Extreme data scale - Tens of TB Extreme request scale - Tens of thousands of requests/sec Low latency globally datacenters worldwide High Availability - outages cost $millions Variable usage patterns - as applications and users change 27

The PNUTS/Sherpa Solution The next generation global-scale record store Record-orientation: Routing, data storage optimized for low-latency record access Scale out: Add machines to scale throughput (while keeping latency low) Asynchrony: Pub-sub replication to far-flung datacenters to mask propagation delay Consistency model: Reduce complexity of asynchrony for the application programmer Cloud deployment model: Hosted, managed service to reduce app time-to-market and enable on demand scale and elasticity 28

E C A E B W C W D E F E What is PNUTS/Sherpa? E C A E B W C W D E F E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Structured, flexible schema Hosted, managed infrastructure A E B W C W D E E C F E 29

What Will It Become? E C A E B W C W D E F E E C A E B W C W D E F E E C A E B W C W D E F E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Indexes and views Structured, flexible schema Hosted, managed infrastructure

What Will It Become? E C A E B W C W D E F E E C A E B W C W D E F E E C A E B W C W D E F E Indexes and views

Scalability Thousands of machines Easy to add capacity Restrict query language to avoid costly queries Geographic replication Asynchronous replication around the globe Low-latency local access High availability and fault tolerance Automatically recover from failures Serve reads and writes despite failures Design Goals 32 Consistency Per-record guarantees Timeline model Option to relax if needed Multiple access paths Hash table, ordered table Primary, secondary access Hosted service Applications plug and play Share operational cost

Technology Elements PNUTS Query planning and execution Index maintenance Distributed infrastructure for tabular data Data partitioning Update consistency Replication YDOT FS Ordered tables Applications Tribble Pub/sub messaging YDHT FS Hash tables Zookeeper Consistency service YCA: Authorization PNUTS API Tabular API 33

Data Manipulation Per-record operations Get Set Delete Multi-record operations Multiget Scan Getrange Web service (RESTful) API 34

Tablets—Hash Table Apple Lemon Grape Orange Lime Strawberry Kiwi Avocado Tomato Banana Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don’t get scurvy! But at what price? How much did you pay for this lemon? Is this a vegetable? New Zealand The perfect fruit NameDescriptionPrice $12 $9 $1 $900 $2 $3 $1 $14 $2 $8 0x0000 0xFFFF 0x911F 0x2AF3 35

Tablets—Ordered Table 36 Apple Banana Grape Orange Lime Strawberry Kiwi Avocado Tomato Lemon Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don’t get scurvy! But at what price? The perfect fruit Is this a vegetable? How much did you pay for this lemon? New Zealand $1 $3 $2 $12 $8 $1 $9 $2 $900 $14 NameDescriptionPrice A Z Q H

Flexible Schema Posted dateListing idItemPrice 6/1/ Couch$570 6/1/ Bike$86 6/3/ Car$1123 6/5/ Lamp$15 Color Red Condition Good Fair

Storage units Routers Tablet Controller REST API Clients Local region Remote regions Tribble Detailed Architecture 38

Tablet Splitting and Balancing 39 Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow over time Overfull tablets split Storage unit may become a hotspot Shed load by moving tablets to other servers Storage unit Tablet

QUERY PROCESSING 40

Accessing Data 41 SU 1 Get key k 2 3 Record for key k 4

Bulk Read 42 SU Scatter/ gather server SU 1 {k1, k2, … kn} 2 Get k 1 Get k 2 Get k 3

Storage unit 1Storage unit 2Storage unit 3 Range Queries in YDOT Clustered, ordered retrieval of records Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1 Router Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Grapefruit…Pear? Grapefruit…Lime? Lime…Pear? Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1

Updates 1 Write key k 2 7 Sequence # for key k 8 SU 3 Write key k 4 5 SUCCESS 6 Write key k Routers Message brokers 44

ASYNCHRONOUS REPLICATION AND CONSISTENCY 45

Asynchronous Replication 46

Goal: Make it easier for applications to reason about updates and cope with asynchrony What happens to a record with primary key “Alice”? Consistency Model 47 Time Record inserted Update Delete Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Update As the record is updated, copies may get out of sync.

Example: Social Alice UserStatus AliceBusy West East UserStatus AliceFree UserStatus Alice??? UserStatus Alice??? UserStatus AliceBusy UserStatus Alice___ Busy Free Record Timeline

Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Read Consistency Model 49 In general, reads are served using a local copy

Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale version Consistency Model 50 But application can request and get current version

Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read ≥ v.6 Current version Stale version Consistency Model 51 Or variations such as “read forward”—while copies may lag the master record, every copy goes through the same sequence of changes

Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Consistency Model 52 Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors)

Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write if = v.7 ERROR Current version Stale version Consistency Model 53 Test-and-set writes facilitate per-record transactions

Consistency Techniques Per-record mastering Each record is assigned a “master region” May differ between records Updates to the record forwarded to the master region Ensures consistent ordering of updates Tablet-level mastering Each tablet is assigned a “master region” Inserts and deletes of records forwarded to the master region Master region decides tablet splits These details are hidden from the application Except for the latency impact!

55 Mastering A E B W C W D E E C F E A E B W C W D E E C F E A E B W C W D E E C F E Tablet master

Bulk Insert/Update/Replace Client Source Data Bulk manager 1.Client feeds records to bulk manager 2.Bulk loader transfers records to SU’s in batches Bypass routers and message brokers Efficient import into storage unit

Bulk Load in YDOT YDOT bulk inserts can cause performance hotspots Solution: preallocate tablets

Index Maintenance How to have lots of interesting indexes and views, without killing performance? Solution: Asynchrony! Indexes/views updated asynchronously when base table updated

SHERPA IN CONTEXT 59

Types of Record Stores Query expressiveness Simple Feature rich Object retrieval Retrieval from single table of objects/records SQL S3 PNUTS Oracle

Types of Record Stores Consistency model Best effort Strong guarantees Eventual consistency Timeline consistency ACID S3 PNUTS Oracle Program centric consistency Object-centric consistency

Types of Record Stores Data model Flexibility, Schema evolution Optimized for Fixed schemas CouchDB PNUTS Oracle Consistency spans objects Object-centric consistency

Types of Record Stores Elasticity (ability to add resources on demand) Inelastic Elastic Limited (via data distribution) VLSD (Very Large Scale Distribution /Replication) Oracle PNUTS S3

Data Stores Comparison User-partitioned SQL stores Microsoft Azure SDS Amazon SimpleDB Multi-tenant application databases Salesforce.com Oracle on Demand Mutable object stores Amazon S3 Versus PNUTS More expressive queries Users must control partitioning Limited elasticity Highly optimized for complex workloads Limited flexibility to evolving applications Inherit limitations of underlying data management system Object storage versus record management

Application Design Space Records Files Get a few things Scan everything Sherpa MObStor Everest Hadoop YMDB MySQL Filer Oracle BigTable 65

Alternatives Matrix Elastic Operability Global low latency Availability Structured access Sherpa Y! UDB MySQL Oracle HDFS BigTable Dynamo Updates Cassandra Consistency model SQL/ACID 66

Further Reading Efficient Bulk Insertion into a Distributed Ordered Table (SIGMOD 2008) Adam Silberstein, Brian Cooper, Utkarsh Srivastava, Erik Vee, Ramana Yerneni, Raghu Ramakrishnan PNUTS: Yahoo!'s Hosted Data Serving Platform (VLDB 2008) Brian Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Phil Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni Asynchronous View Maintenance for VLSD Databases, Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava and Raghu Ramakrishnan SIGMOD 2009 (to appear) Cloud Storage Design in a PNUTShell Brian F. Cooper, Raghu Ramakrishnan, and Utkarsh Srivastava Beautiful Data, O’Reilly Media, 2009 (to appear)

QUESTIONS? 68