Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 An Overview of Cloud Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many.

Similar presentations


Presentation on theme: "1 An Overview of Cloud Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many."— Presentation transcript:

1 1 An Overview of Cloud Computing @ Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many discussions with: Eric Baldeschwieler, Jay Kistler, Chuck Neerdaels, Shelton Shugar, and Raymie Stata and joint work with the Sherpa team, in particular: Brian Cooper, Utkarsh Srivastava, Adam Silberstein, Rodrigo Fonseca and Nick Puz in Y! Research Chuck Neerdaels, P.P. Suryanarayanan and many others in CCDI

2 2 Questions What is cloud computing? –Horizontal and functional services What’s it going to change? –Software business models, science, life How many clouds will there be? –1, 2, 3, infinity What’s new in cloud computing? –HPC grids, ASPs, hosted services, Multics (!) –Emerging “cloud stack” to support a broad class of programs, including data intensive applications

3 3 SCENARIOS Pie-in-the-sky

4 4 Living in the Clouds We want to start a new website, FredsList.com Our site will provide listings of items for sale, jobs, etc. As time goes on, we’ll add more features –And illustrate how more cloud capabilities (and corresponding infrastructure components) are used as needed List of capabilities/components is illustrative, not exhaustive Our cloud provides a “dataset” abstraction –FredsList doesn’t worry about the underlying components

5 5 Step 1: Listings Scenario Simple Web Service API’s Database PNUTS FredsList.com application 1234323, transportation, For sale: one bicycle, barely used FredsList wants to store listings as (key, category, description) 5523442, childcare, Nanny available in San Jose 215534, wanted, Looking for issue 1 of Superman comic book DECLARE DATASET Listings AS ( ID String PRIMARY KEY, Category String, Description Text ) DECLARE DATASET Listings AS ( ID String PRIMARY KEY, Category String, Description Text )

6 6 Step 2: System Evolution Simple Web Service API’s Database PNUTS FredsList.com application 1234323, transportation, For sale: one bicycle, barely used Fred belatedly realizes prices are useful information! 5523442, childcare, Nanny available in San Jose 215534, wanted, Looking for issue 1 of Superman comic book ALTER DATASET Listings ADD (Price Float) ALTER DATASET Listings ADD (Price Float) Schemas are flexible, and evolve 32138, camera, Nikon D40, USD 300 Not every record in a dataset has values defined for all fields declared for the dataset vs.

7 7 Step 3: Search Simple Web Service API’s Database PNUTS “bicycle” FredsList’s customers quickly ask for keyword search Search Vespa “dvd’s” “nanny” FredsList.com application ALTER Listings SET Description SEARCHABLE ALTER Listings SET Description SEARCHABLE Messaging Tribble Federation of systems offering different capabilities

8 8 Step 4: Photos Simple Web Service API’s Database PNUTS FredsList decides to add photos/videos to listings Search Vespa Storage MObStor Foreign key photo → listing FredsList.com application ALTER Listings ADD Photo BLOB ALTER Listings ADD Photo BLOB Messaging Tribble Federation of systems offering different performance points

9 9 Step 5: Data Analysis Simple Web Service API’s Database PNUTS FredsList wants to analyze its listings to get statistics about category, do geocoding, etc. Search Vespa Storage MObStor Foreign key photo → listing FredsList.com application ALTER Listings MAKE ANALYZABLE ALTER Listings MAKE ANALYZABLE Compute Grid Batch export Pig query to analyze categories Hadoop program to geocode data Hadoop program to generate fancy pages for listings Messaging Tribble

10 10 Step 6: Performance Simple Web Service API’s Database PNUTS FredsList wants to reduce its data access latency Search Vespa Messaging Tribble Storage MObStor Foreign key photo → listing FredsList.com application ALTER Listings MAKE CACHEABLE ALTER Listings MAKE CACHEABLE Compute Grid Batch export Caching memcached And by now, Fred is global, and wants geo- replication!

11 11 Data Serving vs. Analysis Very different workloads, requirements Data from serving system is one of many kinds of data (click streams are another common kind, as are syndicated feeds) to be analyzed and integrated The result of analysis often goes right back into serving system

12 12 EYES TO THE SKIES Motherhood-and-Apple-Pie

13 13 Why Clouds? On-demand infrastructure to create a fundamental shift in the OE curve: –Do things we can’t do –Build more robustly, more efficiently, more globally, more completely, more quickly, for a given budget Cloud services should do heavy lifting of heavy-lifting of scaling & high-availability –Today, this is done at the app- level, which is not productive

14 14 Requirements for Cloud Services Multitenant. A cloud service must support multiple, organizationally distant customers. Elasticity. Tenants should be able to negotiate and receive resources/QoS on-demand. Resource Sharing. Ideally, spare cloud resources should be transparently applied when a tenant’s negotiated QoS is insufficient, e.g., due to spikes. Horizontal scaling. It should be possible to add cloud capacity in small increments; this should be transparent to the tenants of the service. Metering. A cloud service must support accounting that reasonably ascribes operational and capital expenditures to each of the tenants of the service. Security. A cloud service should be secure in that tenants are not made vulnerable because of loopholes in the cloud. Availability. A cloud service should be highly available. Operability. A cloud service should be easy to operate, with few operators. Operating costs should scale linearly or better with the capacity of the service.

15 15 Types of Cloud Services Two kinds of cloud services: –Horizontal (“Platform”) Cloud Services Functionality enabling tenants to build applications or new services on top of the cloud –Functional Cloud Services Functionality that is useful in and of itself to tenants. E.g., various SaaS instances, such as Saleforce.com; Google Analytics and Yahoo!’s IndexTools; Yahoo! properties aimed at end-users and small businesses, e.g., flickr, Groups, Mail, News, Shopping Could be built on top of horizontal cloud services or from scratch Yahoo! has been offering these for a long while (e.g., Mail for SMB, Groups, Flickr, BOSS, Ad exchanges)

16 16 Opening Up Yahoo! Search Phase 1 Phase 2 Giving site owners and developers control over the appearance of Yahoo! Search results. BOSS takes Yahoo!’s open strategy to the next level by providing Yahoo! Search infrastructure and technology to developers and companies to help them build their own search experiences.

17 17 babycenter epicurious Search Results of the Future yelp.com answers.com LinkedIn webmd Gawker New York Times

18 18 BOSS Offerings API A self-service, web services model for developers and start-ups to quickly build and deploy new search experiences. BOSS offers two options for companies and developers and has partnered with top technology universities to drive search experimentation, innovation and research into next generation search. University of Illinois Urbana Champaign Carnegie Mellon University Stanford University Purdue University MIT Indian Institute of Technology Bombay University of Massachusetts CUSTOM Working with 3rd parties to build a more relevant, brand/site specific web search experience. This option is jointly built by Yahoo! and select partners. ACADEMIC Working with the following universities to allow for wide-scale research in the search field: (Slide courtesy Prabhakar Raghavan)

19 19 Partner Examples

20 20 Horizontal Cloud Services Horizontal cloud services are foundations on which tenants build applications or new services. They should be: –Semantics-free. Must be "generic infrastructure,” and not tied to specific app-logic. May provide the ability to inject application logic through well-defined APIs –Broadly applicable. Must be broadly applicable (i.e., it can't be intended for just one or two properties). –Fault-tolerant over commodity hardware. Must be built using inexpensive commodity hardware, and should mask component failures. While each cloud service provides value, the power of the cloud paradigm will depend on a collection of well-chosen, loosely coupled services that collectively make it easy to quickly develop and operate innovative web applications.

21 21 What’s in the Horizontal Cloud? Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization Common Approaches to QA, Production Engineering, Performance Engineering, Datacenter Management, and Optimization ID & Account Management Monitoring & QoS Shared Infrastructure Metering, Billing, Accounting Horizontal Cloud Services Edge Content Services e.g., YCS, YCPI Provisioning & Virtualization e.g., EC2 Batch Storage & Processing e.g., Hadoop & Pig Operational Storage e.g., S3, MObStor, Sherpa Other Services Messaging, Workflow, virtual DBs & Webserving Security Simple Web Service API’s

22 22 Yahoo! Cloud Stack Provisioning (Self-serve) Horizontal Cloud Services …YCSYCPI Brooklyn EDGE Monitoring/Metering/Security Horizontal Cloud Services …Hadoop BATCH Horizontal Cloud Services …SherpaMOBStor STORAGE Horizontal Cloud Services VM/OS… APP Horizontal Cloud Services VM/OSyApache WEB Data Highway Serving Grid PHPApp Engine

23 23 Yahoo! CCDI Thrust Areas Fast Provisioning and Machine Virtualization: On demand, deliver a set of hosts imaged with desired software and configured against standard services –Multiple hosts may be multiplexed onto the same physical machine. Batch Storage and Processing: Scalable data storage optimized for batch processing, together with computational capabilities Operational Storage: Persistent storage that supports low-latency updates and flexible retrieval Edge Content Services: Support for dealing with network topology, communication protocols, caching, and BCP Rest of today’s talk

24 24 Web Data Management Large data analysis (Hadoop) Structured record storage (PNUTS/Sherpa) Blob storage (SAN/NAS) Scan oriented workloads Focus on sequential disk I/O $ per cpu cycle CRUD Point lookups and short scans Index organized table and random I/Os $ per latency Object retrieval and streaming Scalable file storage $ per GB

25 25 [Workflow] Hadoop: Batch Storage/Analysis Why is batch processing important? Whether it’s –response-prediction for advertising –machine-learned relevance for Search, or –content optimization for audience, –data-intensive computing is increasingly central to everything Yahoo! does –Hadoop is central to addressing this need Hadoop is a case-study in our cloud vision –Processes enormous amounts of data –Provides horizontal scaling and fault- tolerance for our users –Allows those users to focus on their app logic HDFS Map-Reduce High-level query layer (Pig)

26 26 The World Has Changed Web serving applications need: –Scalability! Preferably elastic –Flexible schemas –Geographic distribution –High availability –Reliable storage Web serving applications can do without: –Complicated queries –Strong transactions

27 27 MObStor Yahoo!’s next-generation globally replicated, virtualized media object storage service Better provisioning, easy migration, replication, better BCP, and performance New features (Evergreen URLs, CDN integration, REST API, …) The object metadata problem addressed using Sherpa, though MObStor is focused on blob storage.

28 28 Storage & Delivery Stack

29 29 PNUTS / SHERPA To Help You Scale Your Mountains of Data

30 30 CCDI—Research Collaboration Yahoo! Research Raghu Ramakrishnan Brian Cooper Utkarsh Srivastava Adam Silberstein Rodrigo Fonseca CCDI Chuck Neerdaels P.P.S. Narayan Kevin Athey Toby Negrin Plus Dev/QA teams

31 31 Yahoo! Serving Storage Problem –Small records – 100KB or less –Structured records – lots of fields, evolving –Extreme data scale - Tens of TB –Extreme request scale - Tens of thousands of requests/sec –Low latency globally - 20+ datacenters worldwide –High Availability - outages cost $millions –Variable usage patterns - as applications and users change 31

32 32 The PNUTS/Sherpa Solution The next generation global-scale record store –Record-orientation: Routing, data storage optimized for low-latency record access –Scale out: Add machines to scale throughput (while keeping latency low) –Asynchrony: Pub-sub replication to far-flung datacenters to mask propagation delay –Consistency model: Reduce complexity of asynchrony for the application programmer –Cloud deployment model: Hosted, managed service to reduce app time-to-market and enable on demand scale and elasticity 32

33 33 E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E What is PNUTS/Sherpa? E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Structured, flexible schema Hosted, managed infrastructure A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E 33

34 34 What Will It Become? E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) Parallel database Geographic replication Indexes and views Structured, flexible schema Hosted, managed infrastructure

35 35 What Will It Become? E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E Indexes and views

36 36 Scalability Thousands of machines Easy to add capacity Restrict query language to avoid costly queries Geographic replication Asynchronous replication around the globe Low-latency local access High availability and fault tolerance Automatically recover from failures Serve reads and writes despite failures Design Goals 36 Consistency Per-record guarantees Timeline model Option to relax if needed Multiple access paths Hash table, ordered table Primary, secondary access Hosted service Applications plug and play Share operational cost

37 37 Technology Elements PNUTS Query planning and execution Index maintenance Distributed infrastructure for tabular data Data partitioning Update consistency Replication YDOT FS Ordered tables Applications Tribble Pub/sub messaging YDHT FS Hash tables Zookeeper Consistency service YCA: Authorization PNUTS API Tabular API 37

38 38 Data Manipulation Per-record operations –Get –Set –Delete Multi-record operations –Multiget –Scan –Getrange Web service (RESTful) API 38

39 39 Tablets—Hash Table Apple Lemon Grape Orange Lime Strawberry Kiwi Avocado Tomato Banana Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don’t get scurvy! But at what price? How much did you pay for this lemon? Is this a vegetable? New Zealand The perfect fruit NameDescriptionPrice $12 $9 $1 $900 $2 $3 $1 $14 $2 $8 0x0000 0xFFFF 0x911F 0x2AF3 39

40 40 Tablets—Ordered Table 40 Apple Banana Grape Orange Lime Strawberry Kiwi Avocado Tomato Lemon Grapes are good to eat Limes are green Apple is wisdom Strawberry shortcake Arrgh! Don’t get scurvy! But at what price? The perfect fruit Is this a vegetable? How much did you pay for this lemon? New Zealand $1 $3 $2 $12 $8 $1 $9 $2 $900 $14 NameDescriptionPrice A Z Q H

41 41 Flexible Schema Posted dateListing idItemPrice 6/1/07424252Couch$570 6/1/07763245Bike$86 6/3/07211242Car$1123 6/5/07421133Lamp$15 Color Red Condition Good Fair

42 42 Storage units Routers Tablet Controller REST API Clients Local region Remote regions Tribble Detailed Architecture 42

43 43 Tablet Splitting and Balancing 43 Each storage unit has many tablets (horizontal partitions of the table) Tablets may grow over time Overfull tablets split Storage unit may become a hotspot Shed load by moving tablets to other servers Storage unit Tablet

44 44 QUERY PROCESSING 44

45 45 Accessing Data 45 SU 1 Get key k 2 3 Record for key k 4

46 46 Bulk Read 46 SU Scatter/ gather server SU 1 {k1, k2, … kn} 2 Get k 1 Get k 2 Get k 3

47 47 Storage unit 1Storage unit 2Storage unit 3 Range Queries in YDOT Clustered, ordered retrieval of records Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1 Router Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Apple Avocado Banana Blueberry Canteloupe Grape Kiwi Lemon Lime Mango Orange Strawberry Tomato Watermelon Grapefruit…Pear? Grapefruit…Lime? Lime…Pear? Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Storage unit 1

48 48 Updates 1 Write key k 2 7 Sequence # for key k 8 SU 3 Write key k 4 5 SUCCESS 6 Write key k Routers Message brokers 48

49 49 ASYNCHRONOUS REPLICATION AND CONSISTENCY 49

50 50 Asynchronous Replication 50

51 51 Goal: Make it easier for applications to reason about updates and cope with asynchrony What happens to a record with primary key “Alice”? Consistency Model 51 Time Record inserted Update Delete Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Update As the record is updated, copies may get out of sync.

52 52 Example: Social Alice UserStatus AliceBusy West East UserStatus AliceFree UserStatus Alice??? UserStatus Alice??? UserStatus AliceBusy UserStatus Alice___ Busy Free Record Timeline

53 53 Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Current version Stale version Read Consistency Model 53 In general, reads are served using a local copy

54 54 Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read up-to-date Current version Stale version Consistency Model 54 But application can request and get current version

55 55 Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Read ≥ v.6 Current version Stale version Consistency Model 55 Or variations such as “read forward”—while copies may lag the master record, every copy goes through the same sequence of changes

56 56 Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write Current version Stale version Consistency Model 56 Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors)

57 57 Time v. 1 v. 2 v. 3v. 4 v. 5 v. 7 Generation 1 v. 6 v. 8 Write if = v.7 ERROR Current version Stale version Consistency Model 57 Test-and-set writes facilitate per-record transactions

58 58 Consistency Techniques Per-record mastering –Each record is assigned a “master region” May differ between records –Updates to the record forwarded to the master region –Ensures consistent ordering of updates Tablet-level mastering –Each tablet is assigned a “master region” –Inserts and deletes of records forwarded to the master region –Master region decides tablet splits These details are hidden from the application –Except for the latency impact!

59 59 Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Tablet master

60 60 Bulk Insert/Update/Replace Client Source Data Bulk manager 1.Client feeds records to bulk manager 2.Bulk loader transfers records to SU’s in batches Bypass routers and message brokers Efficient import into storage unit

61 61 Bulk Load in YDOT YDOT bulk inserts can cause performance hotspots Solution: preallocate tablets

62 62 Index Maintenance How to have lots of interesting indexes and views, without killing performance? Solution: Asynchrony! –Indexes/views updated asynchronously when base table updated

63 63 SHERPA IN CONTEXT 63

64 64 Types of Record Stores Query expressiveness Simple Feature rich Object retrieval Retrieval from single table of objects/records SQL S3 PNUTS Oracle

65 65 Types of Record Stores Consistency model Best effort Strong guarantees Eventual consistency Timeline consistency ACID S3 PNUTS Oracle Program centric consistency Object-centric consistency

66 66 Types of Record Stores Data model Flexibility, Schema evolution Optimized for Fixed schemas CouchDB PNUTS Oracle Consistency spans objects Object-centric consistency

67 67 Types of Record Stores Elasticity (ability to add resources on demand) Inelastic Elastic Limited (via data distribution) VLSD (Very Large Scale Distribution /Replication) Oracle PNUTS S3

68 68 Data Stores Comparison User-partitioned SQL stores –Microsoft Azure SDS –Amazon SimpleDB Multi-tenant application databases –Salesforce.com –Oracle on Demand Mutable object stores –Amazon S3 Versus PNUTS More expressive queries Users must control partitioning Limited elasticity Highly optimized for complex workloads Limited flexibility to evolving applications Inherit limitations of underlying data management system Object storage versus record management

69 69 Application Design Space Records Files Get a few things Scan everything Sherpa MObStor Everest Hadoop YMDB MySQL Filer Oracle BigTable 69

70 70 Alternatives Matrix Elastic Operability Global low latency Availability Structured access Sherpa Y! UDB MySQL Oracle HDFS BigTable Dynamo Updates Cassandra Consistency model SQL/ACID 70

71 71 Further Reading Efficient Bulk Insertion into a Distributed Ordered Table (SIGMOD 2008) Adam Silberstein, Brian Cooper, Utkarsh Srivastava, Erik Vee, Ramana Yerneni, Raghu Ramakrishnan PNUTS: Yahoo!'s Hosted Data Serving Platform (VLDB 2008) Brian Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Phil Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana Yerneni Asynchronous View Maintenance for VLSD Databases, Parag Agrawal, Adam Silberstein, Brian F. Cooper, Utkarsh Srivastava and Raghu Ramakrishnan SIGMOD 2009 (to appear) Cloud Storage Design in a PNUTShell Brian F. Cooper, Raghu Ramakrishnan, and Utkarsh Srivastava Beautiful Data, O’Reilly Media, 2009 (to appear)

72 72 QUESTIONS? 72


Download ppt "1 An Overview of Cloud Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many."

Similar presentations


Ads by Google