Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore

Slides:



Advertisements
Similar presentations
An Overview of Cloud Computing Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many discussions.
Advertisements

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno Jacobsen,
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
1 Web-Scale Data Serving with PNUTS Adam Silberstein Yahoo! Research.
PNUTS: Yahoo’s Hosted Data Serving Platform Jonathan Danaparamita jdanap at umich dot edu University of Michigan EECS 584, Fall Some slides/illustrations.
PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen,
PNUTS: Yahoo!’s Hosted Data Serving Platform Yahoo! Research present by Liyan & Fang.
Web Data Management Raghu Ramakrishnan Research QUIQ Lessons Structured data management powers scalable collaboration environments ASP Multi-tenancy.
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
1 An Overview of Cloud Yahoo! Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Research Fellow, Yahoo! Research Reflects many.
1 Cloud Data Serving: From Key-Value Stores to DBMSs Raghu Ramakrishnan Chief Scientist, Audience and Cloud Computing Brian Cooper Adam Silberstein Utkarsh.
Apache Hadoop and Hive Dhruba Borthakur Apache Hadoop Developer
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
PNUTS: YAHOO!’S HOSTED DATA SERVING PLATFORM FENGLI ZHANG.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Distributed Systems Tutorial 11 – Yahoo! PNUTS written by Alex Libov Based on OSCON 2011 presentation winter semester,
PNUTS: Y AHOO !’ S H OSTED D ATA S ERVING P LATFORM B RIAN F. C OOPER, R AGHU R AMAKRISHNAN, U TKARSH S RIVASTAVA, A DAM S ILBERSTEIN, P HILIP B OHANNON,
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Distributed Indexing of Web Scale Datasets for the Cloud {ikons, eangelou, Computing Systems Laboratory School of Electrical.
Introduction to Hadoop and HDFS
Hadoop & Condor Dhruba Borthakur Project Lead, Hadoop Distributed File System Presented at the The Israeli Association of Grid Technologies.
Alireza Angabini Advanced DB class Dr. M.Rahgozar Fall 88.
Cassandra - A Decentralized Structured Storage System
Introduction to Hadoop Owen O’Malley Yahoo!, Grid Team
PNUTS PNUTS: Yahoo!’s Hosted Data Serving Platform Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, HansArno.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
BIG DATA/ Hadoop Interview Questions.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
School of Computer Science and Mathematics BIG DATA WITH HADOOP EXPERIMENTATION APARICIO CARRANZA NYC College of Technology – CUNY ECC Conference 2016.
Web-Scale Data Serving with PNUTS
Image taken from: slideshare
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
CS 405G: Introduction to Database Systems
and Big Data Storage Systems
Cassandra - A Decentralized Structured Storage System
Hadoop.
INTRODUCTION TO PIG, HIVE, HBASE and ZOOKEEPER
Dhruba Borthakur Apache Hadoop Developer Facebook Data Infrastructure
NOSQL.
PNUTS: Yahoo!’s Hosted Data Serving Platform
PNUTS: Yahoo!’s Hosted Data Serving Platform
NOSQL databases and Big Data Storage Systems
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Massively Parallel Cloud Data Storage Systems
CS6604 Digital Libraries IDEAL Webpages Presented by
Introduction to PIG, HIVE, HBASE & ZOOKEEPER
آزمايشگاه سيستمهای هوشمند علی کمالی زمستان 95
Introduction to Apache
AWS Cloud Computing Masaki.
Benchmarking Cloud Serving Systems with YCSB
April 13th – Semi-structured data
Cloud Computing for Data Analysis Pig|Hive|Hbase|Zookeeper
THE GOOGLE FILE SYSTEM.
Chapter 21: Parallel and Distributed Storage
Presentation transcript:

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Structured record storage Web Data Management CRUD Point lookups and short scans Index organized table and random I/Os $ per latency Scan oriented workloads Focus on sequential disk I/O $ per cpu cycle Structured record storage (PNUTS/Sherpa) Large data analysis (Hadoop) Object retrieval and streaming Scalable file storage $ per GB Blob storage (SAN/NAS)

The World Has Changed Web serving applications need: Scalability! Preferably elastic Flexible schemas Geographic distribution High availability Reliable storage Web serving applications can do without: Complicated queries Strong transactions

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 PNUTS / SHERPA To Help You Scale Your Mountains of Data A project in Y!R focused on a long-range problem, origins in earlier work at Wisconsin. Basis for the Goldrush hack, which won the recent Local hack competition, and could contribute to creation/refinement of Y! Local content and Next Gen Search. Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Yahoo! Serving Storage Problem Small records – 100KB or less Structured records – lots of fields, evolving Extreme data scale - Tens of TB Extreme request scale - Tens of thousands of requests/sec Low latency globally - 20+ datacenters worldwide High Availability - outages cost $millions Variable usage patterns - as applications and users change Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 5

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 What is PNUTS/Sherpa? CREATE TABLE Parts ( ID VARCHAR, StockNumber INT, Status VARCHAR … ) E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Structured, flexible schema E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E Geographic replication Parallel database Hosted, managed infrastructure Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 7

What Will It Become? Indexes and views A 42342 E B 42521 W A 42342 E C 66354 W D 12352 E F 15677 E E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E Indexes and views E 75656 C A 42342 E B 42521 W C 66354 W D 12352 E F 15677 E

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Design Goals Scalability Thousands of machines Easy to add capacity Restrict query language to avoid costly queries Geographic replication Asynchronous replication around the globe Low-latency local access High availability and fault tolerance Automatically recover from failures Serve reads and writes despite failures Consistency Per-record guarantees Timeline model Option to relax if needed Multiple access paths Hash table, ordered table Primary, secondary access Hosted service Applications plug and play Share operational cost Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 10 10

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Technology Elements Applications PNUTS API Tabular API PNUTS Query planning and execution Index maintenance Distributed infrastructure for tabular data Data partitioning Update consistency Replication YCA: Authorization YDOT FS Ordered tables YDHT FS Hash tables Tribble Pub/sub messaging Zookeeper Consistency service Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 11 11

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Data Manipulation Per-record operations Get Set Delete Multi-record operations Multiget Scan Getrange Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 12

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Tablets—Hash Table Name Description Price 0x0000 Grape Grapes are good to eat $12 Lime Limes are green $9 Apple Apple is wisdom $1 Strawberry Strawberry shortcake $900 0x2AF3 Orange Arrgh! Don’t get scurvy! $2 Avocado But at what price? $3 Lemon How much did you pay for this lemon? $1 Tomato Is this a vegetable? $14 0x911F Banana The perfect fruit $2 Kiwi New Zealand $8 0xFFFF Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 13

Tablets—Ordered Table Name Description Price A Apple Apple is wisdom $1 Avocado But at what price? $3 Banana The perfect fruit $2 Grape Grapes are good to eat $12 H Kiwi New Zealand $8 Lemon How much did you pay for this lemon? $1 Lime Limes are green $9 Orange Arrgh! Don’t get scurvy! $2 Q Strawberry Strawberry shortcake $900 Tomato Is this a vegetable? $14 Z Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 14

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Flexible Schema Posted date Listing id Item Price 6/1/07 424252 Couch $570 763245 Bike $86 6/3/07 211242 Car $1123 6/5/07 421133 Lamp $15 Condition Good Fair Color Red Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Detailed Architecture Local region Remote regions Clients REST API Routers Tribble Tablet Controller Storage units Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 16

Tablet Splitting and Balancing Each storage unit has many tablets (horizontal partitions of the table) Storage unit may become a hotspot Storage unit Tablet Overfull tablets split Tablets may grow over time Shed load by moving tablets to other servers Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 17

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 QUERY PROCESSING Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 18

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Accessing Data 3 Record for key k 4 1 Get key k 2 Get key k SU SU SU Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 19

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Bulk Read 1 {k1, k2, … kn} 2 Get k1 Get k2 Get k3 Scatter/ gather server SU SU SU Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 20

Range Queries Clustered, ordered retrieval of records Grapefruit…Pear? Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Router Apple Avocado Banana Blueberry Storage unit 1 Canteloupe Storage unit 3 Lime Storage unit 2 Strawberry Grapefruit…Lime? Grapefruit…Pear? Lime…Pear? Canteloupe Grape Kiwi Lemon Lime Mango Orange Storage unit 1 Storage unit 2 Storage unit 3 Strawberry Tomato Watermelon Apple Avocado Banana Blueberry Strawberry Tomato Watermelon Lime Mango Orange Canteloupe Grape Kiwi Lemon

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Updates 8 1 Sequence # for key k Write key k Routers Message brokers 2 Write key k 3 Write key k 7 4 Sequence # for key k 5 SUCCESS SU SU SU 6 Write key k Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 22

REPLICATION AND CONSISTENCY Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 23

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Replication Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 24

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Consistency Model Goal: Make it easier for applications to reason about updates and cope with asynchrony What happens to a record with primary key “Alice”? Record inserted Update Update Update Update Update Update Update Delete v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Time Generation 1 As the record is updated, copies may get out of sync. Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 25

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Example: Social Alice West East Record Timeline User Status Alice ___ ___ User Status Alice Busy Busy User Status Alice Busy User Status Alice Free Free User Status Alice ??? User Status Alice ??? Free Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Consistency Model Read Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1 In general, reads are served using a local copy Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 27

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Consistency Model Read up-to-date Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1 But application can request and get current version Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 28

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Consistency Model Read ≥ v.6 Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1 Or variations such as “read forward”—while copies may lag the master record, every copy goes through the same sequence of changes Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 29

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Consistency Model Write Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1 Achieved via per-record primary copy protocol (To maximize availability, record masterships automaticlly transferred if site fails) Can be selectively weakened to eventual consistency (local writes that are reconciled using version vectors) Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 30

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Consistency Model Write if = v.7 ERROR Stale version Stale version Current version v. 1 v. 2 v. 3 v. 4 v. 5 v. 6 v. 7 v. 8 Time Generation 1 Test-and-set writes facilitate per-record transactions Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 31

Consistency Techniques Per-record mastering Each record is assigned a “master region” May differ between records Updates to the record forwarded to the master region Ensures consistent ordering of updates Tablet-level mastering Each tablet is assigned a “master region” Inserts and deletes of records forwarded to the master region Master region decides tablet splits These details are hidden from the application Except for the latency impact!

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Mastering A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W Tablet master C 66354 W D 12352 E E 75656 C F 15677 E A 42342 E B 42521 W C 66354 W D 12352 E E 75656 C F 15677 E Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 33

Bulk Insert/Update/Replace Client feeds records to bulk manager Bulk loader transfers records to SU’s in batches Bypass routers and message brokers Efficient import into storage unit Client Bulk manager Source Data Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 SHERPA IN CONTEXT Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 35

Retrieval from single table of objects/records Types of Record Stores Query expressiveness S3 PNUTS Oracle Simple Feature rich Object retrieval Retrieval from single table of objects/records SQL

Types of Record Stores Consistency model S3 PNUTS Oracle Best effort Strong guarantees Eventual consistency Timeline consistency ACID Program centric consistency Object-centric consistency

Types of Record Stores Data model PNUTS CouchDB Oracle Flexibility, Schema evolution Optimized for Fixed schemas Object-centric consistency Consistency spans objects

Elasticity (ability to add resources on demand) Types of Record Stores Elasticity (ability to add resources on demand) PNUTS S3 Oracle Inelastic Elastic Limited (via data distribution) VLSD (Very Large Scale Distribution /Replication)

Data Stores Comparison Versus PNUTS More expressive queries Users must control partitioning Limited elasticity Highly optimized for complex workloads Limited flexibility to evolving applications Inherit limitations of underlying data management system Object storage versus record management User-partitioned SQL stores Microsoft Azure SDS Amazon SimpleDB Multi-tenant application databases Salesforce.com Oracle on Demand Mutable object stores Amazon S3 Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Application Design Space Get a few things Sherpa MObStor YMDB MySQL Oracle Filer BigTable Scan everything Everest Hadoop Records Files Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 41

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Alternatives Matrix Global low latency Structured access Consistency model SQL/ACID Operability Availability Updates Elastic Sherpa Y! UDB MySQL Oracle HDFS BigTable Dynamo Cassandra Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 42

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Hadoop Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Distributed File System Single namespace for entire cluster Managed by a single namenode. Files are single-writer and append-only. Optimized for streaming reads of large files. Files are broken in to large blocks. Typically 128 MB Replicated to several datanodes, for reliability Access from Java, C, or command line. Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 44

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Hadoop clusters We have ~20,000 machines running Hadoop Our largest clusters are currently 2000 nodes Several petabytes of user data (compressed, unreplicated) We run hundreds of thousands of jobs every month Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 45

Research Cluster Usage Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059

Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 Who Uses Hadoop? Amazon/A9 AOL Facebook Fox interactive media Google / IBM New York Times PowerSet (now Microsoft) Quantcast Rackspace/Mailtrust Veoh Yahoo! Dr.S.Sridhar, Director, RVCET, RVCE, Bangalore-560059 47