Operating Systems and The Cloud, Part II: Search => Cluster Apps => Scalable Machine Learning David E. Culler CS162 – Operating Systems and Systems Programming.

Slides:



Advertisements
Similar presentations
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Advertisements

The Datacenter Needs an Operating System Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica.
 Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware  Created by Doug Cutting and.
Mapreduce and Hadoop Introduce Mapreduce and Hadoop
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
Parallel Computing MapReduce Examples Parallel Efficiency Assignment
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
Problem-solving on large-scale clusters: theory and applications Lecture 3: Bringing it all together.
Matei Zaharia University of California, Berkeley Spark in Action Fast Big Data Analytics using Scala UC BERKELEY.
UC Berkeley Spark Cluster Computing with Working Sets Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica.
Berkeley Data Analytics Stack (BDAS) Overview Ion Stoica UC Berkeley UC BERKELEY.
Spark: Cluster Computing with Working Sets
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica Spark Fast, Interactive,
Discretized Streams: Fault-Tolerant Streaming Computation at Scale Wenting Wang 1.
Data-Intensive Computing with MapReduce/Pig Pramod Bhatotia MPI-SWS Distributed Systems – Winter Semester 2014.
Shark Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Hive on Spark.
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica Spark Fast, Interactive,
Distributed Computations
Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.
Distributed Computations MapReduce
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
AMPCamp Introduction to Berkeley Data Analytics Systems (BDAS)
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce and Hadoop 1 Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 2: MapReduce and Hadoop Mining Massive.
Storage in Big Data Systems
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
CSE 486/586 CSE 486/586 Distributed Systems Graph Processing Steve Ko Computer Sciences and Engineering University at Buffalo.
MapReduce M/R slides adapted from those of Jeff Dean’s.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Presented by: Katie Woods and Jordan Howell. * Hadoop is a distributed computing platform written in Java. It incorporates features similar to those of.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Spark System Background Matei Zaharia  [June HotCloud ]  Spark: Cluster Computing with Working Sets  [April NSDI.
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
BIG DATA/ Hadoop Interview Questions.
Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
CSCI5570 Large Scale Data Processing Systems Distributed Data Analytics Systems Slide Ack.: modified based on the slides from Matei Zaharia James Cheng.
Big Data is a Big Deal!.
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
Spark Presentation.
Lecture 3: Bringing it all together
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
Introduction to Spark.
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
CS110: Discussion about Spark
Apache Spark Lecture by: Faria Kalim (lead TA) CS425, UIUC
Overview of big data tools
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Apache Spark Lecture by: Faria Kalim (lead TA) CS425 Fall 2018 UIUC
Introduction to MapReduce
5/7/2019 Map Reduce Map reduce.
Fast, Interactive, Language-Integrated Cluster Computing
MapReduce: Simplified Data Processing on Large Clusters
Lecture 29: Distributed Systems
Presentation transcript:

Operating Systems and The Cloud, Part II: Search => Cluster Apps => Scalable Machine Learning David E. Culler CS162 – Operating Systems and Systems Programming Lecture 40 December 3, 2014 Proj: CP 2 today Reviews in R&R Mid3 12/15

The Data Center as a System Clusters became THE architecture for large scale internet services –Distribute disks, files, I/O, net, and compute over everything –Massive AND Incremental scalability Search Engines the initial “Killer App” Multiple components as Cluster Apps –Web crawl, Index, Search & Rank, Network, … Global Layer as a Master/Worker pattern –GFS, HDFS Map Reduce framework address core of search on massive scale – and much more –Indexing, log analysis, data querying –Collating, inverted indexes : map(k,v) => f(k,v),(k,v) –Filtering, Parsing, Validation –Sorting 12/1/14UCB CS162 Fa14 L39 2 Lessons from Giant-Scale Services, Eric Brewer, IEEE Computer, Jul 2001

Research … 12/1/14UCB CS162 Fa14 L39 3

The Data Center as a System Clusters became THE architecture for large scale internet services –Distribute disks, files, I/O, net, and compute over everything –Massive AND Incremental scalability Search Engines the initial “Killer App” Multiple components as Cluster Apps –Web crawl, Index, Search & Rank, Network, … Global Layer as a Master/Worker pattern –GFS, HDFS Map Reduce framework address core of search on massive scale – and much more –Indexing, log analysis, data querying –Collating, inverted indexes : map(k,v) => f(k,v),(k,v) –Filtering, Parsing, Validation –Sorting, –Graph Processing (???) – page rank, –Cross-correlation (???) –Machine Learning (???) 12/1/14UCB CS162 Fa14 L39 4

Time Travel It’s not just storing it, it’s what you do with the data 12/1/14UCB CS162 Fa14 L39 5

Velox Model Serving Tachyon Spark Streamin g SparkSQ L BlinkDB GraphX MLlib MLBa se Spark R Cancer Genomics, Energy Debugging, Smart Buildings Sample Clean Apache Spark Berkeley Data Analytics Stack HDFS, S3, … Apache MesosYarn Resource Virtualization Storage Processing Engine Access and Interfaces In-house Apps Tachyon 12/1/14UCB CS162 Fa14 L39 6

The Data Deluge Billions of users connected through the net –WWW, Facebook, twitter, cell phones, … –80% of the data on FB was produced last year Clock Rates stalled Storage getting cheaper –Store more data! 12/1/14UCB CS162 Fa14 L39 7

Data Grows Faster than Moore’s Law Projected Growth Increase over /1/14UCB CS162 Fa14 L39 8

Complex Questions Hard questions –What is the impact on traffic and home prices of building a new ramp? Detect real-time events –Is there a cyber attack going on? Open-ended questions –How many supernovae happened last year? 12/1/14UCB CS162 Fa14 L39 9

MapReduce Pros Distribution is completely transparent –Not a single line of distributed programming (ease, correctness) Automatic fault-tolerance –Determinism enables running failed tasks somewhere else again –Saved intermediate data enables just re-running failed reducers Automatic scaling –As operations as side-effect free, they can be distributed to any number of machines dynamically Automatic load-balancing –Move tasks and speculatively execute duplicate copies of slow tasks (stragglers) 12/1/14UCB CS162 Fa14 L39 10

HDFS – distributed file system Blocks are distributed, with replicas, across nodes Name-node provides the index structure Client locates blocks via RPC to metadata Data nodes inform Namenode of failures through heartbeats Block locations made visible to MapReduce Framework 12/1/14UCB CS162 Fa14 L39 11 Master server manages file systems namespace Regulates client access Mapping to data nodes

MapReduce both a programming model and a clustered computing system –A specific pattern of computing over large distributed data sets –A system which takes a MapReduce-formulated problem and executes it on a large cluster –Hides implementation details, such as hardware failures, grouping and sorting, scheduling …

Word-Count using MapReduce Problem: determine the frequency of each word in a large document collection

General MapReduce Formulation Map: –Preprocesses a set of files to generate intermediate key-value pairs, in parallel Group: –Partitions intermediate key-value pairs by unique key, generating a list of all associated values »Shuffle so each key list is all on a node Reduce: –For each key, iterate over value list –Performs computation that requires context between iterations –Parallelizable amongst different keys, but not within one key

MapReduce Logical Execution 12/1/14UCB CS162 Fa14 L39 15

MapReduce Parallel Execution Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MapReduce Parallelization: Pipelining Fine grain tasks: many more map tasks than machines –Better dynamic load balancing –Minimizes time for fault recovery –Can pipeline the shuffling/grouping while maps are still running Example: 2000 machines -> 200,000 map reduce tasks Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime Execution Example The following slides illustrate an example run of MapReduce on a Google cluster A sample job from the indexing pipeline, processes ~900 GB of crawled pages

MR Runtime (1 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (2 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (3 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (4 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (5 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (6 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (7 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (8 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

MR Runtime (9 of 9) Shamelessly stolen from Jeff Dean’s OSDI ‘04 presentation

Fault Tolerance vis re-execution On Worker Failure: –Detect via periodic heartbeats –Re-execute completed and in-progress map tasks –Re-execute in progress reduce tasks –Task completion committed through master Master ??? 12/1/14UCB CS162 Fa14 L39 28

Admin Project 3 Reviews during R&R Mid 3 in Final Exam Group 1 (12/ ) –10 Evans (currently) 12/1/14UCB CS162 Fa14 L39 29

MapReduce Cons Restricted programming model –Not always natural to express problems in this model –Low-level coding necessary –Little support for iterative jobs (lots of disk access) –High-latency (batch processing) Addressed by follow-up research and Apache projects –Pig and Hive for high-level coding –Spark for iterative and low-latency jobs 12/1/14UCB CS162 Fa14 L39 30

Big Data Ecosystem Evolution MapReduce Pregel Dremel GraphLab Storm Giraph Drill Tez Impala S4 … Specialized systems (iterative, interactive and streaming apps) General batch processing

AMPLab Unification Philosophy Don’t specialize MapReduce – Generalize it! Two additions to Hadoop MR can enable all the models shown earlier! 1. General Task DAGs 2. Data Sharing For Users: Fewer Systems to Use Less Data Movement Spark Streaming GraphX … SparkSQL MLbase

UCB / Apache Spark Motivation Complex jobs, interactive queries and online processing all need one thing that MR lacks: Efficient primitives for data sharing Stage 1 Stage 2 Stage 3 Iterative job Query 1 Query 2 Query 3 Interactive mining Job 1 Job 2 … Stream processing 12/1/14UCB CS162 Fa14 L39 33

Spark Motivation Complex jobs, interactive queries and online processing all need one thing that MR lacks: Efficient primitives for data sharing Stage 1 Stage 2 Stage 3 Iterative job Query 1 Query 2 Query 3 Interactive mining Job 1 Job 2 … Stream processing Problem: in MR, the only way to share data across jobs is using stable storage (e.g. file system)  slow! 12/1/14UCB CS162 Fa14 L39 34

Examples iter. 1 iter Input HDFS read HDFS write HDFS read HDFS write Input query 1 query 2 query 3 result 1 result 2 result 3... HDFS read Opportunity: DRAM is getting cheaper  use main memory for intermediate results instead of disks 12/1/14UCB CS162 Fa14 L39 35

Velox Model Serving Tachyon Spark Streamin g SparkSQ L BlinkDB GraphX MLlib MLBa se Spark R Cancer Genomics, Energy Debugging, Smart Buildings Sample Clean Apache Spark Berkeley Data Analytics Stack (open source software) HDFS, S3, … Apache MesosYarn Resource Virtualization Storage Processing Engine Access and Interfaces In-house Apps Tachyon

In-Memory Dataflow System M. Zaharia, M. Choudhury, M. Franklin, I. Stoica, S. Shenker, “Spark: Cluster Computing with Working Sets, USENIX HotCloud, “It’s only September but it’s already clear that 2014 will be the year of Apache Spark” -- Datanami, 9/15/14

Iteration in Map-Reduce Training Data MapReduce Learned Model w (1) w (2) w (3) w (0) Initial Model

Cost of Iteration in Map-Reduce MapReduce Learned Model w (1) w (2) w (3) w (0) Initial Model Training Data Read 1 Read 2 Read 3 Repeatedly load same data

Cost of Iteration in Map-Reduce MapReduce Learned Model w (1) w (2) w (3) w (0) Initial Model Training Data Redundantly save output between stages

Dataflow View Training Data (HDFS) Training Data (HDFS) Map Reduc e Map Reduc e Map Reduc e

Memory Opt. Dataflow Training Data (HDFS) Training Data (HDFS) Map Reduc e Map Reduc e Map Reduc e Cached Load

Memory Opt. Dataflow View Training Data (HDFS) Training Data (HDFS) Map Reduc e Map Reduc e Map Reduc e Efficiently move data between stages Spark: × faster than Hadoop MapReduce

Resilient Distributed Datasets (RDDs) API: coarse-grained transformations (map, group-by, join, sort, filter, sample,…) on immutable collections Resilient Distributed Datasets (RDDs) –Collections of objects that can be stored in memory or disk across a cluster –Built via parallel transformations (map, filter, …) –Automatically rebuilt on failure Rich enough to capture many models: –Data flow models: MapReduce, Dryad, SQL, … –Specialized models: Pregel, Hama, … M. Zaharia, et al, Resilient Distributed Datasets: A fault-tolerant abstraction for in-memory cluster computing, NSDI 2012.

Fault Tolerance with RDDs RDDs track the series of transformations used to build them (their lineage) –Log one operation to apply to many elements –No cost if nothing fails Enables per-node recomputation of lost data messages = textFile(...).filter(_.contains(“error”)).map(_.split(‘\t’)(2)) HadoopRDD path = hdfs://… HadoopRDD path = hdfs://… FilteredRDD func = _.contains(...) FilteredRDD func = _.contains(...) MappedRDD func = _.split(…) MappedRDD func = _.split(…)

Velox Model Serving Tachyon Spark Streamin g SparkSQ L BlinkDB GraphX MLlib MLBa se Spark R Cancer Genomics, Energy Debugging, Smart Buildings Sample Clean Apache Spark Systems Research as Time Travel … HDFS, S3, … Apache MesosYarn Resource Virtualization Storage Processing Engine Access and Interfaces In-house Apps Tachyon 12/1/14UCB CS162 Fa14 L39 46