MapReduce Programming Model. HP Cluster Computing Challenges  Programmability: need to parallelize algorithms manually  Must look at problems from parallel.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

These are slides with a history. I found them on the web... They are apparently based on Dan Weld’s class at U. Washington, (who in turn based his slides.
CS246 TA Session: Hadoop Tutorial Peyman kazemian 1/11/2011.
Distributed Computations
MapReduce: Simplified Data Processing on Large Clusters Cloud Computing Seminar SEECS, NUST By Dr. Zahid Anwar.
An Introduction to MapReduce: Abstractions and Beyond! -by- Timothy Carlstrom Joshua Dick Gerard Dwan Eric Griffel Zachary Kleinfeld Peter Lucia Evan May.
MapReduce Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January Shahram.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 3: Mapreduce and Hadoop All slides © IG.
Distributed Computations MapReduce
Lecture 2 – MapReduce: Theory and Implementation CSE 490h – Introduction to Distributed Computing, Spring 2007 Except as otherwise noted, the content of.
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
Introduction to Google MapReduce WING Group Meeting 13 Oct 2006 Hendra Setiawan.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
MapReduce: Simplified Data Processing on Large Clusters
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science MapReduce:
MapReduce CSE 454 Slides based on those by Jeff Dean, Sanjay Ghemawat, and Dan Weld.
MapReduce Programming Yue-Shan Chang. split 0 split 1 split 2 split 3 split 4 worker Master User Program output file 0 output file 1 (1) fork (2) assign.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Hadoop Ida Mele. Parallel programming Parallel programming is used to improve performance and efficiency In a parallel program, the processing is broken.
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
Introduction to MapReduce Amit K Singh. “The density of transistors on a chip doubles every 18 months, for the same cost” (1965) Do you recognize this.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Take a Close Look at MapReduce Xuanhua Shi. Acknowledgement  Most of the slides are from Dr. Bing Chen,
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
MapReduce Costin Raiciu Advanced Topics in Distributed Systems, 2011.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
HAMS Technologies 1
Map Reduce: Simplified Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Google, Inc. OSDI ’04: 6 th Symposium on Operating Systems Design.
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
MapReduce How to painlessly process terabytes of data.
Google’s MapReduce Connor Poske Florida State University.
MapReduce M/R slides adapted from those of Jeff Dean’s.
Mass Data Processing Technology on Large Scale Clusters Summer, 2007, Tsinghua University All course material (slides, labs, etc) is licensed under the.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
MapReduce Costin Raiciu Advanced Topics in Distributed Systems, 2012.
1 MapReduce: Theory and Implementation CSE 490h – Intro to Distributed Computing, Modified by George Lee Except as otherwise noted, the content of this.
SLIDE 1IS 240 – Spring 2013 MapReduce, HBase, and Hive University of California, Berkeley School of Information IS 257: Database Management.
MapReduce and the New Software Stack CHAPTER 2 1.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
MapReduce: Simplified Data Processing on Large Clusters Lim JunSeok.
MapReduce : Simplified Data Processing on Large Clusters P 謝光昱 P 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.
Introduction to Hadoop Owen O’Malley Yahoo!, Grid Team Modified by R. Cook.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
Map Reduce. Functional Programming Review r Functional operations do not modify data structures: They always create new ones r Original data still exists.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
HADOOP Priyanshu Jha A.D.Dilip 6 th IT. Map Reduce patented[1] software framework introduced by Google to support distributed computing on large data.
Dr Zahoor Tanoli COMSATS.  Certainly not suitable to process huge volumes of scalable data  Creates too much of a bottleneck.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
MapReduce: Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
Lecture 4. MapReduce Instructor: Weidong Shi (Larry), PhD
Introduction to Google MapReduce
MapReduce: Simplified Data Processing on Large Clusters
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
MapReduce Simplied Data Processing on Large Clusters
February 26th – Map/Reduce
Cse 344 May 4th – Map/Reduce.
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

MapReduce Programming Model

HP Cluster Computing Challenges  Programmability: need to parallelize algorithms manually  Must look at problems from parallel standpoint  Tightly coupled problems require frequent communication (more of the slow part!)  We want to decouple the problem ▫Increase data locality, Balance the workload, etc  Parallel efficiency: comm. is fundamental difficulty  Distributing data, updating shared resource, communicating results  Machines have separate memories, so no usual inter-process communication – need network  Introduces inefficiencies: overhead, waiting, etc.

Programming Models: What is MPI?  MPI: Message Passing Interface (  World’s most popular high perf. distributed API  MPI is “de facto standard” in scientific computing  C and FORTRAN, ver. 2 in 1997  What is MPI good for?  Abstracts away common network communications  Allows lots of control without bookkeeping  Freedom and flexibility come with complexity ▫300 subroutines, but serious programs with fewer than 10  Basics:  One executable run on every node (SPMD: single program multiple data)  Each node process has a rank ID number assigned  Call API functions to send messages  Send/Receive of a block of data (in array) ▫MPI_Send(start, count, datatype, dest, tag, comm_context) ▫MPI_Recv(start, count, datatype, source, tag, comm-context, status)

 Poor programmability:  Socket programming, with support of data structures ▫Programmers need to take care of everything: data distribution, inter-proc comm, orchestration  Blocking comm can cause deadlock –Proc1: MPI_Receive(Proc2, A); MPI_Send(Proc2, B); –Proc2: MPI_Receive(Proc1, B); MPI_Send(Proc1, A);  Potential to high parallel efficiency, but  Large overhead from comm. mismanagement ▫Time spent blocking is wasted cycles ▫Can overlap computation with non-blocking comm.  Load imbalance is possible! Dead machines? Challenges with MPI

Google’s MapReduce  Large-Scale Data Processing  Want to use hundreds or thousands of CPUs... but this needs to be easy!  MapReduce provides  Automatic parallelization & distribution  Fault tolerance  I/O scheduling  Monitoring & status updates

Programming Concept  Map  Perform a function on individual values in a data set to create a new list of values  Example: square x = x * x map square [1,2,3,4,5] returns [1,4,9,16,25]  Reduce  Combine values in a data set to create a new value  Example: sum = (each elem in arr, total +=) reduce [1,2,3,4,5] returns 15 (the sum of the elements) ‏

Map/Reduce  map(key, val) is run on each item in set  emits new-key / new-val pairs ▫Processes input key/value pair ▫Produces set of intermediate pairs  reduce(key, vals) is run for each unique key emitted by map()  emits final output ▫Combines all intermediate values for a particular key ▫Produces a set of merged output values (usu just one)

Count words in docs: An Example  Input consists of (url, contents) pairs  map(key=url, val=contents): ▫ For each word w in contents, emit (w, “1”)  reduce(key=word, values=uniq_counts): ▫ Sum all “1”s in values list ▫ Emit result “(word, sum)”

Count, Illustrated map(key=url, val=contents): For each word w in contents, emit (w, “1”) reduce(key=word, values=uniq_counts): Sum all “1”s in values list Emit result “(word, sum)” see bob throw see spot run see1 bob1 run1 see 1 spot 1 throw1 bob1 run1 see 2 spot 1 throw1

“Mapper” nodes are responsible for the map function map(String input_key, String input_value): // input_key : document name (or line of text) // input_value: document contents for each word w in input_value: EmitIntermediate(w, "1"); “Reducer” nodes are responsible for the reduce function reduce(String output_key, Iterator intermediate_values): // output_key : a word // output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result)); Data on a distributed file system (DFS) MapReduce WordCount Example

Garcia, UCB MapReduce WordCount Java code

Execution Overview  How is this distributed? 1. Partition input key/value pairs into chunks, run map() tasks in parallel 2. After all map()s are complete, consolidate all emitted values for each unique emitted key 3. Now partition space of output map keys, and run reduce() in parallel  If map() or reduce() fails, reexecute!

Execution Overview (Cont’d)

Garcia, UCB MapReduce WordCount Diagram ah ah erahif oror uhorah if ah:1,1,1,1 ah:1if:1 or:1or:1 uh:1or:1ah:1 if:1 er:1if:1,1or:1,1,1uh:1 ah:1 ah:1 er: file 1 file 2 file 3 file 4 file 5 file 6 file 7 (ah)(er)(if)(or)(uh) map(String input_key, String input_value): // input_key : doc name // input_value: doc contents for each word w in input_value: EmitIntermediate(w, "1"); reduce(String output_key, Iterator intermediate_values): // output_key : a word // output_values: a list of counts int result = 0; for each v in intermediate_values: result += ParseInt(v); Emit(AsString(result));

Map  Reads contents of assigned portion of input-file  Parses and prepares data for input to map function (e.g. read from HTML) ‏  Passes data into map function and saves result in memory (e.g. ) ‏  Periodically writes completed work to local disk  Notifies Master of this partially completed work (intermediate data) ‏

Reduce  Receives notification from Master of partially completed work  Retrieves intermediate data from Map-Machine via remote-read  Sorts intermediate data by key (e.g. by target page) ‏  Iterates over intermediate data  For each unique key, sends corresponding set through reduce function  Appends result of reduce function to final output file (GFS) ‏

Parallel Execution

Task Granularity & Pipelining  Fine granularity tasks: map tasks >> machines  Minimizes time for fault recovery  Can pipeline shuffling with map execution  Better dynamic load balancing  Often use 200,000 map & 5000 reduce tasks  Running on 2000 machines

Handled via re-execution  Detect failure : periodically ping workers  Any machine who does not respond is considered “dead”  Re-execute completed + in-progress map tasks ▫ Data stored on local machine becomes unreachable.  Re-execute in progress reduce tasks  Task completion committed through master Robust: lost 1600/1800 machines once  finished ok Fault Tolerance / Workers

Master Failure  Could be handled by making the write periodic checkpoints of the master data structures.  But don't yet  (master failure unlikely)

Slow workers significantly delay completion time  Other jobs consuming resources on machine  Bad disks w/ soft errors transfer data slowly  Weird things: processor caches disabled (!!) Solution: Near end of phase, schedule redundant execution of in-process tasks  Whichever one finishes first "wins" Dramatically shortens job completion time Refinement: Redundant Execution

Refinement: Locality Optimization  Master scheduling policy:  Asks GFS for locations of replicas of input file blocks  Map tasks typically split into 64MB (GFS block size)  Map tasks scheduled so GFS input block replica are on same machine or same rack  Effect  Thousands of machines read input at local disk speed ▫ Without this, rack switches limit read rate

Refinement Skipping Bad Records  Map/Reduce functions sometimes fail for particular inputs  Best solution is to debug & fix ▫ Not always possible ~ third-party source libraries  On segmentation fault: ▫ Send UDP packet to master from signal handler ▫ Include sequence number of record being processed  If master sees two failures for same record: ▫ Next worker is told to skip the record ( it is acceptable to ignore when doing statistical analysis on a large data set)

 Sorting guarantees  within each reduce partition  Compression of intermediate data  Combiner  Useful for saving network bandwidth  Local execution for debugging/testing  User-defined counters Other Refinements

MapReduce: In Summary  Now it’s easy to program for many CPUs  Communication management effectively gone ▫I/O scheduling done for us  Fault tolerance, monitoring ▫machine failures, suddenly-slow machines, etc are handled  Can be much easier to design and program!  Can cascade several (many?) MapReduce tasks  But … it further restricts solvable problems  Might be hard to express problem in MapReduce  Data parallelism is key ▫Need to be able to break up a problem by data chunks  MapReduce is closed-source (to Google) C++ ▫Hadoop is open-source Java-based rewrite

MapReduce Example 1/4 package org.myorg; import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.mapred.*; import org.apache.hadoop.util.*; public class WordCount { public static class Map extends MapReduceBase implements Mapper { private final static IntWritable one = new IntWritable(1); private Text word = new Text();

MapReduce Example 2/4 public void map(LongWritable key, Text value, OutputCollector output, Reporter reporter) throws IOException { String line = value.toString(); StringTokenizer tokenizer = new StringTokenizer(line); while (tokenizer.hasMoreTokens()) { word.set(tokenizer.nextToken()); output.collect(word, one); }

MapReduce Example 3/4 public static class Reduce extends MapReduceBase implements Reducer { public void reduce(Text key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += values.next().get(); } output.collect(key, new IntWritable(sum)); }

MapReduce Example 4/4 public static void main(String[] args) throws Exception { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(Map.class); conf.setCombinerClass(Reduce.class); conf.setReducerClass(Reduce.class); conf.setInputFormat(TextInputFormat.class); conf.setOutputFormat(TextOutputFormat.class); conf.setInputPath(new Path(args[0])); conf.setOutputPath(new Path(args[1])); JobClient.runJob(conf); }