MapReduce ： Simplified Data Processing on Large Clusters P76001027 謝光昱 P76011284 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.

MapReduce ： Simplified Data Processing on Large Clusters P76001027 謝光昱 P76011284 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay Ghemawat

2Outline Introduction Programming Model Implementation Refinements Performance Experience Conclusions

Introduction Motive ： Most computations are conceptually ordinary. The input data is usually large and the computation have to finish in a reasonable amount of time. Problem ： The following reasons obscure the computation with large amounts of complex code. Parallelize the computation Distribute the data Handle failures Solve ： Design a new abstraction - MapReduce 3

4 Introduction Programming Model Implementation Refinements Performance Experience Conclusions

Programming Model The computation takes a set of input key/value pairs, and produces a set of output key/value pairs. Map: Takes an input pairs and produces a set of intermediate key/value pairs. Groups together all intermediate key I and passes them to the Reduce function. Reduce: Accepts an intermediate key I and a set of values for that key. Merges together these values to form a possibly smaller set of values. 5

Programming Model Example – WordCount map(String key,String value): //key: document name //value: document contents for each word w in value: EmitIntermediate(w,”1”); reduce(String key,Iterator values) //key: a word //values: a list of counts for each v in values: result +=ParseInt(v); Emit(AsString(result)); 6 Programming Map Reduce

Execution Overview Input data Implementation 8 User Program Master Worker Split 0 Split 1 Split 2 Split 3 Split 4 Worker Output file 0 Output file 1 Input filesMap phase Intermediate files (on local disks) Reduce phase Output files fork Assign mapAssign reduce readlocal write remote read write

Implementation Fault Tolerance Worker Failure Any completed map task by the worker are reset back to idle state, and then scheduled on the other worker. 9 Worker Master Ping response Failed

Implementation Master Failure The master write periodic checkpoints of the master data structures, therefore a new copy can restart from the last checkpoint state. If there is only a single master, its failure is unlikely. 10

Locality In order to conserve network bandwidth, we store the input data on the local disk. Implementation 11 64MB …… 64MB …… Worker Master Input files Map phrase copy

Implementation Backup Tasks The total time is lengthened by stragglers. Straggler ： a machine takes an unusually long time to complete one of the last tasks in the computation. E.g. a machine with a bad disk may slow its read performance. When a MapReduce operation is closed to completion, the master schedules backup executions of the remaining in- progress tasks. The task is marked as completed whenever either the primary or the backup execution completes. 12

Refinements Partitioning function Data gets partitioned across these tasks using a partitioning function on the intermediate key. Default partitioning function is hashing E.g. hash(key) mod R 14

Refinements Combiner Function There is significant repetition in the internediate key. Zipf distribution E.g. The only difference between a reduce function and a combiner function is the output of them. 15 a b b a c c b c Worker Combiner Worker

Refinements Skipping Bad Records Some bugs that cause the Map or Reduce function to crash and prevent a MapReduce operation from completing. Sometimes fixing the bugs is not feasible. The bug is in an unavailable source code. Iit is acceptable to ignore a few records. E.g. statistical analysis Method Each worker process installs a signal handler to catch segmentation violation and bus error. If there are more than one failure on a particular record, it should be skipped when the master issues the next-execution. 16

Refinements Status Information The master runs an internal HTTP server and exports a set of status pages for human consumption. 17

Refinements 18

Refinements 19

Refinements Counters The MapReduce library provides a counter facility to count occurrences of various events. E.g. user code may want to count total number of words processed 20

Performance Cluster Configuration Approximately 1800 machines Each machine had ： Two 2GHz Intel Xeon processors with Hyper-Threading enable 4GB of memory Two 160GB IDE disks A gigabit Ethernet link The machines were arranged in a two-level tree-shaped switched network 22

Performance Grep Scan through 10 10 100-byte records, searching for a relatively rare three-character pattern. The input is split into approximately 64MB pieces(M=15000), and the entire output is placed in one file (R=1) 23 1764 workers 060150 overheadexecution Propagation of the program to workers Interacting with GFS Propagation of the program to workers Interacting with GFS

Performance Sort The program sorts 10 10 100-byte records. The input data is split into 64MB pieces(M=15000), and the sort output will be partition into 4000 files(R=4000). 24

Performance 25 The input rate is less than for grep The shuffling starts as soon as the first map task completes. Remaining reduce tasks 850 delay The first batch of reduce tasks Input rate > shuffle rate > output rate Write two copies for reliability and availability Locality optimization 300600

Performance 26 1283960 Stragglers Increase of 44%!!

Performance 27 Worker death Re-execution 890 Only increase of 5%

Experience Broad applications in google ： large-scale machine learning problems clustering problems for the Google News and Froogle products extraction of data used to produce reports of popular queries extraction of properties of web pages for new experiments and products large-scale graph computations 29

Experience Large-Scale Indexing Rewrite the production indexing system that produces the data structures used for the Google web search service. Benefits of using MapReduce ： The indexing code is simpler, smaller, and easier to understand. E.g. 3800 lines of C++ to 700 lines of MapReduce The performance of the MapReduce library is good enough to change the indexing process easily. It’s easier to add new machines to the indexing cluster. 30

Conclusions The reasons of the MapReduce programming model has been successfully for many different purposes. The model is easy to use A large variety of problems are easily expressible Develop an implementation of MapReduce that Scales to large clusters of machines comprising thousands of machines Redundant execution can be used to reduce the impact of slow machines, and to handle machine failures and data loss 32

Thank you!! 33

MapReduce ： Simplified Data Processing on Large Clusters P76001027 謝光昱 P76011284 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.

Similar presentations

Presentation on theme: "MapReduce ： Simplified Data Processing on Large Clusters P76001027 謝光昱 P76011284 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MapReduce ： Simplified Data Processing on Large Clusters P76001027 謝光昱 P76011284 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.

Similar presentations

Presentation on theme: "MapReduce ： Simplified Data Processing on Large Clusters P76001027 謝光昱 P76011284 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay."— Presentation transcript:

Similar presentations

About project

Feedback