Presentation is loading. Please wait.

Presentation is loading. Please wait.

MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang 2014.10.6.

Similar presentations

Presentation on theme: "MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang 2014.10.6."— Presentation transcript:

1 MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang 2014.10.6

2 Outline  Introduction  Programming model  Implementation  Refinements  Performance  Conclusion

3 1. Introduction

4 What is MapReduce  Origin from Google, [OSDI’04]  A simple programming model  Functional model  For large-scale data processing  Exploits large set of commodity computers  Executes process in distributed manner  Offers high availability

5 Motivation  Lots of demands for very large scale data processing :  computation are conceptually straightforward  input data is large  distributed across thousands of machines  The issue of how to parallelize computation, distribute the data, and handle failures obscure the original computation with complex code to deal with these issues

6 Distributed Grep Very big data Split data grep matches cat All matches

7 Distributed Word Count Very big data Split data count merge merged count

8 Goal Design a new abstraction that allows us to:  express the simple computation we are trying to perform  hides the messy details of parallelization, fault- tolerance, data distribution and load balancing in a library

9 2. Programming Model

10 Map + Reduce  Map:  Accepts input key/value pair  Emits intermediate key/value pair  Reduce :  Accepts intermediate key/value* pair  Emits output key/value pair Very big data Result MAPMAP REDUCEREDUCE Partitioning Function

11 A Simple Example  Counting words in a large set of documents : map(string value)‏ //key: document name //value: document contents for each word w in value EmitIntermediate(w, “1”); reduce(string key, iterator values)‏ //key: word //values: list of counts int results = 0; for each v in values result += ParseInt(v); Emit(AsString(result));

12 More Examples  Distributed Grep  Map: emits a line that matches the pattern  Reduce: identity function  Count of URL Access Frequency  Map: processes logs and emit  Map: processes logs and emit  Reduce: adds together all values for the same URL and emits a  Reduce: adds together all values for the same URL and emits a  Distributed Sort  Map: extracts key from each record and emit  Map: extracts key from each record and emit  Reduce: identity function

13 3. Implementation

14 Environment Implementation depends on the environment:  Machines with x86 dual-CPU, 2-4 GB of memory;  Commodity networking hardware, 100 Mb/s or 1 Gb/s at machine level;  A cluster consists of hundreds or thousands of machines;  Embedded inexpensive IDE disks provides storage

15 Execution Overview

16 1. Input data partitioning (M splits, each 16-64MB); Starting up copies of program on a cluster 2. Tasks assignment: master assigns Map or Reduce to workers 3. Map task: parse key/value pair from input; produce intermediate key/value pair by Map function

17 Execution Overview 4. Pairs partitioning (hash function, typically mod); Location forwarding by master 5. Reduce task: read data from map worker; sort it by intermediate key; group 6. Reduce function: deal with the groups passed by Reduce task 7. All tasks completed. The MapReduce call returns

18 Details of Map/Reduce Task

19 Master Data Structure  Master keeps several data structure:  It stores the state (idle, in-process, or completed) for each map and reduce task  It stores the identity of the worker machine  Master is the conduit:  With master the location of intermediate file is propagated from map task to reduce task

20 Fault Tolerance  Worker failure  Master pings workers periodically  Any machine who does not respond is considered “dead”  For both Map and Reduce machines, any task in progress needs to be re-executed  For Map machines, completed tasks are also reset because results are stored on local disk  Master failure  Abort entire computation

21 Locality Issue  Master scheduling policy  Asks GFS for locations of replicas of input file blocks  Map tasks typically split into 64MB (== GFS block size)  Map tasks scheduled so GFS input block replica are on same or nearby machine  Effect  Most input data is read locally  Consumes no network bandwidth

22 Choice of M and R :  Ideally, M and R should be much larger than the number of work machines  There are practical bounds on M and R:  O(M+R) scheduling decisions  O(M*R) state in memory  M=200,000 and R=5,000, using 2,000 working machines Task Granularity

23 Backup Tasks  Some “straggler” not performing optimally  Near end of computation, schedule redundant execution of in-process tasks  First to complete “wins”

24 4. Refinements

25 Refinements  An Input Reader  Support read input data in different formats  Support read records from database or memory  An output writer  Support produce data in different formats

26 Refinements  A Partition Function  Data gets partitioned using the function on the intermediate key  Default: hash(key) mod R  A Combiner Function  Do partial merging of data before it is send over network  Typically the same code is used for the combiner and the reduce

27 Refinements  Ordering Guarantees  The intermediate key/value pairs are processed in increasing key order  Generate a sorted output file per partition  Side-effects  Produce auxiliary files as additional outputs  Write to a temporary file and atomically rename it

28 Refinements  Skipping Bad Records  map/reduce functions might fail for particular inputs  Fixing the bug might not be possible: third party libraries  On error  Worker sends signal to master  If multiple error on the same record, skip record

29 Refinements  Local Execution  Debugging problems can be tricky: distributed system  An alternative implementation: execute on local machine  Computation can be limited to particular map tasks

30 Refinements  Status Information  The master exports a set of status pages for human consumption  Useful for diagnose bugs  Counters  Count occurrences of various events  The counter are periodically propagated to the master  Display on the status page

31 Status monitor

32 5. Performance

33 Performance Boasts  Distributed grep  10 10 100-byte files (~1TB of data)‏  3-character substring found in ~100k files  ~1800 workers  150 seconds start to finish, including ~60 seconds startup overhead

34 Performance Boasts  Distributed sort  Same files/workers as above  50 lines of MapReduce code  891 seconds, including overhead  Best reported result of 1057 seconds for TeraSort benchmark

35 Performance Boasts

36 6. Conclusion

37 Conclusion  Easy to use  A large variety of problems are easily expressible as MapReduce computation  Develop an implementation of MapReduce

38 Thank you!

Download ppt "MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang 2014.10.6."

Similar presentations

Ads by Google