Presentation is loading. Please wait.

Presentation is loading. Please wait.

Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users.

Similar presentations


Presentation on theme: "Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users."— Presentation transcript:

1 Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users

2 MapReduce – What is it ? Processing engine of Hadoop Developers create Map and Reduce jobs Used for big data batch processing Parallel processing of huge data volumes Fault tolerant Scalable

3 MapReduce – Why use it ? Your data in Terabyte / Petabyte range You have huge I/O Hadoop framework takes care of  Job and task management  Failures  Storage  Replication You just write Map and Reduce jobs

4 MapReduce – How does it work ? Take word counting as an example, something that G oogle does all of the time.

5 MapReduce – How does it work ? Input data split into shards Split data mapped to key,value pairs i.e. Bear,1 Mapped data shuffled/sorted by key i.e. Bear Sorted data reduced i.e. Bear, 2 Final data stored on HDFS There might be extra map layer before shuffle JobTracker controls all tasks in job TaskTracker controls map and reduce

6 MapReduce - Some examples A visual example with colours to show you the cycle Split -> Map -> Shuffle -> Reduce

7 MapReduce - Some examples A visual example of MapReduce with job and task trackers added to individual map and reduce jobs.

8 Hadoop MapReduce – Big users Users  Facebook  Yahoo  Amazon  Ebay


Download ppt "Apache Hadoop MapReduce What is it ? Why use it ? How does it work Some examples Big users."

Similar presentations


Ads by Google