Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Shivaraman Janakiraman, Magesh Khanna Vadivelu.

Similar presentations


Presentation on theme: "By Shivaraman Janakiraman, Magesh Khanna Vadivelu."— Presentation transcript:

1 By Shivaraman Janakiraman, Magesh Khanna Vadivelu

2 Introduction Mining frequent item sets from large databases– an important problem in data mining Proposed to implement Apriori algorithm in Hadoop MapReduce MapReduce, a programming model for large data sets Programs written in this functional style are automatically parallelized and executed on a large cluster of machines programmers without any experience with parallel and distributed systems - easily utilize the resources of a large distributed system.

3

4 The Apriori Algorithm

5 Generating 1-itemset Frequent Pattern

6 Generating 2-itemset Frequent Pattern

7 Generating 3-itemset Frequent Pattern

8 MapReduce Isolated processes - Hadoop limits communication - each individual record processed by a task in isolation from one another records are processed in isolation by tasks called Mappers Mappers is then brought together into a second set of tasks called Reducers, where results from different mappers can be merged together.

9

10 Implementation Timeline TimeTask Week 11/07/2011 Discuss the algorithm and design the coding methodology sequentially Week 11/14/2011 Complete coding the algorithm sequentially Week 11/21/2011 Complete coding the algorithm sequentially Week 11/28/2011 Discuss the design and implementation in twister Project Review Discuss the design with partial implementation in twister Week 12/05/2011 Complete the implementation in twister Week 12/12/2011 Do validation Project Review Presentation

11 Thank you Questions?


Download ppt "By Shivaraman Janakiraman, Magesh Khanna Vadivelu."

Similar presentations


Ads by Google