Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alva Erwin Department ofComputing Raj P. Gopalan, and N.R. Achuthan Department of Mathematics and Statistics Curtin University of Technology Kent St. Bentley.

Similar presentations


Presentation on theme: "Alva Erwin Department ofComputing Raj P. Gopalan, and N.R. Achuthan Department of Mathematics and Statistics Curtin University of Technology Kent St. Bentley."— Presentation transcript:

1 Alva Erwin Department ofComputing Raj P. Gopalan, and N.R. Achuthan Department of Mathematics and Statistics Curtin University of Technology Kent St. Bentley Western Australia PAKDD08 Efficient Mining of High Utility Itemsets from Large Datasets 1

2 Outline Introduction Preliminaries Method – Compressed Transaction Utility-Prol Experiments Conclusions 2

3 Introduction The goal of frequent itemset mining is to find items that co-occur in a transaction database above a user given frequency threshold, without considering the quantity or weight such as profit of the items. Quantity and weight are significant for addressing real world decision problems that require maximizing the utility in an organization. TwoPhase based on Apriori is suitable for sparse data sets with short patterns, CTU-Mine based on the pattern growth is suitable for dense data. 3

4 Definition u(3 4, t1) =$60 u(3 4, t3)=$60 u(3 4) = $120, 4

5 Definition Transaction Utility : Transaction weighted Utility: tu(1) = 80 twu(3 4)=$190 5

6 Compressed Transaction Utility-Prol 99<min_Utility(129.9) GlobalItem index 12345- Original item id 512436 Profit 51015035252 Quantity 60124542 TWU 98 7 96 4 8105954229 6

7 Compressed Utility Pattern-Tree Parallel projection of transaction database 7

8 CUP-tree Traverse index 1 (110) from 5, 2 (310) from (2,3,4), 3 (195) from 2, and 4 (190)from (3,5) 8

9 ProCUP-tree index 1 (110) from 5, cause 110<min_Utility(129.9) 2 (310) from (2,3,4),3 (195) from 2, and 4 (190)from (3,5) 9

10 ProCUP-tree oriUtility*itemQuantity + proUtility*proQuantity = Utility 35*2+25*2=120, 150*1+25*1=175, 10*5+25*3=125 High_Utility_Itemset = (3,2) (3,2,1) GlobalItem index 12345 Original item id 51243 ProItem index --123 Profit 5101503525 Quantity 6012454 TWU 987964810595422 10

11 Experiments 11

12 Conclusion CTU-Pro algorithm to mine the complete set of high utility itemsets from both sparse and relatively dense datasets with short or longer high utility patterns. The algorithm adapts to large data by constructing parallel subdivisions on disk that can be mined independently. 12


Download ppt "Alva Erwin Department ofComputing Raj P. Gopalan, and N.R. Achuthan Department of Mathematics and Statistics Curtin University of Technology Kent St. Bentley."

Similar presentations


Ads by Google