Presentation is loading. Please wait.

Presentation is loading. Please wait.

KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data.

Similar presentations


Presentation on theme: "KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data."— Presentation transcript:

1 KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data

2 Outline Introduction Definition Algorithm Experiment Results Conclusion

3 Introduction This paper will study the problem of frequent pattern mining by examining the relative behavior of the extensions of well known classes of deterministic algorithms.

4 Definition

5

6 Algorithm Step1. Extending the H-mine Algorithm Step2. Extending the FP-growth Algorithm Step3.Computation of Support Upper Bounds Step4.Mining Frequent Patterns with UFP-tree Step5. Determining Support with a Trie Tree

7 H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output : { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB

8 H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output : { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB

9 H-Mine (Example) TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Scan TDB Complete set of frequent items can be found and output : { a:3, c:3, d:4, e:3, g:2 } Following the alphabetical order of frequent items (called F-list): a-c-d-e-g ID Frequent-item projection 100 c, d, e, g 200 a, c, d, e 300 a, d, e, g 400 a, c, d Build H-struct in main memory Scan TDB

10 H-Mine (Example) (Cont.) acdeg 33432 cdeg acde adeg acd 100 200 300 400 Frequentprojections Header table H H-Struct

11 H-Mine (Example) (Cont.) cdeg acde adeg acd 100 200 300 400 Frequentprojections cdeg 2321 Header table H acdeg 33432Header ac: 2 ad: 3 ae: 2

12 H-Mine (Example) (Cont.) a:3, c:3, d:4, e:3, g:2, ac:2, ad:3, ae:2, acd:2,ade:2, cd:3, ce:2, cde:2, de:3, dg:2, deg:2, eg: 2 TDB IDItems 100 c, d, e, f, g, i 200 a, c, d, e, m 300 a, b, d, e, g, k 400 a, c, d, h min_sup_count = 2 Output

13 FP-growth(Example) {} f:4c:1 b:1 p:1 b:1c:3 a:3 b:1m:2 p:2m:1 Header Table Item frequency head f4 c4 a3 b3 m3 p3 min_support = 3 TIDItems bought (ordered) frequent items 100{f, a, c, d, g, i, m, p}{f, c, a, m, p} 200{a, b, c, f, l, m, o}{f, c, a, b, m} 300 {b, f, h, j, o, w}{f, b} 400 {b, c, k, s, p}{c, b, p} 500 {a, f, c, e, l, p, m, n}{f, c, a, m, p} f-c-a-m-p

14 Computation of Support Upper Bounds corollary

15 Mining Frequent Patterns with UFP-tree Goal: It avoids recursively constructing conditional FP-trees.

16 Trie Tree

17 Experiment Results

18

19

20

21 Conclusion In this tests, we found UApriori and UH-mine are both efficient in mining frequent itemsets.


Download ppt "KDD’09,June 28-July 1,2009,Paris,France Copyright 2009 ACM Frequent Pattern Mining with Uncertain Data."

Similar presentations


Ads by Google