Presentation is loading. Please wait.

Presentation is loading. Please wait.

Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data.

Similar presentations


Presentation on theme: "Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data."— Presentation transcript:

1 Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data

2 Outline Motivation UF-Growth algorithm Construction of the UF-Tree Mining of Frequent Patterns from the UF-Tree Improvements to UF-Growth algo. Experimental Results Conslusion

3 Motivation Over the past decade, there have been numerous studies on mining frequent patterns from precise data. However, there are situations in which users are uncertain about the presence or absence of some items. suspicion

4 UF-Growth Algorithm The algorithm consists of two operations: The construction of UF-tree The mining of frequent patterns from UF-tree

5 Construction of the UF-Tree a : 2.7 b: 2.625 c: 2.52429 d: 2.20875 e:2.1575 Scan DB minsup = 1 Scan DB 1 1 1

6 Mining of Frequent Patterns from the UF-Tree expSup({a,e}) = (1*0.72*0.9)+(2*0.71875*0.9) =1.94175 expSup({d,e}) = (1*0.72*0.71875)+(2*0.71875*0.72) =1.5525 {a,e} and {d,e} are frequent {e}-projected DB

7 (Cont.) expSup({d,e}) in {d,e}-projected DB is 0.5175=0.71875*0.72 expSup ({a,d,e})=3*0.5175*0.9=1.39725 {a}, {a,d}, {a,d,e}, {a,e}, {b}, {b,c}, {c}, {d}, {d,e}, and {e} {e}-projected DB {d,e}-projected DB

8 Improvements to UF-Growth Algorithm The UF-tree above may appear to require a large amount of memory Improvement 1. To increase the chance of path sharing, we discretize and round the expected support of each tree node up to k dceimal places

9 (Cont.) 2. The iproved UF-growth does not need to bulid subsequent UF-trees for any non-singleton patterns. To enumerate all its subsets {a,e}, {a,d,e}, {d,e} with their expected supports equal 0.648, 0.46575 and 0.5175 so far. {e}-projected DB To enumerate all its subsets and {a,e}, {a,d,e}, {d,e} with their accumulative expected supports equal 1.94175, 1.39725 and 1.5525

10 Experimental Results

11 (Cont.)

12 Conclusion Improvement 1. method may cause false positive.


Download ppt "Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data."

Similar presentations


Ads by Google