Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data.

Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data

Outline Motivation UF-Growth algorithm Construction of the UF-Tree Mining of Frequent Patterns from the UF-Tree Improvements to UF-Growth algo. Experimental Results Conslusion

Motivation Over the past decade, there have been numerous studies on mining frequent patterns from precise data. However, there are situations in which users are uncertain about the presence or absence of some items. suspicion

UF-Growth Algorithm The algorithm consists of two operations: The construction of UF-tree The mining of frequent patterns from UF-tree

Construction of the UF-Tree a : 2.7 b: 2.625 c: 2.52429 d: 2.20875 e:2.1575 Scan DB minsup = 1 Scan DB 1 1 1

Mining of Frequent Patterns from the UF-Tree expSup({a,e}) = (1*0.72*0.9)+(2*0.71875*0.9) =1.94175 expSup({d,e}) = (1*0.72*0.71875)+(2*0.71875*0.72) =1.5525 {a,e} and {d,e} are frequent {e}-projected DB

(Cont.) expSup({d,e}) in {d,e}-projected DB is 0.5175=0.71875*0.72 expSup ({a,d,e})=3*0.5175*0.9=1.39725 {a}, {a,d}, {a,d,e}, {a,e}, {b}, {b,c}, {c}, {d}, {d,e}, and {e} {e}-projected DB {d,e}-projected DB

Improvements to UF-Growth Algorithm The UF-tree above may appear to require a large amount of memory Improvement 1. To increase the chance of path sharing, we discretize and round the expected support of each tree node up to k dceimal places

(Cont.) 2. The iproved UF-growth does not need to bulid subsequent UF-trees for any non-singleton patterns. To enumerate all its subsets {a,e}, {a,d,e}, {d,e} with their expected supports equal 0.648, 0.46575 and 0.5175 so far. {e}-projected DB To enumerate all its subsets and {a,e}, {a,d,e}, {d,e} with their accumulative expected supports equal 1.94175, 1.39725 and 1.5525

Experimental Results

(Cont.)

Conclusion Improvement 1. method may cause false positive.

Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data.

Similar presentations

Presentation on theme: "Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data.

Similar presentations

Presentation on theme: "Carson Kai-Sang Leung, Mark Anthony F. Mateo, and Dale A. Brajczuk PAKDD 2008 A Tree-based Approach for Frequent Pattern Mining from Uncertain Data."— Presentation transcript:

Similar presentations

About project

Feedback