Presentation is loading. Please wait.

Presentation is loading. Please wait.

Finding Frequent Itemsets by Transaction Mapping

Similar presentations


Presentation on theme: "Finding Frequent Itemsets by Transaction Mapping"— Presentation transcript:

1 Finding Frequent Itemsets by Transaction Mapping
Mingjun Song ,Sanguthevar Rajasekaan   Proceedings of the 2005 ACM symposium on Applied computing 報告者:林靜怡 2006/01/13

2 Introduction Apriori algorithm needs many database scans
for each scan, frequent itemsets are searched by pattern matching time-consuming for large frequent itemsets with long patterns.

3 TM Algorithm Vertical database representation Transaction mapping
Transaction ids of each itemset are mapped and compressed to continuous transaction intervals in a different space reducing the number of intersections

4 Lexicographic Prefix Tree

5 Lexicographic Prefix Tree (conti.)
generate candidate itemsets and test their frequency. Each node in the tree stores a collection of frequent itemsets.

6 Lexicographic Prefix Tree (conti.)
Depth first--if the expansion of a node cannot possibly lead to the discovery of itemsets that have minimum support, then the node will not be expanded and the search will backtrack. When a frequent itemset that meets the minimum support requirement is found, it is output.

7 Transaction Mapping Scan through the database once and identify all frequent 1-itemsets sort them in descending order of frequency 1-itemsets

8 Transaction Mapping sup{1} = 5 sup{2} = 5 sup{3} = 4 sup{4} = 2
min_sup = 2 sup{1} = 5 sup{2} = 5 sup{3} = 4 sup{4} = 2 sup{5} = 1 sup{6} = 1 . sup{20}=1 identify all frequent 1-itemsets Frequent 1-itemsets: 1,2,3,4

9 Transaction Mapping(Conti.)
Scan through the database again For each transaction, select items that are in frequent 1-itemsets sort them according to the order of frequent 1-itemsets insert them into the transaction tree

10 Transaction Tree At the beginning the root is the current node.
if the current node has a child node whose id is equal to this item, then just increment the count of this child by 1 otherwise create a new child node and set its counter as 1.

11 Transaction Tree root 1:1 2:1 2:1 3:1 3:1 4:1 3:1

12 Node Interval a node u that has an associated interval of [s, e], where s is the relabeled start id, e is the relabeled end id. If the node is the first child of it’s parent s = start id of u’s parent If not s = the end id of its previous child+1 e = start id of u + counter - 1

13 Node Interval [1,5] [6,8] [1,2] [3,3] [6,6] [7,8] [1,2]
not first child s=2+1=3 c=3+1-1=3 first child s=1 c=1+2-1=2 first child s=1 c=1+2-1=2 first child s=1 c=1+5-1=5 [1,5] [6,8] [1,2] [3,3] [6,6] [7,8] [1,2]

14 output min_sup = 2 1 2 3 4 {1,2} {1,3} intersect [1,2] >2 {1,2,3,4}
<2 {1,2,4} intersect <2 {1,2} intersect [1,2] >=2 {1,2,3} intersect [1,2] >=2 2 3 4 1 3,4 2 3 {1,2,3} 4 {1,3} 2 3 4 4 3 4 3 {2,3} {2,4} 4 3

15 Experiments OS:Windows 2000 CPU:DELL 2.4GHz Pentium PC RAM:1GB
Compiler:Visual C++

16 Experiments synthetic data real data

17 Experiments

18 Experiments

19 Experiments

20 Experiments


Download ppt "Finding Frequent Itemsets by Transaction Mapping"

Similar presentations


Ads by Google