Presentation is loading. Please wait.

Presentation is loading. Please wait.

CFI-Stream: Mining Closed Frequent Itemsets in Data Streams

Similar presentations


Presentation on theme: "CFI-Stream: Mining Closed Frequent Itemsets in Data Streams"— Presentation transcript:

1 CFI-Stream: Mining Closed Frequent Itemsets in Data Streams
Nan Jiang,Le Gruenwald SIGKDD’06 報告者:林靜怡 2006/10/04

2 Introduction mining Closed frequent itemsets
computes and maintains closed itemsets online and incrementally perform the closure checking output the current closed frequent itemsets in real time based on users’ specified thresholds

3 Definition D:data stream I = { , , …, } :a set of n elements,
called items T: subsets of all the transactions X: subsets of all the items appearing in a data stream

4 Definition C(X):the smallest closed set containing X Definition 1
An itemset X is said to be closed if and only if C(X)= f(g(X)) = f•g(X) = X

5 Algorithm CFI-Stream algorithm DIrect Update (DIU) tree
perform the closure checking online over a data stream sliding window Conditions need to check for closed itemsets check when performing addition and deletion operations on the DIU tree

6 DIU tree maintain the current closed itemsets
k levels in the DIU tree, each level i stores the closed i-itemsets

7 DIU tree Each node in the DIU tree stores a closed itemset
its current support information links to its parent and children nodes

8 Add a Transaction to the DIU Tree
T1:original transaction set t:new arrived transaction Conditions to Check for Closed Itemsets (1) t is in the T1, if the largest itemset X it contains is not currently in the DIU tree ->check for all X’s subsets Y, which are in T1

9 (2) when t is not in T1, for each its subset Y, if Y is in T1, we need to check

10 Closure Checking for Addition

11 C,D 2 A,B 3 A,B,C CD C CD 2 1 3 4 A,B,C 2 1 3 1 AB ABC 1 2

12 Delete a Transaction in DIU Tree
Conditions to Check for Closed Itemsets When the number of the transactions with same itemset of X is equal to zero, if Y is a subset of X, and Y is a closed itemset in the original transaction set

13 Closure Checking for Deletion

14 C,D 2 A,B 3 A,B,C 4 A,B,C 2 3 C 2 3 1 AB CD 2 ABC

15 Experiment Synthetic datasets T10.I6.D100K and T5.I4.D100K

16 Experiment

17 Experiment


Download ppt "CFI-Stream: Mining Closed Frequent Itemsets in Data Streams"

Similar presentations


Ads by Google