# STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows Mohamed G. Elfeky Walid G.Aref Ahmed K. Elmagarmid ICDM 2006 2007/10/021Chen.

## Presentation on theme: "STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows Mohamed G. Elfeky Walid G.Aref Ahmed K. Elmagarmid ICDM 2006 2007/10/021Chen."— Presentation transcript:

STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows Mohamed G. Elfeky Walid G.Aref Ahmed K. Elmagarmid ICDM 2006 2007/10/021Chen Yi-Chun

Outline Motivation Previous Approach –SPD algorithm –Max-Subpattern Tree Approximate Incremental Technique Conclusion 2007/10/022Chen Yi-Chun

Motivation abcabcabcabcabc…. p=3 Single sliding window Smaller w, real-time output supported Lager w, long period found possible Real-time output and long period found ………………………. Multiple sliding window is proposed p=3 abc,*b*,a**,… p=3 abc,*b*,a**,… p=3,5 abc,*b*,a**,… Period detection : SPD algorithm is used Patterns mining : max-subpattern tree is used 2007/10/02 3Chen Yi-Chun

Periodicity Detection : the projection of a data stream S according to a period p starting from position l,where n is the length of S. Ex. If S= abcabbabdb outlier 2007/10/024Chen Yi-Chun

Cont. : the number of times the symbol s occurs in two consecutive positions in the data stream Ex. If S = abbaaabaa indicates how often the sysbol s occurs every p timestamps in a data stream S 2007/10/025Chen Yi-Chun

Cont. If a data stream S of length n contains a symbol s and Then s is said to be periodic in S with a period of length p at position l with respect to periodicity threshold Ex. S= abcabbabdb, –The symbol a is periodic with a period of length 3 at position 0 where respect to a periodicity threshold –The pattern a * * is a frequent single periodic pattern of length 3 2007/10/026Chen Yi-Chun

SPD-algorithm To detect the symbols that are periodic with period length p within S Shift S by p positions, denoted as Ex. If S = a b c a b b a b c b.. = * * * a b c a b b a 2007/10/027Chen Yi-Chun

SPD algorithm in Time-Series a:001 b:010 c:100 (a c c c a b b) P=1 ……….. P=4 ………………………………………… =XXX =YYY Reference “Periodicity Detection in Time Series Databases” [TKDE05] 2007/10/028Chen Yi-Chun

Single Window with SPD 0 0 1 1 0 0 0 0 1 0 1 0 0 0 1 Shift 1 slide 2 (a c c c a b b) 2007/10/029Chen Yi-Chun

Multi Windows with SPD output Smaller w, real-time output supported Lager w, long period found possible 2007/10/0210Chen Yi-Chun

Max-Subpattern Tree Reference “Incremental, Online, and Merge Mining of Partial Periodic Patterns in Time-Series Databases” [TKDE04] Reference “Efficient Mining of Partial Periodic Patterns in Time Series Database” [ICDE99] abdeacdfabdjacdsabdxakdy For p=4 0 11 2 cb 23 2007/10/02

Approximate Incremental Tech. Streaming data = > maintain the max-subpattern tree over the new data Q=a{b,c}d* Q’=a{b,e}df Intersection with Q and Q’ is abd* (equal to Q without c) Difference from Q’ and abd* are e and f (equal to Q’ adding f and e) The approximation happens on the insertion step 2007/10/02

Hysteresis Threshold A pattern q will lose all the history information as soon as it becomes infrequent. When q becomes frequent again, it will be treated as a newly appeared frequent pattern. As a pattern is –Frequent i.e. the frequency is above the higher threshold –Infrequent i.e. the frequency is below the lower threshold –The frequencies are above the lower threshold are kept in the tree. 2007/10/0213

Conclusion Discover potential periodicity rates in data streams Use a incremental tree-structure to mining periodic patterns Use two thresholds to preserving the history of candidate frequent patterns 2007/10/0214Chen Yi-Chun

Download ppt "STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows Mohamed G. Elfeky Walid G.Aref Ahmed K. Elmagarmid ICDM 2006 2007/10/021Chen."

Similar presentations