Download presentation

Presentation is loading. Please wait.

Published byJosef Dandridge Modified about 1 year ago

1
STAGGER: Periodicity Mining of Data Streams using Expanding Sliding Windows Mohamed G. Elfeky Walid G.Aref Ahmed K. Elmagarmid ICDM /10/021Chen Yi-Chun

2
Outline Motivation Previous Approach –SPD algorithm –Max-Subpattern Tree Approximate Incremental Technique Conclusion 2007/10/022Chen Yi-Chun

3
Motivation abcabcabcabcabc…. p=3 Single sliding window Smaller w, real-time output supported Lager w, long period found possible Real-time output and long period found ………………………. Multiple sliding window is proposed p=3 abc,*b*,a**,… p=3 abc,*b*,a**,… p=3,5 abc,*b*,a**,… Period detection : SPD algorithm is used Patterns mining : max-subpattern tree is used 2007/10/02 3Chen Yi-Chun

4
Periodicity Detection : the projection of a data stream S according to a period p starting from position l,where n is the length of S. Ex. If S= abcabbabdb outlier 2007/10/024Chen Yi-Chun

5
Cont. : the number of times the symbol s occurs in two consecutive positions in the data stream Ex. If S = abbaaabaa indicates how often the sysbol s occurs every p timestamps in a data stream S 2007/10/025Chen Yi-Chun

6
Cont. If a data stream S of length n contains a symbol s and Then s is said to be periodic in S with a period of length p at position l with respect to periodicity threshold Ex. S= abcabbabdb, –The symbol a is periodic with a period of length 3 at position 0 where respect to a periodicity threshold –The pattern a * * is a frequent single periodic pattern of length /10/026Chen Yi-Chun

7
SPD-algorithm To detect the symbols that are periodic with period length p within S Shift S by p positions, denoted as Ex. If S = a b c a b b a b c b.. = * * * a b c a b b a 2007/10/027Chen Yi-Chun

8
SPD algorithm in Time-Series a:001 b:010 c:100 (a c c c a b b) P=1 ……….. P=4 ………………………………………… =XXX =YYY Reference “Periodicity Detection in Time Series Databases” [TKDE05] 2007/10/028Chen Yi-Chun

9
Single Window with SPD Shift 1 slide 2 (a c c c a b b) 2007/10/029Chen Yi-Chun

10
Multi Windows with SPD output Smaller w, real-time output supported Lager w, long period found possible 2007/10/0210Chen Yi-Chun

11
Max-Subpattern Tree Reference “Incremental, Online, and Merge Mining of Partial Periodic Patterns in Time-Series Databases” [TKDE04] Reference “Efficient Mining of Partial Periodic Patterns in Time Series Database” [ICDE99] abdeacdfabdjacdsabdxakdy For p= cb /10/02

12
Approximate Incremental Tech. Streaming data = > maintain the max-subpattern tree over the new data Q=a{b,c}d* Q’=a{b,e}df Intersection with Q and Q’ is abd* (equal to Q without c) Difference from Q’ and abd* are e and f (equal to Q’ adding f and e) The approximation happens on the insertion step 2007/10/02

13
Hysteresis Threshold A pattern q will lose all the history information as soon as it becomes infrequent. When q becomes frequent again, it will be treated as a newly appeared frequent pattern. As a pattern is –Frequent i.e. the frequency is above the higher threshold –Infrequent i.e. the frequency is below the lower threshold –The frequencies are above the lower threshold are kept in the tree. 2007/10/0213

14
Conclusion Discover potential periodicity rates in data streams Use a incremental tree-structure to mining periodic patterns Use two thresholds to preserving the history of candidate frequent patterns 2007/10/0214Chen Yi-Chun

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google