Download presentation

Presentation is loading. Please wait.

Published byPerla Eld Modified about 1 year ago

1
CMU SCS : Multimedia Databases and Data Mining Lecture #25: Time series mining and forecasting Christos Faloutsos

2
CMU SCS (c) C. Faloutsos, Must-Read Material Byong-Kee Yi, Nikolaos D. Sidiropoulos, Theodore Johnson, H.V. Jagadish, Christos Faloutsos and Alex Biliris, Online Data Mining for Co-Evolving Time Sequences, ICDE, Feb Chungmin Melvin Chen and Nick Roussopoulos, Adaptive Selectivity Estimation Using Query Feedbacks, SIGMOD 1994

3
CMU SCS (c) C. Faloutsos, Thanks Deepay Chakrabarti (CMU) Spiros Papadimitriou (CMU) Prof. Byoung-Kee Yi (Pohang U.)

4
CMU SCS (c) C. Faloutsos, Outline Motivation Similarity search – distance functions Linear Forecasting Bursty traffic - fractals and multifractals Non-linear forecasting Conclusions

5
CMU SCS (c) C. Faloutsos, Problem definition Given: one or more sequences x 1, x 2, …, x t, … (y 1, y 2, …, y t, … … ) Find –similar sequences; forecasts –patterns; clusters; outliers

6
CMU SCS (c) C. Faloutsos, Motivation - Applications Financial, sales, economic series Medical –ECGs +; blood pressure etc monitoring –reactions to new drugs –elderly care

7
CMU SCS (c) C. Faloutsos, Motivation - Applications (cont’d) ‘Smart house’ –sensors monitor temperature, humidity, air quality video surveillance

8
CMU SCS (c) C. Faloutsos, Motivation - Applications (cont’d) civil/automobile infrastructure –bridge vibrations [Oppenheim+02] – road conditions / traffic monitoring

9
CMU SCS (c) C. Faloutsos, Motivation - Applications (cont’d) Weather, environment/anti-pollution –volcano monitoring –air/water pollutant monitoring

10
CMU SCS (c) C. Faloutsos, Motivation - Applications (cont’d) Computer systems –‘Active Disks’ (buffering, prefetching) –web servers (ditto) –network traffic monitoring –...

11
CMU SCS (c) C. Faloutsos, Stream Data: Disk accesses time #bytes

12
CMU SCS (c) C. Faloutsos, Problem #1: Goal: given a signal (e.g.., #packets over time) Find: patterns, periodicities, and/or compress year count lynx caught per year (packets per day; temperature per day)

13
CMU SCS (c) C. Faloutsos, Problem#2: Forecast Given x t, x t-1, …, forecast x t Time Tick Number of packets sent ??

14
CMU SCS (c) C. Faloutsos, Problem#2’: Similarity search E.g.., Find a 3-tick pattern, similar to the last one Time Tick Number of packets sent ??

15
CMU SCS (c) C. Faloutsos, Problem #3: Given: A set of correlated time sequences Forecast ‘Sent(t)’

16
CMU SCS (c) C. Faloutsos, Important observations Patterns, rules, forecasting and similarity indexing are closely related: To do forecasting, we need –to find patterns/rules –to find similar settings in the past to find outliers, we need to have forecasts –(outlier = too far away from our forecast)

17
CMU SCS (c) C. Faloutsos, Outline Motivation Similarity Search and Indexing Linear Forecasting Bursty traffic - fractals and multifractals Non-linear forecasting Conclusions

18
CMU SCS (c) C. Faloutsos, Outline Motivation Similarity search and distance functions –Euclidean –Time-warping...

19
CMU SCS (c) C. Faloutsos, Importance of distance functions Subtle, but absolutely necessary: A ‘must’ for similarity indexing (-> forecasting) A ‘must’ for clustering Two major families –Euclidean and Lp norms –Time warping and variations

20
CMU SCS (c) C. Faloutsos, Euclidean and Lp... L 1 : city-block = Manhattan L 2 = Euclidean L

21
CMU SCS (c) C. Faloutsos, Observation #1 Time sequence -> n-d vector... Day-1 Day-2 Day-n

22
CMU SCS (c) C. Faloutsos, Observation #2 Euclidean distance is closely related to –cosine similarity –dot product –‘cross-correlation’ function... Day-1 Day-2 Day-n

23
CMU SCS (c) C. Faloutsos, Time Warping allow accelerations - decelerations –(with or w/o penalty) THEN compute the (Euclidean) distance (+ penalty) related to the string-editing distance

24
CMU SCS (c) C. Faloutsos, Time Warping ‘stutters’:

25
CMU SCS (c) C. Faloutsos, Time warping Q: how to compute it? A: dynamic programming D( i, j ) = cost to match prefix of length i of first sequence x with prefix of length j of second sequence y

26
CMU SCS (c) C. Faloutsos, Thus, with no penalty for stutter, for sequences x 1, x 2, …, x i,; y 1, y 2, …, y j x-stutter y-stutter no stutter Time warping

27
CMU SCS (c) C. Faloutsos, VERY SIMILAR to the string-editing distance x-stutter y-stutter no stutter Time warping

28
CMU SCS (c) C. Faloutsos, Time warping Complexity: O(M*N) - quadratic on the length of the strings Many variations (penalty for stutters; limit on the number/percentage of stutters; …) popular in voice processing [Rabiner + Juang]

29
CMU SCS (c) C. Faloutsos, Other Distance functions piece-wise linear/flat approx.; compare pieces [Keogh+01] [Faloutsos+97] ‘cepstrum’ (for voice [Rabiner+Juang]) –do DFT; take log of amplitude; do DFT again! Allow for small gaps [Agrawal+95] See tutorial by [Gunopulos + Das, SIGMOD01]

30
CMU SCS (c) C. Faloutsos, Other Distance functions In [Keogh+, KDD’04]: parameter-free, MDL based

31
CMU SCS (c) C. Faloutsos, Conclusions Prevailing distances: –Euclidean and –time-warping

32
CMU SCS (c) C. Faloutsos, Outline Motivation Similarity search and distance functions Linear Forecasting Bursty traffic - fractals and multifractals Non-linear forecasting Conclusions

33
CMU SCS (c) C. Faloutsos, Linear Forecasting

34
CMU SCS (c) C. Faloutsos, Forecasting "Prediction is very difficult, especially about the future." - Nils Bohr houghts.html

35
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting –Auto-regression: Least Squares; RLS –Co-evolving time sequences –Examples –Conclusions

36
CMU SCS (c) C. Faloutsos, Reference [Yi+00] Byoung-Kee Yi et al.: Online Data Mining for Co-Evolving Time Sequences, ICDE (Describes MUSCLES and Recursive Least Squares)

37
CMU SCS (c) C. Faloutsos, Problem#2: Forecast Example: give x t-1, x t-2, …, forecast x t Time Tick Number of packets sent ??

38
CMU SCS (c) C. Faloutsos, Forecasting: Preprocessing MANUALLY: remove trends spot periodicities time 7 days

39
CMU SCS (c) C. Faloutsos, Problem#2: Forecast Solution: try to express x t as a linear function of the past: x t-2, x t-2, …, (up to a window of w) Formally: Time Tick ??

40
CMU SCS (c) C. Faloutsos, (Problem: Back-cast; interpolate) Solution - interpolate: try to express x t as a linear function of the past AND the future: x t+1, x t+2, … x t+wfuture; x t-1, … x t-wpast (up to windows of w past, w future ) EXACTLY the same algo’s Time Tick ??

41
CMU SCS (c) C. Faloutsos, Linear Regression: idea Body weight express what we don’t know (= ‘dependent variable’) as a linear function of what we know (= ‘indep. variable(s)’) Body height

42
CMU SCS (c) C. Faloutsos, Linear Auto Regression:

43
CMU SCS (c) C. Faloutsos, Linear Auto Regression: Number of packets sent (t-1) Number of packets sent (t) lag w=1 Dependent variable = # of packets sent (S [t]) Independent variable = # of packets sent (S[t-1]) ‘lag-plot’

44
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting –Auto-regression: Least Squares; RLS –Co-evolving time sequences –Examples –Conclusions

45
CMU SCS (c) C. Faloutsos, More details: Q1: Can it work with window w>1? A1: YES! x t-2 xtxt x t-1

46
CMU SCS (c) C. Faloutsos, More details: Q1: Can it work with window w>1? A1: YES! (we’ll fit a hyper-plane, then!) x t-2 xtxt x t-1

47
CMU SCS (c) C. Faloutsos, More details: Q1: Can it work with window w>1? A1: YES! (we’ll fit a hyper-plane, then!) x t-2 x t-1 xtxt

48
CMU SCS (c) C. Faloutsos, More details: Q1: Can it work with window w>1? A1: YES! The problem becomes: X [N w] a [w 1] = y [N 1] OVER-CONSTRAINED –a is the vector of the regression coefficients –X has the N values of the w indep. variables –y has the N values of the dependent variable

49
CMU SCS (c) C. Faloutsos, More details: X [N w] a [w 1] = y [N 1] Ind-var1 Ind-var-w time

50
CMU SCS (c) C. Faloutsos, More details: X [N w] a [w 1] = y [N 1] Ind-var1 Ind-var-w time

51
CMU SCS (c) C. Faloutsos, More details Q2: How to estimate a 1, a 2, … a w = a? A2: with Least Squares fit (Moore-Penrose pseudo-inverse) a is the vector that minimizes the RMSE from y a = ( X T X ) -1 (X T y)

52
CMU SCS (c) C. Faloutsos, More details Q2: How to estimate a 1, a 2, … a w = a? A2: with Least Squares fit Identical to earlier formula (proof?) Where a = ( X T X ) -1 (X T y) a = V x x U T y X = U x x V T

53
CMU SCS (c) C. Faloutsos, More details Straightforward solution: Observations: –Sample matrix X grows over time –needs matrix inversion –O(N w 2 ) computation –O(N w) storage a = ( X T X ) -1 (X T y) a : Regression Coeff. Vector X : Sample Matrix XN:XN: w N

54
CMU SCS (c) C. Faloutsos, Even more details Q3: Can we estimate a incrementally? A3: Yes, with the brilliant, classic method of ‘Recursive Least Squares’ (RLS) (see, e.g., [Yi+00], for details). We can do the matrix inversion, WITHOUT inversion! (How is that possible?!)

55
CMU SCS (c) C. Faloutsos, Even more details Q3: Can we estimate a incrementally? A3: Yes, with the brilliant, classic method of ‘Recursive Least Squares’ (RLS) (see, e.g., [Yi+00], for details). We can do the matrix inversion, WITHOUT inversion! (How is that possible?!) A: our matrix has special form: (X T X)

56
CMU SCS (c) C. Faloutsos, More details XN:XN: w N X N+1 At the N+1 time tick: x N+1

57
CMU SCS (c) C. Faloutsos, More details Let G N = ( X N T X N ) -1 (``gain matrix’’) G N+1 can be computed recursively from G N GNGN w w

58
CMU SCS (c) C. Faloutsos, EVEN more details: 1 x w row vector Let’s elaborate (VERY IMPORTANT, VERY VALUABLE!)

59
CMU SCS (c) C. Faloutsos, EVEN more details:

60
CMU SCS (c) C. Faloutsos, EVEN more details: [w x 1] [w x (N+1)] [(N+1) x w] [w x (N+1)] [(N+1) x 1]

61
CMU SCS (c) C. Faloutsos, EVEN more details: [w x (N+1)] [(N+1) x w]

62
CMU SCS (c) C. Faloutsos, EVEN more details: 1 x w row vector ‘gain matrix’

63
CMU SCS (c) C. Faloutsos, EVEN more details:

64
CMU SCS (c) C. Faloutsos, EVEN more details: wxw wx1 1xw wxw 1x1 SCALAR!

65
CMU SCS (c) C. Faloutsos, Altogether:

66
CMU SCS (c) C. Faloutsos, Altogether: where I: w x w identity matrix : a large positive number (say, 10 4 ) IMPORTANT!

67
CMU SCS (c) C. Faloutsos, Comparison: Straightforward Least Squares –Needs huge matrix (growing in size) O(N×w) –Costly matrix operation O(N×w 2 ) Recursive LS –Need much smaller, fixed size matrix O(w×w) –Fast, incremental computation O(1×w 2 ) –no matrix inversion N = 10 6, w = 1-100

68
CMU SCS (c) C. Faloutsos, Pictorially: Given: Independent Variable Dependent Variable

69
CMU SCS (c) C. Faloutsos, Pictorially: Independent Variable Dependent Variable. new point

70
CMU SCS (c) C. Faloutsos, Pictorially: Independent Variable Dependent Variable RLS: quickly compute new best fit new point

71
CMU SCS (c) C. Faloutsos, Even more details Q4: can we ‘forget’ the older samples? A4: Yes - RLS can easily handle that [Yi+00]:

72
CMU SCS (c) C. Faloutsos, Adaptability - ‘forgetting’ Independent Variable eg., #packets sent Dependent Variable eg., #bytes sent

73
CMU SCS (c) C. Faloutsos, Adaptability - ‘forgetting’ Independent Variable eg. #packets sent Dependent Variable eg., #bytes sent Trend change (R)LS with no forgetting

74
CMU SCS (c) C. Faloutsos, Adaptability - ‘forgetting’ Independent Variable Dependent Variable Trend change (R)LS with no forgetting (R)LS with forgetting RLS: can *trivially* handle ‘forgetting’

75
CMU SCS (c) C. Faloutsos, How to choose ‘w’? goal: capture arbitrary periodicities with NO human intervention on a semi-infinite stream Details

76
CMU SCS (c) C. Faloutsos, Reference [Papadimitriou+ vldb2003] Spiros Papadimitriou, Anthony Brockwell and Christos Faloutsos Adaptive, Hands-Off Stream Mining VLDB 2003, Berlin, Germany, Sept Details

77
CMU SCS (c) C. Faloutsos, Answer: ‘AWSOM’ (Arbitrary Window Stream fOrecasting Method) [Papadimitriou+, vldb2003] idea: do AR on each wavelet level in detail: Details

78
CMU SCS (c) C. Faloutsos, AWSOM xtxt t t W 1,1 t W 1,2 t W 1,3 t W 1,4 t W 2,1 t W 2,2 t W 3,1 t V 4,1 time frequency = Details

79
CMU SCS (c) C. Faloutsos, AWSOM xtxt t t W 1,1 t W 1,2 t W 1,3 t W 1,4 t W 2,1 t W 2,2 t W 3,1 t V 4,1 time frequency Details

80
CMU SCS (c) C. Faloutsos, AWSOM - idea W l,t W l,t-1 W l,t-2 W l,t l,1 W l,t-1 l,2 W l,t-2 … W l’,t’-1 W l’,t’-2 W l’,t’ W l’,t’ l’,1 W l’,t’-1 l’,2 W l’,t’-2 … Details

81
CMU SCS (c) C. Faloutsos, More details… Update of wavelet coefficients Update of linear models Feature selection –Not all correlations are significant –Throw away the insignificant ones (“noise”) (incremental) (incremental; RLS) (single-pass) Details

82
CMU SCS (c) C. Faloutsos, Results - Synthetic data Triangle pulse Mix (sine + square) AR captures wrong trend (or none) Seasonal AR estimation fails AWSOMARSeasonal AR Details

83
CMU SCS (c) C. Faloutsos, Results - Real data Automobile traffic –Daily periodicity –Bursty “noise” at smaller scales AR fails to capture any trend Seasonal AR estimation fails Details

84
CMU SCS (c) C. Faloutsos, Results - real data Sunspot intensity –Slightly time-varying “period” AR captures wrong trend Seasonal ARIMA –wrong downward trend, despite help by human! Details

85
CMU SCS (c) C. Faloutsos, Complexity Model update Space: O lgN + mk 2 O lgN Time: O k 2 O 1 Where –N: number of points (so far) –k:number of regression coefficients; fixed –m:number of linear models; O lgN Details

86
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting –Auto-regression: Least Squares; RLS –Co-evolving time sequences –Examples –Conclusions

87
CMU SCS (c) C. Faloutsos, Co-Evolving Time Sequences Given: A set of correlated time sequences Forecast ‘Repeated(t)’ ??

88
CMU SCS (c) C. Faloutsos, Solution: Q: what should we do?

89
CMU SCS (c) C. Faloutsos, Solution: Least Squares, with Dep. Variable: Repeated(t) Indep. Variables: Sent(t-1) … Sent(t-w); Lost(t-1) …Lost(t-w); Repeated(t-1),... (named: ‘MUSCLES’ [Yi+00])

90
CMU SCS (c) C. Faloutsos, Forecasting - Outline Auto-regression Least Squares; recursive least squares Co-evolving time sequences Examples Conclusions

91
CMU SCS (c) C. Faloutsos, Examples - Experiments Datasets –Modem pool traffic (14 modems, 1500 time- ticks; #packets per time unit) –AT&T WorldNet internet usage (several data streams; 980 time-ticks) Measures of success –Accuracy : Root Mean Square Error (RMSE)

92
CMU SCS (c) C. Faloutsos, Accuracy - “Modem” MUSCLES outperforms AR & “yesterday”

93
CMU SCS (c) C. Faloutsos, Accuracy - “Internet” MUSCLES consistently outperforms AR & “yesterday”

94
CMU SCS (c) C. Faloutsos, Linear forecasting - Outline Auto-regression Least Squares; recursive least squares Co-evolving time sequences Examples Conclusions

95
CMU SCS (c) C. Faloutsos, Conclusions - Practitioner’s guide AR(IMA) methodology: prevailing method for linear forecasting Brilliant method of Recursive Least Squares for fast, incremental estimation. See [Box-Jenkins] (AWSOM: no human intervention)

96
CMU SCS (c) C. Faloutsos, Resources: software and urls free-ware: ‘R’ for stat. analysis (clone of Splus) python script for RLS

97
CMU SCS (c) C. Faloutsos, Books George E.P. Box and Gwilym M. Jenkins and Gregory C. Reinsel, Time Series Analysis: Forecasting and Control, Prentice Hall, 1994 (the classic book on ARIMA, 3rd ed.) Brockwell, P. J. and R. A. Davis (1987). Time Series: Theory and Methods. New York, Springer Verlag.

98
CMU SCS (c) C. Faloutsos, Additional Reading [Papadimitriou+ vldb2003] Spiros Papadimitriou, Anthony Brockwell and Christos Faloutsos Adaptive, Hands-Off Stream Mining VLDB 2003, Berlin, Germany, Sept [Yi+00] Byoung-Kee Yi et al.: Online Data Mining for Co-Evolving Time Sequences, ICDE (Describes MUSCLES and Recursive Least Squares)

99
CMU SCS (c) C. Faloutsos, Outline Motivation Similarity search and distance functions Linear Forecasting Bursty traffic - fractals and multifractals Non-linear forecasting Conclusions

100
CMU SCS (c) C. Faloutsos, Bursty Traffic & Multifractals

101
CMU SCS (c) C. Faloutsos, SKIP, if you have done HW2, Q3: Foils use D1 (‘information fractal dimension’), While HW2-Q3 uses D2 (‘correlation’ f.d.)

102
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting Bursty traffic - fractals and multifractals –Problem –Main idea (80/20, Hurst exponent) –Results HW2-Q3

103
CMU SCS (c) C. Faloutsos, Reference: [Wang+02] Mengzhi Wang, Tara Madhyastha, Ngai Hang Chang, Spiros Papadimitriou and Christos Faloutsos, Data Mining Meets Performance Evaluation: Fast Algorithms for Modeling Bursty Traffic, ICDE 2002, San Jose, CA, 2/26/ /1/2002. Full thesis: CMU-CS Performance Modeling of Storage Devices using Machine Learning Mengzhi Wang, Ph.D. Thesis Abstract,.ps.gz,.pdf Abstract.ps.gz.pdf HW2-Q3

104
CMU SCS (c) C. Faloutsos, Recall: Problem #1: Goal: given a signal (eg., #bytes over time) Find: patterns, periodicities, and/or compress time #bytes Bytes per 30’ (packets per day; earthquakes per year) HW2-Q3

105
CMU SCS (c) C. Faloutsos, Problem #1 model bursty traffic generate realistic traces (Poisson does not work) time # bytes Poisson HW2-Q3

106
CMU SCS (c) C. Faloutsos, Motivation predict queue length distributions (e.g., to give probabilistic guarantees) “learn” traffic, for buffering, prefetching, ‘active disks’, web servers HW2-Q3

107
CMU SCS (c) C. Faloutsos, Q: any ‘pattern’? time # bytes Not Poisson spike; silence; more spikes; more silence… any rules? HW2-Q3

108
CMU SCS (c) C. Faloutsos, Solution: self-similarity # bytes time # bytes HW2-Q3

109
CMU SCS (c) C. Faloutsos, But: Q1: How to generate realistic traces; extrapolate; give guarantees? Q2: How to estimate the model parameters? HW2-Q3

110
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting Bursty traffic - fractals and multifractals –Problem –Main idea (80/20, Hurst exponent) –Results HW2-Q3

111
CMU SCS (c) C. Faloutsos, Approach Q1: How to generate a sequence, that is –bursty –self-similar –and has similar queue length distributions HW2-Q3

112
CMU SCS (c) C. Faloutsos, Approach A: ‘binomial multifractal’ [Wang+02] ~ ‘law’: –80% of bytes/queries etc on first half –repeat recursively b: bias factor (eg., 80%) HW2-Q3

113
CMU SCS (c) C. Faloutsos, Binary multifractals HW2-Q3

114
CMU SCS (c) C. Faloutsos, Binary multifractals HW2-Q3

115
CMU SCS (c) C. Faloutsos, Parameter estimation Q2: How to estimate the bias factor b? HW2-Q3

116
CMU SCS (c) C. Faloutsos, Parameter estimation Q2: How to estimate the bias factor b? A: MANY ways [Crovella+96] –Hurst exponent –variance plot –even DFT amplitude spectrum! (‘periodogram’) –More robust: ‘entropy plot’ [Wang+02] HW2-Q3

117
CMU SCS (c) C. Faloutsos, Entropy plot Rationale: – burstiness: inverse of uniformity –entropy measures uniformity of a distribution –find entropy at several granularities, to see whether/how our distribution is close to uniform. HW2-Q3

118
CMU SCS (c) C. Faloutsos, Entropy plot Entropy E(n) after n levels of splits n=1: E(1)= - p1 log 2 (p1)- p2 log 2 (p2) p1p2 % of bytes here HW2-Q3

119
CMU SCS (c) C. Faloutsos, Entropy plot Entropy E(n) after n levels of splits n=1: E(1)= - p1 log(p1)- p2 log(p2) n=2: E(2) = - p 2,i * log 2 (p 2,i ) p 2,1 p 2,2 p 2,3 p 2,4 HW2-Q3

120
CMU SCS (c) C. Faloutsos, Real traffic Has linear entropy plot (-> self-similar) # of levels (n) Entropy E(n) 0.73 HW2-Q3

121
CMU SCS (c) C. Faloutsos, Observation - intuition: intuition: slope = intrinsic dimensionality = info-bits per coordinate-bit –unif. Dataset: slope =? –multi-point: slope = ? # of levels (n) Entropy E(n) HW2-Q3

122
CMU SCS (c) C. Faloutsos, Observation - intuition: intuition: slope = intrinsic dimensionality = info-bits per coordinate-bit –unif. Dataset: slope =1 –multi-point: slope = 0 # of levels (n) Entropy E(n) HW2-Q3

123
CMU SCS (c) C. Faloutsos, Entropy plot - Intuition Slope ~ intrinsic dimensionality (in fact, ‘Information fractal dimension’) = info bit per coordinate bit - eg Dim = 1 Pick a point; reveal its coordinate bit-by-bit - how much info is each bit worth to me? HW2-Q3

124
CMU SCS (c) C. Faloutsos, Entropy plot Slope ~ intrinsic dimensionality (in fact, ‘Information fractal dimension’) = info bit per coordinate bit - eg Dim = 1 Is MSB 0? ‘info’ value = E(1): 1 bit HW2-Q3

125
CMU SCS (c) C. Faloutsos, Entropy plot Slope ~ intrinsic dimensionality (in fact, ‘Information fractal dimension’) = info bit per coordinate bit - eg Dim = 1 Is MSB 0? Is next MSB =0? HW2-Q3

126
CMU SCS (c) C. Faloutsos, Entropy plot Slope ~ intrinsic dimensionality (in fact, ‘Information fractal dimension’) = info bit per coordinate bit - eg Dim = 1 Is MSB 0? Is next MSB =0? Info value =1 bit = E(2) - E(1) = slope! HW2-Q3

127
CMU SCS (c) C. Faloutsos, Entropy plot Repeat, for all points at same position: Dim=0 HW2-Q3

128
CMU SCS (c) C. Faloutsos, Entropy plot Repeat, for all points at same position: we need 0 bits of info, to determine position -> slope = 0 = intrinsic dimensionality Dim=0 HW2-Q3

129
CMU SCS (c) C. Faloutsos, Entropy plot Real (and 80-20) datasets can be in- between: bursts, gaps, smaller bursts, smaller gaps, at every scale Dim = 1 Dim=0 0

130
CMU SCS (c) C. Faloutsos, (Fractals, again) What set of points could have behavior between point and line? HW2-Q3

131
CMU SCS (c) C. Faloutsos, Cantor dust Eliminate the middle third Recursively! HW2-Q3

132
CMU SCS (c) C. Faloutsos, Cantor dust HW2-Q3

133
CMU SCS (c) C. Faloutsos, Cantor dust HW2-Q3

134
CMU SCS (c) C. Faloutsos, Cantor dust HW2-Q3

135
CMU SCS (c) C. Faloutsos, Cantor dust HW2-Q3

136
CMU SCS (c) C. Faloutsos, Dimensionality? (no length; infinite # points!) Answer: log2 / log3 = 0.6 Cantor dust HW2-Q3

137
CMU SCS (c) C. Faloutsos, Some more entropy plots: Poisson vs real Poisson: slope = ~1 -> uniformly distributed HW2-Q3

138
CMU SCS (c) C. Faloutsos, b-model b-model traffic gives perfectly linear plot Lemma: its slope is slope = -b log 2 b - (1-b) log 2 (1-b) Fitting: do entropy plot; get slope; solve for b E(n) n HW2-Q3

139
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting Bursty traffic - fractals and multifractals –Problem –Main idea (80/20, Hurst exponent) –Experiments - Results HW2-Q3

140
CMU SCS (c) C. Faloutsos, Experimental setup Disk traces (from HP [Wilkes 93]) web traces from LBL lbl-conn-7.tar.Z HW2-Q3

141
CMU SCS (c) C. Faloutsos, Model validation Linear entropy plots Bias factors b: smallest b / smoothest: nntp traffic HW2-Q3

142
CMU SCS (c) C. Faloutsos, Web traffic - results LBL, NCDF of queue lengths (log-log scales) (queue length l) Prob( >l) How to give guarantees? HW2-Q3

143
CMU SCS (c) C. Faloutsos, Web traffic - results LBL, NCDF of queue lengths (log-log scales) (queue length l) Prob( >l) 20% of the requests will see queue lengths <100 HW2-Q3

144
CMU SCS (c) C. Faloutsos, Conclusions Multifractals (80/20, ‘b-model’, Multiplicative Wavelet Model (MWM)) for analysis and synthesis of bursty traffic

145
CMU SCS (c) C. Faloutsos, Books Fractals: Manfred Schroeder: Fractals, Chaos, Power Laws: Minutes from an Infinite Paradise W.H. Freeman and Company, 1991 (Probably the BEST book on fractals!)

146
CMU SCS (c) C. Faloutsos, Further reading: Crovella, M. and A. Bestavros (1996). Self- Similarity in World Wide Web Traffic, Evidence and Possible Causes. Sigmetrics. [ieeeTN94] W. E. Leland, M.S. Taqqu, W. Willinger, D.V. Wilson, On the Self-Similar Nature of Ethernet Traffic, IEEE Transactions on Networking, 2, 1, pp 1-15, Feb

147
CMU SCS (c) C. Faloutsos, Further reading [Riedi+99] R. H. Riedi, M. S. Crouse, V. J. Ribeiro, and R. G. Baraniuk, A Multifractal Wavelet Model with Application to Network Traffic, IEEE Special Issue on Information Theory, 45. (April 1999), [Wang+02] Mengzhi Wang, Tara Madhyastha, Ngai Hang Chang, Spiros Papadimitriou and Christos Faloutsos, Data Mining Meets Performance Evaluation: Fast Algorithms for Modeling Bursty Traffic, ICDE 2002, San Jose, CA, 2/26/ /1/2002. Entropy plots

148
CMU SCS (c) C. Faloutsos, Outline Motivation... Linear Forecasting Bursty traffic - fractals and multifractals Non-linear forecasting Conclusions

149
CMU SCS (c) C. Faloutsos, Chaos and non-linear forecasting

150
CMU SCS (c) C. Faloutsos, Reference: [ Deepay Chakrabarti and Christos Faloutsos F4: Large-Scale Automated Forecasting using Fractals CIKM 2002, Washington DC, Nov ]

151
CMU SCS (c) C. Faloutsos, Detailed Outline Non-linear forecasting –Problem –Idea –How-to –Experiments –Conclusions

152
CMU SCS (c) C. Faloutsos, Recall: Problem #1 Given a time series {x t }, predict its future course, that is, x t+1, x t+2,... Time Value

153
CMU SCS (c) C. Faloutsos, Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] time x(t) Lag-plot ARIMA: fails

154
CMU SCS (c) C. Faloutsos, How to forecast? ARIMA - but: linearity assumption Lag-plot ARIMA: fails

155
CMU SCS (c) C. Faloutsos, How to forecast? ARIMA - but: linearity assumption ANSWER: ‘Delayed Coordinate Embedding’ = Lag Plots [Sauer92] ~ nearest-neighbor search, for past incidents

156
CMU SCS (c) C. Faloutsos, General Intuition (Lag Plot) x t-1 xtxtxtxt 4-NN New Point Interpolate these… To get the final prediction Lag = 1, k = 4 NN

157
CMU SCS (c) C. Faloutsos, Questions: Q1: How to choose lag L? Q2: How to choose k (the # of NN)? Q3: How to interpolate? Q4: why should this work at all?

158
CMU SCS (c) C. Faloutsos, Q1: Choosing lag L Manually (16, in award winning system by [Sauer94])

159
CMU SCS (c) C. Faloutsos, Q2: Choosing number of neighbors k Manually (typically ~ 1-10)

160
CMU SCS (c) C. Faloutsos, Q3: How to interpolate? How do we interpolate between the k nearest neighbors? A3.1: Average A3.2: Weighted average (weights drop with distance - how?)

161
CMU SCS (c) C. Faloutsos, Q3: How to interpolate? A3.3: Using SVD - seems to perform best ([Sauer94] - first place in the Santa Fe forecasting competition) X t-1 xtxt

162
CMU SCS (c) C. Faloutsos, Q4: Any theory behind it? A4: YES!

163
CMU SCS (c) C. Faloutsos, Theoretical foundation Based on the ‘Takens theorem’ [Takens81] which says that long enough delay vectors can do prediction, even if there are unobserved variables in the dynamical system (= diff. equations)

164
CMU SCS (c) C. Faloutsos, Theoretical foundation Example: Lotka-Volterra equations dH/dt = r H – a H*P dP/dt = b H*P – m P H is count of prey (e.g., hare) P is count of predators (e.g., lynx) Suppose only P(t) is observed (t=1, 2, …). H P Skip

165
CMU SCS (c) C. Faloutsos, Theoretical foundation But the delay vector space is a faithful reconstruction of the internal system state So prediction in delay vector space is as good as prediction in state space Skip H P P(t-1) P(t)

166
CMU SCS (c) C. Faloutsos, Detailed Outline Non-linear forecasting –Problem –Idea –How-to –Experiments –Conclusions

167
CMU SCS (c) C. Faloutsos, Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] time x(t) Lag-plot

168
CMU SCS (c) C. Faloutsos, Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] time x(t) Lag-plot ARIMA: fails

169
CMU SCS (c) C. Faloutsos, Logistic Parabola Timesteps Value Our Prediction from here

170
CMU SCS (c) C. Faloutsos, Logistic Parabola Timesteps Value Comparison of prediction to correct values

171
CMU SCS (c) C. Faloutsos, Datasets LORENZ: Models convection currents in the air dx / dt = a (y - x) dy / dt = x (b - z) - y dz / dt = xy - c z Value

172
CMU SCS (c) C. Faloutsos, LORENZ Timesteps Value Comparison of prediction to correct values

173
CMU SCS (c) C. Faloutsos, Datasets Time Value LASER: fluctuations in a Laser over time (used in Santa Fe competition)

174
CMU SCS (c) C. Faloutsos, Laser Timesteps Value Comparison of prediction to correct values

175
CMU SCS (c) C. Faloutsos, Conclusions Lag plots for non-linear forecasting (Takens’ theorem) suitable for ‘chaotic’ signals

176
CMU SCS (c) C. Faloutsos, References Deepay Chakrabarti and Christos Faloutsos F4: Large- Scale Automated Forecasting using Fractals CIKM 2002, Washington DC, Nov Sauer, T. (1994). Time series prediction using delay coordinate embedding. (in book by Weigend and Gershenfeld, below) Addison-Wesley. Takens, F. (1981). Detecting strange attractors in fluid turbulence. Dynamical Systems and Turbulence. Berlin: Springer-Verlag.

177
CMU SCS (c) C. Faloutsos, References Weigend, A. S. and N. A. Gerschenfeld (1994). Time Series Prediction: Forecasting the Future and Understanding the Past, Addison Wesley. (Excellent collection of papers on chaotic/non-linear forecasting, describing the algorithms behind the winners of the Santa Fe competition.)

178
CMU SCS (c) C. Faloutsos, Overall conclusions Similarity search: Euclidean/time-warping; feature extraction and SAMs

179
CMU SCS (c) C. Faloutsos, Overall conclusions Similarity search: Euclidean/time-warping; feature extraction and SAMs Signal processing: DWT is a powerful tool

180
CMU SCS (c) C. Faloutsos, Overall conclusions Similarity search: Euclidean/time-warping; feature extraction and SAMs Signal processing: DWT is a powerful tool Linear Forecasting: AR (Box-Jenkins) methodology; AWSOM

181
CMU SCS (c) C. Faloutsos, Overall conclusions Similarity search: Euclidean/time-warping; feature extraction and SAMs Signal processing: DWT is a powerful tool Linear Forecasting: AR (Box-Jenkins) methodology; AWSOM Bursty traffic: multifractals (80-20 ‘law’)

182
CMU SCS (c) C. Faloutsos, Overall conclusions Similarity search: Euclidean/time-warping; feature extraction and SAMs Signal processing: DWT is a powerful tool Linear Forecasting: AR (Box-Jenkins) methodology; AWSOM Bursty traffic: multifractals (80-20 ‘law’) Non-linear forecasting: lag-plots (Takens)

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google