Download presentation

Presentation is loading. Please wait.

Published byGrant Baron Modified about 1 year ago

1
Deepayan ChakrabartiCIKM F4: Large Scale Automated Forecasting Using Fractals -Deepayan Chakrabarti -Christos Faloutsos

2
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

3
Deepayan ChakrabartiCIKM General Problem Definition Given a time series {x t }, predict its future course, that is, x t+1, x t+2,... Time Value ?

4
Deepayan ChakrabartiCIKM Motivation Financial data analysis Physiological data, elderly care Weather, environmental studies Traditional fields Sensor Networks (MEMS, “SmartDust”) Long / “infinite” series No human intervention “black box”

5
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

6
Deepayan ChakrabartiCIKM How to forecast? ARIMA but linearity assumption Neural Networks but large number of parameters and long training times [Wan/1993, Mozer/1993] Hidden Markov Models O(N 2 ) in number of nodes N; also fixing N is a problem [Ge+/2000] Lag Plots

7
Deepayan ChakrabartiCIKM Lag Plots x t-1 xtxtxtxt 4-NN New Point Interpolate these… To get the final prediction Q0: Interpolation Method Q1: Lag = ? Q2: K = ?

8
Deepayan ChakrabartiCIKM Q0: Interpolation Using SVD (state of the art) [Sauer/1993] X t-1 xtxt

9
Deepayan ChakrabartiCIKM Why Lag Plots? Based on the “Takens’ Theorem” [Takens/1981] which says that delay vectors can be used for predictive purposes

10
Deepayan ChakrabartiCIKM Inside Theory Example: Lotka-Volterra equations ΔH/Δt = rH – aH*P ΔP/Δt = bH*P – mP H is density of prey P is density of predators Suppose only H(t) is observed. Internal state is (H,P). Extra

11
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

12
Deepayan ChakrabartiCIKM Problem at hand Given {x 1, x 2, …, x N } Automatically set parameters - L(opt) (from Q1) - k(opt) (from Q2) in Linear time on N to minimise Normalized Mean Squared Error (NMSE) of forecasting

13
Deepayan ChakrabartiCIKM Previous work/Alternatives Manual Setting : BUT infeasible [Sauer/1992] CrossValidation : BUT Slow; leave-one- out crossvalidation ~ O(N 2 logN) or more “False Nearest Neighbors” : BUT Unstable [Abarbanel/1996]

14
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

15
Deepayan ChakrabartiCIKM Intuition X(t-1) X(t) The Logistic Parabola x t = ax t-1 (1-x t-1 ) + noise time x(t) Intrinsic Dimensionality ≈ Degrees of Freedom ≈ Information about X t given X t-1

16
CIKM Intuition x(t-1) x(t) x(t-2) x(t) x(t-2) x(t-1) x(t)

17
Deepayan ChakrabartiCIKM Intuition To find L(opt): Go further back in time (ie., consider X t-2, X t-3 and so on) Till there is no more information gained about X t

18
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

19
Deepayan ChakrabartiCIKM Fractal Dimensions FD = intrinsic dimensionality “Embedding” dimensionality = 3 Intrinsic dimensionality = 1

20
Deepayan ChakrabartiCIKM Fractal Dimensions FD = intrinsic dimensionality [Belussi/1995] log(r) log( # pairs) Points to note: FD can be a non-integer There are fast methods to compute it

21
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

22
Deepayan ChakrabartiCIKM Q1: Finding L(opt) Use Fractal Dimensions to find the optimal lag length L(opt) Lag (L) Fractal Dimension epsilon L(opt) f

23
Deepayan ChakrabartiCIKM Q2: Finding k(opt) To find k(opt) Conjecture: k(opt) ~ O(f) We choose k(opt) = 2*f + 1

24
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

25
Deepayan ChakrabartiCIKM Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] Time Value

26
Deepayan ChakrabartiCIKM Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] LORENZ: Models convection currents in the air Time Value

27
CIKM Datasets Error NMSE = ∑(predicted-true) 2 /σ 2 Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] LORENZ: Models convection currents in the air LASER: fluctuations in a Laser over time (from the Santa Fe Time Series Competition, 1992) Time Value

28
Deepayan ChakrabartiCIKM Logistic Parabola FD vs L plot flattens out L(opt) = 1 Timesteps Value Lag FD

29
Deepayan ChakrabartiCIKM Logistic Parabola Timesteps Value Our Prediction from here

30
Deepayan ChakrabartiCIKM Logistic Parabola Timesteps Value Comparison of prediction to correct values

31
Deepayan ChakrabartiCIKM Logistic Parabola Our L(opt) = 1, which exactly minimizes NMSE Lag NMSE FD

32
Deepayan ChakrabartiCIKM LORENZ L(opt) = 5 Timesteps Value Lag FD

33
Deepayan ChakrabartiCIKM LORENZ Value Timesteps Our Prediction from here

34
Deepayan ChakrabartiCIKM LORENZ Timesteps Value Comparison of prediction to correct values

35
Deepayan ChakrabartiCIKM LORENZ L(opt) = 5 Also NMSE is optimal at Lag = 5 Lag NMSE FD

36
Deepayan ChakrabartiCIKM Laser L(opt) = 7 Timesteps Value Lag FD

37
Deepayan ChakrabartiCIKM Laser Timesteps Value Our Prediction starts here

38
Deepayan ChakrabartiCIKM Laser Timesteps Value Comparison of prediction to correct values

39
Deepayan ChakrabartiCIKM Laser L(opt) = 7 Corresponding NMSE is close to optimal Lag NMSE FD

40
Deepayan ChakrabartiCIKM Speed and Scalability Preprocessing is linear in N Proportional to time taken to calculate FD

41
Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

42
Deepayan ChakrabartiCIKM Conclusions Our Method: Automatically set parameters L(opt) (answers Q1) k(opt) (answers Q2) In linear time on N

43
Deepayan ChakrabartiCIKM Conclusions Black-box non-linear time series forecasting Fractal Dimensions give a fast, automated method to set all parameters So, given any time series, we can automatically build a prediction system Useful in a sensor network setting

44
Deepayan ChakrabartiCIKM Snapshot Extra

45
Deepayan ChakrabartiCIKM Future Work Feature Selection Multi-sequence prediction Extra

46
Deepayan ChakrabartiCIKM Discussion – Some other problems How to forecast? x 1, x 2, …, x N L(opt) k(opt) How to find the k(opt) nearest neighbors quickly? Given: Extra

47
Deepayan ChakrabartiCIKM Motivation Forecasting also allows us to Find outliers anything that doesn’t match our prediction! Find patterns if different circumstances lead to similar predictions, they may be related. Extra

48
Deepayan ChakrabartiCIKM Motivation (Examples) EEGs : Patterns of electromagnetic impulses in the brain Intensity variations of white dwarf stars Highway usage over time Traditional Sensors “Active Disks” for forecasting / prefetching / buffering “Smart House” sensors monitor situation in a house Volcano monitoring Extra

49
Deepayan ChakrabartiCIKM General Method {x t-1, …, x t-L(opt) } and corresponding prediction x t Store all the delay vectors {x t-1, …, x t-L(opt) } and corresponding prediction x t X t-1 xtxt Find the latest delay vector L(opt) = ? Find nearest neighbors K(opt) = ? Interpolate Extra

50
Deepayan ChakrabartiCIKM Intuition The FD vs L plot does flatten out L(opt) = 1 Lag Fractal dimension Extra

51
Deepayan ChakrabartiCIKM Inside Theory Internal state may be unobserved But the delay vector space is a faithful reconstruction of the internal system state So prediction in delay vector space is as good as prediction in state space Extra

52
Deepayan ChakrabartiCIKM Fractal Dimensions Many real-world datasets have fractional intrinsic dimension There exist fast (O(N)) methods to calculate the fractal dimension of a cloud of points [Belussi/1995] Extra

53
Deepayan ChakrabartiCIKM Speed and Scalability Preprocessing varies as L(opt) 2 Extra

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google