Deepayan ChakrabartiCIKM 20021 F4: Large Scale Automated Forecasting Using Fractals -Deepayan Chakrabarti -Christos Faloutsos.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

The Software Infrastructure for Electronic Commerce Databases and Data Mining Lecture 4: An Introduction To Data Mining (II) Johannes Gehrke
Ordinary Least-Squares
ECG Signal processing (2)
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals: M-trees and dim. curse (case studies – Part II) C. Faloutsos.
FUNNEL: Automatic Mining of Spatially Coevolving Epidemics Yasuko Matsubara, Yasushi Sakurai (Kumamoto University) Willem G. van Panhuis (University of.
QR Code Recognition Based On Image Processing
CMU SCS : Multimedia Databases and Data Mining Lecture #10: Fractals - case studies - I C. Faloutsos.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Fast Algorithms For Hierarchical Range Histogram Constructions
From
Support Vector Machines Instructor Max Welling ICS273A UCIrvine.
Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.
Hazırlayan NEURAL NETWORKS Least Squares Estimation PROF. DR. YUSUF OYSAL.
Computer vision: models, learning and inference
Forecasting using Non Linear Techniques in Time Series Analysis – Michel Camilleri – September FORECASTING USING NON-LINEAR TECHNIQUES IN TIME SERIES.
CMU SCS : Multimedia Databases and Data Mining Lecture #11: Fractals - case studies Part III (regions, quadtrees, knn queries) C. Faloutsos.
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Independent Component Analysis (ICA) and Factor Analysis (FA)
CMU SCS Sensor data mining and forecasting Christos Faloutsos CMU
Lattices for Distributed Source Coding - Reconstruction of a Linear function of Jointly Gaussian Sources -D. Krithivasan and S. Sandeep Pradhan - University.
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Advanced Methods of Prediction Motti Sorani Boaz Cohen Supervisor: Gady Zohar Technion - Israeli Institute of Technology Department of Electrical Engineering.
CISE-301: Numerical Methods Topic 1: Introduction to Numerical Methods and Taylor Series Lectures 1-4: KFUPM CISE301_Topic1.
Constructing Optimal Wavelet Synopses Dimitris Sacharidis Timos Sellis
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
CMU SCS : Multimedia Databases and Data Mining Lecture #12: Fractals - case studies Part III (quadtrees, knn queries) C. Faloutsos.
Applications of Neural Networks in Time-Series Analysis Adam Maus Computer Science Department Mentor: Doctor Sprott Physics Department.
Dimensionality Reduction Motivation I: Data Compression Machine Learning.
Introduction to Chaos by: Saeed Heidary 29 Feb 2013.
A Passive Approach to Sensor Network Localization Rahul Biswas and Sebastian Thrun International Conference on Intelligent Robots and Systems 2004 Presented.
Lei Li Computer Science Department Carnegie Mellon University Pre Proposal Time Series Learning completed work 11/27/2015.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
ECE-7000: Nonlinear Dynamical Systems Overfitting and model costs Overfitting  The more free parameters a model has, the better it can be adapted.
A New Temporal Pattern Identification Method for Characterization and Prediction of Complex Time Series Events Advisor : Dr. Hsu Graduate : You-Cheng Chen.
Streaming Pattern Discovery in Multiple Time-Series Jimeng Sun Spiros Papadimitrou Christos Faloutsos PARALLEL DATA LABORATORY Carnegie Mellon University.
Large-Scale Matrix Factorization with Missing Data under Additional Constraints Kaushik Mitra University of Maryland, College Park, MD Sameer Sheoreyy.
CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.
Learning Chaotic Dynamics from Time Series Data A Recurrent Support Vector Machine Approach Vinay Varadan.
A Kernel Approach for Learning From Almost Orthogonal Pattern * CIS 525 Class Presentation Professor: Slobodan Vucetic Presenter: Yilian Qin * B. Scholkopf.
D YNA MM O : M INING AND S UMMARIZATION OF C OEVOLVING S EQUENCES WITH M ISSING V ALUES Lei Li joint work with Christos Faloutsos, James McCann, Nancy.
G W. Yan 1 Multi-Model Fusion for Robust Time-Series Forecasting Weizhong Yan Industrial Artificial Intelligence Lab GE Global Research Center.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
RiskTeam/ Zürich, 6 July 1998 Andreas S. Weigend, Data Mining Group, Information Systems Department, Stern School of Business, NYU 2: 1 Nonlinear Models.
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting 卷积LSTM网络:利用机器学习预测短期降雨 施行健 香港科技大学 VALSE 2016/03/23.
Feature learning for multivariate time series classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † * Arizona State University † Intel Corporation.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Keogh, E. , Chakrabarti, K. , Pazzani, M. & Mehrotra, S. (2001)
Data Transformation: Normalization
Course Review Questions will not be all on one topic, i.e. questions may have parts covering more than one area.
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Random walk initialization for training very deep feedforward networks
Outline Peter N. Belhumeur, Joao P. Hespanha, and David J. Kriegman, “Eigenfaces vs. Fisherfaces: Recognition Using Class Specific Linear Projection,”
Perceptrons for Dummies
Object Modeling with Layers
An Improved Neural Network Algorithm for Classifying the Transmission Line Faults Slavko Vasilic Dr Mladen Kezunovic Texas A&M University.
Department of Electrical Engineering
Intrinsically Motivated Collective Motion
Topological Signatures For Fast Mobility Analysis
Prediction Networks Prediction A simple example (section 3.7.3)
Memory-Based Learning Instance-Based Learning K-Nearest Neighbor
Using Clustering to Make Prediction Intervals For Neural Networks
Support Vector Machines 2
Presentation transcript:

Deepayan ChakrabartiCIKM F4: Large Scale Automated Forecasting Using Fractals -Deepayan Chakrabarti -Christos Faloutsos

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM General Problem Definition Given a time series {x t }, predict its future course, that is, x t+1, x t+2,... Time Value ?

Deepayan ChakrabartiCIKM Motivation Financial data analysis Physiological data, elderly care Weather, environmental studies Traditional fields Sensor Networks (MEMS, “SmartDust”) Long / “infinite” series No human intervention  “black box”

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM How to forecast? ARIMA  but linearity assumption Neural Networks  but large number of parameters and long training times [Wan/1993, Mozer/1993] Hidden Markov Models  O(N 2 ) in number of nodes N; also fixing N is a problem [Ge+/2000]  Lag Plots

Deepayan ChakrabartiCIKM Lag Plots x t-1 xtxtxtxt 4-NN New Point Interpolate these… To get the final prediction Q0: Interpolation Method Q1: Lag = ? Q2: K = ?

Deepayan ChakrabartiCIKM Q0: Interpolation Using SVD (state of the art) [Sauer/1993] X t-1 xtxt

Deepayan ChakrabartiCIKM Why Lag Plots? Based on the “Takens’ Theorem” [Takens/1981] which says that delay vectors can be used for predictive purposes

Deepayan ChakrabartiCIKM Inside Theory Example: Lotka-Volterra equations ΔH/Δt = rH – aH*P ΔP/Δt = bH*P – mP H is density of prey P is density of predators Suppose only H(t) is observed. Internal state is (H,P). Extra

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM Problem at hand  Given {x 1, x 2, …, x N }  Automatically set parameters - L(opt) (from Q1) - k(opt) (from Q2)  in Linear time on N  to minimise Normalized Mean Squared Error (NMSE) of forecasting

Deepayan ChakrabartiCIKM Previous work/Alternatives Manual Setting : BUT infeasible [Sauer/1992] CrossValidation : BUT Slow; leave-one- out crossvalidation ~ O(N 2 logN) or more “False Nearest Neighbors” : BUT Unstable [Abarbanel/1996]

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM Intuition X(t-1) X(t) The Logistic Parabola x t = ax t-1 (1-x t-1 ) + noise time x(t) Intrinsic Dimensionality ≈ Degrees of Freedom ≈ Information about X t given X t-1

CIKM Intuition x(t-1) x(t) x(t-2) x(t) x(t-2) x(t-1) x(t)

Deepayan ChakrabartiCIKM Intuition To find L(opt): Go further back in time (ie., consider X t-2, X t-3 and so on) Till there is no more information gained about X t

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM Fractal Dimensions FD = intrinsic dimensionality “Embedding” dimensionality = 3 Intrinsic dimensionality = 1

Deepayan ChakrabartiCIKM Fractal Dimensions FD = intrinsic dimensionality [Belussi/1995] log(r) log( # pairs) Points to note: FD can be a non-integer There are fast methods to compute it

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM Q1: Finding L(opt) Use Fractal Dimensions to find the optimal lag length L(opt) Lag (L) Fractal Dimension epsilon L(opt) f

Deepayan ChakrabartiCIKM Q2: Finding k(opt) To find k(opt) Conjecture: k(opt) ~ O(f) We choose k(opt) = 2*f + 1

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] Time Value

Deepayan ChakrabartiCIKM Datasets Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] LORENZ: Models convection currents in the air Time Value

CIKM Datasets Error NMSE = ∑(predicted-true) 2 /σ 2 Logistic Parabola: x t = ax t-1 (1-x t-1 ) + noise Models population of flies [R. May/1976] LORENZ: Models convection currents in the air LASER: fluctuations in a Laser over time (from the Santa Fe Time Series Competition, 1992) Time Value

Deepayan ChakrabartiCIKM Logistic Parabola FD vs L plot flattens out L(opt) = 1 Timesteps Value Lag FD

Deepayan ChakrabartiCIKM Logistic Parabola Timesteps Value Our Prediction from here

Deepayan ChakrabartiCIKM Logistic Parabola Timesteps Value Comparison of prediction to correct values

Deepayan ChakrabartiCIKM Logistic Parabola Our L(opt) = 1, which exactly minimizes NMSE Lag NMSE FD

Deepayan ChakrabartiCIKM LORENZ L(opt) = 5 Timesteps Value Lag FD

Deepayan ChakrabartiCIKM LORENZ Value Timesteps Our Prediction from here

Deepayan ChakrabartiCIKM LORENZ Timesteps Value Comparison of prediction to correct values

Deepayan ChakrabartiCIKM LORENZ L(opt) = 5 Also NMSE is optimal at Lag = 5 Lag NMSE FD

Deepayan ChakrabartiCIKM Laser L(opt) = 7 Timesteps Value Lag FD

Deepayan ChakrabartiCIKM Laser Timesteps Value Our Prediction starts here

Deepayan ChakrabartiCIKM Laser Timesteps Value Comparison of prediction to correct values

Deepayan ChakrabartiCIKM Laser L(opt) = 7 Corresponding NMSE is close to optimal Lag NMSE FD

Deepayan ChakrabartiCIKM Speed and Scalability Preprocessing is linear in N Proportional to time taken to calculate FD

Deepayan ChakrabartiCIKM Outline Introduction/Motivation Survey and Lag Plots Exact Problem Formulation Proposed Method Fractal Dimensions Background Our method Results Conclusions

Deepayan ChakrabartiCIKM Conclusions Our Method: Automatically set parameters L(opt) (answers Q1) k(opt) (answers Q2) In linear time on N

Deepayan ChakrabartiCIKM Conclusions Black-box non-linear time series forecasting Fractal Dimensions give a fast, automated method to set all parameters So, given any time series, we can automatically build a prediction system Useful in a sensor network setting

Deepayan ChakrabartiCIKM Snapshot Extra

Deepayan ChakrabartiCIKM Future Work Feature Selection Multi-sequence prediction Extra

Deepayan ChakrabartiCIKM Discussion – Some other problems How to forecast? x 1, x 2, …, x N L(opt) k(opt) How to find the k(opt) nearest neighbors quickly? Given: Extra

Deepayan ChakrabartiCIKM Motivation Forecasting also allows us to Find outliers  anything that doesn’t match our prediction! Find patterns  if different circumstances lead to similar predictions, they may be related. Extra

Deepayan ChakrabartiCIKM Motivation (Examples) EEGs : Patterns of electromagnetic impulses in the brain Intensity variations of white dwarf stars Highway usage over time Traditional Sensors “Active Disks” for forecasting / prefetching / buffering “Smart House”  sensors monitor situation in a house Volcano monitoring Extra

Deepayan ChakrabartiCIKM General Method {x t-1, …, x t-L(opt) } and corresponding prediction x t Store all the delay vectors {x t-1, …, x t-L(opt) } and corresponding prediction x t X t-1 xtxt Find the latest delay vector L(opt) = ? Find nearest neighbors K(opt) = ? Interpolate Extra

Deepayan ChakrabartiCIKM Intuition The FD vs L plot does flatten out L(opt) = 1 Lag Fractal dimension Extra

Deepayan ChakrabartiCIKM Inside Theory Internal state may be unobserved But the delay vector space is a faithful reconstruction of the internal system state So prediction in delay vector space is as good as prediction in state space Extra

Deepayan ChakrabartiCIKM Fractal Dimensions Many real-world datasets have fractional intrinsic dimension There exist fast (O(N)) methods to calculate the fractal dimension of a cloud of points [Belussi/1995] Extra

Deepayan ChakrabartiCIKM Speed and Scalability Preprocessing varies as L(opt) 2 Extra