A Time Series Representation Framework Based on Learned Patterns

A Time Series Representation Framework Based on Learned Patterns
Mustafa Gokce Baydogan● George Runger* Didem Yamak† ● Boğaziçi University * Arizona State University † DeVry University 10/5/2013 8th INFORMS Workshop on Data Mining and Health Informatics (DM-HI 2013)

Outline Time series data mining
Motivation Representing time series Measuring similarity Learning a pattern-based representation Pattern (relationship) discovery Learned pattern similarity (LPS) Computational experiments and results Conclusions and future work

Time Series Data Mining Motivations
People measure things, and things (with rare exceptions) change over time Time series are everywhere Consider a patient’s medical record test values observations actions and related responses ECG Heartbeat Stock

Time Series Data Mining Motivations
Other types of data can be converted to time series. Everything is about the representation. Example: Recognizing words An example word “Alexandria” from the dataset of word profiles for George Washington's manuscripts. A word can be represented by two time series created by moving over and under the word Images from E. Keogh. A quick tour of the datasets for VLDB In VLDB, 2008.

Challenges Local patterns are important
Translations and dilations (warping) Observed four peaks are related to certain event in the manufacturing process Indication of a problem Time of the peaks may change (two peaks are observed earlier for blue series) Problem occurred over a shorter time interval

Challenges Time series are usually noisy
Multivariate time series (MTS) Relation of patterns within the series and interactions between series may be important High-dimensionality

Motivations Time series representation Time series similarity
To reduce high-dimensionality noise To capture trends, shapes and patterns As they provide more information compared to exact values of each time series data point Time series similarity Accurate Handle warping Fast

Time series representation
* Allows lower bounding for similarity computations

Time series similarity
Popular (No parameter) Intuitive Fast computation Performs bad Very popular (No parameter) Handles warping (Accurate) Hard to beat May perform bad (long series with noise) Handles warping (Accurate) Too many parameters to tune Computationally not efficient

Learning a pattern-based representation
A regression tree-based approach is used to learn a representation Earlier (Geurts, 2001), Your data matrix

A new learning approach Predicting (forecasting) a segment
Your data matrix Forecast ∆ (gap) time units forward

Representation Learned patterns
Time series is 128 units long Predictor segment 1-60 Response segment

Multiple segments Concatenate for all time series to create
1. Randomly, select a response segment (column) of length L 2. Build a regression tree At each split decision, select a random predictor column (one segment at each time)* Multiple random ∆ levels Build J trees with depth D *Known to work well for regression P. Geurts, D. Ernst, and L. Wehenkel. Extremely randomized trees. Machine Learning, 63(1):3-42, 2006.

Multiple segments (cont.)
Tree #1 Tree #2 Tree #3 Tree #J ……… ………………... ……………… Aggregate the information over all trees for prediction (i.e. denoising) Each terminal node defines a basis 2. pattern-based representation (a vector of size RxJ) ……………… 14

Similarity measure Learned Pattern Similarity (LPS)
Time series is represented by Suppose be kth entry of then* Penalizes the number of mismatches Series with mismatching observations in the patterns are different Robust to noise Implicitly works on the discrete values Robust to warping Representation learning handles the problem of warping *Assuming each tree has R terminal nodes 15

Similarity measure (cont.)
The computations are similar to Euclidean distance Fast Allows for bounding schemes Early abandon Similarity search: Find the reference time series that is most similar to query series Keep record of the best distance found so far Stop computing distance for a reference series if current distance is larger than best-so-far Known to improve the testing time (query time) significantly 16

S-MTS Experiments 45 univariate time series datasets from UCR database* Compared to popular NN classifiers with different distance measures Euclidean DTW (Constrained and unconstrained version) SpADe Sparse Spatial Sample Kernels (SSSK) Addition of difference series Taking trend information into consideration A multivariate time series extension If time permits Parameters Cross-validation to set parameters for each dataset Segment length (L) (0.25, 0.5, 0.75) factor of time series length Depth of trees (4,6,8) Number of trees=150 Not important if set large enough *

Univariate datasets Health Energy Robotics Astronomy Astronomy
Chemistry Gesture recognition

Parameters Illustration over 6 datasets (L=0.5xT) 19

Average error rates over 10 replications

Multivariate time series
While training, randomly select one univariate time series and a target segment Find splits over randomly selected predictor segments of randomly selected univariate time series Complexity does not change More trees with larger depth may be required uWaveGestureLibrary the accelerometer readings in three dimensions (i.e. x, y and z) Same parameters result in error rate of 0.022 21

LPS Conclusions and future work
A new approach for time series representation Captures relations between and within the series Features learned within the algorithm (not pre-specified) Handles nominal and missing values Handles warping by representation learning Scalable (also allows for parallel implementation) Training complexity: O(JNTD) Linear to time series length and number of training series Training took at most 6 minutes for 45 datasets (single thread, J=150, D=8, N=1800, T=750) SpADe did not return a result for a week of run Similarity search takes less than a millisecond Fast and accurate results with few parameters

LPS Conclusions and future work
This approach can be extended to many data mining tasks (for both univariate and multivariate time series and images) such as Denoising (in progress) Forecasting (in progress) Anomaly detection (in progress) Clustering (in progress) Indexing … LPS package is provided on 23

Questions and Comments?
Thanks! Questions and Comments? LPS package is provided on

A Time Series Representation Framework Based on Learned Patterns

Similar presentations

Presentation on theme: "A Time Series Representation Framework Based on Learned Patterns"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Time Series Representation Framework Based on Learned Patterns

Similar presentations

Presentation on theme: "A Time Series Representation Framework Based on Learned Patterns"— Presentation transcript:

Similar presentations

About project

Feedback