Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification.

Similar presentations


Presentation on theme: "Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification."— Presentation transcript:

1 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification Mustafa Gokce Baydogan * George Runger * Eugene Tuv † 11/13/2011 * Arizona State University † Intel Corporation

2 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 2 Outline  Time series classification Problem definition Literature review Motivation  A Bag-of-Features Framework for Time Series Classification Approach Algorithm  Computational experiments and results  Conclusions and future work

3 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 3 Time Series Classification  A supervised learning problem aimed at labeling temporally structured univariate (or multivariate) sequences of certain (or variable) length.

4 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 4 Time Series Classification  Gun Point Problem  OSU Leaf

5 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Time Series Classification 5

6 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Literature review  The algorithms proposed for time series classification can be divided into Instance-based methods  predict a test instance based on its similarity to the training instances  Requires a similarity measure (distance measure) Feature-based methods  predict a test instance based on a model built on the feature vectors extracted from a set of instances  Requires feature extraction methods and a prediction model to be built on the extracted features 6

7 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Literature review Instance-based methods  Nearest neighbor classifiers Euclidean distance  Fast but can not handle shift in time Dynamic time warping (DTW)  Strong solution known for time series problems in a variety of domains [1]  Shapelets [2,3] finds local patterns in a time series that are highly predictive of a class based on a distance measure 7 Heraldic shields

8 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 8  Global information is compact representation of the series (such as mean/variance of the time series), they are often too rigid to represent time series  Local patterns define the class They may shift over time Relation between certain patterns may be important The cardinality of the local feature set may vary. Literature review Feature-based methods

9 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Literature review Feature-based methods  Most of the approaches extract features of intervals and build a prediction model Use of knots of a piecewise linear approximation of the series as features [4] Feature extraction through genetic algorithm (finding representative features), SVM to classify based on extracted features [5] Neural networks [6] Decision trees [7] 9

10 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Motivation 10  Although dynamic time warping (DTW) provides strong results, it is not suitable for real time applications [3] DTW has a time complexity of O(n 2 )  the complexity reduces to O(n) using a lower bound (LB_Keogh [8]) where n is the length of the time series This complexity is for comparing only two series (train to test). DTW distance of a test data should be compared to every training data (or at least some of them) which increases the computation time significantly.  Disadvantage of instance based classifier

11 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Motivation  Feature-based approaches assume that patterns exist in the same time interval over the instances, but a pattern that defines a certain class may exist anywhere in time. 11 Each time series is standardized

12 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Motivation  DTW attempts to compensate for possible time translations between features, but with long time series, relatively short features of interest, and moderate noise, the capability for DTW is degraded.  The discontinuity of certain patterns in time is another problem which affects performance of DTW. 12

13 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification 13

14 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification  Bag of features is also referred as Bag of words in document analysis Bag of instances in multiple instance learning (MIL) Bag of frames in audio and speech recognition  Three main steps local feature extraction codebook generation  Similarity based  Unsupervised  Supervised classification from the codebook 14

15 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Local feature extraction  A subsequence s is generated randomly a lower bound on the subsequence length is set. It is a factor of the length of the time series (M).  After partitioning subsequence to intervals, interval features are extracted (mean, variance, slope). Number of intervals (d) must be the same for all subsequences (the cardinality of the feature set must be the same), however the subsequence lengths are random.  Random length intervals are used.  We set a minimum interval length so that extracted features are valid. (i.e mean value of single time point is not meaningful)  The number of intervals to partition a subsequence is set as  Number of subsequences generated for each series Maximum number of intervals – number of intervals of a subsequence 15

16 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Local feature extraction 16

17 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Codebook generation  After local feature extraction for each time series, a new dataset is generated where each subsequence from each time series becomes an instance. the class label defined for each instance is the class of the corresponding time series  We use a classifier that generates a class probability estimate for each instance (subsequence)  The estimate provides information on the strength of an assignment.  Codebook consists of Frequency of predicted subsequence classes for each time series Histogram of the class probability estimates for each class  number of bins of equal probability for generation is a parameter (b) 17

18 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Codebook generation 18

19 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Classification  Global features can supplement the codebook  A classifier is built on the codebook and the global features 19

20 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte TSBF  Random Forest (RF) [9] is an ensemble of N decision trees Each tree is constructed using a different bootstrap sample from the original data About one-third of the cases are left out of the bootstrap sample and not used in the construction of the single tree. (out-of-bag (OOB) samples) At each node of each tree, a RF considers the best split based on only a random sample of features. The random selection reduces the variance of the classifier, and also reduces the computational complexity  Random Forest (RF) is used as a classifier in our study. It is fast (parallel implementation is even faster) It is inherently multiclass (i.e. unlike SVM) It can handle missing values It provides class probability estimates based on OOB samples  The estimates computed from OOB predictions are good estimates of the generalization error  Other classifiers that generates similar information may have problems with overfitting. 20

21 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Computational experiments and results  Parameters  20 datasets from UCR time series database 21

22 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte  Some of the parameters are fixed based on the OOB error rates (number of bins, number of trees) Based on the training data (not after seeing the results on test data) 22 Computational experiments and results

23 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte  TSBF is implemented in both R and C.  We compare our algorithm to nearest neighbors (NN) classifiers with DTW. Two versions of DTW are considered:  NNBestDTW searches for the best warping window, based on the training data, then uses the learned window on the test data.  NNDTWNoWin has no warping window.  NNBestDTW is known to be a strong solution for time series classification  We replicate TSBF ten times because of its random nature and report average error rate on test data over ten replications 23 Computational experiments and results

24 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 24 Computational experiments and results

25 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 25 Computational experiments and results

26 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 26 Computational experiments and results

27 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 27 Computational experiments and results *Windows 7 system with 8 GB RAM, dual core CPU (i7-3620M 2.7 GHz)

28 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Conclusion and Future Work  We present an approach that Can handle local information better Is faster than instance based classifiers Allows for integration of local information Performs better than competitive methods Provides insight about the problem 28

29 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte Conclusion and Future Work  Knowledge extraction from random forests find patterns that are unique for each class  Variable importance measures from RF  Current approach works on series of the same length Modification of local feature extraction for series of variable length  Extending to multivariate time series classification problems Straightforward 29

30 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte THANKS, Questions? This research was partially supported by ONR grant N00014-09-1-0656 The code of TSBF and the datasets are provided in http://www.mustafabaydogan.com/research/time-series-classification.html The paper will be available soon after submission. 30

31 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte References 31

32 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte References (continued) 32

33 Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte 33


Download ppt "Mustafa Gokce Baydogan, George Runger and Eugene Tuv INFORMS Annual Meeting 2011, Charlotte A Bag-of-Features Framework for Time Series Classification."

Similar presentations


Ads by Google