A Framework for Discovering Anomalous Regimes in Multivariate Time-Series Data with Local Models Stephen Bay Stanford University, and Institute for the Study of Learning and Expertise Joint work with Kazumi Saito, Naonori Ueda, and Pat Langley
Discovering Anomalous Regimes Problem: Discover when a section of an observed time series has been generated by an anomalous regime. Anomalous: extremely rare or unusual Regime: the hypothetical true model generating the observed data
Motivation charge voltage temp. current variables causally related several different modes nasa.gov
Other Categories of Irregularities Outliers Unusual patterns
DARTS Framework Estimate on windows Map into parameter space Estimate density of T according to R 1. Reference and Test data 4. Anomaly score 3. Parameter space 2. Local Models compute threshold Discovering Anomalous Regimes in Time Series
Local Models Vector Autoregressive models Regression format Ridge Regression
Scoring and Density Estimation Estimate the density of local models from T relative to R in the parameter space Kernels NN style
Determining a Null Distribution Score function provides a continuous estimate but some tasks require hard cutoff Null Distribution: –the distribution of anomaly scores we would expect to see if the data was completely normal Resample R and generate empirical distribution from block cross-validation Provides hypothesis testing framework for sounding alarms Anomaly score Empirical distribution
Computation Time Local Models –Linear in N (reference and test) –Cubic in number of variables (for AR) –Linear in window size (for AR) Density Estimation –Implemented with KD-trees –Potentially N T log N R –Can be worse in higher dimensions
Experiments Why evaluation is difficult Data sets –CD Player –Random Walk –ECG Arrhythmia –Financial Time-Series Comparison Algorithms –Hotelling’s T 2 statistic
Hotelling’s T 2 Statistic Commonly used in statistical process control for monitoring multivariate processes Basically the same as Mahalanobis distance Applied with time lags for patient monitoring in multivariate data (Gather et al., 2001)
CD Player Data from mechanical cd player arm –Two inputs relating to actuators (u1,u2) –Two outputs relating to position accuracy (y1,y2)
Output variable y1: artificial anomaly
Output variable y2: unchanged
Hotelling’s T 2
Random Walk No anomalies in random walk data
DARTS
Hotelling’s T 2
Cardiac Arrhythmia Data Electrocardiogram traces from MIT-BIH Collected to study cardiac dynamics and arrhythmias Every beat annotated by two cardiologists 30 minute 360 Hz Roughly 650,000 points, 2000 beats Points reference set remainder is test data
Cardiac Reference Data
DARTS Vaa
Hotelling’s T 2 Vaa
DARTS jjj
a
TP/FP Statistics ThresholdTPTNFPFNSensitivitySelectivity 97% %71.1% 98% %80.1% 99% %89.9% Sensitivity = TP / (TP + FN) Selectivity = TP / (TP + FP)
Japanese Financial Data Monthly data from Variables: –Monetary base –National bond interest rate –Wholesale price index –Index of industrial produce –Machinery orders –Exchange rate yen/dollar True anomalies unknown –subjective evaluation by expert
DARTS: Bond Rate
DARTS: Monetary Base
DARTS: Wholesale Price Index
DARTS: Index Industrial Produce
DARTS: Machinery Orders
Hotelling’s T 2
Hotelling’s T 2 vs. DARTS T2 can detect multivariate changes but, –Has little selectivity –Does not distinguish between variables –Does not handle drifts –F-statistical test often grossly underestimates proper threshold
Limitations of DARTS Suitability of local models Window-size and sensitivity Number of parameters Overlapping data Efficiency of KD-tree Explanation
Related Work Limit checking Discrepancy checking Autoregressive models Unusual patterns HMM’s
Conclusions DARTS framework Data -> local models -> parameter space -> density estimate Provides hypothesis testing framework for flagging anomalies Promising results on a variety of real and synthetic problems
DARTS Framework 1.Preprocess R and T 2.Select target variable and create local models from R 3.Create local models from T 4.Compare models of T to R in space P 5.Compute Null Distribution 6.Repeat steps 2-5 for each variable