A Framework for Discovering Anomalous Regimes in Multivariate Time-Series Data with Local Models Stephen Bay Stanford University, and Institute for the.

Slides:

Advertisements

Similar presentations

Applications of one-class classification

Advertisements

Sensor-Based Abnormal Human-Activity Detection Authors: Jie Yin, Qiang Yang, and Jeffrey Junfeng Pan Presenter: Raghu Rangan.

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Other Classification Techniques 1.Nearest Neighbor Classifiers 2.Support Vector Machines.

Mining Distance-Based Outliers in Near Linear Time with Randomization and a Simple Pruning Rule Stephen D. Bay 1 and Mark Schwabacher 2 1 Institute for.

Distributed Indexed Outlier Detection Algorithm Status Update as of March 11, 2014.

Indian Statistical Institute Kolkata

Data preprocessing before classification In Kennedy et al.: “Solving data mining problems”

Chapter 4: Linear Models for Classification

Derek A.T. Cummings, University of Pittsburgh Graduate School of Public Health and Johns Hopkins Bloomberg School of Public Health Timothy P. Endy, Walter.

Rob Fergus Courant Institute of Mathematical Sciences New York University A Variational Approach to Blind Image Deconvolution.

Assessing and Comparing Classification Algorithms Introduction Resampling and Cross Validation Measuring Error Interval Estimation and Hypothesis Testing.

Introduction to Predictive Learning

1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.

A Data-Driven Approach to Quantifying Natural Human Motion SIGGRAPH ’ 05 Liu Ren, Alton Patrick, Alexei A. Efros, Jassica K. Hodgins, and James M. Rehg.

Speaker Adaptation for Vowel Classification

Engineering Data Analysis & Modeling Practical Solutions to Practical Problems Dr. James McNames Biomedical Signal Processing Laboratory Electrical & Computer.

Multi-Scale Analysis for Network Traffic Prediction and Anomaly Detection Ling Huang Joint work with Anthony Joseph and Nina Taft January, 2005.

Stat 350 Lab Session GSI: Yizao Wang Section 016 Mon 2pm30-4pm MH 444-D Section 043 Wed 2pm30-4pm MH 444-B.

Feature Extraction for Outlier Detection in High- Dimensional Spaces Hoang Vu Nguyen Vivekanand Gopalkrishnan.

Anomaly Detection. Anomaly/Outlier Detection  What are anomalies/outliers? The set of data points that are considerably different than the remainder.

Parameterizing Random Test Data According to Equivalence Classes Chris Murphy, Gail Kaiser, Marta Arias Columbia University.

© 2014 CY Lin, Columbia University E6893 Big Data Analytics – Lecture 4: Big Data Analytics Algorithms 1 E6893 Big Data Analytics: Financial Market Volatility.

Part I: Classification and Bayesian Learning

Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.

1 Multivariate Normal Distribution Shyh-Kang Jeng Department of Electrical Engineering/ Graduate Institute of Communication/ Graduate Institute of Networking.

INTRODUCTION TO Machine Learning 3rd Edition

TSTAT_THRESHOLD (~1 secs execution) Calculates P=0.05 (corrected) threshold t for the T statistic using the minimum given by a Bonferroni correction and.

IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.

Anomaly detection with Bayesian networks Website: John Sandiford.

Fast Portscan Detection Using Sequential Hypothesis Testing Authors: Jaeyeon Jung, Vern Paxson, Arthur W. Berger, and Hari Balakrishnan Publication: IEEE.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

BPS - 3rd Ed. Chapter 211 Inference for Regression.

Disease Prevalence Estimates for Neighbourhoods: Combining Spatial Interpolation and Spatial Factor Models Peter Congdon, Queen Mary University of London.

Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:

Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.

Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania

Random Forest-Based Classification of Heart Rate Variability Signals by Using Combinations of Linear and Nonlinear Features Alan Jovic, Nikola Bogunovic.

Chapter 21 R(x) Algorithm a) Anomaly Detection b) Matched Filter.

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Bayesian Inference and Posterior Probability Maps Guillaume Flandin Wellcome Department of Imaging Neuroscience, University College London, UK SPM Course,

Digital Media Lab 1 Data Mining Applied To Fault Detection Shinho Jeong Jaewon Shim Hyunsoo Lee {cinooco, poohut,

Data Mining Anomaly Detection Lecture Notes for Chapter 10 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach, Kumar Introduction to.

New Measures of Data Utility Mi-Ja Woo National Institute of Statistical Sciences.

Financial Time Series Analysis with Wavelets Rishi Kumar Baris Temelkuran.

GENDER AND AGE RECOGNITION FOR VIDEO ANALYTICS SOLUTION PRESENTED BY: SUBHASH REDDY JOLAPURAM.

Inference for the mean vector. Univariate Inference Let x 1, x 2, …, x n denote a sample of n from the normal distribution with mean  and variance 

Data Mining and Decision Support

1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.

Rerandomization to Improve Covariate Balance in Randomized Experiments Kari Lock Harvard Statistics Advisor: Don Rubin 4/28/11.

Granger Causality for Time-Series Anomaly Detection By Zhangzhou.

Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Zurich, February 2008 Bayesian Inference.

SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.

BPS - 5th Ed. Chapter 231 Inference for Regression.

Outline Time series prediction Find k-nearest neighbors Lag selection Weighted LS-SVM.

ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.

Experience Report: System Log Analysis for Anomaly Detection

Why Stochastic Hydrology ?

AP Statistics Chapter 14 Section 1.

Online Conditional Outlier Detection in Nonstationary Time Series

The general linear model and Statistical Parametric Mapping

Inference for the mean vector

Lecture Notes for Chapter 9 Introduction to Data Mining, 2nd Edition

Outlier Discovery/Anomaly Detection

A Latent Space Approach to Dynamic Embedding of Co-occurrence Data

Lithography Diagnostics Based on Empirical Modeling

Basic Practice of Statistics - 3rd Edition Inference for Regression

Data Mining Anomaly Detection

Jia-Bin Huang Virginia Tech

Data Mining Anomaly Detection

Outlines Introduction & Objectives Methodology & Workflow

Presentation transcript:

A Framework for Discovering Anomalous Regimes in Multivariate Time-Series Data with Local Models Stephen Bay Stanford University, and Institute for the Study of Learning and Expertise Joint work with Kazumi Saito, Naonori Ueda, and Pat Langley

Discovering Anomalous Regimes Problem: Discover when a section of an observed time series has been generated by an anomalous regime. Anomalous: extremely rare or unusual Regime: the hypothetical true model generating the observed data

Motivation charge voltage temp. current variables causally related several different modes nasa.gov

Other Categories of Irregularities Outliers Unusual patterns

DARTS Framework Estimate on windows Map into parameter space Estimate density of T according to R 1. Reference and Test data 4. Anomaly score 3. Parameter space 2. Local Models compute threshold Discovering Anomalous Regimes in Time Series

Local Models Vector Autoregressive models Regression format Ridge Regression

Scoring and Density Estimation Estimate the density of local models from T relative to R in the parameter space Kernels NN style

Determining a Null Distribution Score function provides a continuous estimate but some tasks require hard cutoff Null Distribution: –the distribution of anomaly scores we would expect to see if the data was completely normal Resample R and generate empirical distribution from block cross-validation Provides hypothesis testing framework for sounding alarms Anomaly score Empirical distribution

Computation Time Local Models –Linear in N (reference and test) –Cubic in number of variables (for AR) –Linear in window size (for AR) Density Estimation –Implemented with KD-trees –Potentially N T log N R –Can be worse in higher dimensions

Experiments Why evaluation is difficult Data sets –CD Player –Random Walk –ECG Arrhythmia –Financial Time-Series Comparison Algorithms –Hotelling’s T 2 statistic

Hotelling’s T 2 Statistic Commonly used in statistical process control for monitoring multivariate processes Basically the same as Mahalanobis distance Applied with time lags for patient monitoring in multivariate data (Gather et al., 2001)

CD Player Data from mechanical cd player arm –Two inputs relating to actuators (u1,u2) –Two outputs relating to position accuracy (y1,y2)

Output variable y1: artificial anomaly

Output variable y2: unchanged

Hotelling’s T 2

Random Walk No anomalies in random walk data

DARTS

Hotelling’s T 2

Cardiac Arrhythmia Data Electrocardiogram traces from MIT-BIH Collected to study cardiac dynamics and arrhythmias Every beat annotated by two cardiologists 30 minute 360 Hz Roughly 650,000 points, 2000 beats Points reference set remainder is test data

Cardiac Reference Data

DARTS Vaa

Hotelling’s T 2 Vaa

DARTS jjj

a

TP/FP Statistics ThresholdTPTNFPFNSensitivitySelectivity 97% %71.1% 98% %80.1% 99% %89.9% Sensitivity = TP / (TP + FN) Selectivity = TP / (TP + FP)

Japanese Financial Data Monthly data from Variables: –Monetary base –National bond interest rate –Wholesale price index –Index of industrial produce –Machinery orders –Exchange rate yen/dollar True anomalies unknown –subjective evaluation by expert

DARTS: Bond Rate

DARTS: Monetary Base

DARTS: Wholesale Price Index

DARTS: Index Industrial Produce

DARTS: Machinery Orders

Hotelling’s T 2

Hotelling’s T 2 vs. DARTS T2 can detect multivariate changes but, –Has little selectivity –Does not distinguish between variables –Does not handle drifts –F-statistical test often grossly underestimates proper threshold

Limitations of DARTS Suitability of local models Window-size and sensitivity Number of parameters Overlapping data Efficiency of KD-tree Explanation

Related Work Limit checking Discrepancy checking Autoregressive models Unusual patterns HMM’s

Conclusions DARTS framework Data -> local models -> parameter space -> density estimate Provides hypothesis testing framework for flagging anomalies Promising results on a variety of real and synthetic problems

DARTS Framework 1.Preprocess R and T 2.Select target variable and create local models from R 3.Create local models from T 4.Compare models of T to R in space P 5.Compute Null Distribution 6.Repeat steps 2-5 for each variable