Download presentation

Presentation is loading. Please wait.

Published byEmily Leatherbury Modified about 1 year ago

1
Mining Recent Temporal Patterns for Event Detection in Multivariate Time Series Data Iyad Batal Dept. of Computer Science University of Pittsburgh iyad@cs.pitt.edu Dmitriy Fradkin Siemens Corporate Research dmitriy.fradkin@siem ens.com James Harrison Dept. of Public Health Sciences University of Virginia james.harrison@virgi nia.edu Fabian Moerchen Siemens Corporate Research fabian.moerchen@si emens.com Milos Hauskrecht Dept. of Computer Science University of Pittsburgh milos@cs.pitt.edu 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012

2
Introduction Supervised temporal detection. Given a labeled dataset of temporal instances till time t i. Find frequently occurring “temporal patterns” for each label. Given a sample instance, predict its label. Contributions of this paper: Abstractions to define “Recent temporal Patterns” (motivated from medical EHR records) Algorithms to find “frequent patterns” among a given database.

3
Example Database: EHR Records of Patients Each Record: Multiple temporal variables. Each with multiple reading till time t i E.g. glucose, ceratine, cholesterol Label: Disease/Symptom detected at time t i Supervised Learning: Given a database, learn patterns associated with different diseases Prediction: Given a new Patient, find “recent temporal pattern” and the label associated with it.

4
Temporal Abstraction Patterns Issues: Irregularly sampled Sampling errors Multivariate Temporal Abstractions Numeric Values to Finite Abstraction Alphabets E.g. Very Low, low, normal, high, very high All contiguous values with same abstraction form an interval Time series can be represented as {, …}

5
Temporal Abstraction Patterns Multivariate State Sequences Temporal Variable - F State – V State interval – E = (F, V, s, e), so a single variable time series is a ordered set of E i Multivariate State Sequence – (basically a patient record) Z i An ordered combination of state intervals for all containing variables Ordered by Start Times. If Start Times collide, sort by End Time If Both collide, sort of lexical ordering

6
Temporal Abstraction Patterns What is a pattern? A sequence of “temporal relations” between state intervals Ei, Ej What kind of “temporal relations” Ei occurs BEFORE Ej (or vice versa) Ei CO-OCCURS with Ej There are other fine grained relations such as start together, end together, equals, contains, overlaps, meets etc. But they only consider the above two relations which generalizes all these relations.

7
Temporal Abstraction Patterns Temporal Pattern: P = (, R), where Si is the state. And R is the relation matrix which defines either a b (before) or c(co-occurs) relation between a state and the consequent states. Hence R is a upper triangular matrix. P is called k-pattern where k = | |

8
Temporal Abstraction Patterns Pattern Containment: Given a pattern P = (, R) And MSS Z = Z contains P iff All Si are in Z And for i= 1..k and j = i..k-1 R i,j holds for Ei, Ej (denoted by R i,j (Ei, Ej))

9
Mining Recent Temporal Patterns “Recent State” Given a MSS Z= A state Ei is “recent” interval given a maximum gap g if any of the following condition is true Ei is the last state for the given temporal variable. Ei.F not equal Ek.F for all k > I Z.end – Ei.end <= g

10
Mining Recent Temporal Patterns “Recent Pattern (RTP)” Given a MSS Z= A pattern P = (, R) is “recent” pattern in Z given a maximum gap g if ALL of the following conditions are true Z contains P Sk is a recent state in Z No two consecutive matched states in Z are more than “g” apart. i.e. E(k+1).s – E(k).e <= g Suffix Sub-pattern: P is suffix sub-pattern of P’ if P contains a suffix of states in P’ (e.g. if P’ =, P can contain or or but NOT. All the relations for Si in P are same as in P’

11
Mining Recent Temporal Patterns Frequent Recent Pattern Given a database D of MSS, a gap parameter g, and a support parameter sigma. A pattern P is called “frequent” if the number of times it occurs in D, called its “support” denoted as RTP-sup-g(P,D), is greater than sigma.

12
Mining Algorithm Goal: For a given database, for all given labels. Find Frequent Recent Patterns associated with each given Label. In other words, for each class y, given the database Dy. Output a set of patterns that satisfy:

13
Mining Algorithm Approach: Build Patterns of incremental size. Start with patterns of size 1 and build on top of that. For (k+1)th stage, i.e. to fine (k+1)-RTPs given K-RTPs, the algorithms consists of two stages Candidate Generation Counting (by removing candidates that do not qualify)

14
Mining Algorithm Naïve Candidate Generation a b c K-RTPs 12 3 L … a b c 1 1 1 a b c 2 2 2 a b c L L L......

15
Mining Algorithm Improving Efficiency Remove “incoherent” patterns. i.e. patterns that are not allowed. if S1.F = Si.F and R1,i = c, then the pattern cannot be valid, since these states will be combined into one. If R1,j = c then for any R1,i such that i

16
Mining Algorithm Naïve Counting Algorithm. For each variable y For each candidate P, For each MSS Z in database Dy Verify if P is a RTP in Z and increment Count for P for variable y.

17
Mining Algorithm Improving efficiency of Counting Algorithm. Filter D based only on States (instead of entire pattern matching) Proposition (based on the suffix sub-pattern definition): The list of Z in D containing P is a subset of the list of Z containing P’ if P’ is a suffix sub- pattern of P. Get the intersection of the above two list to get the actual candidate Zs to search and match. Note: Due to the second property, the size of the list of candidate Zs keeps on decreasing over time. At the end of the algorithm For each label y, we have a list of frequent m-RTP patterns associated with that label.

18
Learning the classifier for prediction For each instance in D, get the temporal abstraction (MSS) Zi. Mine frequent m-RTPs for each label. Combine all the RTPs into a set Omega. Create a feature vector f of size |Omega| For each MSS Zi, create a feature vector. Put 1 if that pattern is in Zi 0 otherwise. Use any of the existing classifiers (ANN, SVN etc.) for learning using the training set.

19
Experimental Evaluation Dataset 13,558 records of diabetics patients 19 time series variables per patient (glucose, ceratin, hemoglobin, cholesterol etc.) 602 ICD-9 diagnosis codes divided into 8 disease categories (8 class labels) Setup Separate experiments for each of the 8 labels. Each category is divided into cases (positives) and controls (negatives) Cases: patients with the target disease, All the time series variables recorded till the time the disease was FIRST diagnosed. Controls: All other patients. All time series variables recorded upto a randomly selected point in time.

20
Experimental Evaluation Classification Performance Different methods to compare 1.Last Values – only consider most recent value for each variable. 2.TP – Consider all temporal patterns for each variable 3.TP_Sparse – Consider all temporal patterns, but select top 50 for each variable 4.RTP – Consider all Recent temporal Patterns for each variable 5.RTP_Sparse – top 50 RTPs for each variable

21
Experimental Evaluation Classification Performance Sigma (support) (for 2-5) is set to 15% Gap (for 4,5) is set to 6 months. Test: 10-folds cross validations Quality measurement: Create features using each of the above method. Build SVN using these features Evaluate performance of SVN using the “classification accuracy” and “AUC” i.e. area under the ROC (Receiver operating characteristic) curve. AUC is equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one accuracy = ( TP + TN ) / (P+N)

22
Experimental Evaluation

23
Knowledge Discovery

24
Experimental Evaluation

25
Conclusions “Recent Temporal Patterns” are of special interest, especially in medical domain, but should have similar behavior in other domains. Time series abstractions provide a simple approximation as well as compression of data. The gap parameter in detecting pattern is critical for scaling up the mining process (but is domain dependent). RTPs provide efficient mining as well as better prediction accuracy as compared to detecting patterns over the entire series (validated here in the medical domain). How can we leverage/extend this? Towards defining high level abstractions for time series kernels Extend from “independent” multivariate to interdependent multivariate model, where different vertices form variables and the edges define the dependencies.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google