Presentation is loading. Please wait.

Presentation is loading. Please wait.

Iyad Batal Dmitriy Fradkin James Harrison Fabian Moerchen

Similar presentations

Presentation on theme: "Iyad Batal Dmitriy Fradkin James Harrison Fabian Moerchen"— Presentation transcript:

1 Mining Recent Temporal Patterns for Event Detection in Multivariate Time Series Data
Iyad Batal Dmitriy Fradkin James Harrison Fabian Moerchen Milos Hauskrecht Dept. of Computer Science Siemens Corporate Research Dept. of Public Health University of Pittsburgh Sciences University of Virginia 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012

2 Introduction Supervised temporal detection.
Given a labeled dataset of temporal instances till time ti. Find frequently occurring “temporal patterns” for each label. Given a sample instance, predict its label. Contributions of this paper: Abstractions to define “Recent temporal Patterns” (motivated from medical EHR records) Algorithms to find “frequent patterns” among a given database.

3 Example Database: EHR Records of Patients Each Record:
Multiple temporal variables. Each with multiple reading till time ti E.g. glucose, ceratine, cholesterol Label: Disease/Symptom detected at time ti Supervised Learning: Given a database, learn patterns associated with different diseases Prediction: Given a new Patient, find “recent temporal pattern” and the label associated with it.

4 Temporal Abstraction Patterns
Issues: Irregularly sampled Sampling errors Multivariate Temporal Abstractions Numeric Values to Finite Abstraction Alphabets E.g. Very Low, low, normal, high, very high All contiguous values with same abstraction form an interval Time series can be represented as {<v1,s1,e1>,<v2,s2,e2>…}

5 Temporal Abstraction Patterns
Multivariate State Sequences Temporal Variable - F State – V State interval – E = (F, V, s, e), so a single variable time series is a ordered set of Ei Multivariate State Sequence – (basically a patient record) Zi An ordered combination of state intervals for all containing variables Ordered by Start Times. If Start Times collide, sort by End Time If Both collide, sort of lexical ordering

6 Temporal Abstraction Patterns
What is a pattern? A sequence of “temporal relations” between state intervals Ei, Ej What kind of “temporal relations” Ei occurs BEFORE Ej (or vice versa) Ei CO-OCCURS with Ej There are other fine grained relations such as start together, end together, equals, contains, overlaps, meets etc. But they only consider the above two relations which generalizes all these relations.

7 Temporal Abstraction Patterns
Temporal Pattern: P = (<S1,S2,…,Sk>, R), where Si is the state. And R is the relation matrix which defines either a b (before) or c(co-occurs) relation between a state and the consequent states. Hence R is a upper triangular matrix. P is called k-pattern where k = |<Si,…,Sk>|

8 Temporal Abstraction Patterns
Pattern Containment: Given a pattern P = (<S1,S2,…,Sk>, R) And MSS Z = <E1,…,El> Z contains P iff All Si are in Z And for i= 1..k and j = i..k-1 Ri,j holds for Ei, Ej (denoted by Ri,j (Ei, Ej))

9 Mining Recent Temporal Patterns
“Recent State” Given a MSS Z=<E1,…,Ek> A state Ei is “recent” interval given a maximum gap g if any of the following condition is true Ei is the last state for the given temporal variable. Ei.F not equal Ek.F for all k > I Z.end – Ei.end <= g

10 Mining Recent Temporal Patterns
“Recent Pattern (RTP)” Given a MSS Z=<E1,…,Ek> A pattern P = (<S1,..Sk>, R) is “recent” pattern in Z given a maximum gap g if ALL of the following conditions are true Z contains P Sk is a recent state in Z No two consecutive matched states in Z are more than “g” apart. i.e. E(k+1).s – E(k).e <= g Suffix Sub-pattern: P is suffix sub-pattern of P’ if P contains a suffix of states in P’ (e.g. if P’ = <S1,S2,S3>, P can contain <S3> or <S2,S3> or <S1, S2, S3> but NOT <S1,S3>. All the relations for Si in P are same as in P’

11 Mining Recent Temporal Patterns
Frequent Recent Pattern Given a database D of MSS, a gap parameter g, and a support parameter sigma. A pattern P is called “frequent” if the number of times it occurs in D, called its “support” denoted as RTP-sup-g(P,D), is greater than sigma.

12 Mining Algorithm Goal: For a given database, for all given labels. Find Frequent Recent Patterns associated with each given Label. In other words, for each class y, given the database Dy. Output a set of patterns that satisfy:

13 Mining Algorithm Approach:
Build Patterns of incremental size. Start with patterns of size 1 and build on top of that. For (k+1)th stage, i.e. to fine (k+1)-RTPs given K-RTPs, the algorithms consists of two stages Candidate Generation Counting (by removing candidates that do not qualify)

14 Mining Algorithm Naïve Candidate Generation 1 a 1 b 1 c 2 a 2 1 2 3 L
2 c . a b L a c L b L c K-RTPs

15 Mining Algorithm Improving Efficiency
Remove “incoherent” patterns. i.e. patterns that are not allowed. if S1.F = Si.F and R1,i = c, then the pattern cannot be valid, since these states will be combined into one. If R1,j = c then for any R1,i such that i<j R1,i CANNOT be “b”. If it is, then pattern is incoherent. This actually leads to a nice corollary. There can only be consequent “c”s followed by consequent “b”s Hence, the total combinations of R to try is not 2^k but just (k+1). This drastically reduces the candidate space.

16 Mining Algorithm Naïve Counting Algorithm. For each variable y
For each candidate P, For each MSS Z in database Dy Verify if P is a RTP in Z and increment Count for P for variable y.

17 Mining Algorithm Improving efficiency of Counting Algorithm.
Filter D based only on States (instead of entire pattern matching) Proposition (based on the suffix sub-pattern definition): The list of Z in D containing P is a subset of the list of Z containing P’ if P’ is a suffix sub- pattern of P. Get the intersection of the above two list to get the actual candidate Zs to search and match. Note: Due to the second property, the size of the list of candidate Zs keeps on decreasing over time. At the end of the algorithm For each label y, we have a list of frequent m-RTP patterns associated with that label.

18 Learning the classifier for prediction
For each instance in D, get the temporal abstraction (MSS) Zi. Mine frequent m-RTPs for each label. Combine all the RTPs into a set Omega. Create a feature vector f of size |Omega| For each MSS Zi, create a feature vector. Put 1 if that pattern is in Zi 0 otherwise. Use any of the existing classifiers (ANN, SVN etc.) for learning using the training set.

19 Experimental Evaluation
Dataset 13,558 records of diabetics patients 19 time series variables per patient (glucose, ceratin, hemoglobin, cholesterol etc.) 602 ICD-9 diagnosis codes divided into 8 disease categories (8 class labels) Setup Separate experiments for each of the 8 labels. Each category is divided into cases (positives) and controls (negatives) Cases: patients with the target disease, All the time series variables recorded till the time the disease was FIRST diagnosed. Controls: All other patients. All time series variables recorded upto a randomly selected point in time.

20 Experimental Evaluation
Classification Performance Different methods to compare Last Values – only consider most recent value for each variable. TP – Consider all temporal patterns for each variable TP_Sparse – Consider all temporal patterns, but select top 50 for each variable RTP – Consider all Recent temporal Patterns for each variable RTP_Sparse – top 50 RTPs for each variable

21 Experimental Evaluation
Classification Performance Sigma (support) (for 2-5) is set to 15% Gap (for 4,5) is set to 6 months. Test: 10-folds cross validations Quality measurement: Create features using each of the above method. Build SVN using these features Evaluate performance of SVN using the “classification accuracy” and “AUC” i.e. area under the ROC (Receiver operating characteristic) curve. AUC is equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one accuracy = ( TP + TN ) / (P+N)

22 Experimental Evaluation

23 Experimental Evaluation
Knowledge Discovery

24 Experimental Evaluation

25 Conclusions How can we leverage/extend this?
“Recent Temporal Patterns” are of special interest, especially in medical domain, but should have similar behavior in other domains. Time series abstractions provide a simple approximation as well as compression of data. The gap parameter in detecting pattern is critical for scaling up the mining process (but is domain dependent). RTPs provide efficient mining as well as better prediction accuracy as compared to detecting patterns over the entire series (validated here in the medical domain). How can we leverage/extend this? Towards defining high level abstractions for time series kernels Extend from “independent” multivariate to interdependent multivariate model, where different vertices form variables and the edges define the dependencies.

Download ppt "Iyad Batal Dmitriy Fradkin James Harrison Fabian Moerchen"

Similar presentations

Ads by Google