Presentation is loading. Please wait.

Presentation is loading. Please wait.

The 3-class ECG problem: (left) the best clustering was our approach, the second best (right) was Euclidian distance.

Similar presentations


Presentation on theme: "The 3-class ECG problem: (left) the best clustering was our approach, the second best (right) was Euclidian distance."— Presentation transcript:

1 The 3-class ECG problem: (left) the best clustering was our approach, the second best (right) was Euclidian distance

2 Created as in [1], except length is 1000 and transition matrices are hmm1 = [.9.1;.9.1]; hmm2 = [.1.9;.1.9]; To convert to discrete data, our approach simply considers datapoints above the mean to be 1’s, otherwise they are 0’s [1] P.Smyth. Clustering sequences with hidden Markov models. In Advances in Neural Information Processing,volume 9, Cambridge, MA, 1997. MIT Press.

3 Bluewhale FinbackWhale Cow Cat GreySeal HarborSeal Horse WhiteRhino HouseMouse Rat Chimpanzee Pygmychimpanzee Human Gorilla Orangutan SumatranOrangutan Gibbon Opossum Wallaroo Platypus DNA: Single Linkage

4 1 4 3 5 2 11 14 12 13 15 16 19 18 20 17 6 9 7 8 10 1 16 17 18 19 20 4 7 9 6 2 5 8 11 12 14 15 13 10 3 Our approach Euclidean distance Cluster 1 (datasets 1 ~ 5): BIDMC Congestive Heart Failure Database (chfdb)BIDMC Congestive Heart Failure Database (chfdb): record chf02 Start times at 0, 82, 150, 200, 250, respectively Cluster 2 (datasets 6 ~ 10): BIDMC Congestive Heart Failure Database (chfdb)BIDMC Congestive Heart Failure Database (chfdb): record chf15 Start times at 0, 82, 150, 200, 250, respectively Cluster 3 (datasets 11 ~ 15): Long Term ST Database (ltstdb)Long Term ST Database (ltstdb): record 20021 Start times at 0, 50, 100, 150, 200, respectively Cluster 4 (datasets 16 ~ 20): MIT-BIH Noise Stress Test Database (nstdb)MIT-BIH Noise Stress Test Database (nstdb): record 118e6 Start times at 0, 50, 100, 150, 200, respectively

5 L-1u L-1n L-1g L-1t L-1v The results of using our algorithm on various datasets from the Aerospace Corp collection. The bolder the line, the stronger the anomaly. Note that because of the way we plotted these there is a tendency to locate the beginning of the anomaly as opposed to the most anomalous part. These datasets all have their anomalies in the middle, to make sure that our code does not have a basis to find things anomalous in the center of a time series, we did experiments where we grafted the anomalies to different places. In all these cases we had equal success.

6 Stephen Bay asked how would our approach handle random data, and random walk. Here we use ten of each, plus some structured data as a sanity check.

7 Similar to the previous experiment, but with just random and random walk data

8 Hand resting at side Hand above holster Aiming at target Actor misses holster Briefly swings gun at target, but does not aim Laughing and flailing hand 0100200300400500600 100 150 200 250 300 350 400 450 500 550 This image is in the paper, this is just a higher resolution version

9


Download ppt "The 3-class ECG problem: (left) the best clustering was our approach, the second best (right) was Euclidian distance."

Similar presentations


Ads by Google