Presentation is loading. Please wait.

Presentation is loading. Please wait.

1. 2 General problem Retrieval of time-series similar to a given pattern.

Similar presentations


Presentation on theme: "1. 2 General problem Retrieval of time-series similar to a given pattern."— Presentation transcript:

1 1

2 2 General problem Retrieval of time-series similar to a given pattern.

3 3 Example: Stock charts Database of time-series

4 4 Example: Stock charts Database of time-seriesPattern

5 5 Example: Stock charts Database of time-seriesPatternRetrieval results

6 6 Example: Stock charts Database of time-seriesPatternRetrieval results.92.87.86.84

7 7 Example: Electrocardiogram Database of time-series

8 8 Example: Electrocardiogram Database of time-seriesPattern

9 9 Example: Electrocardiogram Database of time-seriesPatternRetrieval results.91.87.98 1.0

10 10 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions

11 11 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions Contributions }

12 12 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data

13 13 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions

14 14 Previous work Feature choice Similarity metrics Indexing and retrieval

15 15 Previous work: Feature choice Discrete Fourier transforms Alphabets Statistical features Subsets of points

16 16 Previous work: Similarity metrics Euclidean distance Bounding rectangles Envelope count Aggregate similarity

17 17 Previous work: Indexing and retrieval Advanced techniques: B-trees R-trees KD-trees VP-trees Grids Applied techniques: Linear search with compression

18 18 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions

19 19 Important points Choose “important” maxima and minima, and discard the other points.

20 20 Important points Choose “important” maxima and minima, and discard the other points. Original series Example:

21 21 Important points Choose “important” maxima and minima, and discard the other points. Original series Example:

22 22 Important points Choose “important” maxima and minima, and discard the other points. Original series Example: Compressed series

23 23 Definition of important points Important minimum

24 24 Definition of important points Important minimum a m is the minimum among a i,…, a j

25 25 Definition of important points Important minimum a m is the minimum among a i,…, a j a i /a m  R and a j /a m  R

26 26 Definition of important points Important minimum a m is the minimum among a i,…, a j a i /a m  R and a j /a m  R R is a knob that determines compression rate

27 27 Definition of important points Important maximum a m is the maximum among a i,…, a j a m /a i  R and a m /a j  R R is a knob that determines compression rate

28 28 Compression example Original series

29 29 Compression example Original series Compressed series

30 30 Compression example Original series Compressed series

31 31 Compression example Original series Compressed series

32 32 Compression algorithm Linear time Constant memory Accepts streaming data For a series with n values, compression time is 0.0133  n milliseconds (300 MHz PC, Visual Basic 6.0).

33 33 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions

34 34 Retrieval Retrieval of time-series similar to a given pattern. Intuition: Find a prominent feature in the pattern Find candidate segments with a similar feature Compare similarity of candidates to the pattern

35 35 Example: Stock charts Database of time-series

36 36 Example: Stock charts Database of time-series

37 37 Example: Stock charts Database of time-seriesPattern

38 38 Example: Stock charts Database of time-seriesPattern

39 39 Example: Stock charts Database of time-seriesPattern

40 40 Example: Stock charts Database of time-seriesPatternRetrieval results.92.87.86.84

41 41 Algorithm Identify the prominent leg in the pattern Retrieve similar legs from the database Identify corresponding candidate segments For each candidate segment, compute its similarity to the pattern Output the candidates whose similarity is above the threshold

42 42 Important details Use compressed pattern and compressed sequences in the retrieval process The prominent feature is the leg having the greatest ratio of right end to left end All legs in the database are indexed by their prominence, using a binary search tree

43 43 Alternative versions Different prominence definitions Different similarity metrics The end-point ratio prominence usually gives the best empirical results.

44 44 Extended legs Similar sequence

45 45 Indexing on extended legs Advantage: More accurate retrieval Disadvantage: Larger index, more memory If a compressed sequence has n legs: Worst case: n 2 /2 extended legs Average case:  (n  lg n) extended legs

46 46 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions

47 47 Data sets Stock charts Air and sea temperatures Wind speeds Electroencephalograms Electrocardiograms

48 48 Data sets Stock charts Air and sea temperatures Wind speeds Electroencephalograms Electrocardiograms 60,000 points 445,000 points 79,000 points 17,000 points 2,000 points

49 49 Patterns Compressed patterns with 4 to 27 legs Examples:

50 50 Retrieval time Retrieval time: 0.07  m  k milliseconds m legs in a pattern k candidates

51 51 Retrieval accuracy: Stock charts 20 % candidates C = 3 10 % C = 2 5 % C = 1.5 1 % C = 1.1

52 52 Retrieval accuracy: Wind speeds 20 % candidates C = 1.5 10 % C = 1.2 5 % C = 1.1

53 53 Retrieval candidate quality Stock charts (5,400 legs)447 Air and sea temperatures (5,500 legs)456 Wind speeds (10,500 legs)379 Candidates 5%10%20% Found matches among ten best:

54 54 Outline Previous work Important points Indexing and retrieval Empirical results Conclusions

55 55 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data

56 56 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data

57 57 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~

58 58 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~

59 59 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~

60 60 Criteria for retrieval methods Gunopulos [2000]: Work for erratic time-series Accept any pattern Find inexact matches Work when some points are missing Work on streaming data ~ ~

61 61 Main results Compression Fast compression procedure Preserves similarity Retrieval Works with compressed data Controlled trade-off between speed and accuracy


Download ppt "1. 2 General problem Retrieval of time-series similar to a given pattern."

Similar presentations


Ads by Google