Presentation is loading. Please wait.

Presentation is loading. Please wait.

Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.

Similar presentations


Presentation on theme: "Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use."— Presentation transcript:

1 Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use these slides for teaching, if You send me an email telling me the class number/ university in advance. My name and email address appears on the first slide (if you are using all or most of the slides), or on each slide (if you are just taking a few slides). You may freely use these slides for a conference presentation, if You send me an email telling me the conference name in advance. My name appears on each slide you use. You may not use these slides for tutorials, or in a published work (tech report/ conference paper/ thesis/ journal etc). If you wish to do this, email me first, it is highly likely I will grant you permission. (c) Eamonn Keogh, eamonn@cs.ucr.edu

2 Indexing Large Human-Motion Databases Eamonn Keogh, Themis Palpanas Victor B. Zordan,Dimitrios Gunopulos University of California, Riverside Marc Cardle University of Cambridge

3 Themis Palpanas3 VLDB - Aug 2004 Motion Capture records motion data from live actors

4 Themis Palpanas4 VLDB - Aug 2004 Motion Capture records motion data from live actors used for data-driven animation

5 Themis Palpanas5 VLDB - Aug 2004 Motion Capture in Games Industry Street NBA Madden

6 Themis Palpanas6 VLDB - Aug 2004 Motion Capture in Movie Industry Troy Lord of the Rings

7 Themis Palpanas7 VLDB - Aug 2004 Motivation motion capture data  segmented in short sequences, stored in motion libraries  composed to create long, realistic motion sequences important to find similar sequences  form pool of similar sequences  choose the most promising, to continue the motion

8 Themis Palpanas8 VLDB - Aug 2004 Motivation Dynamic Time Warping (DTW)  Considers only local adjustments in time, to match two time series  However sometimes global adjustments are required DTW is being extensively used uniform scaling is complementary  combination of both techniques offers rich, high-quality result set DTW Uniform Scaling

9 Themis Palpanas9 VLDB - Aug 2004 Uniform Scaling time series  query, Q, length n  candidate, C, length m (m>n)

10 Themis Palpanas10 VLDB - Aug 2004 Uniform Scaling time series  query, Q, length n  candidate, C, length m (m>n) stretch Q to length p (n≤p≤m): Q p  Q p j = Q┌ j*n/p ┐, 1 ≤ j ≤ p scaling factor, sf = p/n  max scaling factor, sf max = m/n QpQp

11 Themis Palpanas11 VLDB - Aug 2004 Problem Statement given  time series, Q  database of candidate time series, {D} find argmin p { dist(Q p, {D} ) }  dist(Q p, {D} )= Euclidean Distance between time series

12 Themis Palpanas12 VLDB - Aug 2004 Problem Statement given  time series, Q  database of candidate time series, {D} find argmin p { dist(Q p, {D} ) }  dist(Q p, {D} )= Euclidean Distance between time series challenges  quickly solve the problem for two time series  extend solution to scale-up to large time series databases

13 Themis Palpanas13 VLDB - Aug 2004 Outline Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

14 Themis Palpanas14 VLDB - Aug 2004 Best Uniform Scaling Match brute force algorithm:  for each time series in {D} for each sf, 1 ≤ sf ≤ sf max compute distance between the two time series find the best overall match time complexity: O(|D|(m-n))  extremely expensive!

15 Themis Palpanas15 VLDB - Aug 2004 Lower Bounding Uniform Scaling lower bound distance between two time series, for any sf, 1 ≤ sf ≤ sf max desiderata:  fast to compute  tight bound results in fast pruning of candidates that are guaranteed not to belong to the solution  compute distance only for time series not pruned by lower bound

16 Themis Palpanas16 VLDB - Aug 2004 Lower Bounding Uniform Scaling assume:  candidate C, length 100  query Q, length 80  wish to find best match for any scaling of Q between 80-100 01020304050607080 90 100 C m = 100

17 Themis Palpanas17 VLDB - Aug 2004 Lower Bounding Uniform Scaling assume:  candidate C, length 100  query Q, length 80  wish to find best match for any scaling of Q between 80-100 build envelopes, length 80: 0102030405060708090100 U L n = 80 U i = max( C  (i-1)*m/n  +1,…, C  i*m/n  ) L i = min( C  (i-1)*m/n  +1,…, C  i*m/n  )

18 Themis Palpanas18 VLDB - Aug 2004 Lower Bounding Uniform Scaling assume:  candidate C, length 100  query Q, length 80  wish to find best match for any scaling of Q between 80-100 build envelopes, length 80: 0102030405060708090100 Q U i = max( C  (i-1)*m/n  +1,…, C  i*m/n  ) L i = min( C  (i-1)*m/n  +1,…, C  i*m/n  )

19 Themis Palpanas19 VLDB - Aug 2004 Lower Bounding Uniform Scaling assume:  candidate C, length 100  query Q, length 80  wish to find best match for any scaling of Q between 80-100 build envelopes, length 80: 0102030405060708090100 U i = max( C  (i-1)*m/n  +1,…, C  i*m/n  ) L i = min( C  (i-1)*m/n  +1,…, C  i*m/n  )

20 Themis Palpanas20 VLDB - Aug 2004 Lower Bounding Uniform Scaling assume:  candidate C, length 100  query Q, length 80  wish to find best match for any scaling of Q between 80-100 compute lower bound: 0102030405060708090100

21 Themis Palpanas21 VLDB - Aug 2004 Envelope Indexing dimensionality of envelopes is high 0102030405060708090100 80 points

22 Themis Palpanas22 VLDB - Aug 2004 Envelope Indexing dimensionality of envelopes is high  reduce dimensionality by approximating them Piecewise Constant Approximation 0102030405060708090100 8 points

23 Themis Palpanas23 VLDB - Aug 2004 Envelope Indexing dimensionality of envelopes is high  reduce dimensionality by approximating them Piecewise Constant Approximation assume query Q, length 80 0102030405060708090100 Q

24 Themis Palpanas24 VLDB - Aug 2004 Envelope Indexing dimensionality of envelopes is high  reduce dimensionality by approximating them Piecewise Constant Approximation assume query Q, length 80  we approximate it with 8 points 0102030405060708090100

25 Themis Palpanas25 VLDB - Aug 2004 Envelope Indexing dimensionality of envelopes is high  reduce dimensionality by approximating them Piecewise Constant Approximation assume query Q, length 80  approximated with 8 points compute approximation of lower bound: 0102030405060708090100

26 Themis Palpanas26 VLDB - Aug 2004 Algorithms for Secondary Storage use a multidimensional index  VA-file -> FastScan algorithm  R-tree -> RtreeProbe algorithm 2-pass algorithms: 1. scan approximated envelopes, prune search space 2. find exact answer using original series

27 Themis Palpanas27 VLDB - Aug 2004 Outline Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

28 Themis Palpanas28 VLDB - Aug 2004 Datasets Used motion capture  data from 124 sensors placed on human actors mixed bag  time series coming from: medicine, manufacturing, environmental monitoring, economics, sensor data experimented with time series databases of:  size 5,000 – 80,000  time series length 64 – 1,024 points

29 Themis Palpanas29 VLDB - Aug 2004 Main Memory Experiments assume database fits in memory measure pruning power:  fraction of times each approach calls distance function our technique:  1 order of magnitude faster than CD-criterion

30 Themis Palpanas30 VLDB - Aug 2004 Main Memory Experiments assume database fits in memory measure pruning power:  fraction of times each approach calls distance function our technique:  1 order of magnitude faster than CD-criterion  3 orders of magnitude faster than brute force brute force

31 Themis Palpanas31 VLDB - Aug 2004 Disk-Based Experiments comparison of:  brute force  FastScan  RtreeProbe

32 Themis Palpanas32 VLDB - Aug 2004 Disk-Based Experiments comparison of:  FastScan  RtreeProbe

33 Themis Palpanas33 VLDB - Aug 2004 Disk-Based Experiments comparison of:  FastScan  RtreeProbe

34 Themis Palpanas34 VLDB - Aug 2004 Case Study video

35 Themis Palpanas35 VLDB - Aug 2004 Outline Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

36 Themis Palpanas36 VLDB - Aug 2004 Related Work Dynamic Time Warping (DTW)  [Yi & Faloutsos’00][Keogh’02][Zhu & Shasha’03][Fung & Wong’03] Longest Common SubSequence (LCSS)  [Das et al.’97][Vlachos et al.’03] uniform scaling  [Argyros & Ermopoulos’03]

37 Themis Palpanas37 VLDB - Aug 2004 Outline Speeding Up Search Scaling Up To Large Databases Experimental Evaluation Related Work Conclusions

38 Themis Palpanas38 VLDB - Aug 2004 Conclusions studied utility of uniform scaling similarity matching  applications in: motion capture libraries, music retrieval, historical handwritten archives introduced first lower bounding technique proposed indexing method for bounding envelopes  suitable for very large time series databases experimentally evaluated efficiency of technique demonstrated quality of results with real motion capture data

39 Themis Palpanas39 VLDB - Aug 2004 Outline

40 Themis Palpanas40 VLDB - Aug 2004 Lower Bounding Uniform Scaling assume:  candidate C, length 100  query Q, length 80  wish to find best match for any scaling of Q between 80-100 build envelopes, length 80: 0102030405060708090100 U i = max( C  (i-1)*m/n  +1,…, C  i*m/n  ) L i = min( C  (i-1)*m/n  +1,…, C  i*m/n  )


Download ppt "Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use."

Similar presentations


Ads by Google