Presentation is loading. Please wait.

Presentation is loading. Please wait.

University of Macau, Macau

Similar presentations


Presentation on theme: "University of Macau, Macau"β€” Presentation transcript:

1 University of Macau, Macau
Quick-Motif: An Efficient and Scalable Framework for Exact Motif Discovery Yuhong Li Department of Computer and Information Science University of Macau, Macau

2 Quick-Motif: What is Motif ?
Most similar subsequence pair in a Time Series Applications A core subroutine for activity discovery, e.g., elder care, surveillance and sports training. Clustering enumerated motifs is more meaningful than clustering all the subsequences in a long time series.

3 Quick-Motif: Formal Definition
time series subsequence s 𝑖 time series 𝑠 𝑖 𝑖+β„“βˆ’1 π‘šβˆ’1 Timeline Exact Motif Discovery Input: time series 𝑠 and target motif length β„“ Output: most similar subsequence pair in terms of normalized Euclidean distance. Avoid trivial match οƒ  Non-overlapping Adjacent subsequence pairs are expected to similar to each other naturally.

4 Quick-Motif: NaΓ―ve Solution
Sliding window size = β„“, Step size = 1 Subsequences of length β„“ Subsequences of length β„“ Test all subsequence pairs normalize … Motif οƒ  most similar subsequence pair … … Time complexity is O( π‘š 2 β„“).

5 Quick-Motif: Existing Solutions
Reference-based Index (MK) [Mueen & Keogh, SDM 2009] Good: Prune unpromising pairs by batches. Bad: 𝑂(β„“) time distance computations. Smart Brute Force (SBF) [Mueen, ICDM 2013] Good: 𝑂(1) time distance computations. Bad: examine all subsequence pairs. … … ? 𝑂(β„“) 𝑂(1)

6 Quick-Motif: Fast Distance Computation
Incremental distance computation. 𝑠 0 𝑠 20 …… 𝑠 1 𝑠 21 𝑠 2 𝑠 22 𝑠 23 𝑠 3 𝑠 4 … 𝑠 24 𝑠 0 𝑠 1 𝑠 2 𝑠 3 𝑠 4 𝑠 20 𝑠 21 𝑠 22 9 subsequence pairs οƒ  𝑂 β„“ 16 subsequence pairs οƒ  𝑂(1) 𝑠 23 𝑠 24

7 Quick-Motif: Pruning of Subsequence Pairs
Group every w consecutive subsequences as a PAA MBR. 𝑀 = 5 𝑓 2 𝑀 3 5 𝑀 1 5 minDist 𝑀 2 5 PAA feature space 𝑓 1 Minimum distance between two PAA MBRs οƒ  Distance LBs. If distance LB is smaller than 𝑏𝑠𝑓 οƒ  Further refinement.

8 Quick-Motif: Filter-and-Refinement
NaΓ―ve Solution. Check the distance LBs for all 𝑀-MBR pairs. The time complexity is 𝑂( (π‘š/𝑀) 2 πœ™) , πœ™ is the PAA dimensionality. How to Efficiently Find Surviving 𝑀-MBR Pairs? Enable batch pruning. Discover the true motif as soon as possible to improve the pruning ability.

9 Quick-Motif: Filter-and-Refinement
Enable Batch Pruning οƒ  Hierarchical Structure Offer reasonable grouping quality, thus good pruning ability. Can be constructed very efficiently. 𝑓 2 𝑀 8 𝑀 𝑀 1 𝑀 Level 2 𝑀 3 𝑀 𝑀 π‘Ÿπ‘œπ‘œπ‘‘ 𝑀 6 𝑀 Level 1 𝑀 5 𝑀 𝑀 0 𝑀 𝑀 π‘Ž 𝑀 𝑏 𝑀 𝑐 𝑀 7 𝑀 minDist 𝑀 4 𝑀 𝑀 2 𝑀 𝑀 4 𝑀 𝑀 6 𝑀 𝑀 0 𝑀 𝑀 2 𝑀 𝑀 7 𝑀 𝑀 5 𝑀 𝑀 3 𝑀 𝑀 1 𝑀 𝑀 8 𝑀 PAA feature space 𝑓 1 Hilbert curve sort list

10 Quick-Motif: Filter-and-Refinement
Discover true motif as soon as possible οƒ  Locality-based Search Strategy Level 2 𝑀 π‘Ÿπ‘œπ‘œπ‘‘ Bad locality Level 1 𝑀 π‘Ž 𝑀 𝑏 𝑀 𝑐 Hilbert curve sort list Leaf nodes Good locality 𝑀 4 𝑀 𝑀 6 𝑀 𝑀 0 𝑀 𝑀 2 𝑀 𝑀 7 𝑀 𝑀 5 𝑀 𝑀 3 𝑀 𝑀 1 𝑀 𝑀 8 𝑀 Locality-based search vs Best-first search Locality-based Best-first Surviving pairs 0.1256M 0.1249M Heap size N/A 2.78M # pushes 11.73 M (queue) 6.75 M (heap) Resp. time 1.56 s 6.32 s

11 Quick-Motif: Experimental Evaluation
Programming Language: C++ Machine: Ubuntu 12.04, 4GB RAM Datasets RW: Random generate. EEG: Reflect the activity of neurons, length ECG: The Koski ECG. Length EPG: Sequence that traces insect behaviour, length TAO: Sea surface temperatures, length

12 Quick-Motif: Performance Evaluation
(a), Effect of β„“ on ECG (b), Effect of β„“ on EEG (c), Effect of β„“ on EPG (d), Effect of β„“ on TAO

13 Thanks Q A input hidden output


Download ppt "University of Macau, Macau"

Similar presentations


Ads by Google