Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu.

Slides:



Advertisements
Similar presentations
Indexing Time Series Based on original slides by Prof. Dimitrios Gunopulos and Prof. Christos Faloutsos with some slides from tutorials by Prof. Eamonn.
Advertisements

The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
Kaushik Chakrabarti(Univ Of Illinois) Minos Garofalakis(Bell Labs) Rajeev Rastogi(Bell Labs) Kyuseok Shim(KAIST and AITrc) Presented at 26 th VLDB Conference,
Wavelets Fast Multiresolution Image Querying Jacobs et.al. SIGGRAPH95.
1 Storage of images for Efficient Retrieval  Representing IDB as relations  straightforward  Representing IDB with spatial data structures  represent.
Efficient Anomaly Monitoring over Moving Object Trajectory Streams joint work with Lei Chen (HKUST) Ada Wai-Chee Fu (CUHK) Dawei Liu (CUHK) Yingyi Bu (Microsoft)
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Fast Algorithm for Nearest Neighbor Search Based on a Lower Bound Tree Yong-Sheng Chen Yi-Ping Hung Chiou-Shann Fuh 8 th International Conference on Computer.
Spatial and Temporal Data Mining
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
Multimedia DBs.
Time Series Indexing II. Time Series Data
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems Hojjat Jafarpour, Bijit Hore, Sharad Mehrotra, and Nalini Venkatasubramanian.
Indexing Time Series. Time Series Databases A time series is a sequence of real numbers, representing the measurements of a real variable at equal time.
Indexing Time Series Based on Slides by C. Faloutsos (CMU) and D. Gunopulos (UCR)
Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.
Efficient Similarity Search in Sequence Databases Rakesh Agrawal, Christos Faloutsos and Arun Swami Leila Kaghazian.
Data Mining: Concepts and Techniques Mining time-series data.
Reza Sherkat ICDE061 Reza Sherkat and Davood Rafiei Department of Computing Science University of Alberta Canada Efficiently Evaluating Order Preserving.
Multimedia DBs. Time Series Data
1. 2 General problem Retrieval of time-series similar to a given pattern.
1 ISI’02 Multidimensional Databases Challenge: representation for efficient storage, indexing & querying Examples (time-series, images) New multidimensional.
Based on Slides by D. Gunopulos (UCR)
Spatial and Temporal Data Mining
Computing Sketches of Matrices Efficiently & (Privacy Preserving) Data Mining Petros Drineas Rensselaer Polytechnic Institute (joint.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept.
Fast Fourier Transform (FFT) (Section 4.11) CS474/674 – Prof. Bebis.
A Multiresolution Symbolic Representation of Time Series
A fuzzy video content representation for video summarization and content-based retrieval Anastasios D. Doulamis, Nikolaos D. Doulamis, Stefanos D. Kollias.
CS Instance Based Learning1 Instance Based Learning.
Indexing Time Series.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Pattern Matching with Acceleration Data Pramod Vemulapalli.
Exact Indexing of Dynamic Time Warping
Multimedia and Time-series Data
Analysis of Constrained Time-Series Similarity Measures
Video Mosaics AllisonW. Klein Tyler Grant Adam Finkelstein Michael F. Cohen.
Content-Based Music Information Retrieval in Wireless Ad-hoc Networks.
Shape Analysis and Retrieval Statistical Shape Descriptors Notes courtesy of Funk et al., SIGGRAPH 2004.
Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.
COLOR HISTOGRAM AND DISCRETE COSINE TRANSFORM FOR COLOR IMAGE RETRIEVAL Presented by 2006/8.
PMLAB Finding Similar Image Quickly Using Object Shapes Heng Tao Shen Dept. of Computer Science National University of Singapore Presented by Chin-Yi Tsai.
Constructing Optimal Wavelet Synopses Dimitris Sacharidis Timos Sellis
A Query Adaptive Data Structure for Efficient Indexing of Time Series Databases Presented by Stavros Papadopoulos.
Fast Subsequence Matching in Time-Series Databases Author: Christos Faloutsos etc. Speaker: Weijun He.
E.G.M. PetrakisSearching Signals and Patterns1  Given a query Q and a collection of N objects O 1,O 2,…O N search exactly or approximately  The ideal.
Efficient EMD-based Similarity Search in Multimedia Databases via Flexible Dimensionality Reduction / 16 I9 CHAIR OF COMPUTER SCIENCE 9 DATA MANAGEMENT.
Wavelets and Multiresolution Processing (Wavelet Transforms)
2005/12/021 Content-Based Image Retrieval Using Grey Relational Analysis Dept. of Computer Engineering Tatung University Presenter: Tienwei Tsai ( 蔡殿偉.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
Query Sensitive Embeddings Vassilis Athitsos, Marios Hadjieleftheriou, George Kollios, Stan Sclaroff.
Content-Based Image Retrieval Using Block Discrete Cosine Transform Presented by Te-Wei Chiang Department of Information Networking Technology Chihlee.
Euripides G.M. PetrakisIR'2001 Oulu, Sept Indexing Images with Multiple Regions Euripides G.M. Petrakis Dept. of Electronic.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Data Mining Multimedia Databases Text databases Image and.
A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.
Content Based Color Image Retrieval vi Wavelet Transformations Information Retrieval Class Presentation May 2, 2012 Author: Mrs. Y.M. Latha Presenter:
Time Series Sequence Matching Jiaqin Wang CMPS 565.
Intelligent Database Systems Lab Advisor : Dr. Hsu Graduate : Chien-Shing Chen Author : Jessica K. Ting Michael K. Ng Hongqiang Rong Joshua Z. Huang 國立雲林科技大學.
Indexing Time Series. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Time Series databases Text databases.
High-Dimensional Data. Topics Motivation Similarity Measures Index Structures.
Dense-Region Based Compact Data Cube
Keogh, E. , Chakrabarti, K. , Pazzani, M. & Mehrotra, S. (2001)
Time Series Indexing II
Fast Subsequence Matching in Time-Series Databases.
Singular Value Decomposition and its applications
Data Mining: Concepts and Techniques — Chapter 8 — 8
Data Mining: Concepts and Techniques — Chapter 8 — 8
Data Mining: Concepts and Techniques — Chapter 8 — 8
Presentation transcript:

Spatial and Temporal Databases Efficiently Time Series Matching by Wavelets (ICDE 98) Kin-pong Chan and Ada Wai-chee Fu

2 Table of Contents Introduction Related Works The Proposed Approach Overall Strategy Performance Evaluation Conclusion

3 Introduction Time-series: a sequence of real numbers, each number representing a value at a time point (financial data, scientific observation data, …) Time-series databases supporting fast retrieval of data and similarity query are desired

4 Introduction (cont) Similarity Search Finds data sequences that differ only slightly from the given query sequence Example) One may want to find all companies whose stock price fluctuations behave similarly with IBM during a year. Similarity matching process Given compute

5 Introduction (cont.) Indexing Dimensionality reduction Transformation is applied to reduce dimension Completeness Nature of data Effectiveness of power concentration of a particular transformation depends on the nature of the time series

6 Related Works Discrete Fourier Transform (Agrawal et al) Parseval’s theorem F-index may raise false alarm, but guarantee no false dismissal Disadvantage: misses the important feature of time localization

7 Related Works (cont.) Singular Value Decomposition: decompose a matrix X of size N*M into Restriction X is not updated X can be updated daily or monthly. In that case, SVD has to be recomputed the whole matrix again to update

8 The proposed Approach : Similarity Model Define new similarity model used in sequence matching

9 Proposed Approach : Haar Wavelet Haar wavelet Allows a good approximation with a subset of coefficients Fast to compute and requires little storage It preserves Euclidean distance

10 Proposed Approach : Haar Wavelet (cont) Example of Wavelet Computation Assume Original time sequence is f(x) = ( ) 4( ) 1 (6) (2) 2 (8 4) (1 –1) Resolution Average Coefficients =6+2 =6-2 =8+1 =8-1 =4+(-1) =4-(-1)

11 Proposed Approach : Haar Wavelet (cont) Instead of storing 6,2,1 and -1, assume we store first two coefficient, 6 and 2 Reconstruction Process 4( ) 1 (6) (2) 2 (8 4) Resolution Average Coefficients (0 0) Original: ( ), Reconstructed: ( ) We can reduce dimension of the data with sacrificing the accuracy

12 Proposed Approach : DFT versus Haar (cont) Motivation of replacing DFT with DWT Pruning power: less false alarm appear in DWT than DFT Complexity consideration Complexity of Haar is O(n) while O(nlogn) for Fast Fourier Transform Note: DWT does not require massive index reorganization in case of update, which is a major drawback of SVD

13 Proposed Approach: Guarantee of no False Dismissal No qualified time sequence will be rejected, thus no false dismissal They show that this property holds for the Haar wavelet where

14 The Overall Strategy Pre-processing Similarity Model Selection: User can select Euclidean distance or v-shift similarity Haar wavelet transform is applied to time-series Index Construction Index structure such as R-tree is built using first few coefficients Range Query Nearest Neighbor Query

15 Experimental Results

16 Experimental Results (cont.) Scalability Test

17 Conclusion Efficient time series matching through dimension reduction by Haar wavelet transform Outperforms DFT in terms of pruning power, scalability and complexity