Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti University of Cyprus Song Lin

Slides:



Advertisements
Similar presentations
Indexing Time Series Based on original slides by Prof. Dimitrios Gunopulos and Prof. Christos Faloutsos with some slides from tutorials by Prof. Eamonn.
Advertisements

Choosing Distance Measures for Mining Time Series Data
1 Top-K Algorithms: Concepts and Applications by Demetris Zeinalipour Visiting Lecturer Department of Computer Science University of Cyprus Department.
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D.
Word Spotting DTW.
Retrieving k-Nearest Neighboring Trajectories by a Set of Point Locations Lu-An Tang, Yu Zheng, Xing Xie, Jing Yuan, Xiao Yu, Jiawei Han University of.
Ming Hua, Jian Pei Simon Fraser UniversityPresented By: Mahashweta Das Wenjie Zhang, Xuemin LinUniversity of Texas at Arlington The University of New South.
Probabilistic Threshold Range Aggregate Query Processing over Uncertain Data Wenjie Zhang University of New South Wales & NICTA, Australia Joint work:
--Presented By Sudheer Chelluboina. Professor: Dr.Maggie Dunham.
Themis Palpanas1 VLDB - Aug 2004 Fair Use Agreement This agreement covers the use of all slides on this CD-Rom, please read carefully. You may freely use.
6/15/20151 Top-k algorithms Finding k objects that have the highest overall grades.
1 SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref Presented by.
Probabilistic Similarity Search for Uncertain Time Series Presented by CAO Chen 21 st Feb, 2011.
Reza Sherkat ICDE061 Reza Sherkat and Davood Rafiei Department of Computing Science University of Alberta Canada Efficiently Evaluating Order Preserving.
Distance Functions for Sequence Data and Time Series
Dagstuhl Seminar 10042, Demetris Zeinalipour, University of Cyprus, 26/1/2010 Workshop on Research Directions in Situational-aware Self-managed Proactive.
Based on Slides by D. Gunopulos (UCR)
1 SINA: Scalable Incremental Processing of Continuous Queries in Spatio-temporal Databases Mohamed F. Mokbel, Xiaopeng Xiong, Walid G. Aref Presented by.
Evaluating Top-k Queries over Web-Accessible Databases Nicolas Bruno Luis Gravano Amélie Marian Columbia University.
Evaluation of Top-k OLAP Queries Using Aggregate R-trees Nikos Mamoulis (HKU) Spiridon Bakiras (HKUST) Panos Kalnis (NUS)
1 Dot Plots For Time Series Analysis Dragomir Yankov, Eamonn Keogh, Stefano Lonardi Dept. of Computer Science & Eng. University of California Riverside.
1 Distributed Top-K Ranking Algorithms Demetris Zeinalipour Lecturer School of Pure and Applied Sciences Open University of Cyprus Monday, December 15.
1 Ranking Query Results in a Networked World Demetris Zeinalipour Lecturer Department of Computer Science University of Cyprus Thursday, July 23rd, 2010.
Hashed Samples Selectivity Estimators for Set Similarity Selection Queries.
Computer Science and Engineering Loyalty-based Selection: Retrieving Objects That Persistently Satisfy Criteria Presented By: Zhitao Shen Joint work with.
Improved search for Socially Annotated Data Authors: Nikos Sarkas, Gautam Das, Nick Koudas Presented by: Amanda Cohen Mostafavi.
1 Evaluating top-k Queries over Web-Accessible Databases Paper By: Amelie Marian, Nicolas Bruno, Luis Gravano Presented By Bhushan Chaudhari University.
Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA.
MINT Views: Materialized In-Network Top-k Views in Sensor Networks Demetrios Zeinalipour-Yazti (Uni. of Cyprus) Panayiotis Andreou (Uni. of Cyprus) Panos.
1 Disclaimer Feel free to use any of the following slides for educational purposes, however kindly acknowledge the source. We would also like to know how.
Distributed Spatio-Temporal Similarity Search by Demetris Zeinalipour University of Cyprus & Open University of Cyprus Tuesday, July 4 th, 2007, 15:00-16:00,
Benjamin AraiUniversity of California, Riverside Reliable Hierarchical Data Storage in Sensor Networks Song Lin – Benjamin.
Reverse Top-k Queries Akrivi Vlachou *, Christos Doulkeridis *, Yannis Kotidis #, Kjetil Nørvåg * *Norwegian University of Science and Technology (NTNU),
Shape-based Similarity Query for Trajectory of Mobile Object NTT Communication Science Laboratories, NTT Corporation, JAPAN. Yutaka Yanagisawa Jun-ichi.
Computer Science and Engineering Efficiently Monitoring Top-k Pairs over Sliding Windows Presented By: Zhitao Shen 1 Joint work with Muhammad Aamir Cheema.
1 Top-K Query Processing Techniques for Distributed Environments by Demetris Zeinalipour Visiting Lecturer Department of Computer Science University of.
Efficient Processing of Top-k Spatial Preference Queries
Spatio-temporal Pattern Queries M. Hadjieleftheriou G. Kollios P. Bakalov V. J. Tsotras.
ICDE, San Jose, CA, 2002 Discovering Similar Multidimensional Trajectories Michail VlachosGeorge KolliosDimitrios Gunopulos UC RiversideBoston UniversityUC.
All right reserved by Xuehua Shen 1 Optimal Aggregation Algorithms for Middleware Ronald Fagin, Amnon Lotem, Moni Naor (PODS01)
Exact indexing of Dynamic Time Warping
1 Ranking Query Results in a Networked World Demetris Zeinalipour Lecturer Department of Computer Science University of Cyprus Thursday, May 27th, 2010.
Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris.
Stream Monitoring under the Time Warping Distance Yasushi Sakurai (NTT Cyber Space Labs) Christos Faloutsos (Carnegie Mellon Univ.) Masashi Yamamuro (NTT.
Monitoring k-NN Queries over Moving Objects Xiaohui Yu University of Toronto Joint work with Ken Pu and Nick Koudas.
Searching Specification Documents R. Agrawal, R. Srikant. WWW-2002.
1 The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras.
Pairwise Sequence Alignment Part 2. Outline Summary Local and Global alignments FASTA and BLAST algorithms Evaluating significance of alignments Alignment.
Accelerating Dynamic Time Warping Clustering with a Novel Admissible Pruning Strategy Nurjahan BegumLiudmila Ulanova Jun Wang 1 Eamonn Keogh University.
D-skyline and T-skyline Methods for Similarity Search Query in Streaming Environment Ling Wang 1, Tie Hua Zhou 1, Kyung Ah Kim 2, Eun Jong Cha 2, and Keun.
Optimal Aggregation Algorithms for Middleware By Ronald Fagin, Amnon Lotem, and Moni Naor.
1 Complex Spatio-Temporal Pattern Queries Cahide Sen University of Minnesota.
03/02/20061 Evaluating Top-k Queries Over Web-Accessible Databases Amelie Marian Nicolas Bruno Luis Gravano Presented By: Archana and Muhammed.
Top-k Queries in Wireless Sensor Networks Amber Faucett, Dr. Longzhuang Li, In today’s world, wireless.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
1 An Overview of Distributed Top-K Ranking Algorithms 30-min presentation by Demetris Zeinalipour Lecturer School of Pure and Applied Sciences Open University.
A Music Search Engine for Plagiarism Detection
Information Retrieval in Practice
Distance Functions for Sequence Data and Time Series
Spatio-temporal Pattern Queries
Distance Functions for Sequence Data and Time Series
Robust Similarity Measures for Mobile Object Trajectories
Time Series Data and Moving Object Trajectory
Structure and Content Scoring for XML
Time Relaxed Spatiotemporal Trajectory Joins
Structure and Content Scoring for XML
D. ZeinalipourYazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V
Efficient Processing of Top-k Spatial Preference Queries
Presentation transcript:

Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti University of Cyprus Song Lin University of California - Riverside Dimitrios Gunopulos University of California - Riverside ICDE 2006 Song Lin University of California, Riverside

Trajectories are everywhere Song Lin University of California, Riverside

Trajectory Similarity Search Habitat monitoring –Animal migration patterns Sign language detection –Movement of fingers Store surveillance video –Customer movement patterns Camera sensor network –Each sensor can monitor the movement of objects within a small area Song Lin University of California, Riverside

Distributed Similarity Search The setting –Monitoring area G with m objects moving inside –G is segmented into n non-overlapping cells each having a camera sensor –Each record of the trajectory is stored locally at the closest sensor Problem Given a query trajectory Q, retrieve the top K trajectories which are most similar to Q. Song Lin University of California, Riverside

An example Distributed top-K problem –The trajectories of objects are distributed at different cells –It is expensive to collect all the trajectories centrally. Song Lin University of California, Riverside

Finding K most similar trajectories We have to define what is similar –We use well known similarity measures for trajectories Euclidean Dynamic Time Wrapping (DTW) Berndt D., Clifford J., “Using Dynamic Time Warping to Find Patterns in Time Series”, In KDD’94, Menlo Park, CA, pp , Longest Common SubSequence (LCSS) Das G., Gunopulos D., Mannila H., “Finding Similar Time Series”, In PKDD’97, Trondheim, Norway, pp , LNCS 1263, We have to find the most similar trajectories –We focus on LCSS, but the techniques work for DTW as well. Song Lin University of California, Riverside

Similarity Measures Song Lin University of California, Riverside Courtesy of Dr. Eamonn Keogh Song Lin University of California, Riverside Euclidean Matching Dynamic Time Warping Matching Longest Common SubSequence Matching A) B) C)

Longest Common Sub_Sequence (LCSS) 1 n Out-of-phase Match LCSS Figure: courtesy of Dr. Eamonn Keogh Used in string matching problems Captures out-of-phase matches, Captures outliers (ignore matching with outliers) Song Lin University of California, Riverside

Longest Common Sub_Sequence (LCSS) LCSS can be computed in O( δ(l 1 +l 2 ) ) by dynamic programming algorithm. In general, it is expensive to compute this similarity exactly, so we can also compute the bounds of it. Song Lin University of California, Riverside

Centralized LCSS UpperBound Song Lin University of California, Riverside

Problem with distributed computation of LCSS Song Lin University of California, Riverside In distributed setting, computing lCSS is difficult, because –Sequential matching problem –Matching may occur across cells Cell 1Cell 2Cell 3Cell 4

Our Solution Song Lin University of California, Riverside We compute lower bound and upper bound of the LCSS similarity distributively. We develop new distributed top-K algorithms (UB-K, UBLB-K) that use these bounds to find the most similar trajectories.

Distributed LCSS UpperBound Each cell uses LCSS δ, ε (MBE(Q), A ij ) to calculate the similarity of each local sub_trajectory A ij to MBE(Q) Upper bound DUB_LCSS(Q,A i ) is computed by adding the n local results Theorem 1 Song Lin University of California, Riverside

DistributedLCSS LowerBound For each trajectory A i, cell c j finds the time region T ij = {ts(p)|p in A ij } when A i stays in cell c j. Filter Q into Q′ ij such that Q′ ij is in the same time intervals as A ij, Q′ ij = {p|p in Q and ts(p) in T ij }. Each cell performs a local computation of LCSS δ, ε (Q’ ij, A ij ) The lower bound DLB_LCSS(Q,A i ) is computed by adding the n local results Theorem 2 Song Lin University of California, Riverside

Distribute top K algorithms Threshold Algorithm (TA) Fagin R., Lotem A. and Naor M., “Optimal Aggregation Algorithms For Middleware”, In PODS’01, Santa Barbara, CA, pp , Three-Phase Uniform Threshold (TPUT) P. Cao and Z. Wang. Efficient Top-K Query Calculation in Distributed Networks. In PODC, Newfoundland, Canada, Threshold Join Algorithm (TJA) D. Zeinalipour-Yazti, Z. Vagena, D. Gunopulos, V. Kalogeraki, V. Tsotras, M. Vlachos, N. Koudas, D. Srivastava. The Threshold Join Algorithm for Top-k Queries in Distributed Sensor Networks. In DMSN,Trondheim, Norway, Song Lin University of California, Riverside

Problem with existing approaches Assume the exact partial scores are available The exact scores at each cell can not be computed efficiently (recall that the matching may occur at the crossing cells) We use upper (lower) bounds to perform distributed top-k computation (based on Theorem 1 and Theorem 2) Song Lin University of California, Riverside

Distributed top-K computation with bounds Now we have the Lower and Upper Bounds rather than Exact scores. e.g. instead of sim(A0,Q)=20 it gives us [A0,15,25] We propose UB-K and UBLB-K algorithms to compute the top-K results. Song Lin University of California, Riverside

UB-K Algorithm Query: Find the K=2 highest ranked answers Why not stop at 25? Because we might have another object X [UB:24, Real:23] λ+1 TJA λ 2λ2λ2λ+1 TJA Song Lin University of California, Riverside ≥?≥?

UBLB-K Algorithm Note: Kth highest LB is: 21 Therefore A3 (UB:20) and below are not necessary λ+1 TJA 2λ+1 TJA Song Lin University of California, Riverside ≥?≥?

UB-K vs. UBLB-K Both fetch METADATA objects incrementally (αλ+1). UB-K uses upper bounds, while UBLB-K uses both upper bounds and lower bounds UB-K always fetches αλ+1 (α: step increment) DATA objects, while UBLB-K may fetch less DATA objects. UB-K fetches DATA incrementally, while UBLB-K uses a final bulk DATA transfer. Song Lin University of California, Riverside

Experimental Evaluation Comparison system –Centralized –UB-K –UBLB-K Dataset –25,000 trajectories generated over the Oldenburg street map, using the Network Based Generator of Moving Objects*. Song Lin University of California, Riverside * Brinkhoff T., “A Framework for Generating Network-Based Moving Objects”. In GeoInformatica,6(2), 2002.

Performance Evaluation Song Lin University of California, Riverside

Scalability Evaluation Song Lin University of California, Riverside

Varying K and λ Song Lin University of California, Riverside

Summary We described and analyzed well known similarity measures for trajectories DUB_LCSS and DLB_LCSS for bounding similarity of two trajectories distributively UB-K and UBLB-K to find K most similar trajectories Easily extended for DTW and other similarity measures Song Lin University of California, Riverside

Distributed Spatio-Temporal Similarity Search Demetrios Zeinalipour-Yazti University of Cyprus Song Lin University of California - Riverside Dimitrios Gunopulos University of California - Riverside ICDE 2006 Song Lin University of California, Riverside