1 Reverse Nearest Neighbor Queries for Dynamic Databases SHOU Yu Tao Jan. 10 th, 2003 SIGMOD 2000.

Slides:



Advertisements
Similar presentations
Ranking Outliers Using Symmetric Neighborhood Relationship Wen Jin, Anthony K.H. Tung, Jiawei Han, and Wei Wang Advances in Knowledge Discovery and Data.
Advertisements

Spatio-temporal Databases
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Nearest Neighbor Queries using R-trees
School of Computer Science and Engineering Finding Top k Most Influential Spatial Facilities over Uncertain Objects Liming Zhan Ying Zhang Wenjie Zhang.
Searching on Multi-Dimensional Data
Nearest Neighbor Queries using R-trees Based on notes from G. Kollios.
Parsimony based phylogenetic trees Sushmita Roy BMI/CS 576 Sep 30 th, 2014.
Jianzhong Qi Rui Zhang Lars Kulik Dan Lin Yuan Xue The Min-dist Location Selection Query University of Melbourne 14/05/2015.
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
1 NNH: Improving Performance of Nearest- Neighbor Searches Using Histograms Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research) Chen Li (UC Irvine)
Efficient Reverse k-Nearest Neighbors Retrieval with Local kNN-Distance Estimation Mike Lin.
Continuous Intersection Joins Over Moving Objects Rui Zhang University of Melbourne Dan Lin Purdue University Kotagiri Ramamohanarao University of Melbourne.
A Crowd-Enabled Approach for Efficient Processing of Nearest Neighbor Queries in Incomplete Databases Samia Kabir, Mehnaz Tabassum Mahin Department of.
Nearest Neighbor Search in Spatial and Spatiotemporal Databases
2-dimensional indexing structure
Spatio-temporal Databases Time Parameterized Queries.
Spatial Indexing for NN retrieval
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
1 Efficient Method for Maximizing Bichromatic Reverse Nearest Neighbor Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) M. Tamer.
Peer-to-Peer Spatial Queries in Sensor Networks Murat Demirbas Hakan Ferhatosmanoglu The Ohio State University.
Spatial Queries Nearest Neighbor and Join Queries.
ISEE: Efficient k-Nearest-Neighbor Monitoring over Moving Obejcts [SSDBM 2007] Wei Wu, Kian-Lee Tan National University of Singapore.
Nearest Neighbor Queries Sung-hsun Su April 12, 2001
Spatial Queries Nearest Neighbor Queries.
Spatio-Temporal Databases. Introduction Spatiotemporal Databases: manage spatial data whose geometry changes over time Geometry: position and/or extent.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Nearest Neighbor and Reverse Nearest Neighbor Queries for Moving Objects Simonas Šaltenis with Rimantas Benetis, Christian S. Jensen, Gytis Karčiauskas.
Trip Planning Queries F. Li, D. Cheng, M. Hadjieleftheriou, G. Kollios, S.-H. Teng Boston University.
Spatio-Temporal Databases. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases …..
Backtracking.
Metric based KNN indexing Lecturer:Prof Ooi Beng Chin Presenters:Frankie ChanHT Y Tan ZhenqiangHT J.
1 2-D Trees You are given a set of points on the plane –Each point is defined by two coordinates (x, y) (5,45) (25,35) (35,40) (50,10) (90,5) (85,15) (80,65)
Fast Subsequence Matching in Time-Series Databases Author: Christos Faloutsos etc. Speaker: Weijun He.
Influence Zone: Efficiently Processing Reverse k Nearest Neighbors Queries Presented By: Muhammad Aamir Cheema Joint work with Xuemin Lin, Wenjie Zhang,
Antonin Guttman In Proceedings of the 1984 ACM SIGMOD international conference on Management of data (SIGMOD '84). ACM, New York, NY, USA.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Zhuo Peng, Chaokun Wang, Lu Han, Jingchao Hao and Yiyuan Ba Proceedings of the Third International Conference on Emerging Databases, Incheon, Korea (August.
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
On Computing Top-t Influential Spatial Sites Authors: T. Xia, D. Zhang, E. Kanoulas, Y.Du Northeastern University, USA Appeared in: VLDB 2005 Presenter:
9/2/2005VLDB 2005, Trondheim, Norway1 On Computing Top-t Most Influential Spatial Sites Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang Du Northeastern.
Information Technology Selecting Representative Objects Considering Coverage and Diversity Shenlu Wang 1, Muhammad Aamir Cheema 2, Ying Zhang 3, Xuemin.
Trajectory Data Mining Dr. Yu Zheng Lead Researcher, Microsoft Research Chair Professor at Shanghai Jiao Tong University Editor-in-Chief of ACM Trans.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
Continuous Reverse Nearest Neighbor Monitoring
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science.
Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.
New Algorithms for Efficient High-Dimensional Nonparametric Classification Ting Liu, Andrew W. Moore, and Alexander Gray.
1.  RNN(q) – returns a set of data points that have the query point q as the nearest neighbor.  Advanced database applications: fixed wireless telephone.
Spatio-Temporal Databases. Term Project Groups of 2 students You can take a look on some project ideas from here:
1 Efficient Processing of XML Twig Patterns with Parent Child Edges: A Look-ahead Approach Presenter: Qi He.
Spatial Queries Nearest Neighbor and Join Queries Most slides are based on slides provided By Prof. Christos Faloutsos (CMU) and Prof. Dimitris Papadias.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
Computer Science and Engineering Jianye Yang 1, Ying Zhang 2, Wenjie Zhang 1, Xuemin Lin 1 Influence based Cost Optimization on User Preference 1 The University.
Keogh, E. , Chakrabarti, K. , Pazzani, M. & Mehrotra, S. (2001)
Mehdi Kargar Department of Computer Science and Engineering
Strategies for Spatial Joins
Spatial Queries Nearest Neighbor and Join Queries.
Spatio-Temporal Databases
Influence sets based on Reverse Nearest Neighbor Queries
KD Tree A binary search tree where every node is a
Nearest Neighbor Queries using R-trees
Spatio-temporal Databases
Spatio-Temporal Databases
Probabilistic Data Management
Spatio-temporal Databases
Liang Jin (UC Irvine) Nick Koudas (AT&T Labs Research)
Donghui Zhang, Tian Xia Northeastern University
Presentation transcript:

1 Reverse Nearest Neighbor Queries for Dynamic Databases SHOU Yu Tao Jan. 10 th, 2003 SIGMOD 2000

2 Outline of the Presentation  Background  Nearest neighbor (NN) search algorithm [RKV95]  Reverse nearest neighbor (RNN) search algorithm [SAA00]  Other NN related problems – CNN, RNNa, etc.  Conclusions  References  Q & A

3 Background  RNN(q) – returns a set of data points that have the query point q as the nearest neighbor.  Receives much interests during recent years due to its increased importance in advanced database applications: fixed wireless telephone access application – “ load ” detection problem: count how many users are currently using a specific base station q  if q’s load is too heavy  activating an inactive base station to lighten the load of that over loaded base station

4 Nonsymmetrical property of RNN queries NN(q) = p NN(p) = q If p is the nearest neighbor of q, then q need not be the nearest neighbor of p (in this case the nearest neighbor of p is r). those efficient NN algorithms cannot directly applied to solve the RNN problems. Algorithms for RNN problems are needed. A straight forward solution: -- check for each point whether it has q as its nearest neighbor -- not suitable for large data set! q p r

5 Two versions of RNN problem monochromatic version: -- the data points are of two categories, say red and blue. The RNN query point q is in one of the categories, say blue. So RNN(q) must determine the red points which have the query point q as the closest blue point. -- e.g. fixed wireless telephone access application: clients/red (e.g. call initiation or termination) servers/blue (e.g. fixed wireless base stations) bichromatic version: -- all points are of the same color is the monochromatic version. Static vs Dynamic: --whether insertions or deletions of the data points are allowed.

6 RNN problem this paper concerns  Monochromatic case  Dynamic case  Whole Algorithm is based on: (1). Geometric observations  enable a reduction of the RNN problem to the NN problem. (2). NN search algorithm [ RKV95 ]. * Both RNN(q) and NN(q) are sets of points in the databases, while query point q may or may not correspond to an actual data point in the data base.

7 Geometric Observations s1 s6 s5 s4 s3 s2 q L1 L3 L2 Let the space around a query point q be divided into six equal regions Si (1<=i<=6) by straight lines intersecting q. Si therefore is the space between two space dividing lines. Proposition 1: For a given 2-dimensional dataset, RNN(q) will return at most six data points. And they are must be on the same circle centered at q.

8 Geometric Observations Proposition 2: In each region Si: (1). There exists at most two RNN points (2). If there exist exactly two RNN points in a region Si, then each point must be on one of the space dividing lines through q delimiting Si. Proposition 3: In each region Si, let p = NN(q) in Si, if p is not on a space dividing line, then either NN(p) = q (and then RNN(q) = p) or RNN(q) = null. s1 s6 s5 s4 s3 s2 q L1 L3 L2 p

9 Important result from Observations  Implications: In a region Si, if the number of results of NN(q) is: (1) one point only: If NN(q) is not on the space dividing lines: either the nearest neighbor is also the reverse nearest neighbor, or there is no RNN(q) in Si. (2) more than one point, (but the NN(q) of each region will return at most two points for each region): These two points must be on the two dividing lines and on the same circle centered at q.  Allow us to have a criterion for limiting the choice of RNN(q) to one or two points in each of the six regions Si.  The RNN query has been reduced to the NN query

10 Basic NN Search Algorithm This is based on MINDIST metric only return single NN(q) result only

11 Algorithms in [ RKV95 ]  Two metrics introduced – effectively directing and pruning the NN search MINDIST (optimistic) MINMAXDIST (pessimistic)  DFS Search

12 MINDIST(Optimistic)  MINDIST(RECT,q): the shortest distance from RECT to query point q  This provides a lower bound for distance from q to objects in RECT  MINDIST guarantees that all points in the RECT have at least “MINDIST” distance from the query point q.

13 MINMAXDIST(Pessimistic)  MBR property: Every face (edge in 2D, rectangle in 3D, hyper-face in high D) of any MBR contains at least one point of some spatial object in the DB.  MINMAXDIST: Calculate the maximum dist to each face, and choose the minimal.  Upper bound of minimal distance  MINMAXDIST guarantees that at least 1 object with distance less or equal to MINMAXDIST in the MBR

14 Illustration of MINMAXDIST (q1,q2) (t1,t2) (s1,s2) (t1,p2) (t1,s2) x y MINDIST MINMAXDIST

15 Pruning  Downward Pruning – during the descending phase MINDIST(q, M) > MINMAXDIST(q, M’) : M can be pruned Distance(q, O) > MINMAXDIST(q, M’) : O can be discarded  Upward Pruning – when return from the recursion MINDIST(q, M) > Distance(q, O) M can be pruned

16 DFS Search on R-Tree  Traversal: DFS Expanding non-leaf node during the descending phase: Order all its children by the metrics (MINDIST or MINMAXDIST) and sort them into an Active Branch List (ABL). Apply downward pruning techniques to the ABL to remove unnecessary branches. Expanding leaf node: Compare objects to the nearest neighbor found so far. Replace it if the new object is closer. At the return from the recursion phase: Using upward pruning tech.

17 RNN Algorithm  Algorithm Outline for RNN(q) query: 1. Construct the space dividing lines so that space has been divided into 6 regions based on the query point q. 2. (a) * Traverse R-tree and find one or two points in each region Si that satisfy the nearest neighbor condition NN(q). -- this part is also called “ conditional NN queries ” (b) The candidate points are tested for the condition whether their nearest neighbor is q and add to answer list if condition is fulfilled. 3. Eliminate duplicates in RNN(q)

18 How to find NN(q) in Si Brute-force Algorithm: finds all the nearest neighbors until there is one in the queried region Si.  inefficient! (as shown in the figure) Si p2 q p3 p4p5 p6 p7 p1

19 How to find NN(q) in Si  The only difference between the NN algorithm proposed by [RKV95] and conditional NN algorithm resides only in the metric used to sort and prune the list of candidate nodes.

20 New MINMAXDIST definition Mindist(q, M) Minmaxdist(q, M) queried region S MINMAXDIST(q, M, Si) = distance to furthest vertex on closest face IN Si MINDIST(q, M, Si) = MINDIST(q, M)

21 New metric definition Number of vertices in SiMindist(q, M, Si)Minmaxdist(q, M, Si) 0 (no intersection of M with Si)Infinite 0 (M intersects Si, Case E) 1 (case D) Mindist(q, M)Infinite  because cannot guarantee there are data points in both M and S 2 (case C), 3 (case B), 4 (case A)Mindist(q, M)Distance to furthest vertex on closest face IN Si Mindist(q, M, Si) = Mindist(q, M) Because mindist(q, M) is valid for all case, since it provides a definite lower bound on the location of data points inside an MBR, although a little bit looser.

22 CNN/NN algorithm difference  When expanding non-leaf node during the descending phase: NN Search: Order all its children by the metrics (MINDIST or MINMAXDIST) and sort them into an Active Branch List (ABL). Apply downward pruning techniques to the ABL to remove unnecessary branches. CNN Search: -- build a set of lists branchList[i][nodecard] 0<=i<=num_section-1  the list whose pointer points to the children of that node and overlaps with region (i+1) i = num_section  the list contains the counter (for each child) the total number of sections overlaps with this child  child with higher counter is visited first for I/O optimization.

23 Other NN related researches 1.NN and RNN for moving objects [BJKS02] 2.CNN [PTS02] 3.RNNA over data streams [KMS02]

24 Conclusions  The RNN algorithm proposed is based on using the underling indexing data structure (R-tree), also necessary to answer NN queries.  By integrating RNN queries in the framework of already existing access structures, the approach developed in this paper is therefore algorithmic and independent of data structures constructed especially for a set of such queries.  No additional data structures are necessary, therefore the space requirement does not increase.

25 [RKV95] N. Roussopoulos, S. Kelley, and F. Vincent. Nearest neighbor queries. In SIGMOD, [SAA00] I. Stanoi, D. Agrawal, and A. El Abbadi. Reverse nearest neighbor queries for dynamic databases. In Proceedings of the ACM SIGMOD Workshop on Data Mining and Knowledge Discovery (DMKD), [KM00] Korn, F. and Muthukrishnan, S., Influence Set Based on Reverse Nearest Neighbor Queries. SIGMOD, [BJKS02] Benetis, R., Jensen, C., Karciauskas, G., Saltenis, S. Nearest Neighbor and Reverse Nearest Neighbor Queries for Moving Objects. IDEAS, 2002 [PTS02] Papadias, Tao, Y. and Shen, D., Continuous Nearest Neighbor Search. VLDB, [KMS02] Korn, F., Muthukrishnan, S. and Srivastava, D., Reverse nearest neighbor aggregates over data streams. VLDB, References

26 Questions and Answers Any Questions?

27 Thank you for attending this presentation!