Download presentation

Presentation is loading. Please wait.

Published byHarry Alborn Modified over 2 years ago

1
On Spatial-Range Closest Pair Query Jing Shan, Donghui Zhang and Betty Salzberg College of Computer and Information Science Northeastern University

2
SSTD03 --- Santorini, Greece Outline Problem Definition Straightforward Approach Existing Technique Our Method Performance

3
SSTD03 --- Santorini, Greece Problem Definition Given a spatial data set S, the Range Closest Pair query regarding a spatial range R finds a pair of objects (s 1, s 2 ) with s 1 and s 2 R such that the distance between s 1 and s 2 is the smallest distance between two objects inside range R. j Query result is (e, f). R

4
SSTD03 --- Santorini, Greece Outline Problem Definition Straightforward Approach Existing Technique Our Method Performance

5
SSTD03 --- Santorini, Greece Straightforward Approach 1. Use an R-tree to select the objects in the query range. 2. Find the closest pair by checking objects in the selection result. We could do nested-loop; Or better approaches e.g. plane sweep with Voronoi diagram method is O(n log n). Problems: Have to access all data pages of R-tree which intersect the query range. Query range data may not fit in memory

6
SSTD03 --- Santorini, Greece Note on Existing Techniques [Hjaltason and Samet 98]: incremental join. [Corral, Manolopoulos, Theodoridis and Vassilakopoulos 00]: an improved version, using pruning. They addressed a slightly different problem: No query range. Joining two different R-trees. Existing techniques do not perform well if there is overlap between the two R-trees. In case the two R-trees are identical, there is extensive overlap.

7
SSTD03 --- Santorini, Greece MinDist Given two MBRs A, B of R-tree nodes, MinDist(A, B) is the smallest distance between A and B boundaries. object o1 A and o2 B, distance(o1, o2) MinDist(A, B). MinDist A B

8
SSTD03 --- Santorini, Greece Existing Technique 1. T= ; closestpair=NULL. 2. Push the pair of root entries into priority queue Q. 3. While Q is not empty 1. Pop (e1, e2) from Q whose MinDist is the smallest. 2. If e1 points to an index node, For every child entry se1 in Node(e1) and child entry se2 in Node(e2) If MinDist(se1, se2)

9
SSTD03 --- Santorini, Greece Example A B C D a,bf,ic,e,gd,h A B C D R (R,R) T = ; closestpair=NULL (A,A) (B,B) (C,C) (D,D) (A,C) (B,C) (A,B) (C,D) (A,D) (B,D)

10
SSTD03 --- Santorini, Greece Example A B C D a,bf,ic,e,gd,h A B C D R (R,R) T = distance(a, b); closestpair=(a, b) (A,A) (B,B) (C,C) (D,D) (A,C) (B,C) (A,B) (C,D) (A,D) (B,D)

11
SSTD03 --- Santorini, Greece Example A B C D a,bf,ic,e,gd,h A B C D R (R,R) T = distance(f, e); closestpair=(f, e) (A,A) (B,B) (C,C) (D,D) (A,C) (B,C) (A,B) (C,D) (A,D) (B,D)

12
SSTD03 --- Santorini, Greece MinExistDist MinDist MinExistDist A B Given two MBRs A, B of R-tree nodes, MinExistDist(A, B) is the minimum distance which guarantees that there exists a pair of objects, one in A and the other in B, with distance closer than the metric. object o1 A and o2 B, distance(o1, o2) MinExistDist(A, B). Usage [CMT+00]: if MinExistDist(A, B) is smaller than T, update T. This can increase the chance of eliminating pairs from Q at early time.

13
SSTD03 --- Santorini, Greece Involving a Query Range MinDist MinExistDist = ∞ MinDist MinExistDist We extend the MinExistDist…

14
SSTD03 --- Santorini, Greece Outline Problem Definition Straightforward Approach Existing Technique Our Method Performance

15
SSTD03 --- Santorini, Greece Motivation for Our Method The existing technique joins all self-pairs, e.g. (A,A), (B,B), … Reason: the MinDist of any self pair is 0. Challenge: is it possible to make it non-zero? If MinDist(A,A) T, no need to process (A,A) ! We propose two ways to augment the R-tree with additional information. We call the augmented structures the Self-Range Closest-Pair Tree. In short, SRCP-tree.

16
SSTD03 --- Santorini, Greece SRCP-tree (version 1) Along with each index entry, store the closest pair of objects in the sub- tree. Check the closest pair stored along with the root entry. If both objects are inside the query range R, return. Along with each self pair to be pushed into Q, use the distance of the local closest pair (rather than 0) as the MinDist. If we encounter an index entry where both objects in the closest pair are inside R, compare their distance with T. May decrease T.

17
SSTD03 --- Santorini, Greece Insertion When a new object o is inserted, only need to update the augmented information along the insertion path. (But need to visit subtrees.) o At each such entry, let the original local closest pair be (a,b). Needs to updated only if distance(o, o’) < distance (a,b) for some object o’ in the sub-tree. (a,b) distance (a,b) o

18
SSTD03 --- Santorini, Greece SRCP-tree (version 2) Idea: while version 1 tries to avoid processing self pairs, version 2 of the structure tries to avoid processing sibling pairs. E.g. if R has children A, B, C, D, version 1 cannot avoid pair (A,B), unless MinDist(A,B) T. Similarly, it has to process (A,C), (A,D), (B,C), (B,D), (C,D). In version 2, every index entry e stores the “local-parent closest pair”: the closest pair between an object in the sub-tree pointed by e and an object in the sub-tree pointed by Parent(e). E.g. along with A, we store the closest pair of objects (o1, o2), where o1 is in subtree(A) and o2 is in subtree(R). Now, if the distance of object pair stored at A is no smaller than T, no need to process any pair involving A. Namely, (A,A), (A,B), (A,C), (A,D).

19
SSTD03 --- Santorini, Greece Performance Dell Pentium 4, 2.66GHz CPU XXL library, Java Both synthetic and real data: uniform data (80,000 objects) US National Mapping Information (26,700 Massachusetts sites) URL = http://mappings. usgs.gov/www/gnis/ Focus on query time.

20
SSTD03 --- Santorini, Greece Small Query Range

21
SSTD03 --- Santorini, Greece Large Query Range

22
SSTD03 --- Santorini, Greece Conclusions We have addressed the spatial closest pair query with query range. We have proposed two versions of an index structure called SRCP-tree. Our approaches have much better query performance than the existing techniques, especially when the query range is large. In particular, version 2 of the SRCP-tree is universally the best.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google