Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.

Similar presentations


Presentation on theme: "1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005."— Presentation transcript:

1 1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005

2 2 Content The R-tree –Range Query –Aggregation Query NN Query RNN Query Closest Pair Query Close Pair Query Skyline Query

3 3 R-Tree Motivation 2 0 46 8 10 2 4 6 8 x axis y axis b c a d e f g h i j k l m Range query: find the objects in a given range. E.g. find all hotels in Boston. No index: scan through all objects. NOT EFFICIENT!

4 4 R-Tree: Clustering by Proximity

5 5 R-Tree

6 6

7 7 Range Query 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7

8 8 Range Query 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7

9 9 Aggregation Query Given a range, find some aggregate value of objects in this range. COUNT, SUM, AVG, MIN, MAX E.g. find the total number of hotels in Massachusetts. Straightforward approach: reduce to a range query. Better approach: along with each index entry, store aggregate of the sub-tree.

10 10 Aggregation Query 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E :8 1 E :5 2 E :3 3 E :2 4 E :3 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E :3 6 E :2 7

11 11 Aggregation Query 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 1 d e f g h i j k l m E 2 a b cd e E :8 1 E :5 2 E :3 3 E :2 4 E :3 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E :3 6 E :2 7 Subtree pruned!

12 12 Content The R-tree –Range Query –Aggregation Query NN Query RNN Query Closest Pair Query Close Pair Query Skyline Query

13 13 Nearest Neighbor (NN) Query Given a query location q, find the nearest object. E.g.: given a hotel, find its nearest bar. q a

14 14 A Useful Metric: MINDIST Minimum distance between q and an MBR. It is an lower bound of d(o, q) for every object o in E1. MINDIST(o, q) = d(o, q). E1E1 q

15 15 NN Basic Algorithm Keep a heap H of index entries and objects, ordered by MINDIST. Initially, H contains the root. While H   –Extract the element with minimum MINDIST –If it is an index entry, insert its children into H. –If it is an object, return it as NN. End while E1E1 q

16 16 NN Query Example E 1 1 E 2 2 Visit Root ActionHeap 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 3 d e f g h i j k l m query E 4 E 5 E 1 E 2 E 6 E 7 12 5 95 2 13 a b c d e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7 2 10 13

17 17 NN Query Example E 1 1 E 2 2 Visit Root follow E 1 E 2 2 E 5 3 E 5 5 E 9 4 ActionHeap 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 3 d e f g h i j k l m query E 4 E 5 E 1 E 2 E 6 E 7 12 5 95 2 13 a b c d e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7 2 10 13

18 18 NN Query Example E 1 1 E 2 2 Visit Root follow E 1 E 2 2 E 5 3 E 5 5 E 9 4 ActionHeap follow E 2 E 2 6 E 5 3 E 5 5 E 9 4 E 13 7 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 3 d e f g h i j k l m query E 4 E 5 E 1 E 2 E 6 E 7 12 5 95 2 13 a b c d e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7 2 10 13

19 19 NN Query Example E 1 1 E 2 2 Visit Root follow E 1 E 2 2 E 5 3 E 5 5 E 9 4 ActionHeap follow E 2 E 2 6 E 5 3 E 5 5 E 9 4 E 13 7 follow E 6 j 10 i 2 E 5 3 E 5 5 E 9 4 E 13 7 k 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 3 d e f g h i j k l m query E 4 E 5 E 1 E 2 E 6 E 7 12 5 95 2 13 a b c d e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7 2 10 13

20 20 NN Query Example E 1 1 E 2 2 Visit Root follow E 1 E 2 2 E 5 3 E 5 5 E 9 4 ActionHeap follow E 2 E 2 6 E 5 3 E 5 5 E 9 4 E 13 7 follow E 6 Report i and terminate j 10 i 2 E 5 3 E 5 5 E 9 4 E 13 7 k 2 0 46 8 10 2 4 6 8 x axis y axis b c a E 3 d e f g h i j k l m query E 4 E 5 E 1 E 2 E 6 E 7 12 5 95 2 13 a b c d e E 1 E 2 E 3 E 4 E 5 Root E 1 E 2 E 3 E 4 f g h E 5 l m E 7 i j k E 6 E 6 E 7 2 10 13

21 21 Pruning 1 in NN Query If we see an object o, prune every MBR whose MINDIST > d(o, q). Side notice: at most one object in H! E1E1 q E2E2 E4E4 E3E3 o

22 22 Pruning 2 using MINMAXDIST Prune even before we see an object! Prune E 1 if exists E 2 s.t. MINDIST(q, E 1 ) > MINMAXDIST(q, E 2 ). MINMAXDIST: compute max dist between q and each edge of E 2, then take min. E1E1 q E2E2  Object o in sub-tree of E 2 s.t. d(o, q)  MINMAXDIST(q, E 2 )

23 23 NN Full-Blown Algorithm Keep a heap H of index entries and objects, ordered by MINDIST. Initially, H contains the root. Setδ= + . While H   –Extract the element e with minimum MINDIST. –If it is an object, return it as NN. –For every entry se in PAGE(e) whose MINDIST  δ Insert se into H. Decreaseδto MINMAXDIST(q, se) if possible. End while

24 24 Content The R-tree –Range Query –Aggregation Query NN Query RNN Query Closest Pair Query Close Pair Query Skyline Query

25 25 Reverse NN: Definition Given a set of points, and a query location q. Find the points whose NN is q. RNN(q)={p 1, p 2 }, NN(q)= p 3. p 1 p 2 p 3 p 4 q

26 26 Half-plane pruning First NN q p'  (p, q) N 1 q p 1 p 2 N 2  ( q, p 1 )  ( q, p 2 ) p 1 Second NN First NN

27 27 (TPL) Algorithm Two logical steps: –Filter step: Find the set S cnd of candidate points Find NN; Prune space; Find NN in unpruned space; … Till no more object left. –Refinement step: eliminate false positives For each point p in S cnd, check whether its NN is not q. The two steps are combined in a single tree traversal, by keeping all pruned MBRs/objects in S rfn.

28 28 Example

29 29 Example

30 30 Example

31 31 Example

32 32 Example

33 33 Example

34 34 Example (end of filter step)

35 35 Example (end of filter step)

36 36 Example (refinement)

37 37 Example (refinement)

38 38 Content The R-tree –Range Query –Aggregation Query NN Query RNN Query Closest Pair Query Close Pair Query Skyline Query

39 39 Closest Pair (CP) Query Given two sets of objects R and S, Find the pair of objects (r  R, s  S) with minimum distance. CP = (r 2, s 1 ) E.g. find the closest pair of hotel-bar. r 1 r 2 r 3 r 4 r 5 s 1 s 2 s 3

40 40 CP Solution Idea Assume R and S are indexed by R-trees with same height. Similar to the NN query algorithm. MINDIST, MINMAXDIST for a pair of MBRs: Ri Sj MINMAXDIST MINDIST

41 41 CP Basic Algorithm Keep a heap H of pairs of index entries and pairs of objects, ordered by MINDIST. Initially, H contains the pair of roots. While H   –Extract the pair (e R, e S ) with minimum MINDIST. –If it is a pair of objects, return it as CP. –For every entry se R in PAGE(e R ) and every entry se S in PAGE(e S ) Insert(e R, e S ) into H. End while

42 42 CP Full-Blown Algorithm Keep a priority queue H of pairs of index entries and pairs of objects, ordered by MINDIST. Initially, H contains the pair of roots. Setδ= + . While H   –Extract the pair (e R, e S ) with minimum MINDIST. –If it is a pair of objects, return it as CP. –For every entry se R in PAGE(e R ) and every entry se S in PAGE(e S ) whose MINDIST  δ Insert( se R, se S ) into H. Decreaseδto MINMAXDIST( se R, se S ) if possible. End while

43 43 Content The R-tree –Range Query –Aggregation Query NN Query RNN Query Closest Pair Query Close Pair Query Skyline Query

44 44 Close Pair Query Given two sets of objects R and S, plus a threshold α, Find every pair of objects (r  R, s  S) with distance <α. Close pairs = (r 1, s 1 ), (r 2, s 1 ), and (r 3, s 3 ). r 1 r 2 r 3 r 4 r 5 s 1 s 2 s 3 α

45 45 Close Pair Solution Idea Observation: if d(r, s) <α,  mbr R, mbr S that contain r and s, respectively, we have: MINDIST(mbr R, mbr S ) <α. Solution idea: –start with the pair of root nodes, –Join pairs of index entries whose MINDIST <α, –Till we reach leaf level.

46 46 Close Pair Algorithm Push the pair of root nodes into stack. While stack   –Pop a pair (e R, e S ) from stack. –For every entry se R in PAGE(e R ) and se S in PAGE(e S ) where MINDIST(se R, se S ) <α Push (se R, se S ) into stack if se R is an index entry; Otherwise report (se R, se S ) as one close pair. End while

47 47 Content The R-tree –Range Query –Aggregation Query NN Query RNN Query Closest Pair Query Close Pair Query Skyline Query

48 48 Skyline of Manhattan Which buildings can we see? –Higher or nearer

49 49 Finding A Hotel Close to the Beach Which one is better? –i or h? (i, because its price and distance dominate those of h) –i or k?

50 50 Skyline Queries Retrieve points not dominated by any other point.

51 51 Branched and Bound Skyline (BBS) Assume all points are indexed in an R-tree. mindist(MBR) = the L 1 distance between its lower-left corner and the origin.

52 52 Each heap entry keeps the mindist of the MBR. Branched and Bound Skyline (BBS)

53 53 Example of BBS Process entries in ascending order of their mindists.

54 54 Example of BBS

55 55 Example of BBS

56 56 Example of BBS

57 57 Example of BBS

58 58 Example of BBS

59 59 Summary & Reference NN Query: N. Roussopoulos, S. Kelley and F. Vincent, “Nearest Neighbor Queries”, SIGMOD’95. (It didn’t talk about best-first search). RNN Query: Y. Tao, D. Papadias and X. Lian, “Reverse kNN Search in Arbitrary Dimensionality”, VLDB’04. Skyline Query: D. Papadias, Y. Tao, G. Fu, and B. Seeger, “An Optimal and Progressive Algorithm for Skyline Queries”, SIGMOD’03. Also talked about Closest/Close Pair Queries.


Download ppt "1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005."

Similar presentations


Ads by Google