Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia.

Similar presentations


Presentation on theme: "The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia."— Presentation transcript:

1 The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia

2 Motivation What is the optimal location in Boston area to build a new McDonalds store? Optimality: maximize the number of customers who think the new store is closer to them.

3 Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q, Find a location l Q which maximizes o O o.weight s.t. s S, d(o, l) d(o,s). We consider the L 1 distance: |x 1 - x 2 |+|y 1 - y 2 |

4 Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q, Find a location l Q which maximizes o O o.weight s.t. s S, d(o, l) d(o,s). We consider the L 1 distance: |x 1 - x 2 |+|y 1 - y 2 |

5 Example o :3 2 o :4 1 o :5 3 o :6 4 Q 1 s s 2

6 Example l1l1 1 s 2 o :3 2 o :4 1 o : s Q o :5 3 The Influence of l 1 is 5+6=11.

7 Example l1l1 1 s 2 o :3 2 o :4 1 o : s Q o :5 3 The Influence of l 1 is 5+6=11. l2l2 The Influence of l 2 is 5.

8 Content Problem Definition Straightforward Solution Problem Transformation The R-tree-based solution The OL-tree The VOL-tree Performance

9 Using the RNN Algorithm… l1l1 1 s 2 o :3 2 o :4 1 o : s o :5 3 The RNNs of l 1 are O 3 and O 4.

10 Straightforward Solution 1 s 2 o :3 2 o :4 1 o :6 4 s o :5 3 Compute the influence for every location in Q. Problematic: infinite number of candidates!.

11 Content Problem Definition Straightforward Solution Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

12 nn_buffer of an Object Any location within the nn_buffer is a closer site if built. nn_buffer is a diamond. O 1 :4 O 2 :3 O 3 :5O 4 :6 S1S1 S2S2 nn_buffer of O 4.

13 Problem Transformation Find a location with maximum overlap among objects nn_buffer. O 1 :4 O 2 :3 O 3 :5O 4 :6 S1S1 S2S2 Q Any location here is an optimal location!

14 The Rotated Coodinate Rotate the coordinate 45°. All nn_buffers become axis-parallel squares. Focus on the rotated coordinate. 45 o o X' X Y Y' x y x' y'

15 Content Problem Definition Straightforward Solution Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

16 The R-tree-based Solution Store the objects in an R-tree. Retrieve the objects whose nn_buffers intersect Q. Plane sweep to find a region which has maximum overlap.

17 Two Contributions 1.Object retrieval: –Store point objects, –but retrieve nn_buffers in increasing order of lower X. 2.Plane sweep: –Straightforwardly: O(n 2 ). –Our method: O(n log n).

18 Best-first Retrieval Keep a heap of index entries + objects. Sorted in increasing order of nn_buffers lower X. While heap is not empty, pop an entry. If pop an object, send it to plane sweep. If pop an index entry, push its children (intersecting Q). t t

19 Naïve Plane Sweep X Y O 1 :4 O 2 :3 O 3 : O 4 :

20 Not Efficient! O(n 2 ) Suppose next insertion: add 2 to the Y-range [2,11]

21 The aSB-tree Extended from the SB-tree [YW01]: keeps max overlap information at index entries. handle a query range Q.

22 Suppose next insertion: add 2 to the Y range [2,11] The aSB-tree

23 Suppose next insertion: add 2 to the Y range [2,11] The aSB-tree

24 Suppose next insertion: add 2 to the Y range [2,11] The aSB-tree

25 Content Problem Definition Straightforward Solution Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

26 The OL-tree Idea: partition the space, and keep max overlapped region for each partition! Like a k-d-B-tree. An nn_buffer may have multiple copies. Stores nn_buffers : add to fullcover. 2,3,4: recursively insert.

27 Index entry has, besides range: –fullcover: total weight of nn_buffers fully covering the whole area; –localmax: among the nn_buffers inserted into the sub-tree, max overlap. –maxrange: the region where localmax occurred. Leaf entry: –A rectangle and its weight. Stored Information

28 r 1, 0, 4)( r 2, 1, 4)( r 3, 2, 7)( r 32 (, 2, 3) r 31, 4, 3)( r 33 (, 1, 2) r root (, 0, 9) sub-trees omitted

29 r 1, 0, 4)( r 2, 1, 4)( r 3, 2, 7)( r 32 (, 2, 3) r 31, 4, 3)( r 33 (, 1, 2) r root (, 0, 9) sub-trees omitted fullcover: 2 nn_buffers fully cover r 3 localmax: Among those inserted, max overlap is 7 maxrange: where localmax occurred

30 Query Processing Start with root, insert index entries into heap. Sorting key: upper bound of real max overlap in the sub-tree. –localmax + fullcovers of ancestor entries. –Accurate if Q intersects with maxrange.

31 r 1, 0, 4)( r 2, 1, 4)( r 3, 2, 7)( r 32 (, 2, 3) r 31, 4, 3)( r 33 (, 1, 2) r root (, 0, 9) sub-trees omitted localmax Real max overlap = localmax = 5

32 Query Processing Start with root, insert index entries into heap. Sorting key: upper bound of real max overlap in the sub-tree. –localmax + fullcovers of ancestor entries. –Accurate if Q intersects with maxrange. Keep a running value: max overlap M. Pruning 1: Q intersects with maxrange. Pruning 2: upper bound of max overlap < M.

33 r 1, 0, 4)( r 2, 1, 4)( r 3, 2, 7)( r 32 (, 2, 3) r 31, 4, 3)( r 33 (, 1, 2) r root (, 0, 9) sub-trees omitted Q r 2 is pruned since Q intersects r 2.maxrange. M = 0+1+4=5. r 1 is pruned since the upper bound of overlap = 4 < M.

34 r 1, 0, 4)( r 2, 1, 4)( r 3, 2, 7)( r 32 (, 2, 3) r 31, 4, 3)( r 33 (, 1, 2) r root (, 0, 9) sub-trees omitted Sometimes, we need to examine a leaf node. Plane sweep it!

35 OL-tree VOL-tree OL-tree is not practical –worst-case space complexity O(n 2 ) –complex re-organization How to improve? –Only keep a few top levels of the OL-tree. ==> Virtual OL-tree!

36 VOL-tree

37 Example If Q is here, perform range search on the R-tree.

38 Comparison with R-tree Approach The R-tree approach examines all nn_buffers intersecting with Q. By using a small, in-memory VOL- tree, the new approach can prune the search space.

39 Challenge With dynamic updates, to keep localmax and maxrange is expensive. To insert an nn_buffer here, recompute!

40 Index entry (range, fullcover, maxrange, localmax) lowermax, uppermax lowermax localmax uppermax Solution

41 Index entry (range, fullcover, maxrange, localmax) lowermax, uppermax lowermax localmax uppermax Any location in maxrange has overlap = lowermax. At a location outside maxrange, the overlap can be more than lowermax, but < uppermax. Solution

42 Update Case 1: the new nn_buffer does not intersect with maxrange. Case 2: intersects. Case 1: increase uppermax. Case 2: increase uppermax and lowermax.

43 Query Similar to the OL-tree. To compute upper bound of max overlap, use uppermax. When Q intersects maxrange, may or may not prune.

44 Content Problem Definition Straightforward Solution Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

45 Setup Digital Chart from the R-tree Portal. –O: 24,493 populated places. –S: 9,203 cultural landmarks. Pagesize: 1KB. Buffersize: 256 pages. Object R-tree: 753 pages. Pentium IV Dell PC, 3.2GHz. Java. Measure total I/O of 100 random queries.

46 Size of the VOL-tree

47 Small Query Area

48 Large Query Area

49 Varying Buffer Size

50 Effect of Update

51 Conclusions Introduced the optimal-location query. Proposed three solutions. The VOL-tree approach is the best. More improvement with larger query area. (5% query area = 6 times improvement.) More updates decreases the improvement. (50% updates = no improvement.) But can bulk-load.


Download ppt "The Optimal-Location Query Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia."

Similar presentations


Ads by Google