Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Optimal-Location Query

Similar presentations


Presentation on theme: "The Optimal-Location Query"— Presentation transcript:

1 The Optimal-Location Query
Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia

2 Motivation “What is the optimal location in Boston area to build a new McDonald’s store?” Optimality: maximize the number of customers who think the new store is closer to them.

3 Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q , Find a location l  Q which maximizes oO o.weight s.t. sS, d(o, l)  d(o,s). We consider the L1 distance: |x1 - x2|+|y1 - y2|

4 Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q , Find a location l  Q which maximizes oO o.weight s.t. sS, d(o, l)  d(o,s). We consider the L1 distance: |x1 - x2|+|y1 - y2|

5 Example Q o :3 2 o :6 4 o :5 o :4 3 s 1 2 s 1

6 Example Q o :3 2 19 l1 o :6 4 22 o :5 10 o :4 3 s 1 12 2 s 1 The “Influence” of l1 is 5+6=11.

7 Example Q o :3 l2 l1 o :6 o :5 o :4 s s 19 22 18 12 2 4 3 1 2 1
The “Influence” of l1 is 5+6=11. The Influence of l2 is 5.

8 Content Problem Definition Straightforward Solution
Problem Transformation The R-tree-based solution The OL-tree The VOL-tree Performance

9 Using the RNN Algorithm…
2 19 l1 o :6 4 22 o :5 10 o :4 3 s 1 12 2 s 1 The RNNs of l1 are O3 and O4.

10 Straightforward Solution
2 o :6 4 o :5 o :4 3 s 1 2 s 1 Compute the influence for every location in Q. Problematic: infinite number of candidates!.

11 Content Problem Definition Straightforward Solution
Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

12 nn_buffer of an Object nn_buffer of O4. O2:3 O3:5 O4:6 O1:4 S2 S1 Any location within the nn_buffer is a closer site if built. nn_buffer is a diamond.

13 Problem Transformation
Any location here is an optimal location! Q O3:5 O4:6 O1:4 S2 S1 Find a location with maximum overlap among objects’ nn_buffer.

14 The Rotated Coodinate Rotate the coordinate 45°.
Y X' o y x' Y' 45 o y' x X Rotate the coordinate 45°. All nn_buffers become axis-parallel squares. Focus on the rotated coordinate.

15 Content Problem Definition Straightforward Solution
Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

16 The R-tree-based Solution
Store the objects in an R-tree. Retrieve the objects whose nn_buffers intersect Q. Plane sweep to find a region which has maximum overlap.

17 Two Contributions Object retrieval: Plane sweep: Store point objects,
but retrieve nn_buffers in increasing order of lower X. Plane sweep: Straightforwardly: O(n2). Our method: O(n log n).

18 Best-first Retrieval Keep a heap of index entries + objects.
Sorted in increasing order of nn_buffer’s lower X. t t While heap is not empty, pop an entry. If pop an object, send it to plane sweep. If pop an index entry, push its children (intersecting Q).

19 Naïve Plane Sweep Y 4 12 O2:3 9 8 O1:4 5 O3:5 2 O4:6 X -∞ 2 5 8 9 12
+∞ 7 3

20 Not Efficient! O(n2) -∞ 2 5 8 9 12 +∞ 7 3 Suppose next insertion: add 2 to the Y-range [2,11]. +2 -∞ 2 5 8 9 12 +∞ 7 14 3 11

21 The aSB-tree Extended from the SB-tree [YW01]:
keeps max overlap information at index entries. handle a query range Q. -∞ 5 9 +∞ -∞ 2 5 8 9 12 +∞ 7 3

22 The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. +2
-∞ 5 9 +∞ -∞ 2 5 8 9 12 +∞ 7 3

23 The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 2 -∞
2 -∞ 5 9 +∞ +2 +2 -∞ 2 5 8 9 12 +∞ 7 3

24 The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 2 -∞
2 -∞ 5 9 +∞ 7 12 7 5 3 -∞ 2 5 8 9 11 12 +∞

25 Content Problem Definition Straightforward Solution
Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

26 The OL-tree Idea: partition the space, and keep max overlapped region for each partition! Like a k-d-B-tree. Stores nn_buffers. 1 2 3 4 An nn_buffer may have multiple copies. 1: add to fullcover. 2,3,4: recursively insert.

27 Stored Information Index entry has, besides range: Leaf entry:
fullcover: total weight of nn_buffers fully covering the whole area; localmax: among the nn_buffers inserted into the sub-tree, max overlap. maxrange: the region where localmax occurred. Leaf entry: A rectangle and its weight.

28 ( r , 0, 9) root r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

29 r ( , 2, 7) fullcover: 2 nn_buffers fully cover r3
maxrange: where localmax occurred ( r , 0, 9) root r ( , 2, 7) localmax: Among those inserted, max overlap is 7 ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

30 Query Processing Start with root, insert index entries into heap.
Sorting key: upper bound of real max overlap in the sub-tree. localmax +  fullcovers of ancestor entries. Accurate if Q intersects with maxrange.

31 r ( , 2, 7) ( r , 1, 2) Real max overlap = 0+2+1 +localmax = 5
, 0, 9) root Real max overlap = localmax = 5 r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 localmax ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

32 Query Processing Start with root, insert index entries into heap.
Sorting key: upper bound of real max overlap in the sub-tree. localmax +  fullcovers of ancestor entries. Accurate if Q intersects with maxrange. Keep a running value: max overlap M. Pruning 1: Q intersects with maxrange. Pruning 2: upper bound of max overlap < M.

33 Q ( r , 0, 9) r2 is pruned since Q intersects r2.maxrange. M = 0+1+4=5. root r ( , 2, 7) ( r , 0, 4) 3 1 r1 is pruned since the upper bound of overlap = 4 < M. ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

34 r ( , 2, 7) Sometimes, we need to examine a leaf node. Plane sweep it!
, 0, 9) root Sometimes, we need to examine a leaf node. Plane sweep it! r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

35 OL-tree  VOL-tree OL-tree is not practical How to improve?
worst-case space complexity O(n2) complex re-organization How to improve? Only keep a few top levels of the OL-tree. ==> Virtual OL-tree!

36 VOL-tree

37 If Q is here, perform range search on the R-tree.
Example If Q is here, perform range search on the R-tree.

38 Comparison with R-tree Approach
The R-tree approach examines all nn_buffers intersecting with Q. By using a small, in-memory VOL- tree, the new approach can prune the search space.

39 To insert an nn_buffer here, recompute!
Challenge To insert an nn_buffer here, recompute! With dynamic updates, to keep localmax and maxrange is expensive.

40 Solution Index entry lowermax ≤ localmax ≤ uppermax
(range, fullcover, maxrange, localmax) lowermax, uppermax lowermax ≤ localmax ≤ uppermax

41 Solution Index entry lowermax ≤ localmax ≤ uppermax
(range, fullcover, maxrange, localmax) lowermax, uppermax lowermax ≤ localmax ≤ uppermax Any location in maxrange has overlap = lowermax. At a location outside maxrange, the overlap can be more than lowermax, but < uppermax.

42 Update Case 1: the new nn_buffer does not intersect with maxrange.
Case 1: increase uppermax. Case 2: increase uppermax and lowermax. Case 1: the new nn_buffer does not intersect with maxrange. Case 2: intersects.

43 Query Similar to the OL-tree.
To compute upper bound of max overlap, use uppermax. When Q intersects maxrange, may or may not prune.

44 Content Problem Definition Straightforward Solution
Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

45 Setup Digital Chart from the R-tree Portal.
O: 24,493 populated places. S: 9,203 cultural landmarks. Pagesize: 1KB. Buffersize: 256 pages. Object R-tree: 753 pages. Pentium IV Dell PC, 3.2GHz. Java. Measure total I/O of 100 random queries.

46 Size of the VOL-tree

47 Small Query Area

48 Large Query Area

49 Varying Buffer Size

50 Effect of Update

51 Conclusions Q & A... Introduced the optimal-location query.
Proposed three solutions. The VOL-tree approach is the best. More improvement with larger query area. (5% query area = 6 times improvement.) More updates decreases the improvement. (50% updates = no improvement.) But can bulk-load. Q & A...


Download ppt "The Optimal-Location Query"

Similar presentations


Ads by Google