Download presentation

Presentation is loading. Please wait.

1
**The Optimal-Location Query**

Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia

2
Motivation “What is the optimal location in Boston area to build a new McDonald’s store?” Optimality: maximize the number of customers who think the new store is closer to them.

3
Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q , Find a location l Q which maximizes oO o.weight s.t. sS, d(o, l) d(o,s). We consider the L1 distance: |x1 - x2|+|y1 - y2|

4
Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q , Find a location l Q which maximizes oO o.weight s.t. sS, d(o, l) d(o,s). We consider the L1 distance: |x1 - x2|+|y1 - y2|

5
Example Q o :3 2 o :6 4 o :5 o :4 3 s 1 2 s 1

6
Example Q o :3 2 19 l1 o :6 4 22 o :5 10 o :4 3 s 1 12 2 s 1 The “Influence” of l1 is 5+6=11.

7
**Example Q o :3 l2 l1 o :6 o :5 o :4 s s 19 22 18 12 2 4 3 1 2 1**

The “Influence” of l1 is 5+6=11. The Influence of l2 is 5.

8
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based solution The OL-tree The VOL-tree Performance

9
**Using the RNN Algorithm…**

2 19 l1 o :6 4 22 o :5 10 o :4 3 s 1 12 2 s 1 The RNNs of l1 are O3 and O4.

10
**Straightforward Solution**

2 o :6 4 o :5 o :4 3 s 1 2 s 1 Compute the influence for every location in Q. Problematic: infinite number of candidates!.

11
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

12
nn_buffer of an Object nn_buffer of O4. O2:3 O3:5 O4:6 O1:4 S2 S1 Any location within the nn_buffer is a closer site if built. nn_buffer is a diamond.

13
**Problem Transformation**

Any location here is an optimal location! Q O3:5 O4:6 O1:4 S2 S1 Find a location with maximum overlap among objects’ nn_buffer.

14
**The Rotated Coodinate Rotate the coordinate 45°.**

Y X' o y x' Y' 45 o y' x X Rotate the coordinate 45°. All nn_buffers become axis-parallel squares. Focus on the rotated coordinate.

15
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

16
**The R-tree-based Solution**

Store the objects in an R-tree. Retrieve the objects whose nn_buffers intersect Q. Plane sweep to find a region which has maximum overlap.

17
**Two Contributions Object retrieval: Plane sweep: Store point objects,**

but retrieve nn_buffers in increasing order of lower X. Plane sweep: Straightforwardly: O(n2). Our method: O(n log n).

18
**Best-first Retrieval Keep a heap of index entries + objects.**

Sorted in increasing order of nn_buffer’s lower X. t t While heap is not empty, pop an entry. If pop an object, send it to plane sweep. If pop an index entry, push its children (intersecting Q).

19
**Naïve Plane Sweep Y 4 12 O2:3 9 8 O1:4 5 O3:5 2 O4:6 X -∞ 2 5 8 9 12**

+∞ 7 3

20
Not Efficient! O(n2) -∞ 2 5 8 9 12 +∞ 7 3 Suppose next insertion: add 2 to the Y-range [2,11]. +2 -∞ 2 5 8 9 12 +∞ 7 14 3 11

21
**The aSB-tree Extended from the SB-tree [YW01]:**

keeps max overlap information at index entries. handle a query range Q. -∞ 5 9 +∞ -∞ 2 5 8 9 12 +∞ 7 3

22
**The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. +2**

-∞ 5 9 +∞ -∞ 2 5 8 9 12 +∞ 7 3

23
**The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 2 -∞**

2 -∞ 5 9 +∞ +2 +2 -∞ 2 5 8 9 12 +∞ 7 3

24
**The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 2 -∞**

2 -∞ 5 9 +∞ 7 12 7 5 3 -∞ 2 5 8 9 11 12 +∞

25
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

26
The OL-tree Idea: partition the space, and keep max overlapped region for each partition! Like a k-d-B-tree. Stores nn_buffers. 1 2 3 4 An nn_buffer may have multiple copies. 1: add to fullcover. 2,3,4: recursively insert.

27
**Stored Information Index entry has, besides range: Leaf entry:**

fullcover: total weight of nn_buffers fully covering the whole area; localmax: among the nn_buffers inserted into the sub-tree, max overlap. maxrange: the region where localmax occurred. Leaf entry: A rectangle and its weight.

28
( r , 0, 9) root r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

29
**r ( , 2, 7) fullcover: 2 nn_buffers fully cover r3**

maxrange: where localmax occurred ( r , 0, 9) root r ( , 2, 7) localmax: Among those inserted, max overlap is 7 ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

30
**Query Processing Start with root, insert index entries into heap.**

Sorting key: upper bound of real max overlap in the sub-tree. localmax + fullcovers of ancestor entries. Accurate if Q intersects with maxrange.

31
**r ( , 2, 7) ( r , 1, 2) Real max overlap = 0+2+1 +localmax = 5**

, 0, 9) root Real max overlap = localmax = 5 r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 localmax ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

32
**Query Processing Start with root, insert index entries into heap.**

Sorting key: upper bound of real max overlap in the sub-tree. localmax + fullcovers of ancestor entries. Accurate if Q intersects with maxrange. Keep a running value: max overlap M. Pruning 1: Q intersects with maxrange. Pruning 2: upper bound of max overlap < M.

33
Q ( r , 0, 9) r2 is pruned since Q intersects r2.maxrange. M = 0+1+4=5. root r ( , 2, 7) ( r , 0, 4) 3 1 r1 is pruned since the upper bound of overlap = 4 < M. ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

34
**r ( , 2, 7) Sometimes, we need to examine a leaf node. Plane sweep it!**

, 0, 9) root Sometimes, we need to examine a leaf node. Plane sweep it! r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

35
**OL-tree VOL-tree OL-tree is not practical How to improve?**

worst-case space complexity O(n2) complex re-organization How to improve? Only keep a few top levels of the OL-tree. ==> Virtual OL-tree!

36
VOL-tree

37
**If Q is here, perform range search on the R-tree.**

Example If Q is here, perform range search on the R-tree.

38
**Comparison with R-tree Approach**

The R-tree approach examines all nn_buffers intersecting with Q. By using a small, in-memory VOL- tree, the new approach can prune the search space.

39
**To insert an nn_buffer here, recompute!**

Challenge To insert an nn_buffer here, recompute! With dynamic updates, to keep localmax and maxrange is expensive.

40
**Solution Index entry lowermax ≤ localmax ≤ uppermax**

(range, fullcover, maxrange, localmax) lowermax, uppermax lowermax ≤ localmax ≤ uppermax

41
**Solution Index entry lowermax ≤ localmax ≤ uppermax**

(range, fullcover, maxrange, localmax) lowermax, uppermax lowermax ≤ localmax ≤ uppermax Any location in maxrange has overlap = lowermax. At a location outside maxrange, the overlap can be more than lowermax, but < uppermax.

42
**Update Case 1: the new nn_buffer does not intersect with maxrange.**

Case 1: increase uppermax. Case 2: increase uppermax and lowermax. Case 1: the new nn_buffer does not intersect with maxrange. Case 2: intersects.

43
**Query Similar to the OL-tree.**

To compute upper bound of max overlap, use uppermax. When Q intersects maxrange, may or may not prune.

44
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

45
**Setup Digital Chart from the R-tree Portal.**

O: 24,493 populated places. S: 9,203 cultural landmarks. Pagesize: 1KB. Buffersize: 256 pages. Object R-tree: 753 pages. Pentium IV Dell PC, 3.2GHz. Java. Measure total I/O of 100 random queries.

46
Size of the VOL-tree

47
Small Query Area

48
Large Query Area

49
Varying Buffer Size

50
Effect of Update

51
**Conclusions Q & A... Introduced the optimal-location query.**

Proposed three solutions. The VOL-tree approach is the best. More improvement with larger query area. (5% query area = 6 times improvement.) More updates decreases the improvement. (50% updates = no improvement.) But can bulk-load. Q & A...

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google