Download presentation

1
**The Optimal-Location Query**

Donghui Zhang Northeastern University Coauthors: Yang Du, Tian Xia

2
Motivation “What is the optimal location in Boston area to build a new McDonald’s store?” Optimality: maximize the number of customers who think the new store is closer to them.

3
Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q , Find a location l Q which maximizes oO o.weight s.t. sS, d(o, l) d(o,s). We consider the L1 distance: |x1 - x2|+|y1 - y2|

4
Formal Definition Given a set S of sites, a set O of weighted objects, and a query range Q , Find a location l Q which maximizes oO o.weight s.t. sS, d(o, l) d(o,s). We consider the L1 distance: |x1 - x2|+|y1 - y2|

5
Example Q o :3 2 o :6 4 o :5 o :4 3 s 1 2 s 1

6
Example Q o :3 2 19 l1 o :6 4 22 o :5 10 o :4 3 s 1 12 2 s 1 The “Influence” of l1 is 5+6=11.

7
**Example Q o :3 l2 l1 o :6 o :5 o :4 s s 19 22 18 12 2 4 3 1 2 1**

The “Influence” of l1 is 5+6=11. The Influence of l2 is 5.

8
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based solution The OL-tree The VOL-tree Performance

9
**Using the RNN Algorithm…**

2 19 l1 o :6 4 22 o :5 10 o :4 3 s 1 12 2 s 1 The RNNs of l1 are O3 and O4.

10
**Straightforward Solution**

2 o :6 4 o :5 o :4 3 s 1 2 s 1 Compute the influence for every location in Q. Problematic: infinite number of candidates!.

11
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

12
nn_buffer of an Object nn_buffer of O4. O2:3 O3:5 O4:6 O1:4 S2 S1 Any location within the nn_buffer is a closer site if built. nn_buffer is a diamond.

13
**Problem Transformation**

Any location here is an optimal location! Q O3:5 O4:6 O1:4 S2 S1 Find a location with maximum overlap among objects’ nn_buffer.

14
**The Rotated Coodinate Rotate the coordinate 45°.**

Y X' o y x' Y' 45 o y' x X Rotate the coordinate 45°. All nn_buffers become axis-parallel squares. Focus on the rotated coordinate.

15
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

16
**The R-tree-based Solution**

Store the objects in an R-tree. Retrieve the objects whose nn_buffers intersect Q. Plane sweep to find a region which has maximum overlap.

17
**Two Contributions Object retrieval: Plane sweep: Store point objects,**

but retrieve nn_buffers in increasing order of lower X. Plane sweep: Straightforwardly: O(n2). Our method: O(n log n).

18
**Best-first Retrieval Keep a heap of index entries + objects.**

Sorted in increasing order of nn_buffer’s lower X. t t While heap is not empty, pop an entry. If pop an object, send it to plane sweep. If pop an index entry, push its children (intersecting Q).

19
**Naïve Plane Sweep Y 4 12 O2:3 9 8 O1:4 5 O3:5 2 O4:6 X -∞ 2 5 8 9 12**

+∞ 7 3

20
Not Efficient! O(n2) -∞ 2 5 8 9 12 +∞ 7 3 Suppose next insertion: add 2 to the Y-range [2,11]. +2 -∞ 2 5 8 9 12 +∞ 7 14 3 11

21
**The aSB-tree Extended from the SB-tree [YW01]:**

keeps max overlap information at index entries. handle a query range Q. -∞ 5 9 +∞ -∞ 2 5 8 9 12 +∞ 7 3

22
**The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. +2**

-∞ 5 9 +∞ -∞ 2 5 8 9 12 +∞ 7 3

23
**The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 2 -∞**

2 -∞ 5 9 +∞ +2 +2 -∞ 2 5 8 9 12 +∞ 7 3

24
**The aSB-tree Suppose next insertion: add 2 to the Y range [2,11]. 2 -∞**

2 -∞ 5 9 +∞ 7 12 7 5 3 -∞ 2 5 8 9 11 12 +∞

25
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

26
The OL-tree Idea: partition the space, and keep max overlapped region for each partition! Like a k-d-B-tree. Stores nn_buffers. 1 2 3 4 An nn_buffer may have multiple copies. 1: add to fullcover. 2,3,4: recursively insert.

27
**Stored Information Index entry has, besides range: Leaf entry:**

fullcover: total weight of nn_buffers fully covering the whole area; localmax: among the nn_buffers inserted into the sub-tree, max overlap. maxrange: the region where localmax occurred. Leaf entry: A rectangle and its weight.

28
( r , 0, 9) root r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

29
**r ( , 2, 7) fullcover: 2 nn_buffers fully cover r3**

maxrange: where localmax occurred ( r , 0, 9) root r ( , 2, 7) localmax: Among those inserted, max overlap is 7 ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

30
**Query Processing Start with root, insert index entries into heap.**

Sorting key: upper bound of real max overlap in the sub-tree. localmax + fullcovers of ancestor entries. Accurate if Q intersects with maxrange.

31
**r ( , 2, 7) ( r , 1, 2) Real max overlap = 0+2+1 +localmax = 5**

, 0, 9) root Real max overlap = localmax = 5 r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 localmax ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

32
**Query Processing Start with root, insert index entries into heap.**

Sorting key: upper bound of real max overlap in the sub-tree. localmax + fullcovers of ancestor entries. Accurate if Q intersects with maxrange. Keep a running value: max overlap M. Pruning 1: Q intersects with maxrange. Pruning 2: upper bound of max overlap < M.

33
Q ( r , 0, 9) r2 is pruned since Q intersects r2.maxrange. M = 0+1+4=5. root r ( , 2, 7) ( r , 0, 4) 3 1 r1 is pruned since the upper bound of overlap = 4 < M. ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

34
**r ( , 2, 7) Sometimes, we need to examine a leaf node. Plane sweep it!**

, 0, 9) root Sometimes, we need to examine a leaf node. Plane sweep it! r ( , 2, 7) ( r , 0, 4) 3 1 ( r , 1, 4) 2 ( r , 1, 2) 33 ( r , 2, 3) ( r , 4, 3) 32 31 sub-trees omitted

35
**OL-tree VOL-tree OL-tree is not practical How to improve?**

worst-case space complexity O(n2) complex re-organization How to improve? Only keep a few top levels of the OL-tree. ==> Virtual OL-tree!

36
VOL-tree

37
**If Q is here, perform range search on the R-tree.**

Example If Q is here, perform range search on the R-tree.

38
**Comparison with R-tree Approach**

The R-tree approach examines all nn_buffers intersecting with Q. By using a small, in-memory VOL- tree, the new approach can prune the search space.

39
**To insert an nn_buffer here, recompute!**

Challenge To insert an nn_buffer here, recompute! With dynamic updates, to keep localmax and maxrange is expensive.

40
**Solution Index entry lowermax ≤ localmax ≤ uppermax**

(range, fullcover, maxrange, localmax) lowermax, uppermax lowermax ≤ localmax ≤ uppermax

41
**Solution Index entry lowermax ≤ localmax ≤ uppermax**

(range, fullcover, maxrange, localmax) lowermax, uppermax lowermax ≤ localmax ≤ uppermax Any location in maxrange has overlap = lowermax. At a location outside maxrange, the overlap can be more than lowermax, but < uppermax.

42
**Update Case 1: the new nn_buffer does not intersect with maxrange.**

Case 1: increase uppermax. Case 2: increase uppermax and lowermax. Case 1: the new nn_buffer does not intersect with maxrange. Case 2: intersects.

43
**Query Similar to the OL-tree.**

To compute upper bound of max overlap, use uppermax. When Q intersects maxrange, may or may not prune.

44
**Content Problem Definition Straightforward Solution**

Problem Transformation The R-tree-based Solution The OL-tree The VOL-tree Performance

45
**Setup Digital Chart from the R-tree Portal.**

O: 24,493 populated places. S: 9,203 cultural landmarks. Pagesize: 1KB. Buffersize: 256 pages. Object R-tree: 753 pages. Pentium IV Dell PC, 3.2GHz. Java. Measure total I/O of 100 random queries.

46
Size of the VOL-tree

47
Small Query Area

48
Large Query Area

49
Varying Buffer Size

50
Effect of Update

51
**Conclusions Q & A... Introduced the optimal-location query.**

Proposed three solutions. The VOL-tree approach is the best. More improvement with larger query area. (5% query area = 6 times improvement.) More updates decreases the improvement. (50% updates = no improvement.) But can bulk-load. Q & A...

Similar presentations

OK

Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.

Towards Robust Indexing for Ranked Queries Dong Xin, Chen Chen, Jiawei Han Department of Computer Science University of Illinois at Urbana-Champaign VLDB.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google