Presentation is loading. Please wait.

Presentation is loading. Please wait.

Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University.

Similar presentations


Presentation on theme: "Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University."— Presentation transcript:

1 Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University of Hong Kong VLDB ’ 06, Seoul, Korea

2 Donghui Zhang et al. Optimal Location Query 2 Motivation “ What is the optimal location in Boston area to build a new McDonald ’ s store? ” Suppose a customer drives to the closest McDonald ’ s. Optimality: Minimize AVG driving distance.

3 Donghui Zhang et al. Optimal Location Query 3 min-dist OL Without any new site: AD = (200+200+600+600)/4 = 400. 200 600

4 Donghui Zhang et al. Optimal Location Query 4 min-dist OL Without any new site: AD = (200+200+600+600)/4 = 400. With new site l 1 : AD(l 1 ) = (30+30+600+600)/4 = 315. 30 600 30 l1l1

5 Donghui Zhang et al. Optimal Location Query 5 min-dist OL Without any new site: AD = (200+200+600+600)/4 = 400. With new site l 1 : AD(l 1 ) = (30+30+600+600)/4 = 315. With new site l 2 : AD(l 2 ) = (200+200+30+30)/4 = 115. 30 l2l2 200

6 Donghui Zhang et al. Optimal Location Query 6 Formal Definition Given a set S of sites, a set O of objects, and a query range Q, min-dist OL is a location l  Q which minimizes distance between o and its nearest site

7 Donghui Zhang et al. Optimal Location Query 7 L1 Distance d(o, s) = |o.x – s.x|+|o.y – s.y|

8 Donghui Zhang et al. Optimal Location Query 8 Challenging 1.There are infinite number of locations in Q. How to produce a finite set of candidates (yet keeping optimality)? 2.How to avoid computing AD(l) for all candidates?

9 Donghui Zhang et al. Optimal Location Query 9 Solution Highlights 1.Algorithm to compute AD(l). 2.Theorems to limit #candidates. 3.Lower-bound of AD(l) for all locations l in a cell C. 4.Progressive algorithm.

10 Donghui Zhang et al. Optimal Location Query 10 1. Compute AD(l) Remember Define Let RNN(l) be the objects “ attracted ” by l. AD(l)=AD if RNN(l)= l RNN(l)=  AD=AD(l)

11 Donghui Zhang et al. Optimal Location Query 11 1. Compute AD(l) Remember Define Let RNN(l) be the objects “ attracted ” by l. AD(l)=AD if RNN(l)= l RNN(l)={o 7, o 8 } AD(l) < AD

12 Donghui Zhang et al. Optimal Location Query 12 1. Compute AD(l) Remember Define AD(l)=AD - ? Let RNN(l) be the objects “ attracted ” by l. AD(l)=AD if RNN(l)= Average savings for customers in RNN(l)

13 Donghui Zhang et al. Optimal Location Query 13 1. Compute AD(l) Theorem S and O are “ static ” versus l. –AD can be pre-computed. –So is dNN(o, S) To compute AD(l): –Find RNN(l) –oRNN(l), compute d(o, l)

14 Donghui Zhang et al. Optimal Location Query 14 2. Limit #candidates Theorem: within the X/Y range of Q, draw grid lines crossing objects. Only need to consider intersections! Q

15 Donghui Zhang et al. Optimal Location Query 15 2. Limit #candidates Theorem: within the X/Y range of Q, draw grid lines crossing objects. Only need to consider intersections! 5x6=30 candidates Q

16 Donghui Zhang et al. Optimal Location Query 16 2. Limit #candidates Proof idea: suppose the OL is not, move it will produce a better (or equal) result. l Consider RNN(l). δ Move to the right  saves total dist.

17 Donghui Zhang et al. Optimal Location Query 17 2. VCU(Q) A spatial region, enclosing the objects closer to Q than to sites in S. It ’ s the Voronoi cell of Q versus sites in S.

18 Donghui Zhang et al. Optimal Location Query 18 2. Further Limit #candidates Only consider objects in VCU(Q). 5x6=30 candidates

19 Donghui Zhang et al. Optimal Location Query 19 2. Further Limit #candidates 5x6=30 candidates Only consider objects in VCU(Q).

20 Donghui Zhang et al. Optimal Location Query 20 2. Further Limit #candidates 4x4=16 candidates Only consider objects in VCU(Q).

21 Donghui Zhang et al. Optimal Location Query 21 Na ï ve Algorithm Derive candidates. Compute AD(l) for each. Pick smallest. Not efficient! Too many candidates! To compute AD(l) for each one, need: compute RNN(l) retrieve all these objects …

22 Donghui Zhang et al. Optimal Location Query 22 Progressive Idea Treat Q as a cell and consider its corners.

23 Donghui Zhang et al. Optimal Location Query 23 Progressive Idea Divide the cell.

24 Donghui Zhang et al. Optimal Location Query 24 Progressive Idea Divide the cell.

25 Donghui Zhang et al. Optimal Location Query 25 Progressive Idea Recursively divide a sub-cell.

26 Donghui Zhang et al. Optimal Location Query 26 Progressive Idea Recursively divide a sub-cell. Able to check all candidates.

27 Donghui Zhang et al. Optimal Location Query 27 Progressive Idea Q: What do you save? A: Cell pruning, if its lower bound  AD(l 0 ) of some candidate l 0. AD(l o ) =50 Suppose 60 is a lower bound for AD(l), l C

28 Donghui Zhang et al. Optimal Location Query 28 3. LB(C): lower bound for AD(l), lC AD(c 1 )=1000AD(c 2 )=3000 AD(c 3 )=4000AD(c 4 )=2500 c

29 Donghui Zhang et al. Optimal Location Query 29 3. LB(C): lower bound for AD(l), lC Theorem: AD(c 1 )=1000AD(c 2 )=3000 AD(c 3 )=4000AD(c 4 )=2500 is a lower bound, where p is perimeter. e.g. LB(C)=3500-p/4 c

30 Donghui Zhang et al. Optimal Location Query 30 3. LB(C): lower bound for AD(l), lC A better lower bound Theorem: Comparing with the previous lower bound: Higher quality since the lower bound is larger. More computation.

31 Donghui Zhang et al. Optimal Location Query 31 4. The Progressive Algorithm 1.Maintain a heap of cells ordered by LB(). Initially one cell: Q. 2.Maintain the best candidate l opt 3.Pick the cell with minimum LB() and partition it. 4.Compute AD() for the corners of sub- cells. 5.Compute LB() for the sub-cells. 6.Insert sub-cell c i to heap if LB(c i )<AD(l opt ) 7.Goto 3.

32 Donghui Zhang et al. Optimal Location Query 32 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best corner of Q) LB(Q) AD( real OL ) is inside the interval

33 Donghui Zhang et al. Optimal Location Query 33 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) LB(Q) AD( real OL ) is inside the interval

34 Donghui Zhang et al. Optimal Location Query 34 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) Min{ LB(C) | C in heap } AD( real OL ) is inside the interval User may choose to terminate any time.

35 Donghui Zhang et al. Optimal Location Query 35 Batch Partitioning To partition a cell, should partition into multiple sub-cells. Reason: to compute AD(l), need to access the R*-tree of objects. When access the R*-tree, want to compute multiple AD(l). Tradeoff: if partition too much: wasteful! Since some candidates could be pruned.

36 Donghui Zhang et al. Optimal Location Query 36 Performance Setup O: 123,593 postal addresses in Northeastern part of US. Stored using an R*-tree. S: randomly select 100 sites from O. Buffer: 128 pages. Dell Pentium IV 3.2GHz. Query size: 1% in each dimension.

37 Donghui Zhang et al. Optimal Location Query 37 4x4=16 candidates Only consider objects in VCU(Q). 2. Further Limit #candidates

38 Donghui Zhang et al. Optimal Location Query 38 Effect of VCU Computation

39 Donghui Zhang et al. Optimal Location Query 39 3. LB(C): lower bound for AD(l), lC Theorem: AD(c 1 )=1000AD(c 2 )=3000 AD(c 3 )=4000AD(c 4 )=2500 is a lower bound, where p is perimeter. e.g. LB(C)=3500-p/4 c

40 Donghui Zhang et al. Optimal Location Query 40 3. LB(C): lower bound for AD(l), lC A better lower bound Theorem: Comparing with the previous lower bound: Higher quality since the lower bound is larger. More computation.

41 Donghui Zhang et al. Optimal Location Query 41 Comparison of Lower Bounds

42 Donghui Zhang et al. Optimal Location Query 42 Effect of Batch Partitioning

43 Donghui Zhang et al. Optimal Location Query 43 Progressiveness The algorithm quickly reports a candidate OL with a confidence interval, and keeps refining. Time AD(best candidate) Min{ LB(C) | C in heap } AD( real OL ) is inside the interval User may choose to terminate any time.

44 Donghui Zhang et al. Optimal Location Query 44 Progressiveness Each step: partition a cell to 40 sub-cells. After 200 steps, accurate answer. After 20 steps, answer is 1% away from optimal.

45 Donghui Zhang et al. Optimal Location Query 45 Conclusions Introduced the min-dist optimal- location query. Proved theorems to limit the number of candidates. Presented lower-bound estimators. Proposed a progressive algorithm.


Download ppt "Progressive Computation of The Min-Dist Optimal-Location Query Donghui Zhang, Yang Du, Tian Xia, Yufei Tao* Northeastern University * Chinese University."

Similar presentations


Ads by Google