# Finding the Sites with Best Accessibilities to Amenities Qianlu Lin, Chuan Xiao, Muhammad Aamir Cheema and Wei Wang University of New South Wales, Australia.

## Presentation on theme: "Finding the Sites with Best Accessibilities to Amenities Qianlu Lin, Chuan Xiao, Muhammad Aamir Cheema and Wei Wang University of New South Wales, Australia."— Presentation transcript:

Finding the Sites with Best Accessibilities to Amenities Qianlu Lin, Chuan Xiao, Muhammad Aamir Cheema and Wei Wang University of New South Wales, Australia

Application  Find an apartment that is closest to restaurant, bus stop and zoo  ‘Closeness’ is measured by a monotonic scoring function Apartment Restaurant Bus Stop Zoo 2

Problem Definition 3 Given a set of query points S = {s 1, s 2, … s m } Given n sets of data points T 1, T 2, … T n Find k query points in S, whose aggregated distances to T 1, T 2, … T n are smallest: Distance(s j, {T 1, T 2, … T n }) = f(d(s j, NN(s j, T 1 )), d(s j, NN(s j, T 2 )), … d(s j, NN(s j, T n ))) where NN(s j, T i ) is the nearest neighbour of s j in T i d(s j, NN(s j, T i ) is the distance from s j to its nearest neighbour in T i * For simplicity, we use: d(x, y) is Euclidean Distance f(x 1, x 2,...x m ) =sum(x 1, x 2, …, x m )

Related Literature  KNN – K Nearest Neighbour Given a query point q and a set of data points I, find k data points in I that are nearest neighbour of q  RNN – Reverse Nearest Neighbour Given a query point q and a set of data points I, find k data points of which q is the nearest neighbour  ANN – All Nearest Neighbour Given a set of query points Q and a set of data points I, find nearest neighbour in I for each query point in Q (Y.Chen, ICDE2007) Efficient evaluation of all-nearest- neighbor queries In solving our problem, we can retrieve ANN in each type and find top k queries 4

Our Contribution  We introduced the problem of finding the sites with best accessibilities to amenities  We proposed two algorithms to find top-k accessible sites among a set of possible locations  We performed experiments on several real datasets 5

Baseline Apartment Restaurant Bus Stop Zoo 6 ANN is used to retrieve the nearest neighbour of each query for each type.

Baseline - Disadvantage  I/O time Query data will be accessed n times, n is the number of types of index objects  Memory usage Need find NN for all the query points Need to maintain a list of nearest neighbours of each type of each query 7

Separate Tree (Index Construction) Apartment Restaurant Bus Stop Zoo Q1 Q2Q3Q4 Z1 Query Tree Index Tree Z1 R1 R2 R3 R4 R1 R2 R3R4 R1B1 B2 B3 B4 B1 B2 B3B4 Q1 Q2 Q4 Q3 8

Separate Tree (Query Processing) Q1 Z1 R1 B1 MAXD={30, 305, 309} MIND={30, 0, 0} LBD=30 UBD=644 current_k_best = 644 9 R1B1 Apartment Restaurant Bus Stop Zoo Z1 R1 B1 Q1 Q2 Q4 Q3  MAXD Maximum distance from Q1 to all the nodes in the list  MIND Minimum distance from Q1 to all the nodes in the list  UBD Upper bound of the summed distance  LBD Lower bound of the summed distance

Separate Tree (cont’d) current_k_best = 190 10 Apartment Restaurant Bus Stop Zoo Z1 R1 R2 R3 R4 R1B1 B2 B3 B4 Q1 Q2 Q4 Q3 Z1 R1 R2 R3R4 B1 B2 B3B4 Q1 Q2Q3Q4 Q3 Z1 R4 B2 MAXD={30, 100, 60} MIND={30, 0, 0} LBD=30 UBD=190 R3 Q4 Z1 R4 B3 MAXD={300, 150, 60} MIND={300, 60, 30} B4 LBD=360 UBD=510

More Improvement?  Data points from different type can be put into one bounding box – To reduce I/O cost 11

One Tree (Index Construction) Apartment Restaurant Bus Stop Zoo I1 I2 I6 I3 I4 I5 I1 I2I3I4I5I6 Q1 Q2Q3Q4 I17 I18 I12 I9 I10 I11 I12I11 I7I8I13I14I15I16I9I10I18 I16 I15 I8 I14 I 13 I7 Query Tree Index Tree 12 Q1 Q2 Q4 Q3 Each node has a bitmap that indicates what types are contained in the node

One Tree (Query Processing) Apartment Restaurant Bus Stop Zoo Q1 I1 Q1 I1 MAXD={309, 309, 309} MIND={0, 0, 0} LBD=0 UBD=309*3=927 current_k_best = 972 13

One Tree (cont’d) Apartment Restaurant Bus Stop Zoo Q1 Q2 Q3 Q4 I1 I2 I6 I3 I4 I5 I1 I2I3I4I5I6 Q1 Q2Q3Q4 Q3 I4 I5 Q4 I6I5 MIND={0, 0, 30} MAXD={50, 50, 30} LBD=30 UBD=130 MIND={30, 30, 140} MAXD={50, 50, 140} LBD=100 UBD=240 current_k_best = 130 14

Experiments 15  DataSet: San Francisco Road Network (SF) & Road Network of North America (NA)  Spatial query dataset, 2 dimensions  Index: ~174k points (totally)  Query: ~17k points  Algorithm: Baseline Separate Tree One Tree  Measurement: CPU time Number of leaf nodes access (I/O time)

Results (CPU Time VS. k) 16

Results (CPU Time VS. |T|) 17

Results (Leaf Node No. VS. k) 18

Results (Leaf Node No. VS. |T|) 19

Conclusion  We proposed two algorithms: Separate tree: creates indexes for different types of points in separate R-trees One tree: indexes all the points in a single R- tree  Both algorithms outperform the baseline algorithm with a speed-up up to 5.7 times  Also, both algorithms only need access the Query tree once, which reduces I/O cost on accessing Query tree 20

21 Thank you! Questions?

Download ppt "Finding the Sites with Best Accessibilities to Amenities Qianlu Lin, Chuan Xiao, Muhammad Aamir Cheema and Wei Wang University of New South Wales, Australia."

Similar presentations