Download presentation

Presentation is loading. Please wait.

Published byFrank Pafford Modified about 1 year ago

1
Nearest Neighbor Queries using R-trees Based on notes by Yufei Tao

2
CS4482 CityU of HK 2 Nearest Neighbor Search Find the object nearest to a query point q E.g., find the gas station nearest to the red point. k nearest neighbors: Find the k objects nearest to q E.g., 1 NN = {h}, 2NN = {h, a}, 3NN = {h, a, i}

3
CS4482 CityU of HK 3 Nearest Neighbor Processing The R-tree can accelerate NN search, too. Concept: mindist(q, E) The minimum distance between a point q and a rectangle E

4
CS4482 CityU of HK 4 Depth-first NN Algorithm First load the root and compute the mindist from each entry to the query. Visit the child of the entry with the smallest mindist. In this case: E6

5
CS4482 CityU of HK 5 Depth-first NN Algorithm (cont.) Do this recursively at the next level. In the child node of E6, compute the mindist from every entry to the query. Visit the child node of the entry having the smallest mindist. In this case, E1 and E2 have the same mindist. So the decision is random – say, E1 first. Among all the points in the child node of E1, find the closest point a (our current result).

6
CS4482 CityU of HK 6 Depth-first NN Algorithm (cont.) Then backtrack to the child node of E6, where the entry with the next mindist value is E2. Its mindist 5 1/2 is however the same as the distance from q to a. So, we know that no point in E2 can possibly be closer to q than a. No result in E3 either – same reasoning.

7
CS4482 CityU of HK 7 Depth-first NN Algorithm (cont.) We now backtrack to the root, where the entry with the next mindist is E7. Its mindist 2 1/2 closer than the distance 5 1/2 from q to a. Thus, its subtree may contain some point whose distance to q is smaller than the distance between q and a; so we have to visit it At the child node of E7, compute the mindist of all entries to q. E4 will be descended next.

8
CS4482 CityU of HK 8 Depth-first NN Algorithm (cont.) In the child node of E4, we find a point h that is closer to q than a. So h becomes our new nearest neighbor. We backtrack to the child node of E7, where the entry with the next mindist is E5. E5’s mindist 13 1/2 is larger than the distance 2 1/2 from q to a. So we prune its subtree. The algorithm backtracks to the root and terminates. Visited (in this order) root, and the child nodes of E6, E1, E7, E4.

9
CS4482 CityU of HK 9 Another Depth-first Example: 2 NN Difference: entries must be pruned based on their distances to our 2 nd current NN. Root => child node of E6 => child node of E1 => find {a, b} here Backtrack to child node of E6 => child node of E2 (its mindist update our result to {a, f} Backtrack to child node of E6 => child node of E3 => backtrack to the root => child node of E7 => child node of E4 => update our result to {a, h} Backtrack to child node of E7 => prune E5 => backtrack to the root => end.

10
CS4482 CityU of HK 10 Optimal Performance of kNN Search What’s the best performance that can ever be achieved for a kNN? Vicinity circle: Centered at query q, with radius equal to the distance of q to its k-th NN All nodes that intersect the vicinity circle must be visited. Child node of E6 must be accessed by any algorithm. Although there’s no result in its subtree, this cannot be verified unless we visit it!

11
CS4482 CityU of HK 11 Best-first Algorithm (optimal algorithm) BF maintains all the (leaf- and non-leaf) entries seen so far in the memory, and sorts them in ascending order by their mindist. Each step processes the entry in memory with the smallest mindist.

12
CS4482 CityU of HK 12 Best-first Algorithm (cont.) Insert all the entries in the child node of E6 into the sorted list. E7 is the next one to be processed.

13
CS4482 CityU of HK 13 Best-first Algorithm (cont.) Insert all the entries in the child node of E7 into the sorted list. The next entry to be processed is E4.

14
CS4482 CityU of HK 14 Best-first Algorithm (cont.) Insert all the entries in the child node of E4 into the sorted list. The next entry to be processed is h, which is a leaf entry. This is the first NN of q.

15
CS4482 CityU of HK 15 Best-first Algorithm: 2NN Assume we want 2 NNs; then, the algorithm continues. Report h as the 1 st NN, and remove it from the heap The next entry to be processed is E1

16
CS4482 CityU of HK 16 Best-first Algorithm: 2NN (cont.) Visit the child node of E1; enter all its entries into the sorted list. The next entry is a, which is a leaf entry The 2 nd NN and the algorithm terminates. Whenever we process a leaf entry in memory, it is the next NN for sure.

17
CS4482 CityU of HK 17 Best-first = Best Performance To find the 1 st NN, we visited the root, and the child nodes of E6, E7, E4. To find the 2 nd, in addition to the above 3 nodes, we also visited the child node of E1. Both cases are optimal. It can be proved that BF visits the nodes in the tree in ascending order of their mindist to the query point.

18
CS4482 CityU of HK 18 Retrospect: The Rationale Behind What is the main reasoning of depth-first and best-first algorithms? Use mindist to quantify the quality of the best point in a subtree. If a node’s mindist is already greater than our current result, prune it.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google