3 Map overlays incur high execution cost Retrieve the k objects The processing of such this is expensive A B
4 Top-k Spatial Joins Apply a conventional spatial join algorithm on the two data sets A and B Count the number of output pairs in which each object participates Return the k objects with the maximum intersection counts
11 Pseudocode TS (Rtree R a, Rtree R b, int k) Join RTa and RTb to get intersecting pair (e a,e b ) For each entry e that appears in a pair build e.IL, compute e.count and insert to a heap H (sorted by e.count) While number of reported objects < k e = de-heap(H) If e is a leaf entry // actual object –Report ( ) Else // e is an intermediate entry pointing to node n For each Join n and n i // n i is pointed by e i For each intersecting entry pair(e’, e’ i ) // Add e’ i to r’.IL Compute e’.count If e’.count > pruning condition //ie...count of the k-th best object found so far Insert to H If e’ is a leaf entry //object Update pruning condition return
12 Algorithm Visiting order Pruning condition count
22 Total cost versus k (semijoin, 10 percent cache).
23 Conclusions Bottom-k queries Top-k distance (semi) join Top-k nearest neighbor (semi) joins Computing the NN (in A) of all objects of B Sorting the resulting pairs (o b, o a ) where in the NN of with respect to o a Reporting the top-k objects of A