Presentation on theme: "1 Top-k Spatial Joins"— Presentation transcript:
1 Top-k Spatial Joins
2 Survey What ’ s top-k spatial joins
3 Map overlays incur high execution cost Retrieve the k objects The processing of such this is expensive A B
4 Top-k Spatial Joins Apply a conventional spatial join algorithm on the two data sets A and B Count the number of output pairs in which each object participates Return the k objects with the maximum intersection counts
11 Pseudocode TS (Rtree R a, Rtree R b, int k) Join RTa and RTb to get intersecting pair (e a,e b ) For each entry e that appears in a pair build e.IL, compute e.count and insert to a heap H (sorted by e.count) While number of reported objects < k e = de-heap(H) If e is a leaf entry // actual object –Report ( ) Else // e is an intermediate entry pointing to node n For each Join n and n i // n i is pointed by e i For each intersecting entry pair(e’, e’ i ) // Add e’ i to r’.IL Compute e’.count If e’.count > pruning condition //ie...count of the k-th best object found so far Insert to H If e’ is a leaf entry //object Update pruning condition return
12 Algorithm Visiting order Pruning condition count
13 Multiple Expansions Method (ME)
14 Two binary search trees
15 Full join VS. Semi join
16 Comparison environment 1. MCB x LA returns 16,477,244 intersection pairs 2. SKEW x LA returns 19,657,973 intersection pairs
17 Node accesses versus k (full join).
18 CPU time versus k (full join).
19 Total cost versus k (full join, 10 percent cache).
20 Node accesses versus k (semijoin).
21 CPU time versus k (semijoin).
22 Total cost versus k (semijoin, 10 percent cache).
23 Conclusions Bottom-k queries Top-k distance (semi) join Top-k nearest neighbor (semi) joins Computing the NN (in A) of all objects of B Sorting the resulting pairs (o b, o a ) where in the NN of with respect to o a Reporting the top-k objects of A