Presentation is loading. Please wait.

Presentation is loading. Please wait.

Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science.

Similar presentations


Presentation on theme: "Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science."— Presentation transcript:

1 Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science § Hong Kong University of Science and Technology and † Carnegie Mellon University Presented By Sreepraveen Veeramachaneni

2 11/18/20022 Outline Introduction General techniques in spatial query Processing Motivation Background Location-based nearest neighbor search Location-based window queries Experiments Summary References

3 11/18/20023 Introduction The paper proposes an approach that enables mobile clients to determine the validity of previous queries based on their current location. Two types are spatial queries are discussed 1.Window Queries 2.Nearest Neighbor Queries

4 11/18/20024 Techniques In use Spatial databases have been extensively studied during the last two decades and several spatial access methods have been proposed. The most popular one is R-tree and its variations like R* -tree. R-trees can be viewed as multi-dimensional extensions of B-trees.

5 11/18/20025 R-tree Assuming capacity of 3 entries per node. Points that are close in space are clustered in the same leaf node represented as a MBR. Nodes are then recursively grouped together following the same principle until the top level, which consists of a single root

6 11/18/20026 R tree Contd. This technique is used to answer window queries Another important type of spatial information processing is nearest neighbor query, which retrieves the data point that is closest to a query point

7 11/18/20027 Branch and Bound Algorithm Roussopoulos et al., proposed a branch and bound algorithm that searches the R-tree in a depth first manner. Starting from root, all entries are sorted according to their minimum distance from the query point, and the entry with the smallest value is visited first. The process is repeated recursively until the leaf level where the first potential nearest neighbor is found. During backtracking to upper levels, the algorithm only visits entries whose mindist is smaller than the distance of the nearest neighbor already found.

8 11/18/20028

9 9 Outline Introduction General techniques in spatial query Processing Motivation Background Location-based nearest neighbor search Location-based window queries Experiments Summary

10 11/18/200210 Traditional Scenario The traditional scenario in spatial databases assumes that (i)Queries are static and (ii)Each query returns a single output and terminates.

11 11/18/200211 Where Is Your Nearest Restaurant? Traditional nearest neighbor search in spatial databases considers static query points.

12 11/18/200212 What if You Move? Getting only the nearest neighbor is inadequate: When will it expire?

13 11/18/200213 The conventional approach to attain up-to-date information is to pose a new query to the server after a position update, which could lead to high network overhead and extra processing effort. And due to high mobility of the user, the result may be invalidated immediately as the user’s position changes

14 11/18/200214 Outline Introduction General techniques in spatial query Processing Motivation Background Location-based nearest neighbor search Location-based window queries Experiments Summary

15 11/18/200215 First spatial query processing Technique for Mobile computing[ Zheng, Lee SSTD 2001] The first technique was to pre-compute and store in an R-tree the Voronoi diagram of the dataset. Voronoi Diagram: The Voronoi diagram of a collection of geometric objects is a partition of space into cells, each of which consists of the points closer to one particular object than to any others

16 11/18/200216 When a nearest neighbor query arrives at the server, the Voronoi diagram is used to efficiently compute the nearest neighbor

17 11/18/200217 In addition to result, the server sends back to the client the client the validity time T of the result, which is a conservative approximation assuming that the query’s speed is below a maximum value. Problem: Difficult to estimate the value of Query speed. A high value will result in very short T and a low value will result in false misses The method only deals with single nearest neighbor queries and retrieval of K neighbors would require order-K Voronoi diagrams, which are complicated and incur large space overhead.

18 11/18/200218 K- Nearest neighbor query[ Song, Roussopoulos, SSTD 2001] Song and Roussopoulos proposed a technique that does not assume Voronoi diagrams and can be used for any number of neighbors. When a k nearest neighbor query q arrives, the server computes and returns to the client a number m > k of neighbors.

19 11/18/200219 Implementation Let dist (k) and dist (m) be the distances of the k th and m th nearest neighbor from the query point q. If the client re-issues the query at a new location q’, the new k nearest neighbors will be among the m objects of the first query, provided that 2.dist(q,q’) ≤ dist(m) – dist(k)

20 11/18/200220 Example The figure shows an example for a 2-nearest neighbor query at location q, where the server returns four results o, a, b and c ( the nearest neighbors are o and a) When the client moves to the location q’, the two NN are o and b. If 2.dist(q, q’) ≤ dist(4) – dist(2), the client can determine this by computing new distances (wrt to q’) of the four objects, with out having to issue a new query to the server

21 11/18/200221 Problems An obvious problem of this approach lies in obtaining a proper value of m A high value will increase the network overhead and the storage requirements at the client, while a low value may be useless( if it does not reduce the number of queries) m depends on factors like data distribution and query frequency which are difficult to estimate

22 11/18/200222 Time Parameterized Nearest Neighbor (TP NN) Tao and Papadias, SIGMOD02] Given a query moving with steady velocity, return all nearest neighbor results ( up to a future timestamp), i.e., the output is a set of tuples, where R i is the set of nearest neighbors during future interval T i For this situation, the concept of time parameterized queries can be applied for both window queries and nearest neighbor queries. When a server receives a request from a client, it executes a TP query and returns, where R is the set of objects satisfying the corresponding spatial query (current result), T is the validity time of R, and C is the result change at T From the set of objects in R, and the set of objects in C that will cause the changes, the client can incrementally compute the next result

23 11/18/200223 TP window Query Consider, that a client moving east with speed one issues a window query. The server returns meaning that object b currently intersects the query window, but after 1 time unit it will stop doing so and therefore, b should be removed from the result.

24 11/18/200224 Influence of a Object The result of a spatial query changes in future because some objects “influence” its correctness. If an object (e.g., b) satisfies the query at the current time, it may influence the result when it no longer satisfies it in the future (at time 1). An object not currently in the result (e.g., d) may influence the query when it becomes part of the result (at time 2). Some objects such as a and c, may never change the result, so their influence time is set to ∞

25 11/18/200225 Time Parameterized Nearest Neighbor (TP NN) [Tao and Papadias, SIGMOD02] Returns: –The nearest neighbor R of the current query location –The expiry time T of R (given the query’s movement) –The change C of the result at T Result: R={i}, T=2, C={j}

26 11/18/200226 Problem with the techniques discussed so far: All the techniques we discussed for mobile computing presuppose that future locations of clients can be calculated using their current movements (i.e., the velocity of client is known and constant during the lifespan of the query) But in many applications query velocities are continuously updated as the users change their speed or direction of movement Motivated by this, the authors introduce a technique where, instead of time, the validity of the result is determined by the users location in space.

27 11/18/200227 Outline Introduction General techniques in spatial query Processing Motivation Background Location-based nearest neighbor search Location-based window queries Experiments Summary

28 11/18/200228 Location-Based Nearest Neighbors Assumptions: We assume that there exists a spatial index (e.g., R-trees) for data objects, but no specialized structures (e.g., Voronoi diagrams) for nearest neighbor search.

29 11/18/200229 Some users (say, a tourist walking causally) cannot specify their heading directions clearly. Getting You Covered by the Nearest Restaurant

30 11/18/200230 In addition to the nearest restaurant, we also return the validity region of this restaurant. Another query is issued to retrieve the new nearest restaurant, only if the user moves out of this region. Validity Region of the Result

31 11/18/200231 Points that determine the influence region. Influence Points

32 11/18/200232 Keeping the influence points avoids the “in-polygon” check. –The user only needs to check if her/his location is closer to any yellow point than a. Influence Points

33 11/18/200233 Validity Region: A Closer Look The validity region of q is the Voronoi Cell (VC) of o.

34 11/18/200234 Pre-Compute the Voronoi Diagram? Bad idea! To answer kNN of a specific value k, a k-order Voronoi Diagram is necessary. –If we want to answer NN, 2NN, …, 20NN, then 20 sets of Voronoi Diagrams are necessary. Huge space! Poor support for data update. Our solution: Compute the cell on the fly. –Use a single R-tree –Support all values of k

35 11/18/200235 Relationship with Time Parameterized NN If q moves towards l, then its nearest restaurant will change to point a at position q’. – The corresponding TP query q returns: (i) o, (ii), a, and (iii) q’.

36 11/18/200236 Algorithm Step 1 – Find the current NN Step 2 – Use TP NN queries to tighten the validity region progressively

37 11/18/200237 Algorithm Step 2 – Use TP NN queries to tighten the validity region progressively The algorithm issues totally 2S inf TP NN queries, where S inf is the number of influence points. This algorithm generalizes to computing k-order Voronoi Cells for arbitrary values of k (see the paper for details).

38 11/18/200238 Extensions to k NN queries The above method can be easily applied to k nearest neighbor queries, where the validity region is the maximal area around the query, where each point has the same set of k nearest neighbors.

39 11/18/200239 Outline Introduction General techniques in spatial query Processing Motivation Background Location-based nearest neighbor search Location-based window queries Experiments Summary References

40 11/18/200240 Some users would consider more restaurants in their vicinity. The validity region here is such that, as long as the user stays in this region, the query result does not change. Location-based Window Queries: Find All Close Restaurants

41 11/18/200241 Location-based Window Queries The focus f of the window query q is the centroid of the query window The validity region V (q) of a query q is the maximal area around the query focus (i.e., f є V (q)) where the query result R (q) does not change The points that satisfy q are called inner objects, and those outside the query window outer objects

42 11/18/200242 Location-based Window Queries The Minkowski region of each point (e.g., a) is a rectangle (r a ) identical to the query window whose centroid lies on the corresponding point (a) If query focus moves inside r a, the query result always contains object a. The intersection of the inner Minkowski regions corresponds to inner validity region.

43 11/18/200243 If the user location is at the boundary of the validity region, the corresponding query window’s boundary will cross some data point. The Validity Region of Window Queries

44 11/18/200244 If the user location is at the boundary of the validity region, the corresponding query window’s boundary will cross some data point. The Validity Region of Window Queries

45 11/18/200245 If the user location is at the boundary of the validity region, the corresponding query window’s boundary will cross some data point. The Validity Region of Window Queries

46 11/18/200246 If the user location is at the boundary of the validity region, the corresponding query window’s boundary will cross some data point. The Validity Region of Window Queries

47 11/18/200247 In addition to the query result {a,b,c}, the user is also aware of 2 inner influence points {a,b} and 2 outer influence points {d,e}. The original result is invalidated if the query window: –Does not cover any inner influence point. –Covers any outer influence point The user does not need to store the actual boundary of the validity region). The Influence Points

48 11/18/200248 First get the query result {a,b,c} (a traditional window query). Then the influence points. –Using Time Parameterized Window Queries (see paper). Retrieving the Influence Points

49 11/18/200249 Outline Introduction General techniques in spatial query Processing Motivation Background Location-based nearest neighbor search Location-based window queries Experiments Summary References

50 11/18/200250 Experiments Datasets –GR (23K, data space 800km  800km), –NA (569K, data space 7000km  7000km) Disk page size set to 4k bytes Index: R*-tree Queries –LB kNN – parameter k –LB WQ – parameter query length –Each workload consists of 200 queries with the same parameters distributed uniformly in the data space.

51 11/18/200251 Experiments The area of validity region drops linearly with cardinality since the number voronoi cells increases ( while the area of data space remains constant). Under all settings the average number of edges in a voronoi cell is 6 for uniform datasets which is equal to number of influence objects.

52 11/18/200252 Experiment 1: Number of Influence Points for LB kNN The number of influence objects decreases to 4 for k>10. this is because for k>1, an influence object may contribute more than one edge (since it can form perpendicular bisector with any of the k nearest neighbors of the query), while the total number of edges remains around 6.

53 11/18/200253 Experiment 2: Query cost for LB kNN The above figure shows the number of node accesses as a function of cardinality for k = 1 The number of nodes accesses for TPNN queries is about 12 times that of the regular nearest neighbor query because, on average we need 6 TPNN queries to retrieve the influence objects and another 6 queries to confirm the vertices of the validity region.

54 11/18/200254 Experiment 2: Query cost for LB kNN with a buffer As we can see, using an LRU buffer equal to 10% of the R-tree size the actual cost of TPNN queries reduces significantly, since all the queries access similar parts of the data space. Thus, given a relatively small buffer, the overhead imposed by location-based NN queries is not significant

55 11/18/200255 Experiment 3: Number of Influence Points for LB WQ

56 11/18/200256 Experiment 4: Query cost for LB WQ

57 11/18/200257 Conclusion Location-based queries retrieve the validity regions for the query results. –We considered kNN and window queries. Future work –Apply the concept of validity region to other types of queries (e.g., range queries). –Study the incremental computation of the query result (i.e., what happens after the user exits the validity region?)

58 11/18/200258 References 1.Song, Z., Roussopoulos, N. K-Nearest Neighbor Search for Moving Query Point. SSTD 2001 2.Tao, Y., Papadias, D. Time Parameterized Queries in Spatio-Temporal Databases. SIGMOD 2002 3.Zheng, B., Lee, D. Semantic Caching in Location Dependant Query Processing. SSTD 2001 4.Roussopoulos, N., Kelly, S., Vincent, F. Nearest Neighbor Queries. SIGMOD 1995


Download ppt "Location-based Spatial Queries AGM SIGMOD 2003 Jun Zhang §, Manli Zhu §, Dimitris Papadias §, Yufei Tao †, Dik Lun Lee § Department of Computer Science."

Similar presentations


Ads by Google