Download presentation

1
**Danzhou Liu Ee-Peng Lim Wee-Keong Ng**

Efficient k Nearest Neighbor Queries on Remote Spatial Databases Using Range Estimation Danzhou Liu Ee-Peng Lim Wee-Keong Ng Center for Advanced Information Systems, School of Computer Engineering Nanyang Technological University, Nanyang Ave, Singapore , Singapore

2
**Outline Introduction Related work**

k-NN query algorithm based on range estimation Range estimation methods Experiments Conclusions SSDBM2002

3
Introduction Spatial database provides persistent storage for spatial objects (e.g., points, polylines, polygons) Spatial database supports Representation of spatial attributes Storage/indexing of spatial data values using some spatial indices (e.g., R-tree and Quadtree) Queries involving spatial attributes SSDBM2002

4
**k-Nearest Neighbor Queries**

Definition k-Nearest Neighbor (k-NN) query: locating k spatial objects nearest to a given query point Wide range of applications: Geographic Information Systems (GIS), e.g., finding the nearest two hospitals Computer Aided Design (CAD), e.g, finding the nearest three resistors in a circuit board SSDBM2002

5
**Motivation Large volume of spatial data on WWW**

Geospatial Data Clearinghouse (a collection of over 250 spatial database servers) Yahoo, Tiger and other map services Limited Web-based query interfaces Support simple spatial queries (e.g., window queries) No support for remote index access SSDBM2002

6
**The Geospatial Data Clearinghouse**

Large amount of useful geospatial information on WWW SSDBM2002

7
**The Geospatial Data Clearinghouse**

Limited Web-based query interface; supports only window queries SSDBM2002

8
Objective Develop efficient algorithms to evaluate k-NN queries on remote spatial databases using window queries: Propose a generic k-NN query processing algorithm that accommodates different range estimation methods Develop efficient range estimation methods Conduct experiments to evaluate performance of proposed range estimation methods Develop sampling methods to obtain statistical knowledge of remote databases needed for range estimation methods SSDBM2002

9
Related Work Algorithms for simple k-NN queries may be divided into three major groups: Partition-based algorithms Graph-based algorithms Range-based algorithms SSDBM2002

10
**Partition-based Algorithms**

Retrieve k nearest neighbors from spatial indices by pruning away nodes that cannot lead to k nearest neighbors Examples Branch-and-bound R-tree traversal algorithm Pipelined fashion algorithm Not applicable to Web environment Spatial indices are usually not available to non-local applications Creating local indices is infeasible due to large amount of data SSDBM2002

11
**Graph-based Algorithms**

Pre-compute nearest neighbors of spatial objects; create new index structures for pre-computed nearest neighbor information to support search Example Voronoi-based algorithm Not applicable to Web environment Retrieving all spatial objects on remote database servers is sometimes impractical Creating local indices is infeasible due to large amount of data SSDBM2002

12
**Range-based Algorithms**

Use range queries to retrieve k nearest neighbors Examples Use sampling for range estimation Use distance distributions for range estimation Use reference points for range estimation Not applicable to Web environment Determining sample size and selecting samples of spatial objects properly are still a challenge Creating local indices is infeasible due to large amount of data SSDBM2002

13
**Proposed k-NN Algorithm**

Based on range estimation New strategies for k-NN query evaluation in Web environment are required Use window queries for probing spatial database SSDBM2002

14
**Density-based Range Estimation Method**

Based on uniform spatial object distribution assumption Range estimated by EstiRange1 function is Ranges estimated by EstiRange2 function are SSDBM2002

15
**Bucket-based Range Estimation Method**

Use summary information about partitions or buckets of spatial objects for range estimation Summary information Bucket MBB, number of spatial objects in bucket Buckets are created using different strategies [1] Sort the set of max distance between buckets and query point Range estimated is the minimal bucket-query point max distance that contains at least k nearest neighbor objects Use one window query SSDBM2002

16
Example: k = 5 SSDBM2002

17
Experiments New Jersey road dataset from TIGER [30] SSDBM2002

18
**Performance measures:**

Number of iterations h A SSDBM2002

19
Experimental Results Minimum, maximum and upper bounds on the number of iterations of the density-based range estimation method SSDBM2002

20
**Iteration and accuracy of the density-based range estimation method**

SSDBM2002

21
Experimental Results Efficiency of density-based and bucket-based range estimation methods SSDBM2002

22
Conclusions A window query approach to evaluate k-NN queries on remote spatial databases motivated by Large amount of spatial information on the Web Limited query interface Proposed range estimation methods Performances increase with k. No a clear winner SSDBM2002

23
SSDBM2002

24
**Types of Range Estimation Methods**

Tight estimation methods Estimated range is not large enough; i.e., both EstiRange1 and EstiRange2 functions may be invoked e.g., density-based method Loose estimation methods Estimated range is large enough; i.e., only the EstiRange1 function is invoked e.g., bucket-based method SSDBM2002

25
Future Work Extending range estimation methods with sampling techniques to determine data distribution Current range estimation methods depend on statistical knowledge provided by database owners Investigate how the statistical knowledge can be approximated through sampling Developing strategies to select the appropriate range estimation methods for evaluating k-NN queries. Developing Web applications of k-NN queries. SSDBM2002

26
**Four Strategies to Create Buckets**

Equi-Count, Equi-Area, Min-Skew, and Min-Overlap partitioning strategies [1] Charminar Dataset Spatial Densities in Charminar Equi-Area Partitioning Equi-Count Partitioning Min-Skew Partitioning Min-Overlap Partitioning SSDBM2002

Similar presentations

OK

Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.

Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.

© 2018 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on life achievement of nelson mandela Jit ppt on manufacturing plant Ppt on street play in hindi Download ppt on oxidation and reduction examples Ppt on indian travel and tourism Three dimensional viewing ppt on ipad Ppt on conceptual article Ppt on power line communication modem Ppt on mystery stories of india Ppt on brain drain