Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip.

Similar presentations


Presentation on theme: "CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip."— Presentation transcript:

1 CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip Vivian Lee Eric Lo Sindy Shou Hugh Wang

2 What is Distance Browsing? Browsing through the database on the basis of distances from an arbitrary spatial query object  Ranking data objects in their order of distance from a given query object  E.g. Find the nearest person to me who is sleeping. 2 different techniques: k-nearest neighbor algorithm (k-NN) Incremental nearest neighbor algorithm (INN) A collection of spatial objects stored in an R-tree spatial data structure

3 Before All of Them q o Requirement - Consistency Definition: Let d be the combination of functions d 0 and d n, and e  N denote the fact that item e is contained in exactly set of nodes N. The function d 0 and d n are consistent iff for any query object q and any object or node e in the hierarchical data structure there exists n in N, where e  N, such that d(q, n)  d(q, e) The circle around query object q depicts search region after reporting o as next nearest object.

4 Example R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i R0 (0) efciab R5R6R3R4 R1R2 ghd R0: R1:R2: R3:R4:R5:R6: Find the THREE nearest neighbors to query point q in the R-tree given. k-Nearest Neighbor Search Incremental Nearest Neighbor Search

5 k-Nearest Neighbor Search Applicable only when k is fixed in advance Maintain a global list of candidate k nearest neighbors as traverse in depth-first manner Only make local decisions  Next node to visit must be the child node Make use of nearest list Comparing with the max. value in the list

6 Pruning Strategies Strategy 1: prunes an entry whose bounding rectangle r1 is such that MINDIST(q, r1) > MINMAXDIST(q, r2), where r2 is some other bounding rectangle Strategy 2: prunes an object o when DIST(q, o) > MINMAXDIST(q, r), where r is some bounding rectangle. b o a q r o b a q r MINDIST (optimistic) MINMAXDIST (pessimistic)

7 Pruning Strategies (con’t) Strategies 1 & 2 are useful only when k=1 Strategy 3: prunes any node whose bounding rectangle r is such that MINDIST(q, r) > NearestList.MaxDist Only MINDIST() is sufficient for pruning

8 Nearest List R0 (0) Example – k-NN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: ∞ Max Dist Max Dist. ab R4R3 ghd R4:R3: k = 3

9 Nearest List R0 (0) Example – k-NN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: ∞ Max Dist Max Dist. ab R4R3 ghd R4:R3: ghd d(59) g(81) h(17) 81 a(17) 59 b(48) 48 i(21) 21 k = 3

10 Problems with k-NN Nodes/objects are not visited by order of distance. May access non-optimal objects, and need to prune them. Need to know k in advance, difficult to combine with other predicates.

11 Incremental Nearest Neighbor Search Top-down manner tree traversal  Depth-first traversal  Breadth-first traversal

12 Incremental Nearest Neighbor Search INN use Best-first traversal  Pick the node with least distance in the set of all nodes that have yet to be visited Use a priority queue  Distance from the query object is the key Makes global decisions (k-NN make local decisions)  Based on priority queue  Choose among the child nodes of all visited nodes

13 Priority Queue R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6:

14 Priority Queue R2 (0)R1 (0) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: R0 (0)

15 Priority Queue R3 (13) R4 (11) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: R1 (0) R2 (0)

16 Priority Queue R6 (44)R5 (0) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: R2 (0) R4 (11) R3 (13)

17 Priority Queue [c](53)[i](0) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5: R6: R5 (0) R4 (11) R3 (13) R6 (44)

18 Priority Queue i (21) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: [i](0) R4 (11) R3 (13) R6 (44) [c](53)

19 Priority Queue [h](17) [g](74)[d](30) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: R4 (11) R3 (13) R6 (44) [c](53) i (21)

20 Priority Queue [b](27)[a](13) R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3: R4: R5:R6: R3 (13) [h](17) i (21) R6 (44) [c](53) [d](30) [g](74)

21 R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: Priority Queue [a](13) [h](17) i (21) R6 (44) [c](53) [d](30) [g](74) [b](27) a (17)

22 R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: Priority Queue [h](17) i (21) R6 (44) [c](53) [d](30) [g](74) [b](27) a (17) h (17)

23 R0 (0) Example – INN R0R1 R2 R3 R4 R5 R6 q f c g d h ba e i abcdefghiabcdefghi Seg.Dist.BR Dist. R0 R1 R2 R3 R4 R5 R BRDist. efciabR5R6R3R4R1R2ghd R0: R1:R2: R3:R4:R5:R6: Priority Queue i (21) R6 (44) [c](53) [d](30) [g](74) [b](27) a (17) h (17) i (21)

24 Variants Find Farthest Object:  Queue sorted in descending order of distance  Replace = Min and Max Distance:  E.g. Find all Cities distanced from Hongkong for 100 Miles to 200 Miles  Prune unqualified nodes Solve the Traditional k-NN Problem

25 Priority Queue Play a key role in performance In 2-dimension:  worst case unlikely to arise in practice  expected number of points in queue = O( )  usually fit in memory In higher-dimension:  Higher dimension, larger queue size

26 Priority Queue (con’t) Idea:  priority queue will be split into three-tiers  first tier in memory, 2 nd and 3 rd in a disk file  a set of ranges, first tier stores the nearest range, 3 rd tier stores the farthest  when 1 st tier exhausted, move elements from 2 nd tier  when 2 nd tier exhausted, scan elements and rebuild 1 st and 2 nd tier with new ranges

27 Comparison of k-NN and INN k-NN Depth-first recursion Make local decision k is fixed If used with k unknown,  Pick a fixed K’, do k-NN  If k gradually > K’, pick a m>=k and re-apply k-NN  Drawback: waste computational power if chosen m too large INN Priority queue Make global decision Number of neighbors not known in advanced

28 Experiment Dataset  Real-world data: TIGER/Line File Howard: 17,421 line segments Water: 37,495 line segments PG: 59,551 line segments Roads: 200,482 line segments  Synthetic data Hierarchical data structure: R*-tree Utilizing buffered I/O Three measures: execution time, R-tree node I/O, object distance calculations

29 Cumulative Cost of Distance Browsing

30 Incremental Cost of Distance Browsing

31 k-Nearest Neighbor Queries

32 Experimental Result INN outperforms k-NN in distance browsing In k-NN queries, INN algorithm is better than k-NN algorithm For large number of neighbor, priority queue for INN is smaller than the NearestList maintained by k-NN k-Nearest Neighbor Search Incremental Nearest Neighbor Search 

33 References Gisli R. Hjaltason, Hanan Samet, “Distance Browsing in Spatial Databases”, ACM TODS, Volume 24, Number 1, pp , March 1999 ~ THE END ~


Download ppt "CSIS 7101: CSIS 7101: Spatial Data (Part 3) Distance Browsing in Spatial Database GÍSLI R. HJALTASON and HANAN SAMET Rollo Chan Chu Chung Man Mak Wai Yip."

Similar presentations


Ads by Google