Presentation is loading. Please wait.

Presentation is loading. Please wait.

COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing.

Similar presentations


Presentation on theme: "COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing."— Presentation transcript:

1 COSC 6114 Prof. Andy Mirzaian

2 References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing

3 Orthogonal Range Search: Data Base Query salary date of birth 14,000 13, ,00,001989,99,99 2D Query Rectangle [1980,00,00 : 1989,99,99]  [13,000 : 14,000] salary date of birth 14,000 13, ,00,001989,99,99 3D Query Orthogonal Range [1980,00,00 : 1989,99,99]  [13,000 : 14,000]  [2 : 4] 4 2 Mr. G. O. Meter born: Nov. 6, 1988 Salary: $13,600. rank

4 1D-Tree: 1-Dimensional Range Searching x axis xx’ Static: Binary Search in a sorted array. Dynamic: Store data points in some balanced Binary Search Tree T. Let the data points be P = { p 1, p 2, …, p n }  . T is a balanced BST where the data appear at its leaves sorted left to right. The internal nodes are used to split left & right subtrees. Assume x(v) = max x(L), where L is any leaf in the left subtree of internal node v Query Range: [7 : 49] root[ T ]

5 v split Query Range [x : x’]: Call 1DRangeQuery(root[ T ],x,x’) ALGORITHM 1DRangeQuery (v, x, x’) if v is a leaf then if x  x(v)  x’ then report data stored at v else do if x  x(v) then 1DRangeQuery ( leftchild(v), x, x’ ) if x(v) < x’ then 1DRangeQuery ( rightchild(v), x, x’ ) od end root[ T ] Complexities: Query Time O( K + log n) T,[x,x’]  output Construction Time O(n log n) P  T Space O(n) store T [These are optimal] K leaves reported

6 2D-Tree Consider dimension d=2 : point p=( x(p), y(p) ), range R = [x 1 : x 2 ]  [y 1 : y 2 ] p  R  x(p)  [x 1 : x 2 ] and y(p)  [y 1 : y 2 ]. x 1 x(p) x 2 x y p y2y2 y1y1 y(p) L P left P right L P left P right OR P left P right L L = vertical/horizontal median split. Alternate between vertical & horizontal splitting at even and odd depths. (Assume: no 2 points have equal x or y coordinates.) 2D-tree R

7 Constructing 2D-Tree Input: P = { p 1, p 2, …, p n }   2 off-line. Output: 2D-tree storing P. Step 1: Pre-sort P on x & on y, i.e., 2 sorted lists Û = (Xsorted(P), Ysorted(P)). Step 2: root[ T ]  Build2DTree ( Û, 0) end Procedure Build2DTree ( Û, depth ) if Û contains one point then return a leaf storing this point else do if depth is even then x-median split Û, i.e., split data points in half by a vertical line L through x-median of Û and reconfigure Û left and Û right. else y-median split Û, … by a horizontal line L, and reconfigure Û left and Û right. v  a newly created node storing line L leftchild(v)  Build2DTree ( Û left, 1+depth) rightchild(v)  Build2DTree ( Û right, 1+depth) return v end T(n) = 2 T(n/2) + O(n) = O(n log n) time.

8 2D-Tree Example p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 L1L1 L2L2 L6L6 L3L3 L5L5 L7L7 L9L9 p1p1 p3p3 p4p4 p2p2 p5p5 p7p7 p6p6 p9p9 p8p8 L1L1 L4L4 L3L3 L5L5 L8L8 L7L7 L9L9 L2L2 L6L6 L4L4 L8L8

9 Query Point Search in 2D-Tree p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 L1L1 L2L2 L6L6 L3L3 L5L5 L7L7 L9L9 p1p1 p3p3 p4p4 p2p2 p5p5 p7p7 p6p6 p9p9 p8p8 L1L1 L4L4 L3L3 L5L5 L8L8 L7L7 L9L9 L2L2 L6L6 L4L4 L8L8 q

10 2D-Tree node regions region(v) = rectangular region (possibly unbounded) covered by the subtree rooted at v. region (root[ T ]) = (-  : +  )  (-  : +  ) Suppose region(v) =  x 1 : x 2    y 1 : y 2  what are region(leftchild(v)) and region(rightchild(v))? With x-split: region(lc(v)) =  x 1 : x( L ) ]   y 1 : y 2  region(rc(v)) = ( x( L ) : x 2    y 1 : y 2  With y-split: region(lc(v)) =  x 1 : x 2    y 1 : y( L ) ] region(rc(v)) =  x 1 : x 2   ( y( L ) : y 2  lc(v)rc(v) L lc(v) rc(v) L

11 2D-Tree Range Search For range R = [x 1 : x 2 ]  [y 1 : y 2 ] call Search2DTree (root[ T ], R ) ALGORITHM Search2DTree ( v, R ) 1.if v is a leaf then if p(v)  R then report p(v) 2.else if region(lc(v))  R 3.then ReportSubtree (lc(v)) 4.else if region(lc(v))  R   5. then Search2DTree ( lc(v), R ) 6. if region(rc(v))  R 7.then ReportSubtree (rc(v)) 8.else if region(rc(v))  R   9. then Search2DTree ( rc(v), R ) end  region(v) can either be passed as input parameter, or explicitly stored at node v,  v  T.  ReportSubtree(v) is a simple linear-time in-order traversal that reports every leaf descendent of node v.

12 Running Time of Search2DTree  K = # of points reported.  Lines 3 & 7 take O(K) time over all recursive calls.  Total # nodes visited (reported or not) is proportional to # times conditions of lines 4 & 8 are true.  region(v)  R  & region(v)  R  a bounding edge e of R intersects region(v).  R has  4 bounding edges. Let e (assume vertical) be one of them.  Define H(n) (resp. V(n)) = worst-case number of nodes v that intersect e for a 2D-tree of n leaves, assuming root corresponds to an x-split (resp. y-split). e e LL H(n) V(n)

13 dD-Tree Complexities 2D-Tree  Query Time : O( K +  n ) worst-case, O( K + log n) average  Construction Time : O(n log n)  Storage Space: O(n) dD-Tree d-dimensions Use round-robin splitting at successive levels on the d dimensions x 1, x 2, …, x d.  Query Time: O(dK + d n 1–1/d )  Construction Time: O(d n log n)  Space: O(dn) How can we improve the query time?

14 Range Trees 2D Range Tree  Query Time: O( K + log 2 n ) O(K + log n) by Fractional Cascading  Construction Time: O(n log n)  Space: O(n log n) Range R = [x : x’]  [y : y’] 1D Range Tree on x-coordinates: x x’ O(log n) canonical sub-trees O(log n) xx’ Each x-range [x : x’] can be expressed as the disjoint union of O(log n) canonical x-ranges. y

15 Range Trees 2-level data structure: P(v) v T assoc (v) root[ T ] Primary Level: BST on x-coordinates Secondary level: BST on y-coord. min(v) max(v)

16 Range Tree Construction ALGORITHM Build 2D Range Tree (P) Input: P = { p 1, p 2, …, p n }   2, P = (P x, P y ) represented by pre-sorted list on x (named P x ) and on y (named P y ). Output: pointer to the root of 2D range tree for P. Construct T assoc, bottom up, based on P y, but store in each leaf the points, not just their y-coordinates. if |P| > 1 then do P left  { p  P | p x  x med of P } (* both lists P x and P y should split *) P right  { p  P | p x > x med of P } lc(v)  Build 2D Range Tree (P left ) rc(v)  Build 2D Range Tree (P right ) od min(v)  min (P x ); max(v)  max(P x ) T assoc (v)  T assoc return v end T(n) = 2 T(n/2) + O(n) = O(n log n) time. This includes time for pre-sorting.

17 2D Range Query ALGORITHM 2DRangeQuery ( v, [x : x’]  [y : y’] ) 1.if x  min(v) & max(v)  x’ 2. then 1DRangeQuery ( T assoc (v), [y : y’] ) 3. else if v is not a leaf do 4. if x  max(lc(v)) 5. then 2DRangeQuery ( lc(v), [x : x’]  [y : y’] ) 6. if min(rc(v))  x’ 7. then 2DRangeQuery ( rc(v), [x : x’]  [y : y’] ) 8. od end x x’ T Line 2 called at roots of red canonical sub-trees, a total of O(log n) times. Each call takes O(K v + log | T assoc (v) | ) = O(K v + log n) time. Lines 5 & 7 called at blue shoulder paths. Total cost O(log n). Total Query Time = O(log n +  v (K v + log n)) = O(  v K v + log 2 n) = O(K + log 2 n). Query Time: O( K + log 2 n ) will be improved to O(K + log n) by Fractional Cascading Construction Time: O(n log n) Space: O(n log n)

18 Higher Dimensional Range Trees P(v) v T assoc (v) root[ T ] Primary Level: BST on the 1 st coordinate (d-1)-dimensional Range Tree on coord’s 2..d. P = { p 1, p 2, …, p n }   d, p i = (x i1, x i2, …, x id ), i=1..n.

19 Higher Dimensional Range Trees d-level data structure

20 Higher Dimensional Range Trees Query Time: Q d (n) = O( K + log d n) improved to O(K + log d-1 n ) by Frac. Casc. Construction Time: T d (n) = O(n log d-1 n) Space: S d (n) = O(n log d-1 n)

21 General Sets of Points What if 2 points have the same coordinate value at some coordinate axis?  Composite Numer Space: (lexicographic order) (a,b)  (a | b) (a | b) < (a’ | b’)  a

22 Fractional Cascading IDEA: Save repeated cost of binary search in many sorted lists for the same range [y : y’] if the list contents for one are a subset of the other.  A 2  A 1  Binary search for y in A 1 to get to A 1 [i].  Follow pointer to A 2 to get to A 2 [j].  Now walk to the right in each list A1A1 A2A nil

23 Fractional Cascading A1A1 A2A2 nil A3A3  A 2  A 1, A 3  A 1.  No binary search in A 2 and A 3 is needed.  Do binary search in A 1.  Follow blue and red pointers from there to A 2 and A 3.  Now we have the starting point in each sorted list. Walk to the right & report.

24 Layered 2D Range Tree v lc(v) rc(v) P(lc(v))P(rc(v)) P(v) T T assoc (v) T assoc (lc(v)) T assoc (rc(v)) P(lc(v))  P(v) P(rc(v))  P(v)

25 Layered 2D Range Tree T Associated Structures at the secondary level by Fractional Cascading

26 Layered 2D Range Tree (by Fractional Cascading) Query Time: Q 2 (n) = O(log n +  v (K v + log n)) = O(  v K v + log 2 n) = O(K + log 2 n) improves to: Q 2 (n) = O(log n +  v (K v + 1)) = O(  v K v + log n) = O(K + log n). For d-dimensional range tree query time improves to:

27 Exercises

28 1.Show the following implication on the worst-case query time on 2D-Tree: 2.Describe algorithms to insert and delete points from a 2D-Tree. You don’t need to take care of rebalancing the structure. 3.dD-Trees can also be used for partial match queries. A 2D partial match query specifies one of the coordinates and asks for all points that have the specified coordinate value. In higher dimensions we specify values for a subset of the coordinates. Here we allow multiple points to have equal values for coordinates. (a) Show that 2D-Trees can answer partial match queries in O(K+  n) time, where K is the number of reported answers. (b) Describe a data structure that uses O(n) storage and answers 2D partial match queries in O(K + log n) time. (c) Show that a dD-Tree can solve a partial match query in O(K + n 1-s/d ) time, where s is the number of specified coordinates. (d) Show that, when we allow for O(d 2 d n) storage, dD partial match queries can be answered in O(K + d log n) time.

29 4.Describe algorithms to insert and delete points from a Range Tree. You don’t need to take care of rebalancing the structure. 5.One can use 2D-Trees and Range Trees to search for a particular point (a,b) by performing a range query with the range [a:a]  [b:b]. (a) Prove this takes O(log n) time on 2D-Trees. (b) Derive the time bound on Range Trees. 6.In many applications one wants to do range searching among objects other than points. (a) Let P be a set of n axis-parallel rectangles in the plane. We want to be able to report all rectangles in P that are completely contained in a query rectangle [x : x’]  [y: y’]. Describe a data structure for this problem that uses O(n log 3 n) storage and has O(K + log 4 n) query time, where K is the number of reported answers. [Hint: Transform the problem to an orthogonal range searching problem in some higher dimensional space.] (b) Let P now consist of a set of n simple polygons in the plane. Describe a data structure that uses O(n log 3 n) space (excluding space needed to externally store the polygons) and has O(K + log 4 n) query time, where K is the number of polygons completely contained in the query rectangle that are reported. (c) Improve the query time to O(K + log 3 n).

30 7.For this problem, assume for simplicity that n is a power of 2. Consider a 3D-Tree for a set of n points in 3D. Consider a line that is parallel to the x-axis. What is the maximum number of leaf cells that can intersect by such a line. 8.We showed that a 2D-tree could answer orthogonal range queries for a set of n points in the plane in O( n 1/2 + K) time, where K was the number of points reported. Generalize this to show that in dimension 3, the query time is O(n 2/3 + K).

31 END


Download ppt "COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing."

Similar presentations


Ads by Google