# COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing.

## Presentation on theme: "COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing."— Presentation transcript:

COSC 6114 Prof. Andy Mirzaian

References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing

Orthogonal Range Search: Data Base Query salary date of birth 14,000 13,000 1980,00,001989,99,99 2D Query Rectangle [1980,00,00 : 1989,99,99]  [13,000 : 14,000] salary date of birth 14,000 13,000 1980,00,001989,99,99 3D Query Orthogonal Range [1980,00,00 : 1989,99,99]  [13,000 : 14,000]  [2 : 4] 4 2 Mr. G. O. Meter born: Nov. 6, 1988 Salary: \$13,600. rank

1D-Tree: 1-Dimensional Range Searching x axis xx’ Static: Binary Search in a sorted array. Dynamic: Store data points in some balanced Binary Search Tree T. Let the data points be P = { p 1, p 2, …, p n }  . T is a balanced BST where the data appear at its leaves sorted left to right. The internal nodes are used to split left & right subtrees. Assume x(v) = max x(L), where L is any leaf in the left subtree of internal node v. 2 5 6 91317 136 4131 85 2 23 313741 49 73 8591 62 49 23 51737 73 62 9 Query Range: [7 : 49] root[ T ]

v split Query Range [x : x’]: Call 1DRangeQuery(root[ T ],x,x’) ALGORITHM 1DRangeQuery (v, x, x’) if v is a leaf then if x  x(v)  x’ then report data stored at v else do if x  x(v) then 1DRangeQuery ( leftchild(v), x, x’ ) if x(v) < x’ then 1DRangeQuery ( rightchild(v), x, x’ ) od end root[ T ] Complexities: Query Time O( K + log n) T,[x,x’]  output Construction Time O(n log n) P  T Space O(n) store T [These are optimal] K leaves reported

2D-Tree Consider dimension d=2 : point p=( x(p), y(p) ), range R = [x 1 : x 2 ]  [y 1 : y 2 ] p  R  x(p)  [x 1 : x 2 ] and y(p)  [y 1 : y 2 ]. x 1 x(p) x 2 x y p y2y2 y1y1 y(p) L P left P right L P left P right OR P left P right L L = vertical/horizontal median split. Alternate between vertical & horizontal splitting at even and odd depths. (Assume: no 2 points have equal x or y coordinates.) 2D-tree R

Constructing 2D-Tree Input: P = { p 1, p 2, …, p n }   2 off-line. Output: 2D-tree storing P. Step 1: Pre-sort P on x & on y, i.e., 2 sorted lists Û = (Xsorted(P), Ysorted(P)). Step 2: root[ T ]  Build2DTree ( Û, 0) end Procedure Build2DTree ( Û, depth ) if Û contains one point then return a leaf storing this point else do if depth is even then x-median split Û, i.e., split data points in half by a vertical line L through x-median of Û and reconfigure Û left and Û right. else y-median split Û, … by a horizontal line L, and reconfigure Û left and Û right. v  a newly created node storing line L leftchild(v)  Build2DTree ( Û left, 1+depth) rightchild(v)  Build2DTree ( Û right, 1+depth) return v end T(n) = 2 T(n/2) + O(n) = O(n log n) time.

2D-Tree Example p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 L1L1 L2L2 L6L6 L3L3 L5L5 L7L7 L9L9 p1p1 p3p3 p4p4 p2p2 p5p5 p7p7 p6p6 p9p9 p8p8 L1L1 L4L4 L3L3 L5L5 L8L8 L7L7 L9L9 L2L2 L6L6 L4L4 L8L8

Query Point Search in 2D-Tree p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 L1L1 L2L2 L6L6 L3L3 L5L5 L7L7 L9L9 p1p1 p3p3 p4p4 p2p2 p5p5 p7p7 p6p6 p9p9 p8p8 L1L1 L4L4 L3L3 L5L5 L8L8 L7L7 L9L9 L2L2 L6L6 L4L4 L8L8 q

2D-Tree node regions region(v) = rectangular region (possibly unbounded) covered by the subtree rooted at v. region (root[ T ]) = (-  : +  )  (-  : +  ) Suppose region(v) =  x 1 : x 2    y 1 : y 2  what are region(leftchild(v)) and region(rightchild(v))? With x-split: region(lc(v)) =  x 1 : x( L ) ]   y 1 : y 2  region(rc(v)) = ( x( L ) : x 2    y 1 : y 2  With y-split: region(lc(v)) =  x 1 : x 2    y 1 : y( L ) ] region(rc(v)) =  x 1 : x 2   ( y( L ) : y 2  lc(v)rc(v) L lc(v) rc(v) L

2D-Tree Range Search For range R = [x 1 : x 2 ]  [y 1 : y 2 ] call Search2DTree (root[ T ], R ) ALGORITHM Search2DTree ( v, R ) 1.if v is a leaf then if p(v)  R then report p(v) 2.else if region(lc(v))  R 3.then ReportSubtree (lc(v)) 4.else if region(lc(v))  R   5. then Search2DTree ( lc(v), R ) 6. if region(rc(v))  R 7.then ReportSubtree (rc(v)) 8.else if region(rc(v))  R   9. then Search2DTree ( rc(v), R ) end  region(v) can either be passed as input parameter, or explicitly stored at node v,  v  T.  ReportSubtree(v) is a simple linear-time in-order traversal that reports every leaf descendent of node v.

Running Time of Search2DTree  K = # of points reported.  Lines 3 & 7 take O(K) time over all recursive calls.  Total # nodes visited (reported or not) is proportional to # times conditions of lines 4 & 8 are true.  region(v)  R  & region(v)  R  a bounding edge e of R intersects region(v).  R has  4 bounding edges. Let e (assume vertical) be one of them.  Define H(n) (resp. V(n)) = worst-case number of nodes v that intersect e for a 2D-tree of n leaves, assuming root corresponds to an x-split (resp. y-split). e e LL H(n) V(n)

dD-Tree Complexities 2D-Tree  Query Time : O( K +  n ) worst-case, O( K + log n) average  Construction Time : O(n log n)  Storage Space: O(n) dD-Tree d-dimensions Use round-robin splitting at successive levels on the d dimensions x 1, x 2, …, x d.  Query Time: O(dK + d n 1–1/d )  Construction Time: O(d n log n)  Space: O(dn) How can we improve the query time?

Range Trees 2D Range Tree  Query Time: O( K + log 2 n ) O(K + log n) by Fractional Cascading  Construction Time: O(n log n)  Space: O(n log n) Range R = [x : x’]  [y : y’] 1D Range Tree on x-coordinates: x x’ O(log n) canonical sub-trees O(log n) xx’ Each x-range [x : x’] can be expressed as the disjoint union of O(log n) canonical x-ranges. y

Range Trees 2-level data structure: P(v) v T assoc (v) root[ T ] Primary Level: BST on x-coordinates Secondary level: BST on y-coord. min(v) max(v)

Range Tree Construction ALGORITHM Build 2D Range Tree (P) Input: P = { p 1, p 2, …, p n }   2, P = (P x, P y ) represented by pre-sorted list on x (named P x ) and on y (named P y ). Output: pointer to the root of 2D range tree for P. Construct T assoc, bottom up, based on P y, but store in each leaf the points, not just their y-coordinates. if |P| > 1 then do P left  { p  P | p x  x med of P } (* both lists P x and P y should split *) P right  { p  P | p x > x med of P } lc(v)  Build 2D Range Tree (P left ) rc(v)  Build 2D Range Tree (P right ) od min(v)  min (P x ); max(v)  max(P x ) T assoc (v)  T assoc return v end T(n) = 2 T(n/2) + O(n) = O(n log n) time. This includes time for pre-sorting.

2D Range Query ALGORITHM 2DRangeQuery ( v, [x : x’]  [y : y’] ) 1.if x  min(v) & max(v)  x’ 2. then 1DRangeQuery ( T assoc (v), [y : y’] ) 3. else if v is not a leaf do 4. if x  max(lc(v)) 5. then 2DRangeQuery ( lc(v), [x : x’]  [y : y’] ) 6. if min(rc(v))  x’ 7. then 2DRangeQuery ( rc(v), [x : x’]  [y : y’] ) 8. od end x x’ T Line 2 called at roots of red canonical sub-trees, a total of O(log n) times. Each call takes O(K v + log | T assoc (v) | ) = O(K v + log n) time. Lines 5 & 7 called at blue shoulder paths. Total cost O(log n). Total Query Time = O(log n +  v (K v + log n)) = O(  v K v + log 2 n) = O(K + log 2 n). Query Time: O( K + log 2 n ) will be improved to O(K + log n) by Fractional Cascading Construction Time: O(n log n) Space: O(n log n)

Higher Dimensional Range Trees P(v) v T assoc (v) root[ T ] Primary Level: BST on the 1 st coordinate (d-1)-dimensional Range Tree on coord’s 2..d. P = { p 1, p 2, …, p n }   d, p i = (x i1, x i2, …, x id ), i=1..n.

Higher Dimensional Range Trees d-level data structure

Higher Dimensional Range Trees Query Time: Q d (n) = O( K + log d n) improved to O(K + log d-1 n ) by Frac. Casc. Construction Time: T d (n) = O(n log d-1 n) Space: S d (n) = O(n log d-1 n)

General Sets of Points What if 2 points have the same coordinate value at some coordinate axis?  Composite Numer Space: (lexicographic order) (a,b)  (a | b) (a | b) < (a’ | b’)  a { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/13/3602957/slides/slide_21.jpg", "name": "General Sets of Points What if 2 points have the same coordinate value at some coordinate axis.", "description": " Composite Numer Space: (lexicographic order) (a,b)  (a | b) (a | b) < (a’ | b’)  a

Fractional Cascading IDEA: Save repeated cost of binary search in many sorted lists for the same range [y : y’] if the list contents for one are a subset of the other.  A 2  A 1  Binary search for y in A 1 to get to A 1 [i].  Follow pointer to A 2 to get to A 2 [j].  Now walk to the right in each list. 357913151192326313645 513263645 A1A1 A2A2 6392 nil

Fractional Cascading 3579131511923263136 A1A1 A2A2 nil 35791315119232631 36 A3A3  A 2  A 1, A 3  A 1.  No binary search in A 2 and A 3 is needed.  Do binary search in A 1.  Follow blue and red pointers from there to A 2 and A 3.  Now we have the starting point in each sorted list. Walk to the right & report.

Layered 2D Range Tree v lc(v) rc(v) P(lc(v))P(rc(v)) P(v) T T assoc (v) T assoc (lc(v)) T assoc (rc(v)) P(lc(v))  P(v) P(rc(v))  P(v)

Layered 2D Range Tree T Associated Structures at the secondary level by Fractional Cascading

Layered 2D Range Tree (by Fractional Cascading) Query Time: Q 2 (n) = O(log n +  v (K v + log n)) = O(  v K v + log 2 n) = O(K + log 2 n) improves to: Q 2 (n) = O(log n +  v (K v + 1)) = O(  v K v + log n) = O(K + log n). For d-dimensional range tree query time improves to:

Exercises

1.Show the following implication on the worst-case query time on 2D-Tree: 2.Describe algorithms to insert and delete points from a 2D-Tree. You don’t need to take care of rebalancing the structure. 3.dD-Trees can also be used for partial match queries. A 2D partial match query specifies one of the coordinates and asks for all points that have the specified coordinate value. In higher dimensions we specify values for a subset of the coordinates. Here we allow multiple points to have equal values for coordinates. (a) Show that 2D-Trees can answer partial match queries in O(K+  n) time, where K is the number of reported answers. (b) Describe a data structure that uses O(n) storage and answers 2D partial match queries in O(K + log n) time. (c) Show that a dD-Tree can solve a partial match query in O(K + n 1-s/d ) time, where s is the number of specified coordinates. (d) Show that, when we allow for O(d 2 d n) storage, dD partial match queries can be answered in O(K + d log n) time.

4.Describe algorithms to insert and delete points from a Range Tree. You don’t need to take care of rebalancing the structure. 5.One can use 2D-Trees and Range Trees to search for a particular point (a,b) by performing a range query with the range [a:a]  [b:b]. (a) Prove this takes O(log n) time on 2D-Trees. (b) Derive the time bound on Range Trees. 6.In many applications one wants to do range searching among objects other than points. (a) Let P be a set of n axis-parallel rectangles in the plane. We want to be able to report all rectangles in P that are completely contained in a query rectangle [x : x’]  [y: y’]. Describe a data structure for this problem that uses O(n log 3 n) storage and has O(K + log 4 n) query time, where K is the number of reported answers. [Hint: Transform the problem to an orthogonal range searching problem in some higher dimensional space.] (b) Let P now consist of a set of n simple polygons in the plane. Describe a data structure that uses O(n log 3 n) space (excluding space needed to externally store the polygons) and has O(K + log 4 n) query time, where K is the number of polygons completely contained in the query rectangle that are reported. (c) Improve the query time to O(K + log 3 n).

7.For this problem, assume for simplicity that n is a power of 2. Consider a 3D-Tree for a set of n points in 3D. Consider a line that is parallel to the x-axis. What is the maximum number of leaf cells that can intersect by such a line. 8.We showed that a 2D-tree could answer orthogonal range queries for a set of n points in the plane in O( n 1/2 + K) time, where K was the number of points reported. Generalize this to show that in dimension 3, the query time is O(n 2/3 + K).

END

Download ppt "COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing."

Similar presentations