Download presentation
Presentation is loading. Please wait.
1
Orthogonal Range Searching and Kd-Trees
Computational Geometry (EECS 396/496) – October 4th, 2017
2
Orthogonal Range Searching – Motivation
Given a database of people, want to report everyone whose is both between 30 and 60 years old, and earns between $50,000 and $150,000 a year. This is an orthogonal range query: [30, 60] x [50000, ].
3
The Orthogonal Range Searching Problem
Input: A collection 𝑃 of 𝑛 points 𝑝 1 ,…, 𝑝 𝑛 ∈ ℝ 2 with 𝑝 𝑖 =( 𝑥 𝑖 , 𝑦 𝑖 ). Goal: Preprocess points to be able to answer orthogonal range queries efficiently. Preprocess = “find a data structure”. Orthogonal range query: For a query rectangle [𝑥, 𝑥′] x 𝑦, 𝑦 ′ , report all points 𝑝 𝑖 ∈𝑃 with 𝑥≤ 𝑥 𝑖 ≤𝑥′ and y≤ 𝑦 𝑖 ≤𝑦′. Report all points in an axis-aligned query rectangle. The problem generalizes naturally to 𝑑≥1 dimensions.
4
Simplification: 1-Dimensional Range Searching
Input: A collection 𝑃 of 𝑛 values 𝑝 1 ,…, 𝑝 𝑛 ∈ℝ. Goal: Preprocess points to be able to answer range queries efficiently. Preprocess = “find a data structure”. Range query: For a query range [𝑥, 𝑥′], report all points 𝑝 𝑖 ∈𝑃 with 𝑥≤ 𝑥 𝑖 ≤𝑥′. Report all points in a closed interval. Q: What data structure should we use? A: Balanced binary search trees. 𝑥 𝑥′
5
1D Range Searching in BBSTs
Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: ≤ 49 > 19 70 3 30 62 89 3 19 30 49 59 70 89 99 59 62
6
1D Range Searching in BBSTs
Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
7
1D Range Searching in BBSTs
Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
8
1D Range Searching in BBSTs
Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. Note split node where paths diverge. 49 Split node 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
9
1D Range Searching in BBSTs
Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. Note split node where paths diverge. Report all values at leaves of right subtrees on path from split node to 𝑥. 49 Split node 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
10
1D Range Searching in BBSTs
Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. Note split node where paths diverge. Report all values at leaves of right subtrees on path from split node to 𝑥. Report all values at leaves of left subtrees on path from split node to 𝑥′. 49 Split node 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
11
Analysis of 1D Range Searching – Correctness
Claim: Our 1D range searching algorithm reports exactly the values in the query range [𝑥, 𝑥’]. Proof: Claim part I: Any reported value 𝑝 lies in the query range. If 𝑝 was at 𝑢 or 𝑢′ it was checked explicitly. Otherwise, assume it was reported as a leaf of the right subtree of a node 𝑣 on the path to 𝑢. Then have that 𝑣.𝑥< 𝑝≤𝑣 𝑠𝑝𝑙𝑖𝑡 .𝑥. So, x≤𝑣.𝑥<𝑥′. Analysis is symmetric if 𝑝 was reported in a left subtree. 49 𝑣 𝑠𝑝𝑙𝑖𝑡 19 70 Potential 𝑣 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
12
Analysis of 1D Range Searching via BBSTs – Correctness
Claim: Our 1D range searching algorithm reports exactly the values in the query range [𝑥, 𝑥’]. Proof: Claim part II: Every value 𝑝 that lies in the range was reported. Let 𝑤 be the node storing 𝑝, and 𝑣 be the lowest visited ancestor of 𝑤. Assume for contradiction that 𝑣≠𝑤. If 𝑣 is on the path to 𝑢 but not 𝑢’, then 𝑤 must lie in the left subtree of 𝑣. Means that 𝑤.x≤v.x<x, contradiction. Argument is symmetric if 𝑣 is on the path to 𝑢’ or both 𝑢, 𝑢’. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
13
Analysis of 1D Range Searching via BBSTs – Resource Usage
Space usage: 𝑂(𝑛) Preprocessing time: 𝑂(𝑛 log𝑛) Query time: 𝑂 log 𝑛+𝑘 , where 𝑘 is the number of reported values. Searches for 𝑢,𝑢’ take 𝑂 log 𝑛 -time. Reporting all values in a tree takes linear time in the number of values. Query time is 𝑂(𝑛) in the worst case. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62
14
Generalizing to Higher Dimensions
Two generalizations of the 1-dimensional BBST: Kd-trees – This lecture. Range trees – Next lecture. Focus on the 2-dimensional case. The 𝑑-dimensional case for 𝑑≥3 is similar. For now, assume that all points have distinct 𝑥- and 𝑦-coordinates.
15
Kd-Trees Observation: 1-dimensional BBSTs recursively partition points into two sets of roughly equal size based on their 𝑥-coordinate. How do we generalize this for 2-dimensional points? Idea: Alternate partitioning based on 𝑥- coordinate and 𝑦-coordinate! The key idea behind Kd-trees. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 59 62
16
Kd-Trees … … Kd-Tree construction on 𝑃 ={ 𝑝 1 ,…, 𝑝 𝑛 }:
At even depth: partition based on median 𝑥-coordinate. Partition space using a vertical line. At odd depth: partition based on median 𝑦-coordinate. Partition space using a horizontal line. 𝑙 5 𝑙 7 𝑝 3 𝑙 1 𝑝 2 𝑝 4 𝑙 3 𝑝 1 𝑙 2 𝑙 3 𝑙 2 𝑙 4 𝑙 5 𝑙 6 𝑙 7 𝑝 1 𝑝 2 𝑝 3 𝑝 4 … … 𝑙 4 𝑙 1 𝑙 6
17
Kd-Trees – Resource Usage
Space: 𝑂 𝑛 Kd-trees are binary trees with 𝑛 leaves. Construction Time: 𝑂(𝑛 log 𝑛) Sort input points based on both 𝑥 and 𝑦 coordinate in 𝑂(𝑛 log 𝑛 )-time. Partition each sorted list into sublists of size 𝑛/2 , 𝑛/2 . Amount of work: 𝑇(𝑛) = 2𝑇( 𝑛/2 ) + 𝑂(𝑛). Solves to 𝑇 𝑛 =𝑂(𝑛 log 𝑛) .
18
Kd-Tree Query Algorithm
Each internal node 𝑣 in the tree corresponds to a region region(𝑣) in the plane. Query algorithm idea: Report all points in the subtree rooted at 𝑣 if region(𝑣) is contained in the query region. 𝑙 5 𝑙 7 𝑙 1 ≤ > 𝑙 3 𝑙 2 𝑙 3 ≤ > ≤ > 𝑙 2 𝑙 4 𝑙 5 𝑙 6 𝑙 7 𝑙 4 𝑙 1 𝑙 6
19
Kd-Tree Query Algorithm
SearchKdTree(𝑣, 𝑅) Input: The 𝑣 root of (a subtree of) a Kd-tree, and an axis-orthogonal query rectangle 𝑅. Output: All points at leaves in the tree rooted at 𝑣 that lie in 𝑅. If v is a leaf: Then: Report the point stored in 𝑣 if it lies in 𝑅. Return. If region(lc(𝑣)) is contained in 𝑅: Then: ReportPoints(lc(𝑣)). Else: SearchKdTree(lc(𝑣), 𝑅) If region(rc(𝑣)) is contained in 𝑅: Then: ReportPoints(rc(𝑣)). Else: SearchKdTree(rc(𝑣), 𝑅)
20
Kd-Tree Query Time Want to analyze how many tree nodes 𝑣 we visit in a query. Three types of nodes: Region(𝑣) does not intersect with 𝑅. None of these are visited. Region(𝑣) is completely contained in 𝑅. 𝑂(𝑘) of these are visited, where 𝑘 is the total number of reported points. Region(𝑣) intersects 𝑅 but is not contained in 𝑅. The interesting case!
21
Kd-Tree Query Time For how many nodes 𝑣 does region(𝑣) intersect the boundary of 𝑅? For how many nodes 𝑣 does region(𝑣) intersect the left boundary line of 𝑅? Number of intersected regions is at most 𝑄(𝑛) = 2𝑄(𝑛/4) + 2. Solves to 𝑄 𝑛 = 𝑛 . Other three boundary lines are similar. Overall query time: O 𝑛 +𝑘 .
22
Kd-Tree Summary Theorem: A kd-tree for a set of 𝑛 points in the plane has the following resource requirements: Space: 𝑂 𝑛 Construction Time: 𝑂(𝑛 log 𝑛) Query Time: O 𝑛 +𝑘 for axis-orthogonal rectangle queries which report 𝑘 points.
23
Kd-Trees: Lingering Questions
How do kd-trees generalize to higher dimensions? What is the query time in higher dimensions? Is there a data structure which allows for faster queries?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.