Orthogonal Range Searching and Kd-Trees

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Orthogonal range searching. The problem (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of.
INTERVAL TREE & SEGMENTATION TREE
Lecture 3: Parallel Algorithm Design
Parallel Algorithms in Computational Geometry
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
2/14/13CMPS 3120 Computational Geometry1 CMPS 3120: Computational Geometry Spring 2013 Planar Subdivisions and Point Location Carola Wenk Based on: Computational.
COSC 6114 Prof. Andy Mirzaian. References: [M. de Berge et al] chapter 5 Applications: Data Base GIS, Graphics: crop-&-zoom, windowing.
Searching on Multi-Dimensional Data
Lectures on Recursive Algorithms1 COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski.
Klee’s Measure Problem Computational Geometry, WS 2007/08 Group Work Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 – Part II Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
Orthogonal Range Searching 3Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Orthogonal Range Searching-1Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
Computational Geometry, WS 2007/08 Lecture 15 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für Angewandte Wissenschaften.
Lecture 5: Orthogonal Range Searching Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 – Part III Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
Orthogonal Range Searching Computational Geometry, WS 2006/07 Lecture 13 - Part I Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für.
Lecture 6: Point Location Computational Geometry Prof. Dr. Th. Ottmann 1 Point Location 1.Trapezoidal decomposition. 2.A search structure. 3.Randomized,
O RTHOGONAL R ANGE S EARCHING الهه اسلامی فروردین 92, 1.
Fractional Cascading and Its Applications G. S. Lueker. A data structure for orthogonal range queries. In Proc. 19 th annu. IEEE Sympos. Found. Comput.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
Orthogonal Range Searching I Range Trees. Range Searching S = set of geometric objects Q = query object Report/Count objects in S that intersect Q Query.
UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
14/13/15 CMPS 3130/6130 Computational Geometry Spring 2015 Windowing Carola Wenk CMPS 3130/6130 Computational Geometry.
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Multi-dimensional Search Trees
P ARALLEL O RTHOGONAL R ANGE S EARCHING Project Presentation by Savitha Parur Venkitachalam.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Orthogonal Range Search
Computational Geometry Piyush Kumar (Lecture 5: Range Searching) Welcome to CIS5930.
Binary Search Trees (BST)
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2015.
CSE554Contouring IISlide 1 CSE 554 Lecture 3: Contouring II Fall 2011.
CSE554Contouring IISlide 1 CSE 554 Lecture 5: Contouring (faster) Fall 2013.
CMPS 3130/6130 Computational Geometry Spring 2015
February 17, 2005Lecture 6: Point Location Point Location (most slides by Sergi Elizalde and David Pritchard)
UNC Chapel Hill M. C. Lin Geometric Data Structures Reading: Chapter 10 of the Textbook Driving Applications –Windowing Queries Related Application –Query.
May 2012Range Search Algorithms1 Shmuel Wimer Bar Ilan Univ. Eng. Faculty Technion, EE Faculty.
School of Computing Clemson University Fall, 2012
CSE 554 Lecture 5: Contouring (faster)
Computational Geometry
Geometric Data Structures
CMPS 3130/6130 Computational Geometry Spring 2017
CMPS 3130/6130 Computational Geometry Spring 2017
KD Tree A binary search tree where every node is a
Segment tree and Interval Tree
Advanced Topics in Data Management
Fast Trie Data Structures
Quadtrees 1.
Reporting (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of the from: Given an interval I=[x1,x2]
Heaps and Priority Queues
Shmuel Wimer Bar Ilan Univ., School of Engineering
Approximating Points by A Piecewise Linear Function: I
Trees.
CMPS 3130/6130 Computational Geometry Spring 2017
Range Queries on Uncertain Data
CMPS 3130/6130 Computational Geometry Spring 2017
Mathematical Induction II
Tree Structures for Set of Intervals
Presentation transcript:

Orthogonal Range Searching and Kd-Trees Computational Geometry (EECS 396/496) – October 4th, 2017

Orthogonal Range Searching – Motivation Given a database of people, want to report everyone whose is both between 30 and 60 years old, and earns between $50,000 and $150,000 a year. This is an orthogonal range query: [30, 60] x [50000, 150000].

The Orthogonal Range Searching Problem Input: A collection 𝑃 of 𝑛 points 𝑝 1 ,…, 𝑝 𝑛 ∈ ℝ 2 with 𝑝 𝑖 =( 𝑥 𝑖 , 𝑦 𝑖 ). Goal: Preprocess points to be able to answer orthogonal range queries efficiently. Preprocess = “find a data structure”. Orthogonal range query: For a query rectangle [𝑥, 𝑥′] x 𝑦, 𝑦 ′ , report all points 𝑝 𝑖 ∈𝑃 with 𝑥≤ 𝑥 𝑖 ≤𝑥′ and y≤ 𝑦 𝑖 ≤𝑦′. Report all points in an axis-aligned query rectangle. The problem generalizes naturally to 𝑑≥1 dimensions.

Simplification: 1-Dimensional Range Searching Input: A collection 𝑃 of 𝑛 values 𝑝 1 ,…, 𝑝 𝑛 ∈ℝ. Goal: Preprocess points to be able to answer range queries efficiently. Preprocess = “find a data structure”. Range query: For a query range [𝑥, 𝑥′], report all points 𝑝 𝑖 ∈𝑃 with 𝑥≤ 𝑥 𝑖 ≤𝑥′. Report all points in a closed interval. Q: What data structure should we use? A: Balanced binary search trees. 𝑥 𝑥′

1D Range Searching in BBSTs Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: ≤ 49 > 19 70 3 30 62 89 3 19 30 49 59 70 89 99 59 62

1D Range Searching in BBSTs Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

1D Range Searching in BBSTs Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

1D Range Searching in BBSTs Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. Note split node where paths diverge. 49 Split node 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

1D Range Searching in BBSTs Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. Note split node where paths diverge. Report all values at leaves of right subtrees on path from split node to 𝑥. 49 Split node 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

1D Range Searching in BBSTs Example query: [19, 69] Search algorithm on query [𝑥, 𝑥’] in tree 𝑇: Search for 𝑥, 𝑥’ in 𝑇 recording paths to the resulting leaves 𝑢,𝑢’. Check whether values at 𝑢,𝑢’ are in [𝑥, 𝑥’]. If so, report them. Note split node where paths diverge. Report all values at leaves of right subtrees on path from split node to 𝑥. Report all values at leaves of left subtrees on path from split node to 𝑥′. 49 Split node 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

Analysis of 1D Range Searching – Correctness Claim: Our 1D range searching algorithm reports exactly the values in the query range [𝑥, 𝑥’]. Proof: Claim part I: Any reported value 𝑝 lies in the query range. If 𝑝 was at 𝑢 or 𝑢′ it was checked explicitly. Otherwise, assume it was reported as a leaf of the right subtree of a node 𝑣 on the path to 𝑢. Then have that 𝑣.𝑥< 𝑝≤𝑣 𝑠𝑝𝑙𝑖𝑡 .𝑥. So, x≤𝑣.𝑥<𝑥′. Analysis is symmetric if 𝑝 was reported in a left subtree. 49 𝑣 𝑠𝑝𝑙𝑖𝑡 19 70 Potential 𝑣 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

Analysis of 1D Range Searching via BBSTs – Correctness Claim: Our 1D range searching algorithm reports exactly the values in the query range [𝑥, 𝑥’]. Proof: Claim part II: Every value 𝑝 that lies in the range was reported. Let 𝑤 be the node storing 𝑝, and 𝑣 be the lowest visited ancestor of 𝑤. Assume for contradiction that 𝑣≠𝑤. If 𝑣 is on the path to 𝑢 but not 𝑢’, then 𝑤 must lie in the left subtree of 𝑣. Means that 𝑤.x≤v.x<x, contradiction. Argument is symmetric if 𝑣 is on the path to 𝑢’ or both 𝑢, 𝑢’. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

Analysis of 1D Range Searching via BBSTs – Resource Usage Space usage: 𝑂(𝑛) Preprocessing time: 𝑂(𝑛 log⁡𝑛) Query time: 𝑂 log 𝑛+𝑘 , where 𝑘 is the number of reported values. Searches for 𝑢,𝑢’ take 𝑂 log 𝑛 -time. Reporting all values in a tree takes linear time in the number of values. Query time is 𝑂(𝑛) in the worst case. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 u 𝑢’ 59 62

Generalizing to Higher Dimensions Two generalizations of the 1-dimensional BBST: Kd-trees – This lecture. Range trees – Next lecture. Focus on the 2-dimensional case. The 𝑑-dimensional case for 𝑑≥3 is similar. For now, assume that all points have distinct 𝑥- and 𝑦-coordinates.

Kd-Trees Observation: 1-dimensional BBSTs recursively partition points into two sets of roughly equal size based on their 𝑥-coordinate. How do we generalize this for 2-dimensional points? Idea: Alternate partitioning based on 𝑥- coordinate and 𝑦-coordinate! The key idea behind Kd-trees. 49 19 70 3 30 62 89 3 19 30 49 59 70 89 99 59 62

Kd-Trees … … Kd-Tree construction on 𝑃 ={ 𝑝 1 ,…, 𝑝 𝑛 }: At even depth: partition based on median 𝑥-coordinate. Partition space using a vertical line. At odd depth: partition based on median 𝑦-coordinate. Partition space using a horizontal line. 𝑙 5 𝑙 7 𝑝 3 𝑙 1 𝑝 2 𝑝 4 𝑙 3 𝑝 1 𝑙 2 𝑙 3 𝑙 2 𝑙 4 𝑙 5 𝑙 6 𝑙 7 𝑝 1 𝑝 2 𝑝 3 𝑝 4 … … 𝑙 4 𝑙 1 𝑙 6

Kd-Trees – Resource Usage Space: 𝑂 𝑛 Kd-trees are binary trees with 𝑛 leaves. Construction Time: 𝑂(𝑛 log 𝑛) Sort input points based on both 𝑥 and 𝑦 coordinate in 𝑂(𝑛 log 𝑛 )-time. Partition each sorted list into sublists of size 𝑛/2 , 𝑛/2 . Amount of work: 𝑇(𝑛) = 2𝑇( 𝑛/2 ) + 𝑂(𝑛). Solves to 𝑇 𝑛 =𝑂(𝑛 log 𝑛) .

Kd-Tree Query Algorithm Each internal node 𝑣 in the tree corresponds to a region region(𝑣) in the plane. Query algorithm idea: Report all points in the subtree rooted at 𝑣 if region(𝑣) is contained in the query region. 𝑙 5 𝑙 7 𝑙 1 ≤ > 𝑙 3 𝑙 2 𝑙 3 ≤ > ≤ > 𝑙 2 𝑙 4 𝑙 5 𝑙 6 𝑙 7 𝑙 4 𝑙 1 𝑙 6

Kd-Tree Query Algorithm SearchKdTree(𝑣, 𝑅) Input: The 𝑣 root of (a subtree of) a Kd-tree, and an axis-orthogonal query rectangle 𝑅. Output: All points at leaves in the tree rooted at 𝑣 that lie in 𝑅. If v is a leaf: Then: Report the point stored in 𝑣 if it lies in 𝑅. Return. If region(lc(𝑣)) is contained in 𝑅: Then: ReportPoints(lc(𝑣)). Else: SearchKdTree(lc(𝑣), 𝑅) If region(rc(𝑣)) is contained in 𝑅: Then: ReportPoints(rc(𝑣)). Else: SearchKdTree(rc(𝑣), 𝑅)

Kd-Tree Query Time Want to analyze how many tree nodes 𝑣 we visit in a query. Three types of nodes: Region(𝑣) does not intersect with 𝑅. None of these are visited. Region(𝑣) is completely contained in 𝑅. 𝑂(𝑘) of these are visited, where 𝑘 is the total number of reported points. Region(𝑣) intersects 𝑅 but is not contained in 𝑅. The interesting case!

Kd-Tree Query Time For how many nodes 𝑣 does region(𝑣) intersect the boundary of 𝑅? For how many nodes 𝑣 does region(𝑣) intersect the left boundary line of 𝑅? Number of intersected regions is at most 𝑄(𝑛) = 2𝑄(𝑛/4) + 2. Solves to 𝑄 𝑛 = 𝑛 . Other three boundary lines are similar. Overall query time: O 𝑛 +𝑘 .

Kd-Tree Summary Theorem: A kd-tree for a set of 𝑛 points in the plane has the following resource requirements: Space: 𝑂 𝑛 Construction Time: 𝑂(𝑛 log 𝑛) Query Time: O 𝑛 +𝑘 for axis-orthogonal rectangle queries which report 𝑘 points.

Kd-Trees: Lingering Questions How do kd-trees generalize to higher dimensions? What is the query time in higher dimensions? Is there a data structure which allows for faster queries?