KD Tree A binary search tree where every node is a

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Nearest Neighbor Finding Using Kd-tree Ref: Andrew Moore’s PhD thesis (1991)Andrew Moore’s PhD thesis.
Nearest Neighbor Queries using R-trees
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Multidimensional Indexing
Searching on Multi-Dimensional Data
Binary Search Trees Azhar Maqsood School of Electrical Engineering and Computer Sciences (SEECS-NUST)
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
2-dimensional indexing structure
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 2) Lecture 21 COMP171 Fall 2006.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
Multimedia Databases Chapter 4.
Binary Trees Chapter 6.
CPSC 335 BTrees Dr. Marina Gavrilova Computer Science University of Calgary Canada.
Tree.
UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.
Lecture 10 Trees –Definiton of trees –Uses of trees –Operations on a tree.
1 2-D Trees You are given a set of points on the plane –Each point is defined by two coordinates (x, y) (5,45) (25,35) (35,40) (50,10) (90,5) (85,15) (80,65)
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Tree (new ADT) Terminology:  A tree is a collection of elements (nodes)  Each node may have 0 or more successors (called children)  How many does a.
Multi-dimensional Search Trees
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Segment Trees Basic data structure in computational geometry. Computational geometry.  Computations with geometric objects.  Points in 1-, 2-, 3-, d-space.
CMSC 341 K-D Trees. 8/3/2007 UMBC CSMC 341 KDTrees 2 K-D Tree Introduction  Multiple dimensional data Range queries in databases of multiple keys: Ex.
Multi-dimensional Search Trees CS302 Data Structures Modified from Dr George Bebis.
UNC Chapel Hill M. C. Lin Computing Voronoi Diagram For each site p i, compute the common inter- section of the half-planes h(p i, p j ) for i  j, using.
Copyright © 2012 Pearson Education, Inc. Chapter 20: Binary Trees.
Copyright © 2015, 2012, 2009 Pearson Education, Inc., Publishing as Addison-Wesley All rights reserved. Chapter 20: Binary Trees.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 20: Binary Trees.
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
CSE 373 Data Structures Lecture 7
School of Computing Clemson University Fall, 2012
CC 215 Data Structures Trees
AA Trees.
Search trees, k-d tree p ?/ x+y Hi! x--y To read 2<1
CMPS 3130/6130 Computational Geometry Spring 2017
Binary search tree. Removing a node
Extra: B+ Trees CS1: Java Programming Colorado State University
Spatial data structures -kdtrees
Spatial Indexing I Point Access Methods.
CSE 373 Data Structures Lecture 7
Tonga Institute of Higher Education
CMPS 3130/6130 Computational Geometry Spring 2017
Chapter 20: Binary Trees.
Advanced Topics in Data Management
Orthogonal Range Searching and Kd-Trees
Chapter 21: Binary Trees.
Trees CSE 373 Data Structures.
CMSC 341 K-D Trees.
Quadtrees 1.
Search trees, k-d tree p ?/ x+y Hi! x--y To read 2<1
Shmuel Wimer Bar Ilan Univ., School of Engineering
CMSC 341 K-D Trees.
CMSC 341 K-D Trees.
Chapter 20: Binary Trees.
Shape-based Registration
CMSC 341 K-D Trees.
CMPS 3130/6130 Computational Geometry Spring 2017
CMSC 341 K-D Trees.
Trees CSE 373 Data Structures.
Trees CSE 373 Data Structures.
CMSC 341 K-D Trees.
Data Mining CSCI 307, Spring 2019 Lecture 23
Presentation transcript:

KD Tree A binary search tree where every node is a k-dimensional point. Example: k=2 53, 14 27, 28 65, 51 31, 85 30, 11 70, 3 99, 90 29, 16 40, 26 7, 39 32, 29 82, 64 73, 75 15, 61 38, 23 55,62

KD Tree (cont’d) Example: data stored at the leaves

KD Tree (cont’d) Every node (except leaves) represents a hyperplane that divides the space into two parts. Points to the left (right) of this hyperplane represent the left (right) sub-tree of that node. Pleft Pright

KD Tree (cont’d) As we move down the tree, we divide the space along alternating (but not always) axis-aligned hyperplanes: Split by x-coordinate: split by a vertical line that has (ideally) half the points left or on, and half right. Split by y-coordinate: split by a horizontal line that has (ideally) half the points below or on and half above.

KD Tree - Example Split by x-coordinate: split by a vertical line that has approximately half the points left or on, and half right. x

KD Tree - Example Split by y-coordinate: split by a horizontal line that has half the points below or on and half above. x y y

KD Tree - Example Split by x-coordinate: split by a vertical line that has half the points left or on, and half right. x y y x x x x

KD Tree - Example Split by y-coordinate: split by a horizontal line that has half the points below or on and half above. x y y x x x x y y

Node Structure A KD-tree node has 5 fields Splitting axis Splitting value Data Left pointer Right pointer

Splitting Strategies Divide based on order of point insertion Assumes that points are given one at a time. Divide by finding median Assumes all the points are available ahead of time. Divide perpendicular to the axis with widest spread Split axes might not alternate … and more!

Example – using order of point insertion (data stored at nodes

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Example – using median (data stored at the leaves)

Another Example – using median

Another Example - using median

Another Example - using median

Another Example - using median

Another Example - using median

Another Example - using median

Another Example - using median

Example – split perpendicular to the axis with widest spread

KD Tree (cont’d) Let’s discuss Insert Delete Search

Insert new data Insert (55, 62) x y x y 53, 14 65, 51 27, 28 70, 3 55 > 53, move right Insert (55, 62) 53, 14 x 62 > 51, move right 65, 51 y 27, 28 x 70, 3 99, 90 30, 11 31, 85 55 < 99, move left 40, 26 7, 39 32, 29 82, 64 29, 16 y 38, 23 55,62 73, 75 15, 61 62 < 64, move left Null pointer, attach 30

Delete data Finding the replacement r = (c, d) Suppose we need to remove p = (a, b) Find node t which contains p If t is a leaf node, replace it by null Otherwise, find a replacement node r = (c, d) – see below! Replace (a, b) by (c, d) Remove r Finding the replacement r = (c, d) If t has a right child, use the successor* Otherwise, use node with minimum value* in the left subtree *(depending on what axis the node discriminates)

Delete data (cont’d)

Delete data (cont’d)

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree – Exact Search

KD Tree - Range Search low[0] = 35, high[0] = 40; x Range:[l,r] [35, 40] x [23, 30] In range? If so, print cell low[level]<=data[level]  search t.left high[level] >= data[level]  search t.right x 53, 14 65, 51 70, 3 99, 90 82, 64 73, 75 y 27, 28 x 30, 11 31, 85 7, 39 15, 61 y 29, 16 40, 26 32, 29 x 38, 23 low[0] = 35, high[0] = 40; This sub-tree is never searched. low[1] = 23, high[1] = 30; Searching is “preorder”. Efficiency is obtained by “pruning” subtrees from the search.

KD Tree - Range Search Consider a KD Tree where the data is stored at the leaves, how do we perform range search?

KD Tree – Region of a node The region region(v) corresponding to a node v is a rectangle, which is bounded by splitting lines stored at ancestors of v.

KD Tree - Region of a node (cont’d) A point is stored in the subtree rooted at node v if and only if it lies in region(v).

KD Trees – Range Search Need only search nodes whose region intersects query region. Report all points in subtrees whose regions are entirely contained in query range. If a region is partially contained in the query range check points. query range

Example – Range Search Query region: gray rectangle Gray nodes are the nodes visited in this example.

Example – Range Search * Node marked with * corresponds to a region that is entirely inside the query rectangle Report all leaves in this subtree.

Example – Range Search All other nodes visited (i.e., gray) correspond to regions that are only partially inside the query rectangle. - Report points p6 and p11 only - Do not report points p3, p12 and p13

Nearest Neighbor with KD Trees Traverse the tree, looking for the rectangle that contains the query.

Nearest Neighbor with KD Trees Explore the branch of the tree that is closest to the query point first.

Nearest Neighbor with KD Trees Explore the branch of the tree that is closest to the query point first.

Nearest Neighbor with KD Trees When we reach a leaf, compute the distance to each point in the node.

Nearest Neighbor with KD Trees When we reach a leaf, compute the distance to each point in the node.

Nearest Neighbor with KD Trees Then, backtrack and try the other branch at each node visited.

Nearest Neighbor with KD Trees Each time a new closest node is found, we can update the distance bounds.

Nearest Neighbor with KD Trees Each time a new closest node is found, we can update the distance bounds.

Nearest Neighbor with KD Trees Using the distance bounds and the bounds of the data below each node, we can prune parts of the tree that could NOT include the nearest neighbor.

Nearest Neighbor with KD Trees Using the distance bounds and the bounds of the data below each node, we can prune parts of the tree that could NOT include the nearest neighbor.

Nearest Neighbor with KD Trees Using the distance bounds and the bounds of the data below each node, we can prune parts of the tree that could NOT include the nearest neighbor.

K-Nearest Neighbor Search Can find the k-nearest neighbors to a query by maintaining the k current bests instead of just one. Branches are only eliminated when they can't have points closer than any of the k current bests.

NN example using kD trees d=1 (binary search tree) 5 20 7 8 10 12 13 15 18 7,8,10,12 13,15,18 13,15 18 7,8 10,12 7, 8 10, 12 13, 15 18

NN example using kD trees (cont’d) d=1 (binary search tree) 5 20 7 8 10 12 13 15 18 query 17 7,8,10,12 13,15,18 13,15 18 7,8 10,12 min dist = 1 7, 8 10, 12 13, 15 18

NN example using kD trees (cont’d) d=1 (binary search tree) 5 20 7 8 10 12 13 15 18 query 16 7,8,10,12 13,15,18 13,15 18 7,8 10,12 min dist = 2 min dist = 1 7, 8 10, 12 13, 15 18