Trees for spatial data representation and searching

Slides:



Advertisements
Similar presentations
1 DATA STRUCTURES USED IN SPATIAL DATA MINING. 2 What is Spatial data ? broadly be defined as data which covers multidimensional points, lines, rectangles,
Advertisements

Nearest Neighbor Search
Spatial Data Structures Hanan Samet Computer Science Department
Efficient access to TIN Regular square grid TIN Efficient access to TIN Let q := (x, y) be a point. We want to estimate an elevation at a point q: 1. should.
Multidimensional Indexing
Access Methods for Advanced Database Applications.
Searching on Multi-Dimensional Data
CSE 381 – Advanced Game Programming Scene Management
Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.
Quadtrees Raster and vector.
CS447/ Realistic Rendering -- Solids Modeling -- Introduction to 2D and 3D Computer Graphics.
Geometric Data Structures Dr. M. Gavrilova. Lecture Plan Voronoi diagrams Trees and grid variants.
Spatial Indexing I Point Access Methods. PAMs Point Access Methods Multidimensional Hashing: Grid File Exponential growth of the directory Hierarchical.
2-dimensional indexing structure
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
Accessing Spatial Data
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
Spatial Information Systems (SIS) COMP Spatial access methods: Indexing.
©Silberschatz, Korth and Sudarshan23.1Database System Concepts 1 Temporal and Spatial Data Transaction systems  Relational DB  OO DB  OR DB Decision.
Lars Arge1, Mark de Berg2, Herman Haverkort3 and Ke Yi1
Chapter 3: Data Storage and Access Methods
Spatial Indexing I Point Access Methods.
UNC Chapel Hill M. C. Lin Overview of Last Lecture About Final Course Project –presentation, demo, write-up More geometric data structures –Binary Space.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
 Spatial data requires special data structures, similar to B-trees.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Information Systems (SIS) COMP Spatial access methods: Indexing (part 2)
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
10/11/2001CS 638, Fall 2001 Today Kd-trees BSP Trees.
Indexing Spatial Data (Parts of Chapter 25+R-tree paper)
S PATIAL DATA STRUCTURES – KD - TREES Jianping Fan Department of Computer Science UNC-Charlotte.
 This lecture introduces multi-dimensional queries in databases, as well as addresses how we can query and represent multi- dimensional data.
Orthogonal Range Searching I Range Trees. Range Searching S = set of geometric objects Q = query object Report/Count objects in S that intersect Q Query.
Data Structures for Computer Graphics Point Based Representations and Data Structures Lectured by Vlastimil Havran.
Spatial Data Structures Jason Goffeney, 4/26/2006 from Real Time Rendering.
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
UNC Chapel Hill M. C. Lin Point Location Reading: Chapter 6 of the Textbook Driving Applications –Knowing Where You Are in GIS Related Applications –Triangulation.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Indexing for Multidimensional Data An Introduction.
Database System Concepts ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com 1 Indexing Spatial Data.
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Multi-dimensional Search Trees
PRESENTED BY – GAURANGI TILAK SHASHANK AGARWAL Collision Detection.
R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search.
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
Spatial and Geographic Databases ADVANCED DATABASES Khawaja Mohiuddin Assistant Professor Department of Computer Sciences Bahria University (Karachi Campus)
Computational Geometry Piyush Kumar (Lecture 5: Range Searching) Welcome to CIS5930.
Spatial and Geographic Databases. Spatial databases store information related to spatial locations, and support efficient storage, indexing and querying.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
Multi-dimensional Search Trees CS302 Data Structures Modified from Dr George Bebis.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
Ray Tracing Acceleration (3)
UNC Chapel Hill M. C. Lin Geometric Data Structures Reading: Chapter 10 of the Textbook Driving Applications –Windowing Queries Related Application –Query.
Spatial Data Management
Spatial data structures -kdtrees
Spatial Indexing I Point Access Methods.
Mean Shift Segmentation
Query Processing in Databases Dr. M. Gavrilova
KD Tree A binary search tree where every node is a
Advanced Topics in Data Management
Orthogonal Range Searching and Kd-Trees
Presentation transcript:

Trees for spatial data representation and searching CPSC 335 Trees for spatial data representation and searching

Overview Interval trees K-d trees Grids and Grid files using B-trees Spatial data structures Interval trees K-d trees Grids and Grid files using B-trees R-trees

Trees BST – search trees, O(n) AVL, IPR – balanced O(log n) B-trees – for indexing and searching in data bases: Grow from the leaf level More compact – faster search B+, B* - used for indexing, store data in leaves, nodes are more full

Spatial data applications GIS (Geographic Information Systems) CAD (Computer Aided Design) VLSI (Very Large Scale Integration, IBM) Robotics Image Processing

Spatial objects point segment line circle sphere polygon polyhedron Convex, concave Simple, Non-simple With holes, without holes polyhedron

Operations on spatial objects Stored Displayed Manipulated Queried

Examples of applications

Data collection from GPS (Global Positioning Systems) – BMP, GIF, GPEG, etc… from existing maps, geometric (vector) representation from experiments (physical, biological, mechanical) - attributes generated for experiments – data files text, images)

Operations on spatial data Spatial queries Point location Stabbing query (report all intervals/polygons contain the point)

Spatial queries (2D) Point query – find an object containing a point (find a Voronoi region containing a point) Window query – find an object overlapping a rectangle Spatial join – join parts of objects satisfying some relationship (intersection, adjacency, containment)

Overview Interval trees K-d trees Grids and Grid files using B-trees Spatial data structures Interval trees K-d trees Grids and Grid files using B-trees R-trees

Interval trees Geometric, 1-dimensional tree Interval is defined by (x1,x2) Split at the middle (5), again at the middle (3,7), again at the middle (2,8) All intervals intersecting a middle point are stored at the corresponding root (sorted). (4,6) (4,8) 1 2 3 4 5 6 7 8 9 (6,9) (2,4) (7.5,8.5)

Interval trees Finding intervals – by finding x1, x2 against the nodes Find interval containing specific value – from the root Sort intervals within each node of the tree according to their coordinates Cost of the “stabbing query”– finding all intervals containing the specified value is O(log n + k), where k is the number of reported intervals.

Construction We start by taking the entire range of all the intervals and dividing it in half at x_center (in practice, x_center could be picked as MEDIAN to keep the tree relatively balanced). This gives three sets of intervals, those completely to the left of x_center which we'll call S_left, those completely to the right of x_center which we'll call S_right, and those overlapping x_center which we'll call S_center. The intervals in S_left and S_right are recursively divided in the same manner until there are no intervals left. The intervals in S_center that overlap the center point are stored in a separate data structure linked to the node in the interval tree.

Resulting tree data structure The result is a binary tree with each node storing: A center point A pointer to another node containing all intervals completely to the left of the center point A pointer to another node containing all intervals completely to the right of the center point All intervals overlapping the center point sorted by their beginning point All intervals overlapping the center point sorted by their ending point

Interval Tree using MEDIAN Let I:={[x1:x1’], [x2:x2’], …, [xn:xn’]} be a set of closed intervals. Let xmid be the Median of the 2n interval endpoints At most half of the interval endpoints lies to the left of xmid and at most half to the right. Then the resulting Interval Tree is more Balanced than standard Interval Tree

Example of interval tree using Median

Properties An interval tree for a set I of n intervals use O(n) storage and can be built in O(nlogn) time. Using the interval tree we can report all intervals that contain a query point in O(logn+k) time, where k is the number of reported intervals.

Overview Interval trees K-d trees Grids and Grid files using B-trees Spatial data structures Interval trees K-d trees Grids and Grid files using B-trees R-trees

K-d tree Used for point location and multiple database quesries, k –number of the attributes to perform the search Geometric interpretation – to perform search in 2D space – 2-d tree Search components (x,y) interchange!

K-d tree K-d tree a space-partitioning data structure for organizing points in a k-dimensional space. The kd-tree is a binary tree in which every node is a k- dimensional point. Every non-leaf node generates a splitting hyperplane that divides the space into two subspaces. Points left to the hyperplane represent the left sub-tree of that node and the points right to the hyperplane by the right sub- tree. The hyperplane direction is chosen in the following way: every node split to sub-trees is associated with one of the k- dimensions, such that the hyperplane is perpendicular to that dimension vector. So, for example, if for a particular split the "x" axis is chosen, all points in the subtree with a smaller "x" value than the node will appear in the left subtree and all points with larger "x" value will be in the right sub tree. Hyperplane direction ROTATES over all k dimensions!

K-d tree example d d e c f b f c a e b a

K-d tree construction The canonical method of kd-tree construction is: As one moves down the tree, one cycles (rotates) through the axes used to select the splitting planes. (For example, the root would have an x-aligned plane, the root's children would both have y- aligned planes, the root's grandchildren would all have z-aligned planes, the next level would have an x-aligned plane, and so on.) Points are inserted by selecting the median of the points being put into the subtree, with respect to their coordinates in the axis being used to create the splitting plane. This method leads to a balanced kd-tree, in which each leaf node is about the same distance from the root. Note also that it is not required to select the median point. In that case, the result is simply that there is no guarantee that the tree will be balanced. A simple heuristic to avoid coding a complex linear-time median-finding algorithm nor using an O(n log n) sort is to use sort to find the median of a fixed number of randomly selected points to serve as the cut line. Practically this technique often results in nicely balanced trees.

Overview Interval trees K-d trees Grids and Grid files using B-trees Spatial data structures Interval trees K-d trees Grids and Grid files using B-trees R-trees

Progressive Meshes Developed by Hugues Hoppe, Microsoft Research Inc. Published first in SIGGRAPH 1996.

Terrain visualization applications

Geometric subdivision Problems with Geometric Subdivisions

The basic operating principle of ROAM ROAM principle The basic operating principle of ROAM

Quad-tree and Bin-tree for ROAM (real-time adaptive mesh)

The grid Fixed grid: Stored as a 2D array, each entry contains a link to a list of points (object) stored in a grid. a,b

Page overflow Too many points in one grid cell: Split the cell!

Grid file Example of a grid file

Grid file vs. grid In a grid file, the index is dynamically increased in size when overflow happens. The space is split by a vertical or a horizontal line, and then further subdivided when overflow happens! Index is dynamically growing Boundaries of cells of different sizes are stores, thus point and stabbing queries are easy

Rectangle indexing with grids Rectangles may share different grid cells Duplicates are stored Grid cells are of fixed size

The quadtree Instead of using an array as an index, use tree! Quadtree decomposition – cells are indexed by using quaternary B-tree. All cells are squares, not polygons. Search in a tree is faster!

Linear quadtree B+ index – actual references to rectangles are stored in the leaves, saving more space+ access time Label nodes according to Z or “pi” order

Linear quadtree Level of detail increases as the number of quadtree decompositions increases! Decompositions have indexes of a form: 00,01,02,03,10,11,12,13, 2,300 301 ,302 ,303 ,31 ,32 ,33 Stores as Bplus tree

Finer Grid R-tree Each object s decomposed and stored as a set of rectangles Object decomposition: Larger areas of a grid are treated as one element Raster decomposition: Each smaller element s stored separately

R-trees R-tree R * tree- Optimizes Objects are grouped together according to topological properties not a grid. More flexibility. R * tree- Optimizes Node overlapping Areas covered by the node R+ tree – B+ tree, bounding rectangles do not intersect

Conclusions Interval trees K-d trees Spatial data structures such as Interval trees K-d trees Grids and Grid files using B-trees R-trees are used in variety of applications. They are often balanced, good for searching, DB queryng, spatial queryng and utilize Index and B tree concepts.