Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.

Slides:



Advertisements
Similar presentations
Spatial and Temporal Data Mining V. Megalooikonomou Spatial Access Methods (SAMs) II (some slides are based on notes by C. Faloutsos)
Advertisements

Multimedia Database Systems
Spatial Join Queries. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
Nearest Neighbor Queries using R-trees Based on notes from G. Kollios.
CMU SCS : Multimedia Databases and Data Mining Lecture #6: Spatial Access Methods Part III: R-trees C. Faloutsos.
Search Trees.
2-dimensional indexing structure
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial Access Methods Chapter 26 of book Read only 26.1, 26.2, 26.6 Dr Eamonn Keogh Computer Science & Engineering Department University of California.
Spatial Indexing for NN retrieval
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
Accessing Spatial Data
Project Proposals Simonas Šaltenis Aalborg University Nykredit Center for Database Research Department of Computer Science, Aalborg University.
Spatial Indexing SAMs.
Spatial Queries Nearest Neighbor and Join Queries.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Multi-dimensional Indexes
Spatial Information Systems (SIS) COMP Spatial access methods: Indexing.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
1 R-Trees for Spatial Indexing Yanlei Diao UMass Amherst Feb 27, 2007 Some Slide Content Courtesy of J.M. Hellerstein.
Chapter 3: Data Storage and Access Methods
Spatial Queries Nearest Neighbor Queries.
I/O-Algorithms Lars Arge Aarhus University March 6, 2007.
R-tree Analysis. R-trees - performance analysis How many disk (=node) accesses we’ll need for range nn spatial joins why does it matter?
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Spatial Indexing SAMs. Spatial Access Methods PAMs Grid File kd-tree based (LSD-, hB- trees) Z-ordering + B+-tree R-tree Variations: R*-tree, Hilbert.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
R-TREES: A Dynamic Index Structure for Spatial Searching by A. Guttman, SIGMOD Shahram Ghandeharizadeh Computer Science Department University of.
CMU SCS : Multimedia Databases and Data Mining Lecture #6: Spatial Access Methods Part III: R-trees C. Faloutsos.
R-Trees: A Dynamic Index Structure for Spatial Data Antonin Guttman.
INDEXING SPATIAL DATABASES Atinder Singh Department of Computer Science University of California Riverside, CA
R-Trees Extension of B+-trees.  Collection of d-dimensional rectangles.  A point in d-dimensions is a trivial rectangle.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Storage CMSC 461 Michael Wilson. Database storage  At some point, database information must be stored in some format  It’d be impossible to store hundreds.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
1 B Trees - Motivation Recall our discussion on AVL-trees –The maximum height of an AVL-tree with n-nodes is log 2 (n) since the branching factor (degree,
R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search.
Antonin Guttman In Proceedings of the 1984 ACM SIGMOD international conference on Management of data (SIGMOD '84). ACM, New York, NY, USA.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
Lecture 3: External Memory Indexing Structures (Contd) CS6931 Database Seminar.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
R-Trees: A Dynamic Index Structure For Spatial Searching Antonin Guttman.
R-trees: An Average Case Analysis. R-trees - performance analysis How many disk (=node) accesses we ’ ll need for range nn spatial joins why does it matter?
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
Spatio-Temporal Databases
File Processing : Multi-dimensional Index 2015, Spring Pusan National University Ki-Joune Li.
R* Tree By Rohan Sadale Akshay Kulkarni.  Motivation  Optimization criteria for R* Tree  High level Algorithm  Example  Performance Agenda.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
Jeremy Iverson & Zhang Yun 1.  Chapter 6 Key Concepts ◦ Structures and access methods ◦ R-Tree  R*-Tree  Mobile Object Indexing  Questions 2.
1 R-Trees Guttman. 2 Introduction Range queries in multiple dimensions: Computer Aided Design (CAD) Geo-data applications Support special data objects.
Spatial Data Management
Mehdi Kargar Department of Computer Science and Engineering
Strategies for Spatial Joins
Many slides taken from George Kollios, Boston University
Spatial Queries Nearest Neighbor and Join Queries.
Multiway Search Trees Data may not fit into main memory
Spatial Indexing.
Spatial Indexing I Point Access Methods.
Temporal Indexing MVBT.
R-tree: Indexing Structure for Data in Multi-dimensional Space
15-826: Multimedia Databases and Data Mining
B-Tree.
Spatial Indexing I R-trees
File Processing : Multi-dimensional Index
R-trees: An Average Case Analysis
Multidimensional Search Structures
C. Faloutsos Spatial Access Methods - R-trees
Presentation transcript:

Spatial Indexing SAMs

Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation technique and a PAM New methods: Spatial Access Methods SAMs R-tree and variations

Problem Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer spatial queries (range, nn, etc)

Transformation Technique Map an d-dim MBR into a point: ex. [(x min, x max ) (y min, y max )] => (x min, x max, y min, y max ) Use a PAM to index the 2d points Given a range query, map the query into the 2d space and use the PAM to answer it

R-trees [Guttman 84] Main idea: allow parents to overlap! => guaranteed 50% utilization => easier insertion/split algorithms. (only deal with Minimum Bounding Rectangles - MBRs)

R-trees A multi-way external memory tree Index nodes and data (leaf) nodes All leaf nodes appear on the same level Every node contains between m and M entries The root node has at least 2 entries (children)

Example eg., w/ fanout 4: group nearby rectangles to parent MBRs; each group -> disk page A B C D E F G H J I

Example F=4 A B C D E F G H I J P1 P2 P3 P4 FGDEHIJABC

Example F=4 A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC

R-trees - format of nodes {(MBR; obj_ptr)} for leaf nodes P1P2P3P4 ABC x-low; x-high y-low; y-high... obj ptr...

R-trees - format of nodes {(MBR; node_ptr)} for non-leaf nodes P1P2P3P4 ABC x-low; x-high y-low; y-high... node ptr...

R-trees:Search A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC

R-trees:Search A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC

R-trees:Search Main points: every parent node completely covers its ‘children’ a child MBR may be covered by more than one parent - it is stored under ONLY ONE of them. (ie., no need for dup. elim.) a point query may follow multiple branches. everything works for any(?) dimensionality

R-trees:Insertion A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC X X Insert X

R-trees:Insertion A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC Y Insert Y

R-trees:Insertion Extend the parent MBR A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC Y Y

R-trees:Insertion How to find the next node to insert the new object? Using ChooseLeaf: Find the entry that needs the least enlargement to include Y. Resolve ties using the area (smallest) Other methods (later)

R-trees:Insertion If node is full then Split : ex. Insert w A B C D E F G H I J P1 P2 P3 P4 P1P2P3P4 FGDEHIJABC W K K

R-trees:Insertion If node is full then Split : ex. Insert w A B C D E F G H I J P1 P2 P3 P4 Q1Q2FGDEHIJAB W K CKW P5 P1P5P2P3 P4 Q1 Q2

R-trees:Split Split node P1: partition the MBRs into two groups. A B C W K P1 (A1: plane sweep, until 50% of rectangles) A2: ‘linear’ split A3: quadratic split A4: exponential split: 2 M-1 choices

R-trees:Split pick two rectangles as ‘seeds’; assign each rectangle ‘R’ to the ‘closest’ ‘seed’ seed1 seed2 R

R-trees:Split pick two rectangles as ‘seeds’; assign each rectangle ‘R’ to the ‘closest’ ‘seed’: ‘closest’: the smallest increase in area seed1 seed2 R

R-trees:Split How to pick Seeds: Linear:Find the highest and lowest side in each dimension, normalize the separations, choose the pair with the greatest normalized separation Quadratic: For each pair E1 and E2, calculate the rectangle J=MBR(E1, E2) and d= J-E1-E2. Choose the pair with the largest d

R-trees:Insertion Use the ChooseLeaf to find the leaf node to insert an entry E If leaf node is full, then Split, otherwise insert there Propagate the split upwards, if necessary Adjust parent nodes

R-Trees:Deletion Find the leaf node that contains the entry E Remove E from this node If underflow: Eliminate the node by removing the node entries and the parent entry Reinsert the orphaned (other entries) into the tree using Insert Other method (later)

R-trees: Variations R+-tree: DO not allow overlapping, so split the objects (similar to z-values) R*-tree: change the insertion, deletion algorithms (minimize not only area but also perimeter, forced re-insertion ) Hilbert R-tree: use the Hilbert values to insert objects into the tree