Multidimensional Indexes

Slides:



Advertisements
Similar presentations
Multidimensional Index Structures One dimensional index structures assume a single search key, and retrieve records that match a given search-key value.
Advertisements

Materialization and Cubing Algorithms. Cube Materialization Each cell of the data cube is a view consisting of an aggregation of interest. The values.
Query Execution, Concluded Zachary G. Ives University of Pennsylvania CIS 550 – Database & Information Systems November 18, 2003 Some slide content may.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part C Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Quick Review of Apr 10 material B+-Tree File Organization –similar to B+-tree index –leaf nodes store records, not pointers to records stored in an original.
Multidimensional Indexing
Chapter 11 Indexing and Hashing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
File Processing : Hash 2015, Spring Pusan National University Ki-Joune Li.
CS144: Spatial Index. Example Dataset Grid File (2 points per bucket)
Searching on Multi-Dimensional Data
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
Multidimensional Data
Multidimensional Data. Many applications of databases are "geographic" = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
COMP 451/651 B-Trees Size and Lookup Chapter 1.
Spatial Indexing I Point Access Methods. PAMs Point Access Methods Multidimensional Hashing: Grid File Exponential growth of the directory Hierarchical.
BTrees & Bitmap Indexes
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
Data Indexing Herbert A. Evans. Purposes of Data Indexing What is Data Indexing? Why is it important?
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Indexing I Point Access Methods.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
CS 245Notes 51 CS 245: Database System Principles Hector Garcia-Molina Notes 5: Hashing and More.
COMP 451/651 Multiple-key indexes
CS 277 – Spring 2002Notes 51 CS 277: Database System Implementation Arthur Keller Notes 5: Hashing and More.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Multidimensional Data Many applications of databases are ``geographic'' = 2­dimensional data. Others involve large numbers of dimensions. Example: data.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Mutlidimensional Indices Instructor: Randal Burns Lecture for 29 November 2005 Computer Science Johns Hopkins University.
Indexing for Multidimensional Data An Introduction.
Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: –partial match (give only a subset of the dimensions) –range.
CS 245Notes 51 CS 245: Database System Principles Hector Garcia-Molina Notes 5: Hashing and More.
Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.
Chapter 5 Multidimensional Indexes. One dimensional index can be used to support multidimensional query. F1=‘abcd’ F2= 123‘abcd#123’
CPSC 8620Notes 61 CPSC 8620: Database Management System Design Notes 6: Hashing and More.
Multidimensional Access Structures COMP3017 Advanced Databases Dr Nicholas Gibbins –
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Hash 2004, Spring Pusan National University Ki-Joune Li.
Chapter 5. Multidimensional Indexes
Spatial Data Management
CS 245: Database System Principles
Data Indexing Herbert A. Evans.
Indexing Structures for Files and Physical Database Design
CS 540 Database Management Systems
Indexing Goals: Store large files Support multiple search keys
Indexing and hashing.
Multidimensional Access Structures
Spatial Indexing I Point Access Methods.
Lecture 21: Hash Tables Monday, February 28, 2005.
COMP 430 Intro. to Database Systems
Database Management Systems (CS 564)
The Quad tree The index is represented as a quaternary tree
Chapter Trees and B-Trees
Chapter Trees and B-Trees
CS 245: Database System Principles
External Memory Hashing
Chapter 11: Indexing and Hashing
Session #, Speaker Name Indexing Chapter 8 11/19/2018.
15-826: Multimedia Databases and Data Mining
Yan Huang - CSCI5330 Database Implementation – Access Methods
Indexing and Hashing Basic Concepts Ordered Indices
CS 245: Database System Principles
2018, Spring Pusan National University Ki-Joune Li
File Processing : Multi-dimensional Index
CPSC-608 Database Systems
15-826: Multimedia Databases and Data Mining
Lecture 20: Indexes Monday, February 27, 2006.
CS4433 Database Systems Indexing.
Chapter 11: Indexing and Hashing
Index Structures Chapter 13 of GUW September 16, 2019
Presentation transcript:

Multidimensional Indexes Applications: geographical databases, data cubes. Types of queries: partial match (give only a subset of the dimensions) range queries nearest neighbor Where am I? (DB or not DB?) Conventional indexes don’t work well here.

Indexing Techniques Hash like structures: Tree like structures: Grid files Partitioned indexing functions Tree like structures: Multiple key indexes kd-trees Quad trees R-trees

Grid Files Each region in the corresponds to a bucket. Works well even if we only have partial matches Some buckets may be empty. Reorganization requires moving grid lines. Number of buckets grows exponentially with the dimensions. 500K * * * * * * 250K * * * 200K * 90K * Salary * * * * 10K * 15 20 35 102 Age

Partitioned Hash Functions A hash function produces k bits identifying the bucket. The bits are partitioned among the different attributes. Example: Age produces the first 3 bits of the bucket number. Salary produces the last 3 bits. Supports partial matches, but is useless for range queries.

Tree Based Indexing Techniques Salary, 150 Age, 60 Age, 47 70, 110 Salary, 300 85, 140 * * * * * * * * * * * * *

Multiple Key Indexes Each level as an index for one of the attributes. Works well for partial matches if the match includes the first attributes. Index on first attribute Index on second attribute

KD Trees Adaptation to secondary storage: Allow multiway branches at the nodes, or Group interior nodes into blocks. Salary, 150 Age, 60 Age, 47 50, 275 70, 110 Salary, 80 Salary, 300 60, 260 85, 140 50, 100 Age, 38 50, 120 30, 260 25, 400 25, 60 45, 60 45, 350 50, 75

Quad Trees Each interior node corresponds 400K to a square region (or k-dimen) When there are too many points in the region to fit into a block, split it in 4. Access algorithms similar to those of KD-trees. 400K * * * * * * * * * * * Salary * * 100 Age

R-Trees Interior nodes contain sets of regions. Regions can overlap and not cover all parent’s region. Typical query: Where am I? Can be used to store regions as well as data points. Inserting a new region may involve extending one of the existing regions (minimally). Splitting leaves is also tricky.