Download presentation

Presentation is loading. Please wait.

1
**Multidimensional Indexing**

2
More Dimensions There are applications that require us to see data in two or more dimensions, e.g. Geographical Information System Roughly, every attribute of data can be seen as a dimension

3
Example Queries Partial Match: looking for a set data items with specific values for every dimension Range: looking for a set of data items within a specific range for every dimension Nearest-Neighbor: looking for the closets point to a given point, e.g. a city of over population closest to a given city Where-am-I: finding out where a specific point is located, e.g. locating mouse pointer on the screen

4
**Using Conventional Indexes**

Suppose points are distributed randomly in a 2D space with x and y ranging from 0 to 1000. If we are looking for points with 450<x<550 and 450<y<550, i.e. an area of 100x100 Using a B-Tree for x we find pointers having x within the range One way is to retrieve all those points and verify their y value, in order to find the points at the intersection

5
**Using Conventional Indexes**

Almost any data structure allows us to execute nearest-neighbor query by specifying a range in each dimension, but which point is closer?

6
**Multidimensional Indexes**

Hash-Table like Structures Tree Like Structures

7
**Multidimensional Indexes**

Hash-Table like Structures Grid File, does not hash, partitions the dimensions by sorting the values along those dimensions Partitioned Hashing, does hash the various dimensions, each dimension contributes to the bucket number

8
Grid File (Hash Table) Each of the regions can be thought of as a bucket of a hash table Each point in that region has its record placed in a block belonging to that bucket For example: the central rectangle represents data items with 40 ≤ age < 55 and 90 ≤ salary < 225

9
**Grid File Instead of one dimensional array of buckets**

Grid file uses an array with number of dimensions same as the data file Hashing is different from applying a hash function The positions of the data item in each of the dimension together determine the bucket

10
Grid File

11
Grid File Inserting: If there is place in the block of the proper bucket, then we insert If there is no place Add overflow blocks to the bucket Reorganize the structure by adding or moving grid lines

12
**Grid File Reorganizing the structure:**

Adding a grid line splits all the buckets along that line It may not be possible to select a new line that does the best for all buckets This may create for example too many empty buckets or leaving several very full buckets

13
**Grid File Age = 51 Example: Inserting point (52, 200)**

Vertical Line age = 51 doesn’t help, Since it doesn’t split any other bucket, It only create 3 empty buckets

14
**Partitioned Hashing Example: Three bits used for bucket number**

The left most bit is determined by first attribute The two right most bits are determined by second attribute h(25) = 25 % 2 = 110 = 12 h(60) = 60 % 4 = 010 = 02 = 002 Therefore h(25,60) = 100 h(45) = 45 % 2 = 110 = 12 h(350) = 350 % 4 = 210 = 102 Therefore h(45,350) = 110

15
**Grid File <-> Partitioned Hash**

Partial Match Query -> Partitioned Hash Nearest Neighbor -> Grid File Range Query -> Grid file However with these methods we no longer have the advantage that the answer is in exactly one bucket, but still they limit our search to a subset of the buckets

16
Tree Like Structures Multiple-key Indexes: a tree in which the nodes at each level are indexes for one attribute kd-trees (k-dimensional search tree): a binary tree Note: in these structures we are going to lose the advantage of having balanced trees

17
**Multiple-key Indexes Very efficient for partial match query**

Works quite well for range queries

18
**kd-tree Index A binary tree**

Interior nodes have an attributes, a dividing value for that attribute, and pointer to left and right children. Leaves are blocks, with space for as many records as a block can hold.

19
kd-tree Index

20
**kd-tree Index Inserting data item (35,500)**

If there is no room in the proper block We split the leaf node and create a new internal node

Similar presentations

OK

Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.

Indexes. Primary Indexes Dense Indexes Pointer to every record of a sequential file, (ordered by search key). Can make sense because records may be much.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on marine pollution prevention and control Ppt on united states postal services 2009 Ppt on zener diode current Ppt on electricity for grade 2 Ppt on types of life insurance Ppt on power system protection Ppt on bluetooth applications for pc Ppt on social impact of information technology Ppt on needle stick injury infection Ppt on reaction mechanism in organic chemistry