Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secondary Indexes Secondary Indexes By Jignesh Borisa(111) By Jignesh Borisa(111)

Similar presentations


Presentation on theme: "Secondary Indexes Secondary Indexes By Jignesh Borisa(111) By Jignesh Borisa(111)"— Presentation transcript:

1 Secondary Indexes Secondary Indexes By Jignesh Borisa(111) By Jignesh Borisa(111)

2 Agenda Introduction Introduction Distinction :Primary & Secondary Index Distinction :Primary & Secondary Index Design of Secondary Indexes Design of Secondary Indexes Application Application Indirection in Secondary Indexes Indirection in Secondary Indexes Document Retrieval Document Retrieval Inverted Index Inverted Index

3 Introduction Primary indexes determine the location of the indexed records. Primary indexes determine the location of the indexed records. To facilitate variety of queries. To facilitate variety of queries. Secondary Index serves the purpose of any index Secondary Index serves the purpose of any index A data structure that facilitates finding records given a value for one or more fields. A data structure that facilitates finding records given a value for one or more fields. Cont… Cont…

4 Distinguished from the primary index Distinguished from the primary index Not determine the placement of records in the data file. Not determine the placement of records in the data file. Secondary index tells the current location of records. Secondary index tells the current location of records. CREATE INDEX BDIndex ON MovieStar(birthdate); CREATE INDEX BDIndex ON MovieStar(birthdate);

5 Distinction: Primary & Secondary Indexes Secondary index does not influence location. Secondary index does not influence location. Not use it to predict the location of any record. Not use it to predict the location of any record. Secondary Indexes are always dense. Secondary Indexes are always dense.

6 Design Of Secondary Indexes A secondary index is a dense index, usually with duplicates. A secondary index is a dense index, usually with duplicates. The key is a search key & need not be unique. The key is a search key & need not be unique. Pairs in the index file are sorted by key value. Pairs in the index file are sorted by key value. Index would be sparse. Index would be sparse.

7 10 20 40 50 30 50 60 20 40 10 20 50 30 10 50 60 20 Secondary Index

8 Applications Heap Structure where the records of the relation are kept in no particular order. Heap Structure where the records of the relation are kept in no particular order. Clustered File. Clustered File.

9 Example Consider relations: Consider relations: Movie( title, year, length, inColor,studioName, Movie( title, year, length, inColor,studioName, producerC#) producerC#) Common Form of query : Common Form of query : Select title,year Select title,year From Movie,Studio From Movie,Studio WHERE presC#=zzz AND Movie.studioName=Studio.Name WHERE presC#=zzz AND Movie.studioName=Studio.Name

10 Indirection in Secondary Indexes Significant amount of wastage of space in the structure Significant amount of wastage of space in the structure If search key value appears n times in the data file, then the value is written n times in the index file. If search key value appears n times in the data file, then the value is written n times in the index file. A convenient way to use Buckets, a level of indirection A convenient way to use Buckets, a level of indirection

11 daa 20 10 40 50 60 20 40 10 20 50 30 10 50 60 20 30 etc. Index Files Buckets Data File

12 Document Retrieval The retrieval of documents given keywords has become one of the largest problems. The retrieval of documents given keywords has become one of the largest problems. A document may be thought of as a tuple in a relation Doc. A document may be thought of as a tuple in a relation Doc. Each attribute is boolean- either the word is present in the document or it is not. Each attribute is boolean- either the word is present in the document or it is not. Doc( hasCat, hasDog, … ) Doc( hasCat, hasDog, … ) Where hasCat is true if & only if the document the word “cat” at least once. Where hasCat is true if & only if the document the word “cat” at least once.

13 Disney1995 Buckets for studio Movie tuples Buckets for years Buckets for studio Movie tuples Buckets for years Studio Index Year Index Intersecting buckets in main memory

14 Inverted Index buckets Documents buckets Documents cut dog … the cat is fat … …was raining cats and Dogs… …Fido the Dog…. Inverted Index

15 THANK YOU THANK YOU


Download ppt "Secondary Indexes Secondary Indexes By Jignesh Borisa(111) By Jignesh Borisa(111)"

Similar presentations


Ads by Google