Introduction to Database Systems2 Indexes v An index on a collection of records speeds up selections on the search key fields. –Any subset of the fields of a record can be the search key for an index on the collection. v An index is a collection of index entries. –Retrieve all entries k* with key value k –Retrieve all entries k* between two key values –Retrieve entries in search key order
Introduction to Database Systems3 Alternatives for Data Entry k* in Index v Three alternatives: À Data record with search key value k Ô issue : how much data repetition? Ô Issue: is this simply a fancy file format? Á Â v Our focus: alternative 2. –Examples of indexing techniques: B+ trees, hash- based structures
Introduction to Database Systems4 Index Classification: Clustering v Clustered vs. unclustered : If order of data records is the same as, or ``close to’’, order of data entries, then called clustered index. –At most one independent clustered index. –Cost of retrieving data through index varies greatly based on whether index is clustered or not! Why? –Usually, clustering desired for sorted access.
Introduction to Database Systems5 Clustered vs. Unclustered Index Title: es_f52.fig Creator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1 CreationDate: Wed Oct 11 19:11:29 1995
Introduction to Database Systems6 Sparse Clustering v Dense vs. Sparse : If there is at least one data entry per search key value (in some data record), then dense. –Every sparse index is clustered! Title: l3_f1.fig Creator: /s/transfig-3.1.1/exe/fig2dev Version 3.1 Patchlevel 1 CreationDate: Wed Sep 6 17:49:58 1995
Introduction to Database Systems7 Primary/Secondary Indexes v Definition 1: Primary == Clustered v Definition 2: Primary == search key contains primary key of the relation v We will use Definition 2
Introduction to Database Systems8 Tree-Structured Indexing v Tree-structured indexing techniques support both range searches and equality searches v `` Find all students with gpa > 3.0 ’’ –If data is in sorted file, use binary search. v Simple idea: Create an `index’ file. * Can do binary search on (smaller) index file!
Introduction to Database Systems9 ISAM v Index file may still be quite large. But we can apply the idea repeatedly! * Leaf pages contain data entries.
Introduction to Database Systems10 Comments on ISAM v File creation : Leaf pages allocated sequentially, sorted by search key; then index pages allocated, then space for overflow pages. v Index entries : ; they `direct’ search for data entries, which are in leaf pages. v Search : Start at root; use key comparisons to go to leaf. Cost log F N ; F = # entries/index pg, N = # leaf pgs v Insert : Find leaf data entry belongs to, and put it there. v Delete : Find and remove from leaf; if empty overflow page, de-allocate. * Static tree structure : inserts/deletes affect only leaf pages.
Introduction to Database Systems11 Example ISAM Tree v Each node can hold 2 entries; no need for `next-leaf-page’ pointers. (Why?)
Introduction to Database Systems12 After Inserting 23*, 48*, 41*, 42*...
Introduction to Database Systems13... Then Deleting 42*, 51*, 97* * Note that 51* appears in index levels, but not in leaf!