Presentation is loading. Please wait.

Presentation is loading. Please wait.

2 Level Indexes Indexed Files - Part Two

Similar presentations


Presentation on theme: "2 Level Indexes Indexed Files - Part Two"— Presentation transcript:

1 2 Level Indexes Indexed Files - Part Two
Portions of this lecture stolen from Foster's 325 Lecture Notes

2 Where we left off last class
The primary purpose of using indexes is to speed searching. In a single layer indexed file, the index-to-data relationship is 1:1. The index resides in main memory for fast access. What if that 1:1 index is too big for memory?

3 2 Levels of Indexes First Level Index Second Level Index
resides in memory entirety stays in a file entries point to the Second Level 1:1 ratio of entries to records in data file entries are ordered for fast searching entries contain key an IRRN - Index Relative Record Number (pointer into the Second Level) DRRN - Data RRN (pointer into the data file) size is TBD

4 An Overly Simple Example ... ... Key IRRN 20 40 60 80 Product ID Key
DRRN 1 2 3 4 5 6 7 8 9 10 11 12 . . . 18 19 20 21 ... 90 91 98 99 Product ID Field 1 Field 2 Yadda yadda 1 2 3 4 5 6 7 8 9 10 11 12 . . . 18 19 20 21 ... 90 91 98 99 An Overly Simple Example Key IRRN 20 40 60 80

5 Search Algorithm Data Size = N records Level One Size = K1 entries
Array index Key IRRN 1 20 2 40 3 60 4 80 Data Size = N records Level One Size = K1 entries Preconditions : K2 = N / K1 level one index is already in an array in memory (arrary1) i = 0; while (Target > array1[i].Key) && (i < K1) i++; i = i-1; SeekG (secondaryfile, array1[i].IRRN*sizeof(index records)) Read (secondaryfile, K2 records, into array2) binary search array2 for Target SeekG (datafile, array2[location].DRRN*sizeof(data records)) read record from datafile Can this be a binary search?

6 A Better Example ... ... Key IRRN Adams Foster 20 Lambert 40 Randall
DRRN Adams 6 1 Barnes 2 Bell 18 3 Bishop 8 4 Camp 80 5 Carey Conner 19 7 Critter Crook 99 9 Dannelly 21 10 Davis 20 11 Dinkins 12 Duncan . . . Faulk Farrow Foster Fuller 98 ... West 81 Wilks Young Zinn RRN Name Acct Num Address Yadda yadda Carey 1 Foster 2 Barnes 3 Zinn 4 Critter 5 Faulk 6 Adams 7 Wilks 8 Bishop 9 Farrow 10 Duncan 11 Dinkins 12 West . . . 18 Bell 19 Conner 20 Davis 21 Dannelly ... 80 Camp 81 Young 98 Fuller 99 Crook A Better Example Key IRRN Adams Foster 20 Lambert 40 Randall 60 West 80

7 Sorting a File takes a long time!
Add Algorithm Key IRRN Adams Foster 20 Lambert 40 Randall 60 West 80 Key DRRN Adams 6 1 Barnes 2 Bell 18 3 Bishop 8 4 Camp 80 5 Carey Conner 19 7 Critter Crook 99 9 Dannelly 21 10 Davis 20 11 Dinkins 12 Duncan . . . Faulk Farrow Foster Fuller 98 ... West 81 Wilks Young Zinn Append new record to end of datafile add entry (Key and DRRN) to end of secondary file sort secondary key K2 = N / K1 for (i=0; i<K1; i++) array1[i].key = secondarykey (i * K2) array1[i].IRRN = i * K2 YIKES! Sorting a File takes a long time!

8 Better Structure when Additions are Frequent
Key DRRN Adams 6 1 Barnes 2 Bell 18 3 Bishop 8 4 Camp 80 5 Carey Conner 19 7 Critter Crook 99 9 Dannelly 21 10 Davis 20 11 Dinkins 12 Duncan . . . 15 blank 999 16 17 Foster Fuller 98 ... West Key IRRN Adams Foster 20 Lambert 40 Randall 60 West 80 Better Structure when Additions are Frequent Instead of filling the secondary index, leave room for expansion. Example between Adams and Foster, put 15 names instead of 20 that leaves 5 growth spots before an adjustment is needed when adding "Baker", only need to sort (move) Adams to Foster-1

9 Theoretical Best Size of Index 1
Remember: Level 1 index stays in memory only a portion of Level 2 goes into memory To minimize search times of those two arrays, optimal size of Index 1 is sqrt(N) Example Assume N = 100 Size of Level 1 = sqrt(100) = 10 each of those level 1 entries points to 10 level 2 entries so we end up searching two arrays of 10 elements each

10 Real Best Size of Index 1 "To minimize search times of those two arrays, optimal size of Index 1 is sqrt(N)" But array2 must be read from a file over and over and over. So, the smaller array2 the better! Hence, optimal size of Index 1 = as big as main memory allows

11 Next Class Multiple Indexes 3 Levels of Indexes multiple keys
maybe you and I should not see the same items 3 Levels of Indexes


Download ppt "2 Level Indexes Indexed Files - Part Two"

Similar presentations


Ads by Google