Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop.

Similar presentations


Presentation on theme: "CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop."— Presentation transcript:

1 CS 44321 CS4432: Database Systems II

2 CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop INDEX name

3 CS 44323 ATTRIBUTE LIST  MULTIKEY INDEX e.g., CREATE INDEX foo ON R(A,B,C) Note

4 CS 44324 Motivation: Find records where DEPT = “Toy” AND SAL > 50k Multi-key Index

5 CS 44325 Strategy I: Use one index, say Dept. Get all Dept = “Toy” records and check their salary I1I1

6 CS 44326 Use 2 Indexes; Manipulate Pointers ToySal > 50k Strategy II:

7 CS 44327 Multiple Key Index One idea: Strategy III: I1I1 I2I2 I3I3

8 CS 44328 Example Record Dept Index Salary Index Name=Joe DEPT=Sales SAL=15k Art Sales Toy 10k 15k 17k 21k 12k 15k 19k

9 CS 44329 For which queries is this index good? Find RECs Dept = “Sales” SAL=20k Find RECs Dept = “Sales” SAL > 20k Find RECs Dept = “Sales” Find RECs SAL = 20k

10 CS 443210 Many alternate methods for indexing

11 CS 443211 key  h(key) Hashing...... Buckets (typically 1 disk block)

12 CS 443212 One example hash function Key = ‘x 1 x 2 … x n ’ n-byte character string Have b buckets Hash function : –h: add (x 1 + x 2 + ….. X n) modulo b

13 CS 443213  This may not be best function …  Read Knuth Vol. 3 if you really need to select a good function. Good hash  Expected number of function:keys/bucket is the same for all buckets

14 CS 443214 Within a bucket: Do we keep keys sorted? Yes, if CPU time critical & Inserts/Deletes not too frequent

15 CS 443215 Next: example to illustrate inserts, overflows, deletes h(K)

16 CS 443216 EXAMPLE 2 records/bucket INSERT: h(a) = 1 h(b) = 2 h(c) = 1 h(d) = 0 01230123 d a c b h(e) = 1 e

17 CS 443217 01230123 a b c e d EXAMPLE: deletion Delete: e f f g maybe move “g” up c d

18 CS 443218 Rule of thumb: Try to keep space utilization between 50% and 80% Utilization = # keys used total # keys that fit If < 50%, wasting space If > 80%, overflows significant depends on how good hash function is & on # keys/bucket

19 CS 443219 How do we cope with growth? Overflows and reorganizations Dynamic hashing Extensible hashing Others …

20 CS 443220 Extensible hashing : idea 1 (a) Use i of b bits output by hash function b h(K)  use i  grows over time…. Note: enables future doubling of space ! 00110101

21 CS 443221 (b) Hash to directory of pointers to buckets (instead of buckets directly) h(K)[i ] to bucket Note : Double space by doubling the directory !............ Extensible hashing : idea 2

22 CS 443222 Example: h(k) is 4 bits; 2 keys/bucket i = 1 1 1 0001 1001 1100 Insert 1010 1 1100 1010 New directory 2 00 01 10 11 i = 2 2 0 1

23 CS 443223 1 0001 2 1001 1010 2 1100 Insert: 0111 0000 00 01 10 11 2 i = Example continued 0111 0000 0111 0001 2 2

24 CS 443224 00 01 10 11 2 i = 2 1001 1010 2 1100 2 0111 2 0000 0001 Insert: 1001 Example continued 1001 1010 000 001 010 011 100 101 110 111 3 i = 3 3

25 CS 443225 Extensible hashing: deletion Merge blocks and cut directory if possible (Reverse insert procedure)

26 CS 443226 Extensible hashing If directory fits into main memory, then access cost is 1 IO, otherwise 2 IOs Can handle growing files - with less wasted space - with no full reorganizations Summary + Indirection (Not bad if directory in memory) Directory doubles in size (Now it fits, now it does not) - - +

27 CS 443227 Use what when : Indexing : Tree-Structures vs Hashing

28 CS 443228 Hashing good for probes given key e.g., SELECT … FROM R WHERE R.A = 5 Indexing vs Hashing

29 CS 443229 INDEXING (Including B Trees) good for Range Searches: e.g., SELECT FROM R WHERE R.A > 5 Indexing vs Hashing

30 CS 443230 Reading Chapter 14 Read –14.3.1 and 14.3.2

31 CS 443231 The BIG picture…. Chapters 11 & 12: Storage, records, blocks... Chapter 13 & 14: Access Mechanisms - Indexes - B trees - Hashing - Multi key Chapter 15 & 16: Query Processing NEXT


Download ppt "CS 44321 CS4432: Database Systems II. CS 44322 Index definition in SQL Create index name on rel (attr) (Check online for index definitions in SQL) Drop."

Similar presentations


Ads by Google