Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPSC-608 Database Systems

Similar presentations


Presentation on theme: "CPSC-608 Database Systems"— Presentation transcript:

1 CPSC-608 Database Systems
Fall 2017 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #18 Notes #7

2 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

3 Another Index Structure: Hash Tables
hush function h h(k) buckets search key k A bucket is typically a disk block (probably with overflow blocks) h(k), 0 ≤ k ≤ b-1, gives an easy way to compute the bucket address (direct: address from h(k); indirect: h(k) is the index in a directory. Notes #7

4 How do we cope with growth?
Overflows and reorganizations Dynamic hashing Extensible Linear

5 Extensible Hashing: General framework
# bits used by the buckets (block index) # bits used by the directory (hash index) i j1 00…00 00…01 j2 h i k h(x) h(x)i j2 i . 11…11 i directory buckets

6 Extensible hashing: Searching

7 Extensible hashing: Searching
input: a search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); read in the disk block B with the address H[m].

8 Extensible hashing: Insertion

9 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x) b i mj h(x)

10 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1.

11 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x)

12 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x)

13 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x)

14 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B \\ B has no room for t IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x)

15 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B \\ B has no room for t IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x) i > j b i mj h(x)

16 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B \\ B has no room for t IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x) i > j b i mj h(x)

17 Extensible hashing: Insertion
input: a tuple t with search key x \\ h is the hash function, H is the directory, i is the current hash index. m = the first i bits of h(x); let the block with the address H[m] be B; IF B has room THEN add t in B ELSE let j be the block index of B \\ B has no room for t IF i = j THEN {double the size of H to 2i+1, i = i + 1; let the pointers in the new H[2k] and H[2k+1] both equal to that in the old H[k], 0 ≤ k ≤ 2i; } split B U{t} into B0 and B1 (with block index j+1) using the j+1st bit, let H[mj0**] point to B0 and H[mj1**] point to B1. b i m h(x) i > j b i mj h(x)

18 Insertion in Extensible Hashing

19 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 1 1 Insert: c1001 1 k1100

20 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 1 1 Insert: g1010 c1001 1 k1100

21 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 1 1 Insert: g1010 c1001 1 g1010 k1100

22 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 1 1 Insert: g1010 c1001 1 g1010 k1100 Split the block, and increase the block index

23 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 i = 1 1 Insert: g1010 c1001 1 g1010 k1100 Split the block, and increase the block index if the block index is equal to the hash index, first double the directory size

24 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 i = 1 1 Insert: g1010 2 c1001 1 g1010 k1100 Split the block, and increase the block index if the block index is equal to the hash index, first double the directory size 2

25 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 i = 1 1 Insert: g1010 2 c1001 1 g1010 k1100 Split the block, and increase the block index if the block index is equal to the hash index, first double the directory size 2

26 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 i = 1 1 Insert: g1010 2 c1001 1 g1010 k1100 Split the block, and increase the block index if the block index is equal to the hash index, first double the directory size 2

27 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 i = 1 1 Insert: g1010 2 c1001 1 k1100 g1010 Split the block, and increase the block index if the block index is equal to the hash index, first double the directory size k1100 2

28 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 Insert: g1010 2 c1001 g1010 2 k1100

29 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 00 01 10 11 Insert: g1010 d0111 2 c1001 g1010 2 k1100

30 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 2 c1001 g1010 2 k1100

31 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 2 c1001 g1010 2 k1100

32 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 1 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 e0000 2 c1001 g1010 2 k1100

33 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 b0001 e0000 1 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 e0000 2 c1001 g1010 2 k1100

34 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 Split the block, and increase the block index b0001 e0000 1 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 e0000 2 c1001 g1010 2 k1100

35 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 Split the block, and increase the block index 2 b0001 e0000 1 2 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 e0000 2 c1001 g1010 2 k1100

36 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 Split the block, and increase the block index e0000 2 b0001 d0111 b0001 1 2 i = 2 d0111 00 01 10 11 Insert: g1010 d0111 e0000 2 c1001 g1010 2 k1100

37 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 e0000 2 b0001 d0111 2 i = 2 00 01 10 11 Insert: g1010 d0111 e0000 2 c1001 g1010 2 k1100

38 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 e0000 2 b0001 d0111 2 i = 2 00 01 10 11 Insert: g1010 d0111 e0000 a1000 2 c1001 g1010 2 k1100

39 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 e0000 2 b0001 d0111 2 i = 2 00 01 10 11 Insert: g1010 d0111 e0000 a1000 2 c1001 g1010 a1000 Split the block, and increase the block index 2 k1100

40 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 i = 3 e0000 2 000 001 010 011 100 101 110 111 b0001 d0111 2 i = 2 00 01 10 11 if the block index is equal to the hash index, first double the directory size Insert: g1010 d0111 e0000 a1000 2 c1001 g1010 a1000 Split the block, and increase the block index 2 k1100

41 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 i = 3 e0000 2 000 001 010 011 100 101 110 111 b0001 d0111 2 i = 2 00 01 10 11 3 Insert: g1010 d0111 e0000 a1000 3 2 c1001 g1010 a1000 Split the block, and increase the block index 2 k1100

42 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 i = 3 e0000 2 000 001 010 011 100 101 110 111 b0001 d0111 2 i = 2 00 01 10 11 3 a1000 Insert: g1010 d0111 e0000 a1000 c1001 3 g1010 2 c1001 g1010 2 k1100

43 Insertion in Extensible Hashing
h(b) = 0001 h(c) = 1001 h(d) = 0111 h(e) = 0000 h(g) = 1010 h(k) = 1100 i = 3 e0000 2 000 001 010 011 100 101 110 111 b0001 d0111 2 i = 2 00 01 10 11 3 a1000 Insert: g1010 d0111 e0000 a1000 c1001 g1010 3 2 k1100

44 Extensible hashing: deletion
No merging of blocks Merge blocks and cut directory if possible (Reverse insert procedure)

45 Note: Still need overflow chains
Example: many records with duplicate keys insert 1100 1 1101 1100

46 Note: Still need overflow chains
Example: many records with duplicate keys if we split: insert 1100 2 For 10** 1 1101 1100 2 For 11** 1101 ? 1100 1100

47 Note: Still need overflow chains
Example: many records with duplicate keys if we split: Even further split: 2 For 10** insert 1100 2 For 10** 1 1101 3 For 110* 1100 1101 2 For 11** 1101 1100 1100 ? ? 1100 1100 3 For 111*

48 Solution: overflow chains
insert 1100 add overflow block: 1 1 1101 1100 1101 1101 1100

49 Extensible hashing Summary. + Can handle growing files
- with less wasted space - with no full reorganizations +

50 Extensible hashing Summary. + - Can handle growing files Indirection
- with less wasted space - with no full reorganizations + Indirection (Not bad if directory in memory) Directory doubles in size (Now it fits, now it does not) -

51 How do we cope with growth?
Overflows and reorganizations Dynamic hashing Extensible Linear

52 Linear hashing Another dynamic hashing scheme

53 Linear hashing Another dynamic hashing scheme Ideas:
Use the same hash function h; Use only part of h when the hash table is smaller (use the i low order bits of h. grows b i h(x) =

54 Linear hashing Another dynamic hashing scheme Ideas:
Use the same hash function h; Use only part of h when the hash table is smaller (use the i low order bits of h. grows b i h(x) = Similar to Extensible hash

55 (c) Hash table size n grows linearly
Linear hashing Another dynamic hashing scheme Ideas: Use the same hash function h; Use only part of h when the hash table is smaller (use the i low order bits of h. grows b i h(x) = Similar to Extensible hash (c) Hash table size n grows linearly Main difference n 00..0 (|n| = i)

56 Linear hashing Another dynamic hashing scheme Ideas:
Use the same hash function h; Use only part of h when the hash table is smaller (use the i low order bits of h. grows b i h(x) = Similar to Extensible hash (c) Hash table size n grows linearly Main difference n 00..0 (|n| = i) (d) Use overflow blocks.

57 Linear hashing b h(x) = grows i Hash table size n grows linearly (n is a parameter for the hash structure b h(x) = n 00..0 (|n| = i) h(x)i i = |n|

58 Linear hashing b h(x) = grows i Hash table size n grows linearly (n is a parameter for the hash structure (backet n is the first unused bucket) b h(x) = n 00..0 (|n| = i) h(x)i i = |n|

59 Where does x go if h(x)i ≥ n?
Linear hashing b h(x) = grows i Hash table size n grows linearly (n is a parameter for the hash structure (backet n is the first unused bucket) b h(x) = n 00..0 (|n| = i) h(x)i i = |n| Where does x go if h(x)i ≥ n?

60 Where does x go if h(x)i ≥ n?
Linear hashing b h(x) = grows i Hash table size n grows linearly (n is a parameter for the hash structure (backet n is the first unused bucket) b h(x) = n 00..0 (|n| = i) h(x)i i = |n| Where does x go if h(x)i ≥ n? Put x in h(x)i – 2i-1 (< n)!! (h(x)i – 2i-1 = h(x)i with the leading bit 1 replaced with 0)

61 Linear Hashing: Searching
How Do We Search x? Linear Hashing: Searching input: a search key x \\ h is the hash function, n is the current upper bound, i = |n| m = the last i bits of h(x); IF m ≥ n THEN m = m – 2i-1; read in the disk block(s) with the address m \\ you should check overflow blocks in the address m.

62 Linear Hashing: Insertion
How Do We Insert x? Linear Hashing: Insertion input: a tuple t with search key x \\ h is the hash function, n is the current upper bound, i = |n| m = the last i bits of h(x); IF m ≥ n THEN m = m – 2i-1; insert t in the disk block B with the address m; \\ If B is full, you need to use an overflow block.

63 Linear Hashing: Insertion
How Do We Insert x? Delete Linear Hashing: Insertion Deletion input: a tuple t with search key x \\ h is the hash function, n is the current upper bound, i = |n| m = the last i bits of h(x); IF m ≥ n THEN m = m – 2i-1; insert t in the disk block B with the address m; \\ you may need to check overflow blocks. Delete

64 How Do We Expand the Hash Table?

65 How Do We Expand the Hash Table?
When Do We Expand the Hash Table?

66 How Do We Expand the Hash Table?
When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n

67 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

68 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

69 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

70 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

71 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

72 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

73 Linear Hashing: Increasing Hash Table Size
How Do We Expand the Hash Table? When Do We Expand the Hash Table? needed space Keep track of: R = available space If R > threshold (e.g., 80%) then increase n Linear Hashing: Increasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| read in the disk block(s) B of address n – 2i -1 ; split (properly) the tuples in B and put them in the block B and the block B’ with address n; n = n + 1;

74 How Do We Shrink the Hash Table?

75 How Do We Shrink the Hash Table?
When? When R is smaller than a threshold (e.g., 50%)

76 Linear Hashing: Decreasing Hash Table Size
How Do We Shrink the Hash Table? When? When R is smaller than a threshold (e.g., 50%) Linear Hashing: Decreasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| n = n − 1; move the tuples in the block(s) of address n to the block(s) of address n – 2i -1 (here i is the length of the new n).

77 Linear Hashing: Decreasing Hash Table Size
How Do We Shrink the Hash Table? When? When R is smaller than a threshold (e.g., 50%) Linear Hashing: Decreasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| n = n − 1; move the tuples in the block(s) of address n to the block(s) of address n – 2i -1 (here i is the length of the new n).

78 Linear Hashing: Decreasing Hash Table Size
How Do We Shrink the Hash Table? When? When R is smaller than a threshold (e.g., 50%) Linear Hashing: Decreasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| n = n − 1; move the tuples in the block(s) of address n to the block(s) of address n – 2i -1 (here i is the length of the new n).

79 Linear Hashing: Decreasing Hash Table Size
How Do We Shrink the Hash Table? When? When R is smaller than a threshold (e.g., 50%) Linear Hashing: Decreasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| n = n − 1; move the tuples in the block(s) of address n to the block(s) of address n – 2i -1 (here i is the length of the new n).

80 Linear Hashing: Decreasing Hash Table Size
How Do We Shrink the Hash Table? When? When R is smaller than a threshold (e.g., 50%) Linear Hashing: Decreasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| n = n − 1; move the tuples in the block(s) of address n to the block(s) of address n – 2i -1 (here i is the length of the new n).

81 Linear Hashing: Decreasing Hash Table Size
How Do We Shrink the Hash Table? When? When R is smaller than a threshold (e.g., 50%) Linear Hashing: Decreasing Hash Table Size input: the current upper bound n \\ h is the hash function, i = |n| n = n − 1; move the tuples in the block(s) of address n to the block(s) of address n – 2i -1 (here i is the length of the new n).

82 Linear Hashing: General framework
00…00 00…01 x h h(x) i h(x)i . b i no tuples 1*…** = n i grow linearly buckets

83 Example b=4 bits, 2 keys/bucket
Future growth buckets 0000 0101 1010 1111

84 Example b=4 bits, 2 keys/bucket n = 10 (1 + the largest index of the used buckets)
future growth buckets 0000 0101 1010 1111 n =10

85 Example b=4 bits, 2 keys/bucket n = 10 (1 + the largest index of the used buckets) i = |n| = 2 (# used bits) Future growth buckets 0000 0101 1010 1111 n =10

86 Example b=4 bits, 2 keys/bucket n = 10 (1 + the largest index of the used buckets) i = |n| = 2 (# used bits) Future growth buckets 0000 0101 1010 1111 n =10 Rules: If h(x)i < n, then look at bucket h(x)i

87 Example b=4 bits, 2 keys/bucket n = 10 (1 + the largest index of the used buckets) i = |n| = 2 (# used bits) Future growth buckets 0000 0101 1010 1111 n =10 Rules: If h(x)i < n, then look at bucket h(x)i If h(x)i ≥ n, then look at bucket h(x)i − 2i -1 (i.e., replacing the leading bit 1 of h(x)i by 0)

88 Insertion: b=4 bits, 2 keys/bucket, n=10, i=2
1101 (can have overflow chains!) Future growth buckets 0000 0101 1010 1111 n =10 Rules: If h(x)i < n, then look at bucket h(x)i If h(x)i ≥ n, then look at bucket h(x)i − 2i -1 (i.e., replacing the leading bit 1 of h(x)i by 0)

89 Increase size n: b=4 bits, n=10, i=2
0000 0101 1010 1111 n = 10

90 Increase size n: b=4 bits, n=10, i=2
0000 0101 1010 1111 n = 10

91 Increase size n: b=4 bits, n=10, i=2
0000 0101 1010 1111 n = 10

92 Increase size n: b=4 bits, n=11, i=2
0000 0101 1010 1010 1111 n = 10

93 Increase size n: b=4 bits, n=11, i=2
Future growth buckets 0000 0101 1010 1111 n =11 10

94 Insert: b=4 bits, n=11, i=2 insert 1101 1101 0000 0101 1010 1111
Future growth buckets 0000 0101 1010 1111 n =11 10

95 Increase size n: b=4 bits, n=11, i=2
1101 0000 0101 1010 1111 n =11 10

96 Increase size n: b=4 bits, n=11, i=2
1101 0000 0101 1010 1111 n =11 10

97 Increase size n: b=4 bits, n=11, i=2
1101 0000 0101 1010 1111 n =11 10

98 Increase size n: b=4 bits, n=100, i=3
1101 0000 0101 1010 1111 1111 n =11 10

99 Increase size n: b=4 bits, n=100, i=3
1101 0000 0101 1010 1111 10 n =100

100 Increase size n: b=4 bits, n=100, i=3
0101 0000 0101 1010 1111 1101 10 n =100

101 Linear Hashing Summary + + Can handle growing files
- with less wasted space - with no full reorganizations No indirection like extensible hashing + +

102 Linear Hashing Summary + + ‒ Can handle growing files
- with less wasted space - with no full reorganizations No indirection like extensible hashing + + overflow chains

103 Example: BAD CASE Very full Very empty Need to move m here…
Would waste space…

104 Summary Hashing - How it works - Dynamic hashing - Extensible - Linear

105 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database

106 Next Algorithms implementing the relational algebraic operations:
Projection and selection Set and bag operations Join operations Grouping, duplicate elimination, sorting

107 Algorithms Implementing Relational Algebraic Operations
Projection and selection π, σ Set/bag operations US, ∩S, −S, UB, ∩B, −B Join operations Extended operations γ, δ, τ, table-scan × C ,

108 DBMS graduate database in tables (relations) lock table DDL language
administrator DDL complier lock table DDL language file manager logging & recovery concurrency control transaction manager database programmer index/file manager buffer manager DML (query) language query execution engine DML complier main memory buffers secondary storage (disks) DBMS graduate database


Download ppt "CPSC-608 Database Systems"

Similar presentations


Ads by Google