# Extendible Hashing For Use as a File Structure 1.

## Presentation on theme: "Extendible Hashing For Use as a File Structure 1."— Presentation transcript:

Extendible Hashing For Use as a File Structure 1

2 External Hashing zWhat if the hash table is a file in which each bucket is a record in that file? zObservations: yA bucket may contain more than one key value. yThe number of buckets may expand or contract dynamically.

3 Extendible Hashing zHandling multiple key values per bucket is not a problem. zCollisions are resolved with overflow buckets rather than the next bucket. zKeep track of the number of times all buckets have been split (the “level”) and the next bucket to split.

4 The Hash Function zThe standard hash function would now be something like: H(x, L) = x mod (n * 2 L ) z“L” is the level, initially zero. zIf H(x, L) < b, then calculate H(x, L+1). z“b” is the next bucket to split.

5 The “Split” zQuestions: yWhen do I split the next bucket? yWhat does a split entail? zWe split when the load factor exceeds a certain threshold. The load factor is the number of key values / number of slots. zA split entails creating a new bucket and rehashing all keys in bucket b at level L+1.

6 The Insert Algorithm zInitialize L = 0 and b = 0; zCalculate bucket = H(x, L) yif (bucket < b) bucket = H(x, L+1) zIf bucket has an empty slot, fill it with x yElse, create an overflow bucket for x zIf the new load factor >= the threshold yAdd new bucket at end yRehash all key values in bucket b at Level L+1 yAdd one to b.

7 The Insert Algorithm II zIf b = n * 2 L zWe have split all the buckets at the current level, so yL = L + 1 yb = 0

8 Insert Example zInsert 24: zbucket = H(24, 0) = 0 zbucket >= b, so bucket 0 it is: Insert: 24,10,15,33,60,11, 61,41 210 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 0/6 = 0 threshold = 0.75

9 Insert Example zInsert 10: zbucket = H(10, 0) = 1 zbucket >= b, so bucket 1 it is: Insert 10,15,33,60,11, 61,41 21 24 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 1/6 = 0.17 threshold = 0.75

10 Insert Example zInsert 15: zbucket = H(15, 0) = 0 zbucket >= b, so bucket 0 it is: Insert:15,33,60,11, 61,41 2 10 1 24 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 2/6 = 0.33 threshold = 0.75

11 Insert Example zInsert 33: zbucket = H(33, 0) = 0 zbucket >= b, so bucket 0 it is: Insert:33,60,11, 61,41 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 3/6 = 0.5 threshold = 0.75

12 Insert Example zThis requires an overflow bucket. zLet’s assume overflow buckets also can hold 2 key values. zNow, update load factor: Insert:60,11, 61,41 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 4/6 = 0.67 threshold = 0.75 33

13 Insert Example zInsert 60 zbucket = H(60, 0) = 0 zbucket >= b, so bucket 0 it is: Insert:60,11, 61,41 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 4/8 = 0.5 threshold = 0.75 33

14 Insert Example zInsert 11 zbucket = H(11, 0) = 2 zbucket >= b, so bucket 2 it is: Insert:11, 61,41 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 5/8 = 0.63 threshold = 0.75 33 60

15 Insert Example zLoad factor >= threshold, so it is time to rehash all keys in bucket b = 0: zFirst, create a new bucket: Insert:61,41 11 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/8 = 0.75 threshold = 0.75 33 60

16 Insert Example zrehash 24 at level L+1: zH(24, 1) = 24 mod 6 = 0 z24 stays at bucket 0 Insert:61,41 11 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 33 60 3

17 Insert Example zrehash 15 at level L+1: zH(15, 1) = 15 mod 6 = 3 z15 moves to bucket 3 Insert:61,41 11 2 10 1 24 15 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 33 60 3

18 Insert Example zrehash 33 at level L+1: zH(33, 1) = 33 mod 6 = 3 z33 moves to bucket 3 Insert:61,41 11 2 10 1 24 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 33 60 15 3

19 Insert Example zrehash 60 at level L+1: zH(60, 1) = 60 mod 6 = 0 z60 stays at bucket 0 Insert:61,41 11 2 10 1 24 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 60 15 33 3

20 Insert Example zAdd 1 to b; it is less than 3, so done with first split. zI now have an empty overflow bucket; remove it and recalculate load factor: Insert:61,41 11 2 10 1 24 60 0 b=0, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 15 33 3

21 Insert Example zLoad factor is now 0.75, so I need to split again, this time b=1. Insert:61,41 11 2 10 1 24 60 0 b=1, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/8 = 0.75 threshold = 0.75 15 33 3

22 Insert Example zAdd bucket 4 and rehash all key values at bucket 1. z10 mod 6 = 4, so it should move: Insert:61,41 11 2 10 1 24 60 0 b=1, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 15 33 34

23 Insert Example zNote update of b to 2; the load factor is OK, so continue with insert of 61. Insert:61,41 11 21 24 60 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 15 33 3 10 4

24 Insert Example zbucket = H(61,0) = 1 zSince bucket < b, recalculate at L+1: zbucket = H(61, 1) = 1 Insert:61,41 11 21 24 60 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 threshold = 0.75 15 33 3 10 4

25 Insert Example zFinally, insert 41 zbucket = H(41,0) = 2 zbucket >= b so 2 it is: Insert:41 11 2 61 1 24 60 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 7/10 = 0.7 threshold = 0.75 15 33 3 10 4

26 Insert Example zLoad factor >= threshold, so split bucket 2: Insert:done 11 41 2 61 1 24 60 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 8/10 = 0.8 threshold = 0.75 15 33 3 10 4

27 Insert Example zBoth 11 and 41 are 5 mod 6, so both go to bucket 5. zUpdate b... Insert:done 2 61 1 24 60 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 8/12 = 0.67 threshold = 0.75 15 33 3 10 4 11 41 5

28 Insert Example zb = 3*2 L, so set b=0 and L=L+1: Insert:done 2 61 1 24 60 0 b=3, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 8/12 = 0.67 threshold = 0.75 15 33 3 10 4 11 41 5

29 Insert Example zDone. Insert:done 2 61 1 24 60 0 b=0, L=1 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 8/12 = 0.67 threshold = 0.75 15 33 3 10 4 11 41 5

30 Insert Example 2 61 1 24 60 0 15 33 3 10 4 11 41 5 62

31 Deleting with Extendible Hashing zDelete works the opposite of insert: yWhen the load factor goes below a lower threshold, combine buckets. yNote: if b=0, it is necessary to decrement the level

32 Delete Algorithm zHash the key value to delete in the standard way, hashing at level L+1 if necessary. yIf the key value is not found, report failure and stop yElse continue zUpdate the load factor

33 Delete Algorithm II zIf the load factor <= Lower Threshold yDecrement b yif (b== -1) xif (L=0) set b=0 and stop xL=L-1 and b=n*2 L - 1 yCombine the last bucket with bucket b; yRepeat if necessary.

34 Delete Example zLet’s start with the final table from our insert example. zWe’ll use 0.5 as our lower threshold. Delete: 60, 10, 41 2 61 1 24 60 0 b=0, L=1 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 8/12 = 0.67 Lower threshold = 0.5 15 33 3 10 4 11 41 5

35 Delete Example zDelete 60 zH(60, 1) = 0 which is >= b zRemove 60 from bucket 0: Delete: 60, 10, 41 2 61 1 24 60 0 b=0, L=1 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 8/12 = 0.67 Lower threshold = 0.5 15 33 3 10 4 11 41 5

36 Delete Example zDelete 10 zH(10,1) = 4 which is >=b zRemove 10 from bucket 4: Delete: 10, 41 2 61 1 24 0 b=0, L=1 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 7/12 = 0.58 Lower threshold = 0.5 15 33 3 10 4 11 41 5

37 Delete Example zTime to combine buckets. zDecrementing b results in b=-1 so zset L=0 and b= 3*2 0 - 1 = 2 Delete: 41 2 61 1 24 0 b=0, L=1 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/12 = 0.5 Lower threshold = 0.5 15 33 34 11 41 5

38 Delete Example zNext, combine the last bucket (5) with bucket 2: Delete: 41 2 61 1 24 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/12 = 0.5 Lower threshold = 0.5 15 33 34 11 41 5

39 Delete Example zBucket 5 is deleted and the load factor is updated. zLoad factor > lower threshold, so done. Delete: 41 11 41 2 61 1 24 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 Lower threshold = 0.5 15 33 34

40 Delete Example zDelete 33 zH(33, 0) = 0 < b, so rehash at L+1: zH(33, 1) = 3; remove 33 from bucket 3: Delete: 33 11 41 2 61 1 24 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 6/10 = 0.6 Lower threshold = 0.5 15 33 34

41 Delete Example zLoad Factor <= Lower threshold, so time to combine... zFirst, decrement b: Delete: done 11 41 2 61 1 24 0 b=2, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 5/10 = 0.5 Lower threshold = 0.5 15 34

42 Delete Example zNow, combine last bucket (4) with bucket b=1, and remove bucket 4. zUpdate the load factor too: Delete: done 11 41 2 61 1 24 0 b=1, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 5/10 = 0.5 Lower threshold = 0.5 15 34

43 Delete Example zLoad factor >= lower threshold, so done. Delete: done 11 41 2 61 1 24 0 b=1, L=0 H(x) = x mod 3*2 L 2 key values /bucket Load factor = 5/8 = 0.625 Lower threshold = 0.5 15 3

Download ppt "Extendible Hashing For Use as a File Structure 1."

Similar presentations