Download presentation

Presentation is loading. Please wait.

Published byKelly Tibbitts Modified over 2 years ago

1
1 Linear Hashing Appendix for Chapter 1

2
2 Linear Hashing Allow a hash file to expand and shrink dynamically without needing a directory. Suppose the file starts with M buckets numbered 0,1,…,M -1 and used h(K) = K mod M. This hash function is called initial hash function h i. Overflow can be handled by maintaining individual overflow chains for each bucket. When a collision leads to an overflow record in any file bucket, bucket 0 is split into 2 buckets: the original bucket 0 and a new bucket M at the end of the file. The records originally in bucket 0 are distributed between the two buckets based on h i+1 (K)=K mod 2M. Any records that hashed to bucket 0 based on h i will hash to either bucket 0 or bucket M based on h i+1.

3
3 Linear Hashing (cont.) With further overflow records, additional buckets are slit in the linear order 1,2, 3, 4,… If enough overflows occur, all the original buckets 0,1,…,M-1 will have been split, so the file now has 2M instead of M buckets and all buckets use the hash function h i+1. Hence, the records in overflow are redistributed into regular buckets, using h i+1. There is a value n – which is initially set to 0 and is incremented by 1 whenever a split occurs – is needed to determine which buckets have been split. To retrieve a record with hash key value K, first apply h i to K; if h i (K) < n, then apply the function h i+1 on K because the bucket is already split. Initially, n = 0, indicating h i applies to all buckets. When n = M, this means that all the original buckets have been split and h i+1 applies to all records in the file. At this point, n is reset to 0, and any collisions lead to overflow lead to the use of a new hash function h i+2 (K) = K mod 4M.

4
4 Linear Hashing (cont.) In general, h i+j (K) = K mod(2 j M), where j =0,1,2…. A hash function is needed whenever all the buckets 0,1,2,…,(2 j M)-1 have been split and n is reset to 0. Splitting can be controlled by monitoring the file load factor: l = r/(bfr*N) where r is the current number of records, bfr is the max number of records that can fit in a bucket and N is the current number of file buckets. Split can be triggered when the load of the file exceeds a certain threshold (e.g. 0.9) Buckets that have been split can also be combined if the load of the file falls below a given threshold (e.g. 0.7)

5
5

6
6

7
7

8
8

9
9

10
10

11
11

12
12 The search procedure for linear hashing if n = 0 then m h j (k) else begin m h j (k); if m < n then m h j+1 (k) end; search the bucket whose hash value is m (and its overflow, if any);

Similar presentations

OK

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Indexing Structures for Files.

Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 18 Indexing Structures for Files.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on formation of joint stock company Ppt on microsoft excel 2003 Ppt on group 17 elements 360 degree customer view ppt on iphone Ppt on pre ignition catalytic converter Ppt on science fiction in films Ppt on 21st century skills for education Animated ppt on magnetism for kids Download ppt on non conventional energy resources Ppt on pollution of air and water