# 1 11. Hash Tables Heejin Park College of Information and Communications Hanyang University.

## Presentation on theme: "1 11. Hash Tables Heejin Park College of Information and Communications Hanyang University."— Presentation transcript:

1 11. Hash Tables Heejin Park College of Information and Communications Hanyang University

2 Contents Direct-address tables Hash tables Hash functions

3 Direct address tables U (universe of keys) 9 4 0 7 6 1 / / / / / / 2 3 5 8 2 3 5 8 Key Satellite data Slot Generate a table T with |U| slots. Store k into slot k. Assume that no two elements have the same key. T K (actual keys) 0 1 2 3 4 5 6 7 8 9

5 Hash tables Space consumption of direct addressing Θ(|U|) If the universe |K|/|U| is small, most of the space allocated for T would be wasted. Is it possible to reduce the space requirement to Θ(|K|) while the running time is still O(1) ?

6 Hash tables Hashing The element with key k is stored in slot h(k). instead of slot k. h is called a hash function. A hash function computes the slot from the key k. h : U {0, 1,..., m - 1}. We say that an element with key k hashes to slot h(k). We also say that h(k) is the hash value of key k.

7 Hash tables U (universe of keys) k1k1 k2k2 k3k3 k4k4 0 h(k 1 ) h(k 3 ) h(k 2 )=h(k 5 ) h(k 4 ) m-1 k5k5 K (actual keys)

8 Hash tables Collision : two keys may hash to the same slot. Avoiding collisions To make h appear to be random. avoiding collisions or at least minimizing collisions. Avoiding collisions is impossible because |U| > m.

9 Hash tables Collision resolution by chaining Put the elements hashed to the same slot in a linked list. U (universe of keys) / / / / / / k5k5 k7k7 k4k4 k8k8 k3k3 k2k2 k1k1 k6k6 k5k5 K (actual keys) k1k1 k4k4 k8k8 k3k3 k6k6 k7k7 k2k2

10 Hash tables CHAINED-HASH-INSERT(T, x) insert x at the head of list T[h(key[x])] CHAINED-HASH-DELETE(T, x) delete x from the list T[h(key[x])] CHAINED-HASH-SEARCH(T, k) search for an element with key k in list T[h(k)]

11 Hash tables The worst-case running time for insertion is O(1). Assuming that the element x being inserted is not already present in the table. Deletion of an element x can be done in O(1) time. Assuming that the lists are doubly linked. For searching, the worst-case running time is proportional to the length of the list.

12 Hash tables Searching time for hashing with chaining Given a hash table T with m slots that stores n elements, we define the load factor α for T. The average number of elements stored in a chain. α = n/m.

13 Hash tables Worst-case All n keys hash to the same slot, creating a list of length n. Searching time : Θ(n) plus the time to compute the hash function. No better than a linked list for all the elements.

14 Hash tables Average case Depends on how well the hash function h distributes the set of keys to be stored among the m slots. Simple uniform hashing An element is equally likely to hash into each slot. An element hashes independently of where any other element has hashed to.

15 Hash tables Let n j denote the length of the list T[j] for j = 0, 1,..., m – 1. n = n 0 + n 1 + … + n m-1 average value of n j E[n j ] = α = n/m. Assume that the hash value h(k) can be computed in O(1) time, the time required to search for an element with key k depends linearly on the length n h(k) of the list T[h(k)].

16 Hash tables Consider two cases of search. When search is unsuccessful. No element in the table has key k. When search is successful. An element with key k is found.

17 Hash tables An unsuccessful search takes Θ(1 + α) expected time. A successful search takes Θ(1 + α) expected time.

Similar presentations