Log Files. O(n) Data Structure Exercises 16.1.

Log Files

Data Structure Exercises 16.1

Hash Tables

In general, the approach of Summing Components can be extended to keys with m components. Let the key be k = (x 0, x 1, …, x m-1 ), we compute the integer. We may use the following expression as its hash code: hash code =

v(t) + v(e) + v(m) + v(p) + v(0) + v(1)

where the key is k = (x 0, x 1, …, x m-2, x m-1 ). Polynomial Hash Codes In this approach, we choose a constant a  1 and calculate the integer value

A = 9 A carefully chosen value of the constant a can reduce the number of conflicts significantly. Good values include 33, 37, 39, and 41 according to some experimental studies.

where N is a prime number, a and b are nonnegative integers randomly chosen at the time when the compression function is determined so that a mod N  0. This method is more sophisticated works better.

Collision-Handling Schemes Recall that if there is no collision, we can store the item (k, e) in the bucket array cell A[h(k)]. However, collision does occur time to time. In this case, two different keys, k 1 and k 2 cause the hash function to return a same value: h(k 1 ) = h(k 2 ). Therefore we cannot store -the item directly in A[h(k)]. The two schemes to handle collisions: 1. 1. Separate Chaining 2. 2. Open Addressing

Separate Chaining In this approach, what stored in A[h(k)]. is a reference to a sequence S k rather than the item. In turn, the items that have the same hash function value k are all stored in S k. The sequence S k can be implemented as a log file.

Linear Probing - A simple opening addressing scheme Assume that we want to insert an item (k, e) and I = h(k). The operation goes like this: If the bucket A[i] is not empty, we try the next bucket A[(i+1) mod N]. If this is not empty, then we try A[(i+2) mod N], and so on, until we find an empty bucket. Example:

One of disadvantages with Linear probing is that it tends to cluster the items of the dictionary into contiguous runs. This causes the searches to slow down quite a bit. To avoid this, we can use quadratic probing. Quadratic Probing Rather than searching the buckets for, we search the buckets A[(i + j) mod N] for j = 0, 1, 2, …, we search the buckets A[(i + j 2 ) mod N].

Double Hashing In this approach, we search the buckets A[(i+f(i)) mod N], where f(i) = j*h’(k) and h’(k) is the secondary hash function. In this approach, the secondary hash function is not allowed to be zero. A common choice is h’(k) = q – (k mod q) where q < N is some prime number, which can be divide by 1 and itself.

h’(k) = 6

The Ordered Dictionary ADT

Log Files. O(n) Data Structure Exercises 16.1.

Similar presentations

Presentation on theme: "Log Files. O(n) Data Structure Exercises 16.1."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Log Files. O(n) Data Structure Exercises 16.1.

Similar presentations

Presentation on theme: "Log Files. O(n) Data Structure Exercises 16.1."— Presentation transcript:

Similar presentations

About project

Feedback