Presentation is loading. Please wait.

Presentation is loading. Please wait.

Log Files. O(n) Data Structure Exercises 16.1.

Similar presentations


Presentation on theme: "Log Files. O(n) Data Structure Exercises 16.1."— Presentation transcript:

1 Log Files

2

3

4

5 O(n)

6 Data Structure Exercises 16.1

7 Hash Tables

8

9

10

11

12

13

14

15

16

17

18

19

20

21 In general, the approach of Summing Components can be extended to keys with m components. Let the key be k = (x 0, x 1, …, x m-1 ), we compute the integer. We may use the following expression as its hash code: hash code =

22 v(t) + v(e) + v(m) + v(p) + v(0) + v(1)

23 where the key is k = (x 0, x 1, …, x m-2, x m-1 ). Polynomial Hash Codes In this approach, we choose a constant a  1 and calculate the integer value

24 A = 9 A carefully chosen value of the constant a can reduce the number of conflicts significantly. Good values include 33, 37, 39, and 41 according to some experimental studies.

25

26

27

28

29

30 where N is a prime number, a and b are nonnegative integers randomly chosen at the time when the compression function is determined so that a mod N  0. This method is more sophisticated works better.

31 Collision-Handling Schemes Recall that if there is no collision, we can store the item (k, e) in the bucket array cell A[h(k)]. However, collision does occur time to time. In this case, two different keys, k 1 and k 2 cause the hash function to return a same value: h(k 1 ) = h(k 2 ). Therefore we cannot store -the item directly in A[h(k)]. The two schemes to handle collisions: 1. 1. Separate Chaining 2. 2. Open Addressing

32 Separate Chaining In this approach, what stored in A[h(k)]. is a reference to a sequence S k rather than the item. In turn, the items that have the same hash function value k are all stored in S k. The sequence S k can be implemented as a log file.

33

34

35

36

37

38 Data Structure Exercises 16.2

39

40 Linear Probing - A simple opening addressing scheme Assume that we want to insert an item (k, e) and I = h(k). The operation goes like this: If the bucket A[i] is not empty, we try the next bucket A[(i+1) mod N]. If this is not empty, then we try A[(i+2) mod N], and so on, until we find an empty bucket. Example:

41

42

43

44

45

46

47 One of disadvantages with Linear probing is that it tends to cluster the items of the dictionary into contiguous runs. This causes the searches to slow down quite a bit. To avoid this, we can use quadratic probing. Quadratic Probing Rather than searching the buckets for, we search the buckets A[(i + j) mod N] for j = 0, 1, 2, …, we search the buckets A[(i + j 2 ) mod N].

48

49 Double Hashing In this approach, we search the buckets A[(i+f(i)) mod N], where f(i) = j*h’(k) and h’(k) is the secondary hash function. In this approach, the secondary hash function is not allowed to be zero. A common choice is h’(k) = q – (k mod q) where q < N is some prime number, which can be divide by 1 and itself.

50 h’(k) = 6

51 Data Structure Exercises 17.1

52 The Ordered Dictionary ADT

53

54


Download ppt "Log Files. O(n) Data Structure Exercises 16.1."

Similar presentations


Ads by Google