Presentation is loading. Please wait.

Presentation is loading. Please wait.

Handling Collisions Open Addressing SNSCT-CSE/16IT201-DS.

Similar presentations


Presentation on theme: "Handling Collisions Open Addressing SNSCT-CSE/16IT201-DS."— Presentation transcript:

1 Handling Collisions Open Addressing SNSCT-CSE/16IT201-DS

2 Handling Collisions Linear Probing
An alternative to resolving collisions with linked lists is to try alternative cells until an empty cell is found. Alternative Cells are h0(x), h1(x), h2(x),... where hi(x)=(hash(x)+f(i)) % TableSize with f(0)=0, f is the collision resolution strategy. Because all the data go inside the table For Linear probing, λ < 0.5 SNSCT-CSE/16IT201-DS

3 Linear Probing In linear probing f(i)=i (Popular Choice)
Let key x be stored in element hash(x)=t of the array 65(?) What do you do in case of a collision? If the hash table is not full, attempt to store key in the next array element (in this case (t+1)%N, (t+2)%N, (t+3)%N …) until you find an empty slot. SNSCT-CSE/16IT201-DS

4 Linear Probing Where do you store 65 ?
   attempts SNSCT-CSE/16IT201-DS

5 Linear Probing Performance (1)
If the table is relatively empty, blocks of occupied cells start forming (primary clustering) Expected # of probes for insertions and unsuccessful searches is ½(1+1/(1-λ)2) for successful searches is ½(1+1/(1-λ)) SNSCT-CSE/16IT201-DS

6 (earlier insertion are cheaper)
Linear Probing Performance (2) Assumptions: If clustering is not a problem, large table, probes are independent of each other Expected # of probes for unsuccessful searches (=expected # of probes until an empty cell is found) Expected # of probes for successful searches (=expected # of probes when an element was inserted) (=expected # of probes for an unsuccessful search) Average cost of an insertion (fraction of empty cells = 1 -λ) (earlier insertion are cheaper) SNSCT-CSE/16IT201-DS

7 Linear Probing Eliminates need for separate data structures (chains), and the cost of constructing nodes. Leads to problem of clustering. Elements tend to cluster in dense intervals in the array. Search efficiency problem remains. Deletion becomes trickier….      SNSCT-CSE/16IT201-DS

8 Deletion Problem -- SOLUTION
Standard deletion cannot be performed in a probing hash table, because the cell might have caused a collison to go past it. “Lazy” deletion Each cell is in one of 3 possible states: active empty deleted For Find or Delete only stop search when EMPTY state detected (not DELETED) SNSCT-CSE/16IT201-DS

9 Deletion-Aware Algorithms
Insert call Find, if cell = active already there if Cell = deleted or empty insert, cell = active Find cell empty NOT found cell deleted if key == key -> NOT FOUND else H = (H + 1) mod TS cell active if key == key -> FOUND Delete call Find, cell active DELETE; cell=deleted cell deleted NOT found SNSCT-CSE/16IT201-DS

10 Handling Collisions Quadratic Probing SNSCT-CSE/16IT201-DS

11 Quadratic Probing In quadratic probing f(i)=i2 (Popular Choice)
Let key x be stored in element hash(x)=t of the array 65(?) What do you do in case of a collision? If the hash table is not full, attempt to store key in array elements (t+12)%N, (t+22)%N, (t+32)%N … until you find an empty slot. SNSCT-CSE/16IT201-DS

12 Quadratic Probing Where do you store 65 ? hash(65)=t=5
    t t t t+9 attempts SNSCT-CSE/16IT201-DS

13 Quadratic Probing Theorem: If quadratic probing is used, and the table size is prime, then a new element can always be inserted if the table is at least half empty. Proof: Let M=TableSize be an odd prime > 3, -prove first alternative locations are all distinct. -Assume two of these locations are the same and Then, Since TableSize is prime and i and j are distinct (also ≤ ), this is not possible. It follows that the first M/2 alternative are all distinct, and an insertion must succeed if the table is at least half full. SNSCT-CSE/16IT201-DS

14 Quadratic Probing Tends to distribute keys better than linear probing, alleviates the problem of clustering (primary clustering. There remains the problem of secondary clustering in which elements that hash to the same position will probe the same alternative cells Runs the risk of an infinite loop on insertion, unless precautions are taken. E.g., consider inserting the key 16 into a table of size 16, with positions 0, 1, 4 and 9 already occupied. Therefore, table size should be prime. SNSCT-CSE/16IT201-DS

15 Handling Collisions Double Hashing SNSCT-CSE/16IT201-DS

16 Double Hashing f(i)=i*hash2(x) is a popular choice
hash2(x)should never evaluate to zero Now the increment is a function of the key The slots visited by the hash function will vary even if the initial slot was the same Avoids clustering Theoretically interesting, but in practice slower than quadratic probing, because of the need to evaluate a second hash function. SNSCT-CSE/16IT201-DS

17 Double Hashing Typical second hash function hash2(x)=R − ( x % R )
where R is a prime number, R < N SNSCT-CSE/16IT201-DS

18 Double Hashing Where do you store 99 ? hash(99)=t=9
Let hash2(x) = 11 − (x % 11), hash2(99)=d=11 Note: R=11, N=15 Attempt to store key in array elements (t+d)%N, (t+2d)%N, (t+3d)%N … Array:     t t t t+33 attempts Where would you store: 127? SNSCT-CSE/16IT201-DS

19 Double Hashing Let f2(x)= 11 − (x % 11) hash2(127)=d=5 Array:
   t t t+5 attempts Infinite loop! SNSCT-CSE/16IT201-DS


Download ppt "Handling Collisions Open Addressing SNSCT-CSE/16IT201-DS."

Similar presentations


Ads by Google