Download presentation

Presentation is loading. Please wait.

Published byDonavan Summersett Modified over 2 years ago

1
Advanced Algorithms for Massive Datasets Basics of Hashing

2
The Dictionary Problem Definition. Let us given a dictionary S of n keys drawn from a universe U. We wish to design a (dynamic) data structure that supports the following operations: membership(k) checks whether k є S insert(k) S = S U {k} delete(k) S = S – {k}

3
Collision resolution: chaining

4
Key issue: a good hash function Basic assumption: Uniform hashing Avg #keys per slot = n * (1/m) = n/m = (load factor)

5
Search cost m = (n)

6
In summary... Hashing with chaining: O(1) search/update time in expectation O(n) optimal space Simple to implement...but: Space = m log 2 n + n (log 2 n + log 2 |U|) bits Bounds in expectation Uniform hashing is difficult to guarantee Open addressing array chains

7
In practice Typically we use simple hash functions: prime

8
Enforce “goodness” As in Quicksort for the selection of its pivot, select the h() at random From which set we should draw h ?

9
An example of Universal hash Each a i is selected at random in [0,m) k0k0 k1k1 k2k2 krkr ≈log 2 m r ≈ log 2 U / log 2 m a0a0 a1a1 a2a2 arar K a prime U = universe of keys m = Table size not necessarily: (...mod p) mod m

10
Simple and efficient universal hash h a (x) = ( a*x mod 2 r ) div 2 r-t 0 ≤ x < |U| = 2 r a is odd Few key issues: Consists of t bits, so m = 2 t Probability of collision is ≤ 1/2 t-1 (= 2/m)

11
Minimal Ordered Perfect Hashing 11 m = 1.25 n n=12 m=15 The h 1 and h 2 are not perfect Minimal, not minimum = lexicographic rank

12
h(t) = [ g( h 1 (t) ) + g ( h 2 (t) ) ] mod n 12 h is perfect and ordered, no strings need to be stored space is negligible for h 1 and h 2 and it is = m log n, for g

13
How to construct it 13 Term = edge, its vertices are given by h1 and h2 All g(v)=0; then assign g() by difference with known h() Acyclic ok No-Acycl regenerate hashes

Similar presentations

OK

© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.

© 2006 Pearson Addison-Wesley. All rights reserved13 B-1 Chapter 13 (excerpts) Advanced Implementation of Tables CS102 Sections 51 and 52 Marc Smith and.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on bluetooth architecture overview Ppt on sets theory Ppt on cse related topics about information Ppt on shell scripting and logic Ppt on charge coupled device ccd Ppt on porter's five forces analysis of amazon Ppt on ready to serve beverages images Ppt on content addressable memory design Training ppt on msds Ppt on business model canvas