Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hash Function comparison for PSAMP purposes: results and suggestions Maurizio Molina,

Similar presentations


Presentation on theme: "Hash Function comparison for PSAMP purposes: results and suggestions Maurizio Molina,"— Presentation transcript:

1 Hash Function comparison for PSAMP purposes: results and suggestions Maurizio Molina,

2 2 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Motivation, Background In PSAMP, hash functions operating on portion of packet header and/or payload are useful for two reasons: –Emulate random sampling: Sampling ID –Generate a “compact” packet identifier: Digest ID It’s necessary that PSAMP indicates which hash function to use, for consistent packet sampling and identification But requirements are different in the two cases, so the criteria leading to the choice of the “best” hash function are different –And the choice of the “best” hash function can be different as well! We compared 4 hash functions: –IPSX –BOB –MMH –CRC32

3 3 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Hash functions for random sampling emulation Requirements: –Good uniformity of distribution: the Sampling ID must be uniformly distributed over the Hash Range (the space of the possible Hash results) Ideally, also when the hash input is not uniform at all! –Computation Speed: it must operate at line rate

4 4 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Testing for Uniformity of distribution: method (1/2) Subdivide the Hash range in N bins, evaluate the fraction of Hash results falling in each bin –Ideally: 1/N - but in reality…. Repeat the experiment 60 times,and calculate confidence intervals 1/N

5 5 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Testing for Uniformity of distribution: method (2/2) Metrics: –Std deviation of averages –Average of conf. Interval size The lower the metric, the better! 1/N better worse better worse 1/N

6 6 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Testing for Uniformity of distribution: results Performances of the 4 Hash functions are very close….. –Both with real and synthetic input packet traces

7 7 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Testing for speed: results IPSX is much faster (6.69 times faster than BOB)!

8 8 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Hash functions for random sampling emulation: Conclusion IPSX, which is the simplest and fastest, has uniformity of distribution comparable to the other ones –IPSX is the preferred one! –MUST for IPSX, MAY for BOB (second in rank)

9 9 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Hash functions for compact pkt identifier generation Requirements: –Low collision probability of digest ID Ideally, coll. prob. should be low also when the hash inputs are very similar (or “slowly variant”) –Computation Speed, but more relaxed wrt the random sampling emulation case, as this Hash will likely operate only on sampled packets

10 10 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Testing for collision probability: results We excluded IPSX because its fixed input Key size (16 bytes) is a limitation for achieving small collision probabilities Results: –BOB and CRC32 exploit better than mmh the longer keys –BOB and CRC32 have similar performances

11 11 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Testing for speed: results BOB is the best, but the difference from CRC32 is small !

12 12 © NEC Europe Ltd., 2002 Network Laboratories, Heidelberg Hash functions for compact pkt identifier generation: Conclusion Were it for these results only, we should indicate BOB as the preferred one But differences with CRC32 seem small, while CRC32 is more established –In draft-ietf-psamp-sample-tech-04.txt we indicated CRC32 as the preferred one! –MUST for CRC32, MAY for BOB (first in rank, but “new”) Discussion: does this “close” the issue too early (tests were limited…)? –Alternative: Indicate two MUSTs (CRC32 and BOB)?


Download ppt "Hash Function comparison for PSAMP purposes: results and suggestions Maurizio Molina,"

Similar presentations


Ads by Google