Presentation is loading. Please wait.

Presentation is loading. Please wait.

Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley.

Similar presentations


Presentation on theme: "Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley."— Presentation transcript:

1 Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley

2 2 - Sailesh Kumar - 4/30/2015 Problem Statement n How to implement deterministic hast tables n Near worst case O(1) deterministic performance n We are given with a small amount of on-chip memory n On-chip memory limited to 1-2 bytes per table entry n In this paper we tackle the above problem

3 3 - Sailesh Kumar - 4/30/2015 Hash Tables n Hash table uses a hash function which is used to index the table entries »hash("apple") = 5 hash("watermelon") = 3 hash("grapes") = 9 hash("cantaloupe") = 7 hash("kiwi") = 0 hash("mango") = 6 hash("banana") = 2 »hash("honeydew") = 2 n This is called collision »Now what kiwi banana Watermelon apple mango cantaloupe grapes 01234567890123456789 Linear ProbingDouble Hashing Hash2(honeydew) = 3 honeydew Linear Chaining honeydew No. of keys mapped to a bucket is called collision chain length

4 4 - Sailesh Kumar - 4/30/2015 Performance Analysis n Average performance is O(1) n However, worst-case performance is O(n) n In fact the probability of collision chain > 1 is pretty high These keys will take twice time to be probed These will take thrice the time to be probed Pretty high probability that performance is half or three times lower

5 5 - Sailesh Kumar - 4/30/2015 Segmented Hashing n Uses power of multiple choices »has been proposed and used earlier by several authors n A N-way segmented hash »Logically divides the hash table array into N equal segments »Maps the incoming keys onto a bucket from each segment »Picks the bucket which is either empty or has minimum keys k i h( ) k i is mapped to this bucket k i+1 h( ) k i+1 is mapped to this bucket 211121212 A 4-way segmented hash table 1 2

6 6 - Sailesh Kumar - 4/30/2015 Segmented Hash Performance n More segments improves the probabilistic performance »With 64 segments, probability of collision chain > 2 is nearly zero even at 100% load »More deterministic hash table performance

7 7 - Sailesh Kumar - 4/30/2015 An Obvious Deficiency n O(N) memory probes per query »Requires N times higher memory bandwidth n How to ensure an O(1) memory probes per query n Use Bloom filters implemented using small on-chip memory (filters out unnecessary memory accesses) n Before going further brief introduction of Bloom filters 2111201212 k i h( ) Every query requires 4 probes

8 8 - Sailesh Kumar - 4/30/2015 Bloom Filter X 1 1 1 1 1 m-bit Array H1H1 H2H2 H3H3 H4H4 HkHk Bloom Filter

9 9 - Sailesh Kumar - 4/30/2015 Bloom Filter Y 1 1 1 1 1 m-bit Array 1 1 1 H1H1 H2H2 H3H3 H4H4 HkHk

10 10 - Sailesh Kumar - 4/30/2015 Bloom Filter X 1 1 1 1 1 m-bit Array 1 1 1 match H1H1 H2H2 H3H3 H4H4 HkHk

11 11 - Sailesh Kumar - 4/30/2015 Bloom Filter W 1 1 1 1 1 m-bit Array 1 1 1 Match (false positive) H1H1 H2H2 H3H3 H4H4 HkHk

12 12 - Sailesh Kumar - 4/30/2015 Adding per Segment Filters 0 1 0 2111201212 k i h( ) k i can go to any of the 3 buckets 1 0 0 0 0 1 1 0 1 h 1 (kiki ) h 2 (kiki ) h k (kiki ) : m b bits We can select any of the above three segments and insert the key into the corresponding filter

13 13 - Sailesh Kumar - 4/30/2015 False Positive Rates n With Bloom Filters, there is likelihood of false positives »False positive means unnecessary memory accesses n With N segments, clearly the false positive rates will be at least N times higher »In fact, it will be even higher, because we have to also consider several permutations of false positives n We use Selective Filter Insertion algorithm, which reduces the false positive rates by several orders of magnitude

14 14 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Algorithm 0 1 0 k i h( ) 2111201212 k i can go to any of the 3 buckets 1 0 0 0 0 1 1 0 1 h 1 (kiki ) h 2 (kiki ) h k (kiki ) : m b bits Insert the key into segment 4, since fewer bits are set. Fewer bits are set => lower false positive With more segments (or more choices), our algorithm sets far fewer bits in the Bloom filter

15 15 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Details n Greedy policy n For every arriving key n We choose the segment where minimum bits are set in the Bloom filter n We show that this leads to unbalanced segments »Reduced performance

16 16 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Algorithm k 1 h( ) h 1 h 2 1 1 1

17 17 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Algorithm k 2 h( ) h 1 h 2 1 1 1 1 1

18 18 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Algorithm k 3 h( ) h 1 h 2 1 1 1 1 1 1 1

19 19 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Algorithm k 4 h( ) h 1 h 2 1 1 1 1 1 1 1 1 1

20 20 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Algorithm k 5 h( ) h 1 h 2 1 1 1 1 1 1 1 1 1 Reduced No. of choices

21 21 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Enhancement n Objective is to keep segments balanced n Might need to make sub-optimal choices at times n One way is to avoid the most loaded segment »Reduces number of choices by 1 n However, it leads to situations where two segments alternately leads n Things get complicated »More detailed version of algorithm can be found in paper

22 22 - Sailesh Kumar - 4/30/2015 Selective Filter Insertion Results

23 23 - Sailesh Kumar - 4/30/2015 Simulation Results n 64K buckets, 32 bits/entry Bloom filter. n Simulation runs for 500 phases. »During every phase, 100,000 random searches are performed. Between two phases, 10,000 random keys are deleted and inserted.

24 24 - Sailesh Kumar - 4/30/2015 Conclusion n We presented a way to implement »Hash tables with deterministic performance »We utilize small on-chip memory to achieve it »We also show that on-chip memory requirements are modest »Well within the Moore’s law »A 1M hash table for example needs 1-2MB of on-chip memory n Questions?


Download ppt "Segmented Hash: An Efficient Hash Table Implementation for High Performance Networking Subsystems Sailesh Kumar Patrick Crowley."

Similar presentations


Ads by Google