Packet Classification Using Coarse-Grained Tuple Spaces

Packet Classification Using Coarse-Grained Tuple Spaces
Haoyu Song, Jon Turner and Sarang Dharmapurikar

Overview Two-dimensional packet classification problem
in list of 2d filters, find first match for given address pair (1011,0111): [<101*,10*>, <10*,011*>, <1*,01*>] Limitations of current solutions fast algorithmic methods require excessive space (≥50x) TCAM has high cost per bit, significant power usage Combining cross-product and tuple-space search hybrid strategy with range of time-space tradeoff options Improving 1d lookups combining tree bitmap and Bloom filters Possible extensions

Cross-Product Method Procedure
do 1d lookup on all fields combine results into lookup key in cross-product table direct lookup table or hash table Fast, but space grows as nk for n filters, k fields filter set F0: <1010*, 01*> F1: < 101*,0111*> cross product table key filter S0D F S0D F S1D none S1D F1 10100, 01110 S0D1

2D Tuple Space Search Group by prefix length Rectangle search
32 Source IP Prefix Length Destination IP Prefix Length Group by prefix length hash table per group up to 33 x 33= 1,089 groups in practice occupied tuples Rectangle search markers to guide search at most 33 probes, often less hard to update Pruned tuple space search 1d search on src/dest fields find prefix lengths that match src/dest fields of packet search intersecting tuples if ≤k matching prefixes, at most k2 probes

Coarse-Grained Tuple Space
Select coarse-grained partition of tuple space Build cross-product table per sub-space Search procedure 1d lookups for LPM probe each subspace terminate early if possible Pruning identify candidate sub-spaces during 1d lookup probe selected sub-spaces Space/time tradeoff 32 Source IP Prefix Length Destination IP Prefix Length

Performance of Basic Algorithm
Equal size divisions of 2d tuple space Ratio of cross-products to filter set size 2x2 partition brings space usage to 2x minimum maximum of four probes required compared to for simple tuple space search Pruning of limited use for filter sets of size <104

Performance of Best Configurations
2x 3x 4x

Alternate Partitioning Approaches
32 Source IP Prefix Length Destination IP Prefix Length Arbitrary sub-spaces are possible potential for fewer regions with good space efficiency Preliminary results mixed may be useful for smaller filter sets More evaluation needed Note: filters of form <prefix,*> and <*,prefix> stored in 1d data structures

Fast 1d Lookups Tree Bitmap Hashing + Bloom Filters Multibit trie
110 1011 1 1 0 1 Bloom Filters 0,1 110,111, 000,001 10100,10101, 10110,10111 1 3 5 off-chip hash tables 1 Multibit trie Co-located children Bitmaps for prefix nodes subtree presence 4 bit stride implies 8 memory accesses Expand prefixes to “standard” lengths Off-chip hash table per length On-chip Bloom filters to avoid unproductive probes Large space requirements for good worst-case performance

Fast and Compact 1d Lookups
Bloom filters 2 4 subtree hash tables 1 1 Insert tree bitmap subtree roots into off-chip hash tables and on-chip Bloom filters Lookup prefix of subtree roots in Bloom filters if match on length k and all shorter lengths, probe off-chip table for length k Reduction in on-chip memory for Bloom filters shape-shifting trie yields further space reduction

1d Lookup Performance 200K IPv4 prefixes 5 bit stride for tree bitmap
8 bit on-chip “root table” 4 Bloom filters 1 BF entry for every 2 prefixes 1 off-chip probes (4 incl. FP) 2 Bloom filters 1 BF entry for every 6 prefixes 2 off-chip probes (4 incl. FP)

Practical Configuration
Configure 1d lookups for 1 off-chip probe each (excluding false positives) about 5 bits per prefix for Bloom filters with low FP rate Record <prefix,*> and <*,prefix> filters in 1d lookup data structures also proposed in recent paper by Kounavis, et. al. Divide remaining filters among four subspaces approximately 2 off-chip hash table entries per filter at most four probes With single QDR SRAM at 200 MHz, 32 bit word size can do 200 million probes per second about 33 million packets/second 40 byte packets at 10 Gb/s

Possible Extensions More extensive evaluation
scaling to larger filter sets – K filters integrated evaluation of 1d and 2d lookups systematic evaluation of alternate partitioning strategies Alternate representations of filter sub-spaces any filter set data structure is candidate using decision trees, can skip 1d lookups Generalization to more dimensions handling fields with ranges (for port numbers) coarse-grained grouping of tuple-spaces defined on “nesting level” can we beat TCAM?

Packet Classification Using Coarse-Grained Tuple Spaces

Similar presentations

Presentation on theme: "Packet Classification Using Coarse-Grained Tuple Spaces"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Packet Classification Using Coarse-Grained Tuple Spaces

Similar presentations

Presentation on theme: "Packet Classification Using Coarse-Grained Tuple Spaces"— Presentation transcript:

Similar presentations

About project

Feedback