Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Marti Austin Motoyama 1 Randy H. Katz 1 1 EECS.

Similar presentations


Presentation on theme: "1 SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Marti Austin Motoyama 1 Randy H. Katz 1 1 EECS."— Presentation transcript:

1 1 SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Marti Austin Motoyama 1 Randy H. Katz 1 1 EECS Department, UC Berkeley, 2 Bell Laboratories, Lucent Technologies

2 2 Outline Introduction to multi-match classification Multi-match classification using TCAM –May consume a large amount of TCAM memory –May consume high power Set Splitting Algorithm (SSA) – A memory and power efficient scheme for multi-match classification Simulation results Conclusions

3 3 Single-Match classification –Assumption: all the filters are associated with priorities –Only the highest priority match matters –E.g., longest prefix match Packet headerPacket Payload Multi-Match classification –Report all matching results –No priority among filters –Intrusion detection system: identify all the related rules –Also required by accounting applications Packet Classification

4 4 Ternary-CAM (TCAM) Fully associative memory: compare input string with all the entries in parallel –For multiple matches, report the index of the first match Each cell takes one of three logic states – 0, 1, and ?(dont care) entry cell width

5 5 Challenges of Multi-match Classification using TCAM Memory efficient –9Mbits – 18Mbits priced at $200-$300 Power efficient Easy update High speed –TCAM is fast e.g., 4 ns, However, TCAM only returns the first match result –We want all the matching results within a few cycles If returning a bit vector of the matching result? –Processing the bit vector can take time if the bit vector is long –Not efficient it is a sparse vector in most of the cases

6 6 Previous Solutions: Geometric Intersection- based Solution [Hot Interconnects 04] Add additional intersection filters –High speed Return all the matching results within one cycle –Memory efficient Create ~10N intersection filters for the Snort rule set May create O(N F ) intersection filters in the worst case –Energy efficient –Easily updatable

7 7 Previous Solution: MUD [ Sigcomm05] Encode the index of the entry and include the encoded value in each TCAM entry –Search the TCAM with initial MUD as all dont cares –After finding a matching result at index j, search again with discriminator field value > j

8 8 Previous Solution: MUD (Cont.) High speed –1+d+(k-2)*(d-1) = O(dk) TCAM lookups to get k matching results d is the logarithm of the number of entries in TCAM (d=log2N) Decreased to 1+d*(k-1)/r with DIRPE, where r (smaller than d) Memory efficient Energy efficient –All the entries in TCAMs are accessed each time high power consumption. Easily updatable Our Goal: Find a memory and power efficient solution

9 9 Observation Split filters into two sets to reduce intersections –Report the union of results from all sets –No need to include the intersections of the filters from different sets –Decrease the number of filters in TCAM, decrease power consumption –Increase the number of TCAM access N filters +O(N 2 ) intersection 1 TCAM lookup N filters + 1 intersection 2 TCAM lookups Original Two sets F1F1 FNFN Matching F 1 and F N Matching F 1 Matching F N

10 10 Problem Definition Given a set of filters F(F 1,F 2, …., F N ) Filters create a set of intersections I(I 1,I 2, …., I M ) –e.g., I 1 = intersection of (F 1, F 5, F 6 ) How to divide the filters into several sets –Residual intersection set I: intersections from filters in the same set –N + |I| < TCAM size –Number of sets (TCAM accesses) is minimum –NP hard problem!

11 11 Split Rules into Two Sets Still an NP hard problem (known as maximum set splitting or maximum hypergraph cut ) Best known approximation algorithms –Yield a performance ratio of 0.72 to the optimum solution –Require quadratic programming slow when the number of filters is large Our SSA algorithm –Remove at least half of the intersections –O(NM) complexity, where N is the total number of filters, and M is the total number of intersections

12 12 Maximum Satisfiability Problem –A set of literals {F 1, F 1, F 2, F 2,.., F N, F N } –A set of clauses, each clause is a subset of literals E.g., C 1 ={F 1 F 5 F 6 } –Goal: Find an assignment of F to satisfy a maximum number of clauses

13 13 Johnsons Algorithm to Maximum Satisfiability Problem Assign each clause a weight = 2 -|c| E.g., weight of C 1 ={F 1, F 5 F 6 } is 2 -3 Let F i be any literal which hasnt been assigned a value yet –If the weight of all clauses containing F i is higher than those containing F i Assign F i a true value and remove all clauses containing F i Multiply the weight of all the clauses containing F i by 2 –Otherwise Assign F i a false value and remove all clauses containing F i Multiply the weight of all the clauses containing F i by 2

14 14 Johnsons Theorem If all the clauses have at least k literals –Johnsons algorithm can satisfy at least (2 K -1)/ 2 K percent of the total clauses –e.g., k=2, satisfy at least ¾ of the clauses –It is proved that (2 K -1)/ 2 K is the best approximation bound for k>2

15 15 Filter Set Split Algorithm (SSA) Convert set splitting problem into maximum satisfiablity problem –Each filter corresponds to a literal –For any intersection (e.g., I 1 = intersection of F 1,, F 5, and F 6 ), add two clauses C={F 1, F 5 F 6 } and C={F 1, F 5 F 6 } Total number of clauses is 2M, M is the number of intersections Run Johnsons algorithm and assign each filter F i either a true (put in set one) or a false value (put in set two)

16 16 Filter Set Split Algorithm (SSA) (cont.) According to Johnsons theorem –At least ¾ of the clauses are satisfied 2M*3/4=1.5M At least 0.5M of the intersections have both clauses satisfied Suppose for intersection of F 1,, F 5, and F 6, C={F 1 F 5 F 6 } and C={F 1 F 5 F 6 } both are satisfied At least one of F 1,, F 5, F 6 is true and at least one is false F 1,, F 5, F 6 are split into different sets, thus this intersection doesnt need to be presented in TCAM At least 50% of the intersections are removed!

17 17 Review of the SSA Scheme High speed –Deterministic lookup rate. E.g., if filters are split into two sets, only 2 TCAM lookups per packet are needed. –Sets are logically independent Lookups can be parallelized Memory efficient –Guarantee the removal of at least 50% of the intersections each time the filter set is split into two sets Energy efficient –Low memory requirement –Access each filter only once per packet Easily updatable –Updates can be inserted to one of the set that creates the least number of intersections

18 18 Simulation Setup Tests on the Snort rule header sets –Compare SSA with two TCAM-based solutions: MUD Geometric Intersection-based solution –Compare SSA with two representative software-based solutions: Hicuts EGT-PC –Evaluation metrics Memory consumption Lookup rate Power consumption Update cost

19 19 Memory Usage Total number of extra intersections filters in TCAMs. Version Geometric Intersection-based SSA-2SSA-4 Extra IntersectionsSavingExtra IntersectionsSaving 2.0.034534698.67%199.97% 2.0.137544798.75%199.97% 2.1.037584798.75%0100% 2.1.140675598.65%0100% Total number of TCAM entries used. VersionMUD Geometric Intersection-basedSSA-2SSA-4 2.0.02403693286241 2.0.12554009302256 2.1.02574015304257 2.1.12634330318263

20 20 Classification Speed MUD –One packet may match up to 12 unique filters, and requires a maximum of 20 TCAM lookups –Common packets like http packets match 4 unique filters and may require 5-9 TCAM lookups. A Napster packet requires 9 to 15 TCAM lookups Geometric Intersection-based solution –1 TCAM lookup per packet SSA-2 –2 TCAM lookups per packet SSA-4 –4 TCAM lookups per packet –If average packet size is 402.7 bytes, SSA-4 operates at 201.35 Gbps classification rate –Worst case, if every packet is 40 bytes, SSA-4 achieves 20Gbps rate

21 21 Update Cost VersionMUD Geometric Intersection- based SSA-2SSA-4 AvgMaxAvgMaxAvgMax 2.0.0131.731571.33171.0022 2.0.1135.241351.341911 2.1.0134.711351.36201.0022 2.1.1136.001721.41261.0062 Update cost in terms of newly inserted filters

22 22 Power Consumption Energy used by a TCAM is linear to –The number of entries searched in parallel –The number of TCAM accesses per packet Metric: total TCAM entries accessed per packet

23 23 Software Solutions Hicuts –A high percentage of wildcards generate a high degree of filter duplications (on average, on filter is duplicated to 3108 times) EGT-PC –Many Snort rules apply to the same source and destination addresses –A packet may match 153 filters if we consider source and destination addresses only comparing input with these filter one by one is not affordable VersionTree HeightNumber of Filters in Leaf NodesSRAM Used (KB) 2.0.018745,01941,000 2.0.119803,64546,297 2.1.019820,41547,160 2.1.118827,65149,378

24 24 Conclusions SSA is a memory and power efficient solution to multi-match classification problem –O(NM) complexity –Guarantee to remove 50% of the intersections each time the filter set splits –Comparing to MUD Use a similar amount of TCAM memory Yield a 75% to 95% reduction in power consumption –Comparing to the Geometric Intersection-based Solution Use 90% less TCAM memory and power Require one additional TCAM lookup per packet


Download ppt "1 SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Marti Austin Motoyama 1 Randy H. Katz 1 1 EECS."

Similar presentations


Ads by Google