Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data plane algorithms in routers From prefix lookup to deep packet inspection Cristian Estan, University of Wisconsin-Madison.

Similar presentations


Presentation on theme: "Data plane algorithms in routers From prefix lookup to deep packet inspection Cristian Estan, University of Wisconsin-Madison."— Presentation transcript:

1 Data plane algorithms in routers From prefix lookup to deep packet inspection Cristian Estan, University of Wisconsin-Madison

2 DIMACS Tutorial on Algorithms for Next Generation Networks August What is the data plane? The part of the router handling the traffic  Data plane algorithms applied to every packet  Successive packets typically treated independently  Example: deciding on which link to send a packet Throughput defined as number of packets or bytes handled per second is very important  “Line speed” – keeping up with the rate at which traffic can be transmitted over the wire or fiber  Example: 10Gbps router has 32 ns to handle 40 byte packet Memory usage limited by technology and costs  Can afford at most tens of megabits of fast on-chip memory

3 DIMACS Tutorial on Algorithms for Next Generation Networks August A generic data plane problem Router has many directives composed of a guard, and an associated action (all guards distinct) There is a simple procedure for testing how well a guard matches a packet For each packet, find the guard that matches “best” and take the associated action Example – routing table lookup:  Each guard is an IP prefix (between 0 and 32 bits)  Matching procedure: is the guard a prefix of the 32 bit destination IP address  “Best” defined as longest matching prefix

4 DIMACS Tutorial on Algorithms for Next Generation Networks August The rules of the game Matching against all guards in sequence is too slow We build a data structure that captures the semantics of all guards and use it for matching Primary metrics  How fast the matching algorithm is  How much memory the data structure needs  Time to build data structure also has some importance We can cheat (but we won’t today) by:  Using binary or ternary content-addressable memories  Using other forms of hardware support

5 DIMACS Tutorial on Algorithms for Next Generation Networks August Measuring “algorithm complexity” Execution cost measured in number of memory accesses to read data structure  Actual data manipulation operations typically very simple  On some platforms we can read wide words Worst case performance most important  Worst case defined with respect to input, not guards  Caching has been proven ineffective for many settings  Using algorithms with good amortized complexity, but bad worst case requires large buffers

6 DIMACS Tutorial on Algorithms for Next Generation Networks August Overview Longest matching prefix  Trie-based algorithms Uni-bit and multi-bit tries (fixed stride and variable stride) Leaf pushing Bitmap compression of multi-bit trie nodes Tree bitmap representation for multi-bit trie nodes  Binary search on ranges  Binary search on prefix lengths Classification on multiple fields Signature matching

7 DIMACS Tutorial on Algorithms for Next Generation Networks August Longest matching prefix Used in routing table lookup (a.k.a. forwarding) for finding the link on which to send a packet Guard: a bit string of 0 to w bits called IP prefix Action: a single byte interface identifier Input: a w-bit string representing the destination IP address of the packet (w is 32 for IPv4,128 for IPv6) Output: the interface associated with the longest guard matching the input Size of problem: hundreds of thousands of prefixes

8 DIMACS Tutorial on Algorithms for Next Generation Networks August P10* P21* P3100* P41000* P * P6101* P7110* P811001* P9111* Routing table Uni-bit trie 0P1 1P P7 1P9 0P3 1P6 0P P P8 P1000* P1001* P1010* P1011* P2100* P2101* P2110* P2111* P3100* P * P * P * P * P * P6101* P7110* P * P * P9111* Controlled prefix expansion with stride 3 P1000* P1001* P1010* P1011* P3100* P * P * P * P * P6101* P7110* P * P * P9111* 000P1 001P1 010P1 011P1 100P3 101P5 110P7 111P9 000P5 001P4 010P4 011P P8 011P P1 001P1 010P1 011P P P9 000P5 001P4 010P4 011P4 100P3 101P3 110P3 111P3 Leaf pushing Multi-bit trie with fixed stride Leaf pushing reduces memory usage but increases update time 000P1 001P1 010P1 011P1 100P33 101P5 110P72 111P9 000P5 001P4 010P4 011P P Multi-bit trie with variable stride 000P7 001P7 010P8 011P8 100P7 101P7 110P7 111P7 Given a maximum trie height h and a routing table of size n dynamic programming algorithm computes optimal variable stride trie in O(nw 2 h)

9 DIMACS Tutorial on Algorithms for Next Generation Networks August P10* P21* P3100* P41000* P * P6101* P7110* P811001* P9111* Routing table Uni-bit trie 0P1 1P P7 1P9 0P3 1P6 0P P P8 P1000* P1001* P1010* P1011* P2100* P2101* P2110* P2111* P3100* P * P * P * P * P * P6101* P7110* P * P * P9111* Controlled prefix expansion with stride 3 P1000* P1001* P1010* P1011* P3100* P * P * P * P * P6101* P7110* P * P * P9111* 000P1 001P1 010P1 011P1 100P3 101P5 110P7 111P9 000P5 001P4 010P4 011P P8 011P Input 000P1 001P1 010P1 011P P P9 000P5 001P4 010P4 011P4 100P3 101P3 110P3 111P3 Leaf pushing Multi-bit trie with fixed stride 000P1 001P1 010P1 011P1 100P33 101P5 110P72 111P9 000P5 001P4 010P4 011P P Multi-bit trie with variable stride Longest matching prefix P2P7 000P7 001P7 010P8 011P8 100P7 101P7 110P7 111P7 Leaf pushing reduces memory usage but increases update time Given a maximum trie height h and a routing table of size n dynamic programming algorithm computes optimal variable stride trie in O(nw 2 h)

10 DIMACS Tutorial on Algorithms for Next Generation Networks August P10* P21* P3100* P41000* P * P6101* P7110* P811001* P9111* Routing table Uni-bit trie 0P1 1P P7 1P9 0P3 1P6 0P P P8 P1000* P1001* P1010* P1011* P2100* P2101* P2110* P2111* P3100* P * P * P * P * P * P6101* P7110* P * P * P9111* Controlled prefix expansion with stride 3 P1000* P1001* P1010* P1011* P3100* P * P * P * P * P6101* P7110* P * P * P9111* 000P1 001P1 010P1 011P1 100P3 101P5 110P7 111P9 000P5 001P4 010P4 011P P8 011P Input 000P1 001P1 010P1 011P P P9 000P5 001P4 010P4 011P4 100P3 101P3 110P3 111P3 000P7 001P7 010P8 011P8 100P7 101P7 110P7 111P7 Leaf pushing Multi-bit trie with fixed stride 000P1 001P1 010P1 011P1 100P33 101P5 110P72 111P9 000P5 001P4 010P4 011P P Multi-bit trie with variable stride Longest matching prefix P7 Leaf pushing reduces memory usage but increases update time Given a maximum trie height h and a routing table of size n dynamic programming algorithm computes optimal variable stride trie in O(nw 2 h)

11 DIMACS Tutorial on Algorithms for Next Generation Networks August P1000* P1001* P1010* P1011* P3100* P * P * P * P * P6101* P7110* P * P * P9111* Input P10* P21* P3100* P41000* P * P6101* P7110* P811001* P9111* P1000* P1001* P1010* P1011* P3100* P * P * P * P * P6101* P7110* P * P * P9111* 000P1 001P1 010P1 011P P P9 P1 P5 P Lulea bitmap compression Longest matching prefix P *1 1*1 00*0 01*0 10*0 11*0 000*0 001*0 010*0 011*0 100*1 101*1 110*1 111*1 P1 P2 P3 P6 P7 P9 Pointers to children and prefixes are stored in separate structures. Prefixes of all lengths are stored, thus leaf pushing is not needed and update is fast. Bitmaps have 1s corresponding to entries that are not empty. Repeating entries are stored only once in the compressed array. An auxiliary bitmap is needed to find the right entry in the compressed node. It stores a 0 for positions that do not differ from the previous one. When the compression bitmaps are large it is expensive to count bits during lookup. The bitmap is divided into chunks and a pre- computed auxiliary array stores the number of bits set before each chunk. The lookup algorithm needs to count only bits set within one chunk. Compressed node Representing node as tree bitmap Bitmap supporting fast counting 13+0=13 P2

12 DIMACS Tutorial on Algorithms for Next Generation Networks August Binary search on ranges Divide w-bit address space into maximal continuous ranges covered by same prefix Build array or balanced (binary) search tree with boundaries of ranges At lookup time perform O(log(n)) search Not better than multi-bit tries with compression, but it is not covered by patents

13 DIMACS Tutorial on Algorithms for Next Generation Networks August Binary search on prefix lengths Core idea: for each prefix length represented in the routing table, have a hash table with the prefixes  Can find longest matching prefix after looking up in each hash table the prefix of the address with corresponding length  Binary search on prefix lengths is faster Simple but wrong algorithm: if you find prefix at length x store it as best match and look for longer matching prefixes, otherwise look for shorter prefixes  Problem: what if there is both a shorter and a longer prefix, but no prefix at length x? Solution: insert marker at length x when there are longer prefixes. Must store with marker longest matching shorter prefix. Markers lead to moderate increase in memory usage. Promising algorithm for IPv6 (w=128)

14 DIMACS Tutorial on Algorithms for Next Generation Networks August Papers on longest matching prefix G. Varghese “Network algorithmics an interdisciplinary approach to designing fast networked devices”, chapter 11, Morgan Kaufmann 2005 V. Srinivasan, G. Varghese “Faster IP lookups using controlled prefix expansion”, ACM Trans. on Comp. Sys., Feb M. Degermark, A. Brodnik, S. Carlsson, S. Pink “Small forwarding tables for fast routing lookups”, ACM SIGCOMM, 1997 W. Eatherton, Z. Dittia, G. Varghese “Tree Bitmap : Hardware / Software IP Lookups with Incremental Updates”, cse.ucsd.edu/~varghese/PAPERS/willpaper.pdfhttp://www- cse.ucsd.edu/~varghese/PAPERS/willpaper.pdf B. Lampson, V. Srinivasan, G. Varghese “IP lookups using multiway and multicolumn search”, IEEE Infocom, 1998 M. Waldvogel, G. Varghese, J. Turner, B. Plattner, “Scalable high- speed IP lookups”, ACM Trans. on Comp. Sys., Nov. 2001

15 DIMACS Tutorial on Algorithms for Next Generation Networks August Overview Longest matching prefix Classification on multiple fields  Solution for two-dimensional case: grid of tries  Bit vector linear search  Cross-producting  Decision tree approaches Signature matching

16 DIMACS Tutorial on Algorithms for Next Generation Networks August Packet classification problem Required for security, recognizing packets with quality of service requirements Guard: prefixes or ranges for k header fields  Typically source and destination prefix, source and destination port range, and exact value or * for protocol  All fields must match for rule to apply Action: drop, forward, map to a certain traffic class Input: a tuple with the values of the k header fields Output: the action associated with the first rule that matches the packet (rules are strictly ordered) Size of problem: thousands of classification rules

17 DIMACS Tutorial on Algorithms for Next Generation Networks August Example of classification rule set Destination IPSource IPDest PortSrc PortProtocolAction M*25**R1 M*53*UDPR2 MS53**R3 M*23**R4 TITO123 UDPR5 *Net***R6 Net***TCP/ACKR7 *****R8 Net Internet Mail gateway M Internal time server TI External time server TO Secondary name server S Router that filters traffic

18 DIMACS Tutorial on Algorithms for Next Generation Networks August A geometric view of packet classification In theory number of regions defined can be much larger than number of rules Any algorithm that guarantees O(n) space for all rule sets of size n needs O(log(n) k-1 ) time for classification R1 R2 R3 R1 R2 R3 Source address space Destination address space

19 DIMACS Tutorial on Algorithms for Next Generation Networks August The two dimensional case: source and destination IP addresses For each destination prefix in rule set, link to corresponding node in destination IP trie a trie with source prefixes of rules using this destination prefix Matching algorithm must use backtracking to visit all source tries Grid of tries: by pre-computing “switch pointers” in destination tries and propagating some information about more general rules, matching may proceed without backtracking  Memory used proportional to number of rules  Matching time O(w) with constant depending on stride Extended grid of tries handles 5 fields and has good run time and memory in practice

20 DIMACS Tutorial on Algorithms for Next Generation Networks August Dest IPSrc IPDest PortSrc PortProtoAction M*25**R1 M*53*UDPR2 MS53**R3 M*23**R4 TITO123 UDPR5 *Net***R6 Net***TCPR7 *****R8 Dest IP M TI Net * Source IP S TO Net * Proto UDP TCP * Dest Port * Src Port * R8 Bit vector approaches do linear search through rule set For each field we pre-compute a structure (e.g. trie) to find most specific prefix or range distinguished by rule set For each rule, a single bit represents whether a given most specific prefix matches rule or not  We associate with each range a bitmap of size n encoding which of the rules may match a packet in that prefix or range Classification algorithm first computes for each field of the packet the most specific prefix/range it belongs to By then AND-ing together the k bitmaps of size n we find matching rules Works well for hardware solutions that allow wide memory reads Scales poorly to large rule sets

21 DIMACS Tutorial on Algorithms for Next Generation Networks August Dest IPSrc IPDest PortSrc PortProtoAction M*25**R1 M*53*UDPR2 MS53**R3 M*23**R4 TITO123 UDPR5 *Net***R6 Net***TCPR7 *****R8 Dest IP M TI Net * Src IP S TO Net * Proto UDP TCP * Dest Port * Src Port 123 * Cross ProductAction 0M,S,25,123,UDPR1 1M,S,25,123TCPR1 2M,S,25,123,*R1 3M,S,25,*,UDPR1 4M,S,25,*,TCPR1 5M,S,25,*,*R1 478*,*,*,*,TCPR8 479*,*,*,*,*R8 … 4*4*5*2*3=480 3*120+3*30+4*6+1*3+1=478 Dest IP - Src IP Rule bitmap Class 0M,S C1 1M,TO C2 2M,Net C3 3M,* C2 4TI,S C4 5TI,T C5 6TI,Net C6 7TI,* C4 8Net,S C4 9Net,TO C4 10Net,Net C6 11Net,* C4 12*,S C7 13*,TO C7 14*,Net C8 15*,* C7 16 entries, 8 distinct classes Src IP Dest IP Src Port Dest Port Proto Final result Cross-producting performs longest prefix matching separately for all fields and combines the results in a single step by looking up the matching rule in a pre-computed table explicitly listing the first matching rule for each element of the cross-product. The size of this table is the product of the numbers of recognized prefixes/ranges for the individual fields. Due to its memory requirements this method is not feasible. Equivalenced cross-producting (a.k.a. recursive flow classification or RFC) combines the results of the per-field longest matching prefix operations two by two. The pairs of values are grouped in equivalence classes and in general there are much fewer equivalence classes than pairs of values. This leads to significant memory savings as compared to simple cross-producting. This algorithm provides fast packet classification, but compared to other algorithms, the memory requirements are relatively large (but feasible in some settings).

22 DIMACS Tutorial on Algorithms for Next Generation Networks August Decision tree approaches At each node of the tree test a bit in a field or perform a range test  Large fan-out leads to shallow trees and fast classification Leaves contain a few rules traversed linearly Interior nodes may contain rules that match also Tests may look at bits from multiple fields A rule may appear in multiple nodes of the decision tree – this can lead to increased memory usage Tree built using heuristics that pick fields to compare on that divide remaining rules relatively evenly among descendants Fast and compact on rule sets used today

23 DIMACS Tutorial on Algorithms for Next Generation Networks August Papers on packet classification G. Varghese “Network algorithmics …”, chapter 12 V. Srinivasan, G. Varghese, S. Suri, M. Waldvogel, “Fast and Scalable Layer Four Switching”, ACM SIGCOMM, Sep F. Baboescu, S. Singh, G. Varghese, “Packet classification for core routers: Is there an alternative to CAMs?”, IEEE Infocom, 2003 P. Gupta, N. McKeown, “Packet classification on multiple fields”, ACM SIGCOMM 1999 T. Woo, “A modular approach to packet classification: Algorithms and results”, IEEE Infocom, 2000 S. Singh, F. Baboescu, G. Varghese, “Packet classification using multidimensional cutting”, SIGCOMM, 2003

24 DIMACS Tutorial on Algorithms for Next Generation Networks August Overview Longest matching prefix Classification on multiple fields Signature matching  String matching  Regular expression matching w/ DFAs and D 2 FAs

25 DIMACS Tutorial on Algorithms for Next Generation Networks August Signature matching Used in intrusion prevention/detection, application classification, load balancing Guard: a byte string or a regular expression Action: drop packet, log alert, set priority, direct to specific server Input: byte string from the payload of packet(s)  Hence the name “deep packet inspection” Output: the positions at which various signatures match or the identifier of the “highest priority” signature that matches Size of problem: hundreds of signatures per protocol

26 DIMACS Tutorial on Algorithms for Next Generation Networks August String matching Most widely used early form of deep packet inspection, but the more expressive regular expressions have superceded strings by now  Still used as pre-filter to more expensive matching operations by popular open source IDS/IPS Snort Matching multiple strings a well-studied problem  A. Aho, M. Corasick. “Efficient string matching: An aid to bib- liographic search”, Communications of the ACM, June 1975  Many hardware-based solutions published in last decade  Matching time independent of number of strings, memory requirements proportional to sum of their sizes

27 DIMACS Tutorial on Algorithms for Next Generation Networks August Regular expression matching Deterministic and non-deterministic finite automata (DFAs and NFAs) can match regular expressions  NFAs more compact but require backtracking or keeping track of sets of states during matching  Both representations used in hardware and software solutions, but only DFA based solutions can guarantee throughput in software DFAs have a state space explosion problem  From DFAs recognizing individual signatures we can build a DFA that recognizes entire signature set in a single pass  Size of combined DFA much larger than sum of sizes for DFAs recognizing individual signatures  Multiple combined DFAs are used to match signature set

28 DIMACS Tutorial on Algorithms for Next Generation Networks August State 0State State State 3 … State 0State 1State State Default transitions … Deterministic finite automaton (DFA) Delayed Input DFA (D 2 FA) S. Kumar, S. Dharmapurikar, F. Yu, P. Crowley, J. Turner, “Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection”, ACM SIGCOMM, September 2006 Input …410052… Crt. state 112 Set of regular expr. D 2 FAs with no bound on default path length D 2 FAs d.p.l.≤4 Avg. d.p.l.Max d.p.l.Memory Cisco %1.56% Cisco %1.54% Cisco %3.31% Linux %1.87% Linux %9.08% Snort %1.66% Bro %0.51% The memory columns report the ratio between the number of transitions used by the D 2 FA and the corresponding DFA. D 2 FAs build on the observation that for many pairs of states, the transition tables are very similar and it is enough to store the differences. The lookup algorithm may need to follow multiple default transitions until it finds a state that explicitly stores a pointer to the next state it needs to transition to. Since this is a throughput concern, the algorithm for constructing D 2 FAs allows the user to set a limit on the length of the maximum default path. If the “current state” variable meets an acceptance condition (e.g. whether the state identifier is larger than a given threshold), the automaton raises an alert.

29 DIMACS Tutorial on Algorithms for Next Generation Networks August Conclusions Networking devices implement more and more complex data plane processing to better control traffic The algorithms and data structures used have big performance impact Often set of rules to be matched against has specific structure  Algorithms exploiting this structure may give good performance even if it is impossible to find an algorithm that gives good performance on all possible rule sets

30 DIMACS Tutorial on Algorithms for Next Generation Networks August That’s all folks!


Download ppt "Data plane algorithms in routers From prefix lookup to deep packet inspection Cristian Estan, University of Wisconsin-Madison."

Similar presentations


Ads by Google