1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm

2 Generic Router Architecture (Review from EE384x) Lookup IP Address Update Header Header Processing DataHdrDataHdr ~1M prefixes Off-chip DRAM Address Table Address Table IP AddressNext Hop Queue Packet Buffer Memory Buffer Memory ~1M packets Off-chip DRAM

3 Lookups Must be Fast 12540Gb/s2003 31.2510Gb/s2001 7.812.5Gb/s1999 1.94622Mb/s1997 40B packets (Mpkt/s) LineYear 1.Lookup mechanism must be simple and easy to implement 2.(Surprise?) Memory access time is the long-term bottleneck

4 Memory Technology (2003-04) TechnologySingle chip density $/chip ($/MByte) Access speed Watts/ chip Networking DRAM 64 MB$30-$50 ($0.50-$0.75) 40-80ns0.5-2W SRAM4 MB$20-$30 ($5-$8) 4-8ns1-3W TCAM1 MB$200-$250 ($200-$250) 4-8ns15-30W Note: Price, speed and power are manufacturer and market dependent.

5 Lookup Mechanism is Protocol Dependent Networking Protocol Lookup Mechanism Techniques MPLS, ATM, Ethernet Exact match search –Direct lookup –Associative lookup –Hashing –Binary/Multi-way Search Trie/Tree IPv4, IPv6Longest-prefix match search -Radix trie and variants -Compressed trie -Binary search on prefix intervals

6 Outline I.Routing Lookups Overview Exact matching –Direct lookup –Associative lookup –Hashing –Trees and tries Longest prefix matching –Why LPM? –Tries and compressed tries –Binary search on prefix intervals References II.Packet Classification

7 Exact Matches in ATM/MPLS VCI/MPLS-label Address Memory Data (Outgoing Port, new VCI/label) VCI/Label space is 24 bits - Maximum 16M addresses. With 64b data, this is 1Gb of memory. VCI/Label space is private to one link Therefore, table size can be negotiated Alternately, use a level of indirection Direct Memory Lookup

8 Exact Matches in Ethernet Switches Layer-2 addresses are usually 48-bits long, The address is global, not just local to the link, The range/size of the address is not negotiable (like it is with ATM/MPLS) 2 48 > 10 12, therefore cannot hold all addresses in table and use direct lookup.

9 Exact Matches in Ethernet Switches (Associative Lookup) Associative memory (aka Content Addressable Memory, CAM) compares all entries in parallel against incoming data. Network address Data Associative Memory (CAM) Address 48bits Match Location Address Normal Memory Data Port

10 Exact Matches in Ethernet Switches Hashing Use a pseudo-random hash function (relatively insensitive to actual function) Bucket linearly searched (or could be binary search, etc.) Leads to unpredictable number of memory references Hashing Function Memory Address Data Network Address 48 16, say Pointer Memory Address Data List/Bucket List of network addresses in this bucket

11 Exact Matches Using Hashing Number of memory references

12 Exact Matches in Ethernet Switches Perfect Hashing Hashing Function Memory Address Data Network Address 48 16, say Port There always exists a perfect hash function. Goal: With a perfect hash function, memory lookup always takes O(1) memory references. Problem: - Finding perfect hash functions (particularly minimal perfect hashings) is very complex. - Updates?

13 Exact Matches in Ethernet Switches Hashing Advantages: –Simple –Expected lookup time is small Disadvantages –Inefficient use of memory –Non-deterministic lookup time Attractive for software-based switches, but decreasing use in hardware platforms

14 Exact Matches in Ethernet Switches Trees and Tries Binary Search Tree <> <><> log 2 N N entries Binary Search Trie 01 0101 111010 Lookup time bounded and independent of table size, storage is O(NW) Lookup time dependent on table size, but independent of address length, storage is O(N)

15 Exact Matches in Ethernet Switches Multiway tries 16-ary Search Trie 0000, ptr1111, ptr 0000, 01111, ptr 000011110000 0000, 0 1111, ptr 111111111111 Ptr=0 means no children Q: Why cant we just make it a 2 48 -ary trie?

16 Exact Matches in Ethernet Switches Multiway tries Table produced from 2 15 randomly generated 48-bit addresses As degree increases, more and more pointers are 0

17 Exact Matches in Ethernet Switches Trees and Tries Advantages: –Fixed lookup time –Simple to implement and update Disadvantages –Inefficient use of memory and/or requires large number of memory references

18 Outline I.Routing Lookups Overview Exact matching –Direct lookup –Associative lookup –Hashing –Trees and tries Longest prefix matching –Why LPM? –Tries and compressed tries –Binary search on prefix intervals References II.Packet Classification

19 Longest Prefix Matching: IPv4 Addresses 32-bit addresses Dotted quad notation: e.g. 12.33.32.1 Can be represented as integers on the IP number line [0, 2 32 -1]: a.b.c.d denotes the integer: (a*2 24 +b*2 16 +c*2 8 +d) 0.0.0.0255.255.255.255 IP Number Line

20 Class-based Addressing ABCD 0.0.0.0 E 128.0.0.0 192.0.0.0 ClassRangeMS bitsnetidhostid A 0.0.0.0 – 128.0.0.00bits 1-7bits 8-31 B 128.0.0.0 - 191.255.255.255 10bits 2-15bits 16-31 C 192.0.0.0 - 223.255.255.255 110bits 3-23bits 24-31 D (multicast) 224.0.0.0 - 239.255.255.255 1110-- E (reserved) 240.0.0.0 - 255.255.255.255 11110--

21 Lookups with Class-based Addresses 23 186.21 Port 1 Port 2 192.33.32.1 Class A Class B Class C 192.33.32Port 3 Exact match netidport#

22 Problems with Class-based Addressing Fixed netid-hostid boundaries too inflexible –Caused rapid depletion of address space Exponential growth in size of routing tables

23 Early Exponential Growth in Routing Table Sizes Number of BGP routes advertised

24 Classless Addressing (and CIDR) Eliminated class boundaries Introduced the notion of a variable length prefix between 0 and 32 bits long Prefixes represented by P/l: e.g., 122/8, 212.128/13, 34.43.32/22, 10.32.32.2/32 etc. An l-bit prefix represents an aggregation of 2 32-l IP addresses

25 CIDR:Hierarchical Route Aggregation Backbone Router R1 R2 R3 R4 ISP, PISP, Q 192.2.0/22 200.11.0/22 Site, S 192.2.1/24 Site, T 192.2.2/24192.2.0/22200.11.0/22 192.2.1/24192.2.2/24 192.2.0/22, R2 Backbone routing table IP Number Line R2

26 Post-CIDR Routing Table sizes Source: http://www.cidr-report.org/

27 Routing Lookups with CIDR 192.2.0/22, R2 192.2.2/24, R3 192.2.0/22200.11.0/22 192.2.2/24 200.11.0/22, R4 200.11.0.33192.2.0.1 192.2.2.100 LPM: Find the most specific route, or the longest matching prefix among all the prefixes matching the destination address of an incoming packet

28 Longest Prefix Match is Harder than Exact Match The destination address of an arriving packet does not carry with it the information to determine the length of the longest matching prefix Hence, one needs to search among the space of all prefix lengths; as well as the space of all prefixes of a given length

29 LPM in IPv4 Use 32 exact match algorithms for LPM! Exact match against prefixes of length 1 Exact match against prefixes of length 2 Exact match against prefixes of length 32 Network Address Port Priority Encode and pick

30 Metrics for Lookup Algorithms Speed (= number of memory accesses) Storage requirements (= amount of memory) Low update time (support ~5K updates/s) Scalability –With length of prefix: IPv4 unicast (32b), Ethernet (48b), IPv4 multicast (64b), IPv6 unicast (128b) –With size of routing table: (sweetspot for todays designs = 1 million) Flexibility in implementation Low preprocessing time

31 Radix Trie P1111*H1 P210*H2 P31010*H3 P410101H4 P2 P3 P4 P1 A B C G D F H E 1 0 0 1 1 1 1 Lookup 10111 Add P5=1110* I 0 P5 next-hop-ptr (if prefix) left-ptr right-ptr Trie node

32 Radix Trie W-bit prefixes: O(W) lookup, O(NW) storage and O(W) update complexity Advantages Simplicity Extensible to wider fields Disadvantages Worst case lookup slow Wastage of storage space in chains

33 Leaf-pushed Binary Trie A B C G D E 1 0 0 1 1 left-ptr or next-hop Trie node right-ptr or next-hop P2 P4P3 P2 P1 111*H1 P210*H2 P31010*H3 P410101H4

34 PATRICIA 2 A B C E 1 0 1 Patricia tree internal node 3 P3 P2 P4 P1 1 0 0 F G D 5 bit-position left-ptr right-ptr Lookup 10111 P1111*H1 P210*H2 P31010*H3 P410101H4 Bitpos 12345

35 W-bit prefixes: O(W 2 ) lookup, O(N) storage and O(W) update complexity Advantages Decreased storage Extensible to wider fields Disadvantages Worst case lookup slow Backtracking makes implementation complex PATRICIA

36 Path-compressed Tree 1,, 2 A B C 1 0 10,P2,4 P4 P1 1 0 E D 1010,P3,5 bit-position left-ptrright-ptr variable-length bitstring next-hop (if prefix present) Path-compressed tree node structure Lookup 10111 P1111*H1 P210*H2 P31010*H3 P410101H4

37 W-bit prefixes: O(W) lookup, O(N) storage and O(W) update complexity Advantages Decreased storage Disadvantages Worst case lookup slow Path-compressed Tree

38 Multi-bit Tries Depth = W Degree = 2 Stride = 1 bit Binary trie W Depth = W/k Degree = 2 k Stride = k bits Multi-ary trie W/k

39 Prefix Expansion with Multi-bit Tries If stride = k bits, prefix lengths that are not a multiple of k need to be expanded PrefixExpanded prefixes 0*00*, 01* 11* E.g., k = 2: Maximum number of expanded prefixes corresponding to one non-expanded prefix = 2 k-1

40 Four-ary Trie (k=2) P2 P3P1 2 A B F 11 next-hop-ptr (if prefix) ptr00ptr01 A four-ary trie node P1 1 10 P4 2 H 11 P4 1 10 11 10 D C E G ptr10ptr11 Lookup 10111 P1111*H1 P210*H2 P31010*H3 P410101H4

41 Prefix Expansion Increases Storage Consumption Replication of next-hop ptr Greater number of unused (null) pointers in a node Time ~ W/k Storage ~ NW/k * 2 k-1

42 Generalization: Different Strides at Each Trie Level 16-8-8 split 4-10-10-8 split 24-8 split 21-3-8 split Optional Exercise: Why does this not work well for IPv6?

43 Choice of Strides: Controlled Prefix Expansion [Sri98] Given a forwarding table and a desired number of memory accesses in the worst case (i.e., maximum tree depth, D) A dynamic programming algorithm to compute the optimal sequence of strides that minimizes the storage requirements: runs in O(W 2 D) time Advantages Optimal storage under these constraints Disadvantages Updates lead to sub- optimality anyway Hardware implementation difficult

44 Binary Search on Prefix Intervals [Lampson98] 0000 11110010010001101000111010101100 P1 P4 P3 P5 P2 PrefixInterval P1/00000…1111 P200/20000…0011 P31/11000…1111 P41101/41101…1101 P5001/30010…0011 1001 I1I1 I3I3 I4I4 I5I5 I6I6 I2I2

45 I1I1 I3I3 I2I2 I4I4 I5I5 I6I6 0111 0011 1101 11000001 > Alphabetic Tree 1/21/4 1/8 1/16 1/32 > > > > 0000 11110010010001101000111010101100 P1 P4 P3 P5 P2 1001 I1I1 I3I3 I4I4 I5I5 I6I6 I2I2

46 0001 Another Alphabetic Tree I1I1 I2I2 I5I5 I3I3 I4I4 I6I6 0111 0011 1100 1101 1/2 1/4 1/8 1/16 1/32

47 Advantages Storage is linear Can be balanced Lookup time independent of W Disadvantages But, lookup time is dependent on N Incremental updates complex Each node is big in size: requires higher memory bandwidth W-bit N prefixes: O(logN) lookup, O(N) storage Multiway Search on Intervals

48 Routing Lookups: References [lulea98] A. Brodnik, S. Carlsson, M. Degermark, S. Pink. Small Forwarding Tables for Fast Routing Lookups, Sigcomm 1997, pp 3-14. [Example of techniques for decreasing storage consumption] [gupta98] P. Gupta, S. Lin, N.McKeown. Routing lookups in hardware at memory access speeds, Infocom 1998, pp 1241-1248, vol. 3. [Example of hardware-optimized trie with increased storage consumption] P. Gupta, B. Prabhakar, S. Boyd. Near-optimal routing lookups with bounded worst case performance, Proc. Infocom, March 2000 [Example of deliberately skewing alphabetic trees] P. Gupta, Algorithms for routing lookups and packet classification, PhD Thesis, Ch 1 and 2, Dec 2000, available at http://yuba.stanford.edu/ ~pankaj/phd.html [Background and introduction to LPM]

49 Routing lookups : References (contd) [lampson98] B. Lampson, V. Srinivasan, G. Varghese. IP lookups using multiway and multicolumn search, Infocom 1998, pp 1248-56, vol. 3. [LC-trie] S. Nilsson, G. Karlsson. Fast address lookup for Internet routers, IFIP Intl Conf on Broadband Communications, Stuttgart, Germany, April 1-3, 1998. [sri98] V. Srinivasan, G.Varghese. Fast IP lookups using controlled prefix expansion, Sigmetrics, June 1998. [wald98] M. Waldvogel, G. Varghese, J. Turner, B. Plattner. Scalable high speed IP routing lookups, Sigcomm 1997, pp 25-36.

1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

Similar presentations

Presentation on theme: "1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,

Similar presentations

Presentation on theme: "1 EE384Y: Packet Switch Architectures Part II Address Lookup and Classification Nick McKeown Professor of Electrical Engineering and Computer Science,"— Presentation transcript:

Similar presentations

About project

Feedback