EE384Y: Packet Switch Architectures

Slides:



Advertisements
Similar presentations
Network Layer Delivery Forwarding and Routing
Advertisements

EE384Y: Packet Switch Architectures
1 EE384Y: Packet Switch Architectures Part II Load-balanced Switch (Borrowed from Isaac Keslassys Defense Talk) Nick McKeown Professor of Electrical Engineering.
EE384y: Packet Switch Architectures
1 UNIT I (Contd..) High-Speed LANs. 2 Introduction Fast Ethernet and Gigabit Ethernet Fast Ethernet and Gigabit Ethernet Fibre Channel Fibre Channel High-speed.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Cognitive Radio Communications and Networks: Principles and Practice By A. M. Wyglinski, M. Nekovee, Y. T. Hou (Elsevier, December 2009) 1 Chapter 12 Cross-Layer.
OSPF 1.
Introduction to IP Routing Geoff Huston. Routing How do packets get from A to B in the Internet? A B Internet.
ARIN Public Policy Meeting
1 Building a Fast, Virtualized Data Plane with Programmable Hardware Bilal Anwer Nick Feamster.
1 Hyades Command Routing Message flow and data translation.
Scalable Routing In Delay Tolerant Networks
1 Linux IP Masquerading Brian Vargyas XNet Information Systems.
Multipath Routing for Video Delivery over Bandwidth-Limited Networks S.-H. Gary Chan Jiancong Chen Department of Computer Science Hong Kong University.
IP-Internet Protocol Addresses. Computer Engineering Department 2 Addresses for the Virtual Internet The goal of internetworking is to provide a seamless.
Chapter 1: Introduction to Scaling Networks
PP Test Review Sections 6-1 to 6-6
Access Control Lists. Types Standard Extended Standard ACLs Use only the packets source address for comparison 1-99.
What is access control list (ACL)?
IP Multicast Information management 2 Groep T Leuven – Information department 2/14 Agenda •Why IP Multicast ? •Multicast fundamentals •Intradomain.
1 © 2004, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.1 Module 10 Routing Fundamentals and Subnets.
RED-PD: RED with Preferential Dropping Ratul Mahajan Sally Floyd David Wetherall.
Artificial Intelligence
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Addressing the Network – IPv4 Network Fundamentals – Chapter 6.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
PSSA Preparation.
Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013.
The Project Please read the project’s description first. Each router will have a unique ID, with your router’s ID of 0 Any two connected routers will have.
A Search Memory Substrate for High Throughput and Low Power Packet Processing Sangyeun Cho, Michel Hanna and Rami Melhem Dept. of Computer Science University.
August 17, 2000 Hot Interconnects 8 Devavrat Shah and Pankaj Gupta
Fast Updating Algorithms for TCAMs Devavrat Shah Pankaj Gupta IEEE MICRO, Jan.-Feb
Packet Classification using Hierarchical Intelligent Cuttings
1 IP-Lookup and Packet Classification Advanced Algorithms & Data Structures Lecture Theme 08 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Balajee Vamanan, Gwendolyn Voskuilen, and T. N. Vijaykumar School of Electrical & Computer Engineering SIGCOMM 2010.
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
M. Waldvogel, G. Varghese, J. Turner, B. Plattner Presenter: Shulin You UNIVERSITY OF MASSACHUSETTS, AMHERST – Department of Electrical and Computer Engineering.
Survey of Packet Classification Algorithms. Outline Background and problem definition Classification schemes – One dimensional classification – Two dimensional.
KARL NADEN – NETWORKS (18-744) FALL 2010 Overview of Research in Router Design.
CSIE NCKU High-performance router architecture 高效能路由器的架構與設計.
Packet Classification on Multiple Fields Pankaj Gupta and Nick McKeown Stanford University {pankaj, September 2, 1999.
1 ENTS689L: Packet Processing and Switching Classification Engines Classification Engines Vahid Tabatabaee Fall 2007.
Two stage packet classification using most specific filter matching and transport level sharing Authors: M.E. Kounavis *,A. Kumar,R. Yavatkar,H. Vin Presenter:
048866: Packet Switch Architectures Dr. Isaac Keslassy Electrical Engineering, Technion Introduction.
Packet Classification George Varghese. Original Motivation: Firewalls Firewalls use packet filtering to block say ssh and force access to web and mail.
Chapter 9 Classification And Forwarding. Outline.
Applied Research Laboratory Edward W. Spitznagel 7 October Packet Classification for Core Routers: Is there an alternative to CAMs? Paper by: Florin.
Minimizing Expected Lookup Times on Binary Search Trees April 29, 2002 Pankaj Gupta Principal Architect, Cypress Semiconductor
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Packet Classification on Multiple Fields 참고 논문 : Pankaj Gupta and Nick McKeown SigComm 1999.
Packet Classifiers In Ternary CAMs Can Be Smaller Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison) Jia Wang.
Applied Research Laboratory Edward W. Spitznagel 24 October Packet Classification using Extended TCAMs Edward W. Spitznagel, Jonathan S. Turner,
1 Packet Classification تنظیم : محمدعلی عظیمی. Classifier Example 2.
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
Performance Analysis of Packet Classification Algorithms on Network Processors Deepa Srinivasan, IBM Corporation Wu-chang Feng, Portland State University.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
Address Lookup and Classification
Packet classification on Multiple Fields Authors: Pankaj Gupta and Nick McKcown Publisher: ACM 1999 Presenter: 楊皓中 Date: 2013/12/11.
Packet Switch Architectures The following are (sometimes modified and rearranged slides) from an ACM Sigcomm 99 Tutorial by Nick McKeown and Balaji Prabhakar,
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
Dynamic Algorithms with Worst-case Performance for Packet Classification Pankaj Gupta and Nick McKeown Stanford University {pankaj,
By: Yaron Levy Supervisors: Dr. Shlomo Greenberg Mr. Hagai David.
Toward Advocacy-Free Evaluation of Packet Classification Algorithms
Transport Layer Systems Packet Classification
High-performance router/switch architecture 高效能路由器/交換器的 架構與設計
Presentation transcript:

EE384Y: Packet Switch Architectures Part II Address Lookup and Classification (2) Nick McKeown Professor of Electrical Engineering and Computer Science, Stanford University nickm@stanford.edu http://www.stanford.edu/~nickm

Outline Routing Lookups Packet Classification Motivation and problem definition Classification algorithms Linear search Associative search (TCAM) Trie-based techniques Crossproducting Tradeoffs in classification Heuristic algorithms References

Motivation: Desire for Additional Services Y ISP3 X NAP ISP1 Z ISP2 Service Example Differentiated Service Ensure that traffic from ISP2 is given higher priority over traffic from ISP3. Packet Filtering Deny all web traffic from ISP3 at interface X. Policy-based routing Ensure that all web traffic from ISP2 is sent via interface Z. Other examples: Accounting & billing, rate-limiting, etc.

Special Processing Requires Identification of Flows All packets of a flow obey a pre-defined rule and are processed similarly by the router E.g. a flow = (src-IP-address, dst-IP-address), or a flow = (dst-IP-prefix, protocol) etc. Router needs to identify the flow of every incoming packet and then perform appropriate special processing based on negotiated service agreements Classification Rules or policies (aka ACL entries, filters)

Flow-aware Router: Basic Architectural Components Routing, resource reservation, admission control, SLAs Control Packet classification Special processing Switching Datapath: (per-packet processing) Routing lookup Scheduling

Multi-field Packet Classification L3-DA L3-SA L4-PROT Field 1 Field 2 … Field k Action Rule 1 5.3.40.0/21 2.13.8.11/32 UDP A1 Rule 2 5.168.3.0/24 152.133.0.0/16 TCP A2 Rule N 5.168.0.0/16 152.0.0.0/8 ANY AN Example: packet (5.168.3.32, 152.133.171.71, …, TCP) Packet Classification: Find the action associated with the highest priority rule matching an incoming packet header.

Formal Problem Definition Given a classifier C with N rules, Rj, 1  j  N, where Rj consists of three entities: A regular expression Rj[i], 1  i  d, on each of the d header fields, A number, pri(Rj), indicating the priority of the rule in the classifier, and An action, referred to as action(Rj). For an incoming packet P with the header considered as a d-tuple of points (P1, P2, …, Pd), the d-dimensional packet classification problem is to find the rule Rm with the highest priority among all the rules Rj matching the d-tuple; i.e., pri(Rm) > pri(Rj),  j  m, 1  j  N, such that Pi matches Rj[i], 1  i  d. We call rule Rm the best matching rule for packet P.

Routing Lookup: Instance of 1D Classification One-dimension (destination address) Forwarding table  classifier Routing table entry  rule Outgoing interface  action Prefix-length  priority

Example 4D Classifier R1 R2 R3 R4 R5 Rule L3-DA L3-SA L4-DP L4-PROT Action R1 152.163.190.69/255.255.255.255 152.163.80.11/255.255.255.255 * Deny R2 152.168.3/255.255.255 152.163.200.157/255.255.255.255 eq www udp R3 range 20-21 Permit R4 tcp R5

Example Classification Results Pkt Hdr L3-DA L3-SA L4-DP L4-PROT Rule, Action P1 152.163.190.69 152.163.80.11 www tcp R1, Deny P2 152.168.3.21 152.163.200.157 udp R2, Deny

Geometric Interpretation Packet classification problem: Find the highest priority rectangle containing an incoming point R7 R6 R2 R1 P1 R4 P2 R5 R3 e.g. (128.16.46.23, *) Dimension 2 e.g. (144.24/24, 64/16) Dimension 1

Outline Routing Lookups Packet Classification Motivation and problem definition Classification algorithms Linear search Associative search (TCAM) Trie-based techniques Crossproducting Tradeoffs in classification Heuristic algorithms References

Metrics for Classification Algorithms Speed Storage requirements Ability to handle large classifiers Low preprocessing time Update time Scalability in the number of header fields Flexibility in rule specification

Size/Update-rate of Classifier? Micro-flow recognition 128K-1M flows in a metro/edge router Also requires high update rate (but have few wildcards) Firewall applications <2K rules per interface Requires low update rate (usually configured at start-up/boot-up time) Depends heavily on the type of router

Linear Search Keep rules in a linked list O(N) storage, O(N) lookup time, O(1) update complexity Q. Why is update complexity O(1) ?

Ternary Match Operation Each TCAM entry stores a value, V, and mask, M Hence, two bits (Vi and Mi) for each bit position i (i=1..W) For an incoming packet header, H = {Hi}, the TCAM entry outputs a match if Hi matches Vi in each bit position for which Mi equals ‘1’. Vi Mi Match in bit position I ? X Yes 1 Iff (Hi==0) Iff (Hi==1) A simple diagram of a row entry would be useful here. Z = &_{I=1..W} (~Mi | (Vi == Hi)) Optional Exercise: What is the logic equation for Z (boolean variable denoting whether a TCAM entry matched)? Optional Exercise: What is the logic equation for Z (boolean variable denoting whether a TCAM entry matched), if instead of (Vi, Mi) we store (Ai,Bi) where (0,0) = always match, (1,1) = always mismatch, (0,1) = match0, and (1,0) = match1

Lookups/Classification with Ternary CAM TCAM RAM P32 P31 P8 For LPM Memory array Action Memory 1.23.11.3, tcp 1 1 2 3 Priority Packet Action Header encoder M 1.23.x.x, x 1

Range-to-prefix Blowup Maximum memory blowup = factor of (2W-2)d Rule Range R1 [3,11] R2 [2,7] R3 [4,11] R4 [4,7] R5 [1,14] Maximal Prefixes 0011, 01**, 10** 001*, 01** 01**, 10** 01** 0001, 001*, 01**, 10**, 110*, 1110 Luckily, real-life does not see too many arbitrary ranges.

TCAMs Advantages Disadvantages Extensible to multiple fields Fast: 10-16 ns today (66-100 M searches per second) going to 250 Msps Simple to understand and use Disadvantages Inflexible: range-to-prefix blowup Power: ~15-20W @ 100Msps Cost: $200-$250 for ~2MByte Density: largest available in 2003-4 is ~2MB, i.e., 128K x 128 (can be cascaded) Tough memory soft-error problem

Example Classifier Rule Destination Address Source Address R1 0* 10* 01* R3 1* R4 00* R5 11* R6 R7 *

Hierarchical Tries Dimension DA O(NW) memory O(W2) lookup Dimension SA Search (000,010) Rule DA SA R1 0* 10* R2 01* R3 1* R4 00* R5 11* R6 R7 * Dimension DA Dimension SA R5 R2 R1 R3 R6 R7 R4 1 O(NW) memory O(W2) lookup

Set-pruning Tries [Tsuchiya, Sri98] Search (000,010) Rule DA SA R1 0* 10* R2 01* R3 1* R4 00* R5 11* R6 R7 * Dimension DA 1 O(N2) memory O(2W) lookup R4 R3 R6 Dimension SA R7 R2 R1 R5 R7 R2 R1 R7 R7

Grid-of-Tries [Sri98] Dimension DA O(NW) memory O(2W) lookup Search (000,010) Rule DA SA R1 0* 10* R2 01* R3 1* R4 00* R5 11* R6 R7 * Dimension DA 1 O(NW) memory O(2W) lookup R3 R4 R6 Dimension SA R5 R2 R1 R7

Grid-of-Tries 20K 2D rules: 2MB, 9 memory accesses (with prefix-expansion) Advantages Good solution for two dimensions Disadvantages Difficult to carry out updates Not easily extensible to more than two dimensions

Crossproducting [Sri98] 6 (8,4) 5 R2 R1 P1 4 R3 R4 (1,3) 3 2 1 1 2 3 4 5 6 7 8 9

Crossproducting Need: d 1-D lookups + 1 memory access, O(Nd) space 50 rules: 1.5MB, need caching (on-demand crossproducting) for bigger classifiers Advantages Fast accesses Suitable for multiple fields Disadvantages Large amount of memory Need caching for bigger classifiers (> 50 rules)

Outline Routing Lookups Packet Classification Motivation and problem definition Classification algorithms Linear search Associative search (TCAM) Trie-based techniques Crossproducting Tradeoffs in classification Heuristic algorithms References

Classification Algorithms: Speed vs. Storage Tradeoff Lower bounds for Point Location in N regions with d dimensions from Computational Geometry O(log N) time with O(Nd) storage, or O(logd-1N) time with O(N) storage . These bounds illustrate the difficulty of designing good PC algorithms. Even for small values of N and d, the storage or classification time becomes infeasible. N = 100, d = 4, Nd = 100 MBytes and logd-1N = 350 memory accesses

Classification Tradeoff in Hardware Switches/Routers Power consumption of classification subsystem Cost Speed Density (Storage)

Algorithms so far: Summary Good for two fields, but do not scale to more than two fields, OR Good for very small classifiers (< 50 rules) only, OR Have non-deterministic classification time, OR Either too slow or consume too much storage

One Solution: Heuristics that “seem to work well in real-life” Recursive Flow Classification [Gupta, McKeown 1999] Generalization of crossproducting to conserve storage Hierarchical Intelligent Cuttings [Gupta, McKeown 1999] Aggregated Bit-vector [Baboescu, Varghese 2001] Good heuristics do better than worst-case bounds for real-life datasets. Hierarchy (to at least some level) Structure Properties of real-life classifiers:

Lookup: What’s Used Out There? Overwhelming majority of routers: Modifications of multi-bit tries (h/w optimized trie algorithms) DRAM (sometimes SRAM) based, large number of routes (>0.25M) Parallelism required for speed/storage becomes an issue Others mostly TCAM based For smaller number of routes (256K) Used more frequently in L2/L3 switches Power and cost main bottlenecks

Classification: What’s Used Out There? Majority of hardware platforms: TCAMs High performance, cost, power, determinstic worst-case Some others: Modifications of RFC Low speed, low cost DRAM-based, heuristic Works well in software platforms Some others: nothing/linear search/simulated-parallel-search etc.

Packet Classification: References F. Baboescu and G. Varghese, “Scalable packet classification,” Proc. Sigcomm 2001 [Lak98] T.V. Lakshman. D. Stiliadis. “High speed policy based packet forwarding using efficient multi-dimensional range matching”, Sigcomm 1998, pp 191-202 [Sri98] V. Srinivasan, S. Suri, G. Varghese and M. Waldvogel. “Fast and scalable layer 4 switching”, Sigcomm 1998, pp 203-214 [Grid-of-tries, crossproducting] V. Srinivasan, G. Varghese, S. Suri. “Fast packet classification using tuple space search”, Sigcomm 1999, pp 135-146 P. Gupta, N. McKeown, “Packet classification using hierarchical intelligent cuttings,” Hot Interconnects VII, 1999 [Gupta99] P. Gupta, N. McKeown, “Packet classification on multiple fields,” Sigcomm 1999, pp 147-160 [RFC]

Packet Classification: References (contd.) P. Gupta, “Algorithms for routing lookups and packet classification”, PhD Thesis, Ch 1 and 4, Dec 2000, available at http://yuba.stanford.edu/ ~pankaj/phd.html [Background and introduction to Classification] P. Gupta and N. McKeown, “Algorithms for packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 24-32 S. Iyer, R.R. Kompella, and A. Shelat, “ClassiPI: An architecture for fast and flexible packet classification,” IEEE Network, March/April 2001, vol. 15, no. 2, pp 33-41 TCAM vendors: netlogicmicro.com, sibercore.com, idt.com, cypress.com