1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.

Slides:



Advertisements
Similar presentations
Router/Classifier/Firewall Tables Set of rules—(F,A)  F is a filter Source and destination addresses. Port number and protocol. Time of day.  A is an.
Advertisements

Fast Updating Algorithms for TCAMs Devavrat Shah Pankaj Gupta IEEE MICRO, Jan.-Feb
Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond
1 An Efficient, Hardware-based Multi-Hash Scheme for High Speed IP Lookup Hot Interconnects 2008 Socrates Demetriades, Michel Hanna, Sangyeun Cho and Rami.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
Bio Michel Hanna M.S. in E.E., Cairo University, Egypt B.S. in E.E., Cairo University at Fayoum, Egypt Currently is a Ph.D. Student in Computer Engineering.
Fundamentals of Python: From First Programs Through Data Structures
Outline Introduction Related work on packet classification Grouper Performance Empirical Evaluation Conclusions.
A Ternary Unification Framework for Optimizing TCAM-Based Packet Classification Systems Author: Eric Norige, Alex X. Liu, and Eric Torng Publisher: ANCS.
1 TCAM Razor: A Systematic Approach Towards Minimizing Packet Classifiers in TCAMs Department of Computer Science and Information Engineering National.
Efficient Multi-match Packet Classification with TCAM Fang Yu Randy H. Katz EECS Department, UC Berkeley {fyu,
Fast Filter Updates for Packet Classification using TCAM Authors: Haoyu Song, Jonathan Turner. Publisher: GLOBECOM 2006, IEEE Present: Chen-Yu Lin Date:
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
1 Energy Efficient Multi-match Packet Classification with TCAM Fang Yu
CS 268: Lectures 13/14 (Route Lookup and Packet Classification) Ion Stoica April 1/3, 2002.
Efficient Multidimensional Packet Classification with Fast Updates Author: Yeim-Kuan Chang Publisher: IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 4, APRIL.
Efficient Multi-Match Packet Classification with TCAM Fang Yu
1 DRES:Dynamic Range Encoding Scheme for TCAM Coprocessors Authors: Hao Che, Zhijun Wang, Kai Zheng and Bin Liu Publisher: IEEE Transactions on Computers,
1 A Fast IP Lookup Scheme for Longest-Matching Prefix Authors: Lih-Chyau Wuu, Shou-Yu Pin Reporter: Chen-Nien Tsai.
SSA: A Power and Memory Efficient Scheme to Multi-Match Packet Classification Fang Yu 1 T. V. Lakshman 2 Martin Austin Motoyama 1 Randy H. Katz 1 1 EECS.
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
Algorithms for Advanced Packet Classification with TCAMs Karthik Lakshminarayanan UC Berkeley Joint work with Anand Rangarajan and Srinivasan Venkatachary.
Fast binary and multiway prefix searches for pachet forwarding Author: Yeim-Kuan Chang Publisher: COMPUTER NETWORKS, Volume 51, Issue 3, pp , February.
Hash, Don’t Cache: Fast Packet Forwarding for Enterprise Edge Routers Minlan Yu Princeton University Joint work with Jennifer.
1 Efficient packet classification using TCAMs Authors: Derek Pao, Yiu Keung Li and Peng Zhou Publisher: Computer Networks 2006 Present: Chen-Yu Lin Date:
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
CoPTUA: Consistent Policy Table Update Algorithm for TCAM without Locking Zhijun Wang, Hao Che, Mohan Kumar, Senior Member, IEEE, and Sajal K. Das.
Timothy Whelan Supervisor: Mr Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University Hardware based packet filtering.
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
GLOBECOM (Global Communications Conference), 2012
Vladimír Smotlacha CESNET Full Packet Monitoring Sensors: Hardware and Software Challenges.
ORange: Multi Field OpenFlow based Range Classifier Liron Schiff Tel Aviv University Yehuda Afek Tel Aviv University Anat Bremler-Barr Inter Disciplinary.
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
Fast Packet Classification Using Bloom filters Authors: Sarang Dharmapurikar, Haoyu Song, Jonathan Turner, and John Lockwood Publisher: ANCS 2006 Present:
Packet Classification on Multiple Fields 참고 논문 : Pankaj Gupta and Nick McKeown SigComm 1999.
Packet Classifiers In Ternary CAMs Can Be Smaller Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison) Jia Wang.
Palette: Distributing Tables in Software-Defined Networks Yossi Kanizo (Technion, Israel) Joint work with Isaac Keslassy (Technion, Israel) and David Hay.
Towards a Billion Routing Lookups per Second in Software  Author: Marko Zec, Luigi, Rizzo Miljenko Mikuc  Publisher: SIGCOMM Computer Communication Review,
Multi-Field Range Encoding for Packet Classification in TCAM Author: Yeim-Kuan Chang, Chun-I Lee and Cheng-Chien Su Publisher: INFOCOM 2011 Presenter:
Applied Research Laboratory Edward W. Spitznagel 24 October Packet Classification using Extended TCAMs Edward W. Spitznagel, Jonathan S. Turner,
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
1 ECE 526 – Network Processing Systems Design System Implementation Principles II Varghese Chapter 3.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
1 Power-Efficient TCAM Partitioning for IP Lookups with Incremental Updates Author: Yeim-Kuan Chang Publisher: ICOIN 2005 Presenter: Po Ting Huang Date:
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
Cross-Product Packet Classification in GNIFS based on Non-overlapping Areas and Equivalence Class Author: Mohua Zhang, Ge Li Publisher: AISS 2012 Presenter:
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
Packet classification on Multiple Fields Authors: Pankaj Gupta and Nick McKcown Publisher: ACM 1999 Presenter: 楊皓中 Date: 2013/12/11.
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
1 Bit Weaving: A Non-Prefix Approach to Compressing Packet Classifiers in TCAMs Author: Chad R. Meiners, Alex X. Liu, and Eric Torng Publisher: IEEE/ACM.
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
Dynamic Algorithms with Worst-case Performance for Packet Classification Pankaj Gupta and Nick McKeown Stanford University {pankaj,
1 Space-Efficient TCAM-based Classification Using Gray Coding Authors: Anat Bremler-Barr and Danny Hendler Publisher: IEEE INFOCOM 2007 Present: Chen-Yu.
Author : Lynn Choi, Hyogon Kim, Sunil Kim, Moon Hae Kim Publisher/Conf : IEEE/ACM TRANSACTIONS ON NETWORKING Speaker : De yu Chen Data :
Packet Classification Using Multi- Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: 2013 IEEE 37th Annual Computer Software.
DRES: Dynamic Range Encoding Scheme for TCAM Coprocessors 2008 YU-ANTL Lab Seminar June 11, 2008 JeongKi Park Advanced Networking Technology Lab. (YU-ANTL)
IP Address Lookup Masoud Sabaei Assistant professor Computer Engineering and Information Technology Department, Amirkabir University of Technology.
Author : Tzi-Cker Chiueh, Prashant Pradhan Publisher : High-Performance Computer Architecture, Presenter : Jo-Ning Yu Date : 2010/11/03.
Exploiting Graphics Processors for High-performance IP Lookup in Software Routers Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu IEEE INFOCOM.
A DFA with Extended Character-Set for Fast Deep Packet Inspection
Toward Advocacy-Free Evaluation of Packet Classification Algorithms
Yotam Harchol The Hebrew University of Jerusalem, Israel
SPEAKER: Yu-Shan Chou ADVISOR: DR. Kai-Wei Ke
Packet Classification Using Coarse-Grained Tuple Spaces
Duo Liu, Bei Hua, Xianghui Hu, and Xinan Tang
Yotam Harchol The Hebrew University of Jerusalem, Israel
Worst-Case TCAM Rule Expansion
Packet Classification Using Binary Content Addressable Memory
Presentation transcript:

1

Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42

Introducing Grouper A packet classification algorithm Parameterized by the amount of memory available to it Trades classification speed for memory efficiency Obtains good performance under real-world memory constraints 3/42

Quick (Over|Re)view of Packet Classifiers Takes in a list of rules, each specifying a class of packets matched by that rule The rules are usually arranged by priority ClassSource IPSource Port * [4-8][ ] 2*>=1024 3** 4/42

Packet Classifier’s Job The classifier’s job is to input packets, and for every input, output the corresponding class number … RULES 5/42

Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions & Future Work 6/42

Related Work: Range Rule Patterns Existing software solutions (e.g., GEM) focus heavily on range and prefix pattern rules Range rule: dest_port = [1024 – 65535] Prefix rule: src_ip = * For many applications, these types of rules are not efficiently expressive E.g., matching all odd-numbered 16-bit ports requires 65,535 range/prefix rules 7/42

Bitmask Patterns: More Efficiently Expressive than Range Patterns Bitmask pattern to match all odd 16-bit ports: – Ternary mask, consisting of 0, 1,or ? (don’t care) – ???????????????1 A b -bit bitmask rule may require 2 b -1 range rules to express On the other hand, Rottenstreich et al. recently showed that every b -bit range rule can be converted into b bitmask rules 8/42

Who Uses Bitmasks? Some existing packet-classification solutions handle bitmask patterns RFC (a software solution) handles them, but uses prohibitively large amounts of memory for large rule sets (> 6000 rules) TCAMs (a hardware solution) are the de facto industry standard and use bitmask rules, but are expensive, special-purpose hardware with limited capacity for rules 9/42

Related Work: Regular Expression Patterns Some software algorithms, such as ESAs XFAs and BDDs, can handle regular expression rules, which are even more efficiently expressive than bitmasks Unfortunately, all of these algorithms suffer from worst-case exponential memory requirements and/or classification times 10/42

Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions & Future Work 11/42

How Grouper Works: Grouping Grouper is a software algorithm that handles bitmask rules It works by partitioning the b packet bits our classifier cares about into approximately equal sized groups 12/42

How Grouper Works: Lookup Grouper uses the value of each of these groups to look up the set (expressed as a bitmap) of classes that match that group of bits 13/42

How Grouper Works: Lookup Grouper uses the value of each of these groups to look up the set (expressed as a bitmap) of classes that match that group of bits = Table for Group 0 14/42

How Grouper Works: Lookup Grouper uses the value of each of these groups to look up the set (expressed as a bitmap) of classes that match that group of bits Group 0Group 1Group 2Group Table for Group 1 15/42

Table for Group 2 How Grouper Works: Lookup Grouper uses the value of each of these groups to look up the set (expressed as a bitmap) of classes that match that group of bits 1100 Group 0Group 1Group 2Group /42

Table for Group 3 How Grouper Works: Lookup Grouper uses the value of each of these groups to look up the set (expressed as a bitmap) of classes that match that group of bits 1100 Group 0Group 1Group 2Group /42

How Grouper Works: Intersection Then it takes the intersection (bitwise-AND) of all matching sets of rules to obtain the final matching class & & & /42

How Grouper Works: Results The final result is an n -length bitmap representing the set of all classes the input packet belongs to. We can either return the highest priority class that matches, or all matching classes. (Our implementation does the former) Class 1 matches Class # /42

Observation 1: Dimension Independence Note that Grouper is “blind” to packet fields/dimensions As far as Grouper is concerned, every packet is simply an array of bits Groups do not necessarily correspond to packet fields. Grouper doesn’t suffer from problems of other classification algorithms (e.g., geometric algorithms) whose performance is exponential in number of dimensions 20/42

Observation 2: Efficiency via Uniformity Grouper guarantees that all groups will be roughly equal in size. This uniformity prevents memory inefficiency from disproportionately large tables or time inefficiency from small tables. Space InefficientTime InefficientBest Balance 21/42

Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions & Future Work 22/42

Performance at the Extremes of Group Sizes By controlling the size of the bit groupings, Grouper can trade memory for classification speed Tables = 3 Mem = 40 bits 23/42

Performance at the Extremes of Group Sizes By controlling the size of the bit groupings, Grouper can trade memory for classification speed Tables = 4 Mem = 32 bits 24/42

Performance With All Bits in a Single Group Having more bits per group implies larger lookup tables but less table lookups and less intersections: this is one extreme of the classification algorithm, using a single lookup table— large memory requirements but fast lookup time 256 entries 25/42

Performance with Each Bit in its Own Group A single bit per group corresponds to the other extreme of the classification algorithm: linear search (analogous to walking through every combination of packet bits and rule/class numbers) 26/42

Grouper’s Performance in General (Running Time) Grouper uses t lookup tables to classify b bits according to n rules/classes Each lookup table maps either or of the b packet bits to an n -length bitmap representing the set of all classes those bits could possibly match Classification time is [1 < t ≤ b ] 27/42

Grouper’s Performance in General (Memory Usage) Grouper uses t tables, each with entries Each entry is an n -length bitmap consuming O(n/W) machines words – ( W is the word size in bits) Total memory is therefore [1 < t ≤ b ] 28/42

Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions & Future Work 29/42

Implementation & Setup Prototype in about 1,000 lines of C Implemented for x86_64 processor Experiments run on commodity Dell laptop, 2GHz Core 2 Duo, 4GB Ram Tested on minimal install of Arch Linux 30/42

Values Tested Tested relevant bit values (b) : – 32, 104, 320 and 12,000 Tested number of rules (n): – 100, 1K, 10K, 100K, 1 million Didn’t test combination of b=12K and n=1M because it would require too much memory (minimum of 3GB and quickly increasing from there) 31/42

Max and Min Classifier Throughputs 32/42

Max and Min Pre-Processing Time 33/42

Throughputs for 1K Rules 34/42

Throughputs for 10K Rules 35/42

Throughputs for 100K Rules 36/42

Throughputs for 320 bits Classified, 100K Rules 37/42

Throughputs for 12K Bits Classified,10K Rules 38/42

Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions & Future Work 39/42

Summary Grouper classifies packets according to arbitrary bitmask rules Grouper can trade time for space efficiency as needed – Classification time: O(t ∙ n/W) – Memory use: O(2 b/t ∙ t ∙ n) Grouper gets good performance even on commodity hardware and large rule sets [1 < t ≤ b] 40/42

Future Work We are extending Grouper to handle range patterns directly This can be done both through expansion of range patterns to bitmask patterns, or through grouping all bits of the range into the same table We are also extending Grouper to handle rule- set updates while it is running This is an interesting challenge for an algorithm that relies heavily on precomputation 41/42

Thanks/Questions? 42/42

Extra Slides 43/42

Exact Memory Usage Grouper’s exact memory usage is given by 44/42