Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond

Slides:



Advertisements
Similar presentations
VCRIB: Virtual Cloud Rule Information Base Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan HotCloud 2012.
Advertisements

Balajee Vamanan, Gwendolyn Voskuilen, and T. N. Vijaykumar School of Electrical & Computer Engineering SIGCOMM 2010.
A HIGH-PERFORMANCE IPV6 LOOKUP ENGINE ON FPGA Author : Thilan Ganegedara, Viktor Prasanna Publisher : FPL 2013.
Fast Firewall Implementation for Software and Hardware-based Routers Lili Qiu, Microsoft Research George Varghese, UCSD Subhash Suri, UCSB 9 th International.
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Author: Wenjun Li, Xianfeng Li Publisher: 2013 IEEE 21 st Annual Symposium.
Hybrid Data Structure for IP Lookup in Virtual Routers Using FPGAs Authors: Oĝuzhan Erdem, Hoang Le, Viktor K. Prasanna, Cüneyt F. Bazlamaçcı Publisher:
A Ternary Unification Framework for Optimizing TCAM-Based Packet Classification Systems Author: Eric Norige, Alex X. Liu, and Eric Torng Publisher: ANCS.
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems Authors: Seongwook Youn and Dennis McLeod Presenter:
Pipelined Parallel AC-based Approach for Multi-String Matching Department of Computer Science and Information Engineering National Cheng Kung University,
1 Packet Classification Algorithms: From Theory to Practice Author: Yaxuan Qi, Lianghong Xu, Baohua Yang, Yibo Xue, and Jun Li Publisher: IEEE INFOCOM.
Beyond TCAMs: An SRAM-based Parallel Multi-Pipeline Architecture for Terabit IP Lookup Author: Weirong Jiang ViktorK.Prasanna Publisher: Infocom 08 Present:
1 On Constructing Efficient Shared Decision Trees for Multiple Packet Filters Author: Bo Zhang T. S. Eugene Ng Publisher: IEEE INFOCOM 2010 Presenter:
1 Towards Green Routers: Depth- Bounded Multi-Pipeline Architecture for Power-Efficient IP Lookup Author: Weirong Jiang Viktor K. Prasanna Publisher: Performance,
1 Scalable high-throughput SRAM-based architecture for IP-lookup using FPGA Author: Hoang Le; Weirong Jiang; Prasanna, V.K.; Publisher: FPL Field.
Efficient IP-Address Lookup with a Shared Forwarding Table for Multiple Virtual Routers Author: Jing Fu, Jennifer Rexford Publisher: ACM CoNEXT 2008 Presenter:
1 Multi-Terabit IP Lookup Using Parallel Bidirectional Pipelines Author: Weirong Jiang Viktor K. Prasanna Publisher: ACM 2008 Presenter: Po Ting Huang.
1 Energy Efficient Packet Classification Hardware Accelerator Alan Kennedy, Xiaojun Wang HDL Lab, School of Electronic Engineering, Dublin City University.
1 OC-3072 Packet Classification Using BDDs and Pipelined SRAMs Author: Amit Prakash, Adnan Aziz Publisher: Hot Interconnects 9, Presenter: Hsin-Mao.
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Wenjun Li Xianfeng Li School of Electronic and Computer Engineering.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Block Permutations in Boolean Space to Minimize TCAM for Packet Classification Authors: Rihua Wei, Yang Xu, H. Jonathan Chao Publisher: IEEE INFOCOM,2012.
Xinming Chen, Zhen Chen, Beipeng Mu, Lingyun Ruan, Jinli Meng Towards High-performance IPsec on Cavium OCTEON Platform Research Institute of Information.
(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.
LayeredTrees: Most Specific Prefix based Pipelined Design for On-Chip IP Address Lookups Author: Yeim-Kuau Chang, Fang-Chen Kuo, Han-Jhen Guo and Cheng-Chien.
Multi-dimensional Packet Classification on FPGA 100 Gbps and Beyond Author: Yaxuan Qi, Jeffrey Fong, Weirong Jiang, Bo Xu, Jun Li, Viktor Prasanna Publisher:
Timothy Whelan Supervisor: Mr Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University Hardware based packet filtering.
GLOBECOM (Global Communications Conference), 2012
Wire Speed Packet Classification Without TCAMs ACM SIGMETRICS 2007 Qunfeng Dong (University of Wisconsin-Madison) Suman Banerjee (University of Wisconsin-Madison)
1 Towards Practical Architectures for SRAM-based Pipelined Lookup Engines Author: Weirong Jiang, Viktor K. Prasanna Publisher: INFOCOM 2010 Presenter:
1. Outline Introduction Related work on packet classification Grouper Performance Analysis Empirical Evaluation Conclusions 2/42.
1 Memory-Efficient and Scalable Virtual Routers Using FPGA Author: Hoang Le, Thilan Ganegedara and Viktor K. Prasanna Publisher: ACM/SIGDA FPGA '11 Presenter:
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
A Smart Pre-Classifier to Reduce Power Consumption of TCAMs for Multi-dimensional Packet Classification Yadi Ma, Suman Banerjee University of Wisconsin-Madison.
Performance Analysis of Packet Classification Algorithms on Network Processors Deepa Srinivasan, IBM Corporation Wu-chang Feng, Portland State University.
Memory-Efficient IPv4/v6 Lookup on FPGAs Using Distance-Bounded Path Compression Author: Hoang Le, Weirong Jiang and Viktor K. Prasanna Publisher: IEEE.
CS 740: Advanced Computer Networks IP Lookup and classification Supplemental material 02/05/2007.
Author : Weirong Jiang, Yi-Hua E. Yang, and Viktor K. Prasanna Publisher : IPDPS 2010 Presenter : Jo-Ning Yu Date : 2012/04/11.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Author: Weirong Jiang and Viktor K. Prasanna Publisher: ACM Symposium on Parallel Algorithms and Architectures, SPAA 2009 Presenter: Chin-Chung Pan Date:
Parallel tree search: An algorithmic approach for multi- field packet classification Authors: Derek Pao and Cutson Liu. Publisher: Computer communications.
Packet Classification Using Multidimensional Cutting Sumeet Singh (UCSD) Florin Baboescu (UCSD) George Varghese (UCSD) Jia Wang (AT&T Labs-Research) Reviewed.
Author: Weirong Jiang and Viktor K. Prasanna Publisher: The 18th International Conference on Computer Communications and Networks (ICCCN 2009) Presenter:
Author: Weirong Jiang, Viktor K. Prasanna Publisher: th IEEE International Conference on Application-specific Systems, Architectures and Processors.
Authors : Baohua Yang, Jeffrey Fong, Weirong Jiang, Yibo Xue, and Jun Li. Publisher : IEEE TRANSACTIONS ON COMPUTERS Presenter : Chai-Yi Chu Date.
Optimizing Packet Lookup in Time and Space on FPGA Author: Thilan Ganegedara, Viktor Prasanna Publisher: FPL 2012 Presenter: Chun-Sheng Hsueh Date: 2012/11/28.
Introduction to Intrusion Detection Systems. All incoming packets are filtered for specific characteristics or content Databases have thousands of patterns.
Practical Multituple Packet Classification Using Dynamic Discrete Bit Selection Author: Baohua Yang, Fong J., Weirong Jiang, Yibo Xue, Jun Li Publisher:
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
Optimizing Interconnection Complexity for Realizing Fixed Permutation in Data and Signal Processing Algorithms Ren Chen, Viktor K. Prasanna Ming Hsieh.
FPGA IMPLEMENTATION OF TCAM Under the guidance of Dr. Hraziia Presented By Malvika ( ) Department of Electronics and Communication Engineering.
Author: Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher:
IP Routers – internal view
High-throughput Online Hash Table on FPGA
Toward Advocacy-Free Evaluation of Packet Classification Algorithms
Scalable Memory-Less Architecture for String Matching With FPGAs
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Resource Allocation for Distributed Streaming Applications
Using decision trees to improve signature-based intrusion detection
High-performance router/switch architecture 高效能路由器/交換器的 架構與設計
Hash Functions for Network Applications (II)
Power-efficient range-match-based packet classification on FPGA
Large-scale Packet Classification on FPGA
A Trie Merging Approach with Incremental Updates for Virtual Routers
IXP C Programming Language
Clustered Hierarchical Search Structure for Large-Scale Packet Classification on FPGA Publisher : Field Programmable Logic and Applications, 2011 Author.
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI
Hazem Hamed, Adel El-Atawy, Ehab Al-Shaer
Presentation transcript:

Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond Yaxuan Qi, Jeffrey Fong, Weirong Jiang, Bo Xu, Jun Li, Viktor Prasanna

Outline Background and Motivation The packet classification problem Existing solutions & Challenges Algorithm and Architecture Design HyperSplit Mapping into hardware & Optimizations Performance Evaluation Test Setup Experimental Results Conclusion

Outline Background and Motivation The packet classification problem Existing solutions & Challenges Algorithm and Architecture Design HyperSplit Mapping into hardware & Optimizations Performance Evaluation Test Setup Experimental Results Conclusion

Packet Classification Problem To identify and associate each packet to a specific rule May match multiple rules Used for: Routing Firewall/ Intrusion Detection System Quality of Service

Existing Solutions SRAM Based Software running on general hardware Different algorithms gives different search speed and/or number of rules Advantage: Price (generally) # of Rules Disadvantage Speed TCAM Based Dedicated packet matching hardware Different hardware architecture gives different speed Advantage Speed Disadvantage Price Energy consumption Chip size No support for Range Range to Prefix Conversion

Existing Solutions Search Method Algorithms RFC Decomposition HSM SRAM based Methods Decision Tree HiCut HyperSplit

Existing Solutions Search Method Algorithms RFC Decomposition HSM SRAM based Methods Decision Tree HiCut HyperSplit

Challenges & Goals Memory Usage High Performance On-the-fly update Needs to be memory efficient that can support large rulesets High Performance Requires high throughput and deterministic performance On-the-fly update To allow rules to be changed and updated without downtime

Outline Background and Motivation The packet classification problem Existing solutions & Challenges Algorithm and Architecture Design HyperSplit Mapping into hardware & Optimizations Performance Evaluation Test Setup Experimental Results Conclusion

HyperSplit Memory-efficient packet classification algorithm Uses 1/10 (10%) of the memory that other comparable algorithms requires Optimized k-d tree data structure Combines the advantages of both parallel search and tree search algorithms Uses heuristics to select the most efficient splitting point on a specific field

Example 11 R4 10 R2 R3 01 R5 00 R1(R2) 00 01 10 11

Example Lv-1 11 R4 10 R2 R3 01 R5 00 R1 00 01 10 11 X,01 L R X<=01

Example Lv-1 11 R4 10 R2 R3 01 R5 Lv-2 00 R1 00 01 10 11 X,01 Y,00 R

Example Lv-1 Lv-2 11 R4 10 R2 R3 01 R5 Lv-2 00 R1 00 01 10 11 X,01 Y,00 X,10 01 R5 Lv-2 Y<=00 Y>00 X<=10 X>10 00 R1 00 01 10 11 R1 R2 R3 RR

Example Lv-1 Lv-2 11 R4 Lv-3 10 R2 R3 01 R5 Lv-2 00 R1 00 01 10 11 Y,00 X,10 01 R5 Lv-2 Y<=00 Y>00 X<=10 X>10 00 R1 00 01 10 11 R1 R2 R3 Y,10 Y<=10 Y>10 R5 R4

Mapping Decision into Hardware X,01 Y,00 X,10 R1 R2 R3 Y,10 R5 R4

Mapping Decision into Hardware X,01 Y,00 X,10 R1 R2 R3 Y,10 R5 R4

Mapping Decision into Hardware INPUT PACKET STAGE 1 X,01 STAGE 2 Y,00 X,10 STAGE 3 R1 R2 R3 Y,10 STAGE 4 R5 R4 MATCHED RULE

Hardware Implementation STAGE n

Architecture Optimization (1) Node Merging – Pipeline Depth Reduction @addr0 d1,v1 addr1 @addr0 d1,d2,d3 v1,v2,v3 addr1 @addr1 d1,v1 addr2 @addr1+1 d1,v1 addr3 @addr2 child1 @addr2+1 child2 @addr3 child1 @addr3+1 child2 @addr1 child1 @addr1+1 child2 @addr1+2 child3 @addr1+3 child4

Architecture Optimization (2) Controlled Block RAM Allocation Different rulesets will result in different memory usage per stage Limits the size of a certain stage by pushing leafs to lower levels of the pipeline

Architecture Optimization (3) Dual-search pipeline take advantage of dual-port BRAM

Outline Background and Motivation The packet classification problem Existing solutions & Challenges Algorithm and Architecture Design HyperSplit Mapping into hardware & Optimizations Performance Evaluation Test Setup Experimental Results Conclusion

Test Setup Tested with a publicly available ruleset from Washington University Used the ACL 100, 1K, 5K, 10K rulesets Design is implemented on a Xilinx Virtex-6 Model: VC6VSX475T Containing 7,640Kb Distributed RAM and 38,304Kb Block RAM Using Xilinx ISE 11.5 tool

Algorithm Evaluation Node-merging Optimization Reduce tree height (pipeline depth) by almost 50% with minimal memory overhead!

Algorithm Evaluation Leaf-pushing Optimization

FPGA Performance

FPGA Performance

Outline Background and Motivation The packet classification problem Existing solutions & Challenges Algorithm and Architecture Design HyperSplit Mapping into hardware & Optimizations Performance Evaluation Test Setup Experimental Results Conclusion

Conclusion FPGA provides a flexible and excellent solution to the packet classification problem HyperSplit algorithm is suited to and provides an efficient mapping to hardware 3 optimizations used to reduce tree length, constraint the memory usage of each stage and improve performance Consume less resource than other FPGA-based solutions and much faster than multicore based solutions