(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.

Slides:



Advertisements
Similar presentations
Enhanced matrix multiplication algorithm for FPGA Tamás Herendi, S. Roland Major UDT2012.
Advertisements

Multi-dimensional Packet Classification on FPGA: 100Gbps and Beyond
A Scalable and Reconfigurable Search Memory Substrate for High Throughput Packet Processing Sangyeun Cho and Rami Melhem Dept. of Computer Science University.
A HIGH-PERFORMANCE IPV6 LOOKUP ENGINE ON FPGA Author : Thilan Ganegedara, Viktor Prasanna Publisher : FPL 2013.
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Author: Wenjun Li, Xianfeng Li Publisher: 2013 IEEE 21 st Annual Symposium.
Hybrid Data Structure for IP Lookup in Virtual Routers Using FPGAs Authors: Oĝuzhan Erdem, Hoang Le, Viktor K. Prasanna, Cüneyt F. Bazlamaçcı Publisher:
Authors: Raphael Polig, Kubilay Atasu, and Christoph Hagleitner Publisher: FPL, 2013 Presenter: Chia-Yi, Chu Date: 2013/10/30 1.
A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems Authors: Seongwook Youn and Dennis McLeod Presenter:
Pipelined Parallel AC-based Approach for Multi-String Matching Department of Computer Science and Information Engineering National Cheng Kung University,
400 Gb/s Programmable Packet Parsing on a Single FPGA Authors : Michael Attig 、 Gordon Brebner Publisher: 2011 Seventh ACM/IEEE Symposium on Architectures.
1 MIPS Extension for a TCAM Based Parallel Architecture for Fast IP Lookup Author: Oğuzhan ERDEM Cüneyt F. BAZLAMAÇCI Publisher: ISCIS 2009 Presenter:
Fast Filter Updates for Packet Classification using TCAM Authors: Haoyu Song, Jonathan Turner. Publisher: GLOBECOM 2006, IEEE Present: Chen-Yu Lin Date:
1 Scalable high-throughput SRAM-based architecture for IP-lookup using FPGA Author: Hoang Le; Weirong Jiang; Prasanna, V.K.; Publisher: FPL Field.
1 Memory-Efficient 5D Packet Classification At 40 Gbps Authors: Ioannis Papaefstathiou, and Vassilis Papaefstathiou Publisher: IEEE INFOCOM 2007 Presenter:
Parallel IP Lookup using Multiple SRAM-based Pipelines Authors: Weirong Jiang and Viktor K. Prasanna Presenter: Yi-Sheng, Lin ( 林意勝 ) Date:
Performance Evaluation of IPv6 Packet Classification with Caching Author: Kai-Yuan Ho, Yaw-Chung Chen Publisher: ChinaCom 2008 Presenter: Chen-Yu Chaug.
The Design of Improved Dynamic AES and Hardware Implementation Using FPGA 游精允.
An Efficient IP Lookup Architecture with Fast Update Using Single-Match TCAMs Author: Jinsoo Kim, Junghwan Kim Publisher: WWIC 2008 Presenter: Chen-Yu.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
 Author: Tsern-Huei Lee  Publisher: 2009 IEEE Transation on Computers  Presenter: Yuen-Shuo Li  Date: 2013/09/18 1.
Block Permutations in Boolean Space to Minimize TCAM for Packet Classification Authors: Rihua Wei, Yang Xu, H. Jonathan Chao Publisher: IEEE INFOCOM,2012.
High-Performance Packet Classification on GPU Author: Shijie Zhou, Shreyas G. Singapura and Viktor K. Prasanna Publisher: HPEC 2014 Presenter: Gang Chi.
Efficient FPGA Implementation of QR
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
LayeredTrees: Most Specific Prefix based Pipelined Design for On-Chip IP Address Lookups Author: Yeim-Kuau Chang, Fang-Chen Kuo, Han-Jhen Guo and Cheng-Chien.
Multi-dimensional Packet Classification on FPGA 100 Gbps and Beyond Author: Yaxuan Qi, Jeffrey Fong, Weirong Jiang, Bo Xu, Jun Li, Viktor Prasanna Publisher:
Timothy Whelan Supervisor: Mr Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University Hardware based packet filtering.
Decimal Multiplier on FPGA using Embedded Binary Multipliers Authors: H. Neto and M. Vestias Conference: Field Programmable Logic and Applications (FPL),
Author: Haoyu Song, Fang Hao, Murali Kodialam, T.V. Lakshman Publisher: IEEE INFOCOM 2009 Presenter: Chin-Chung Pan Date: 2009/12/09.
GLOBECOM (Global Communications Conference), 2012
Multi-Field Range Encoding for Packet Classification in TCAM Author: Yeim-Kuan Chang, Chun-I Lee and Cheng-Chien Su Publisher: INFOCOM 2011 Presenter:
Reconfigurable Computing Using Content Addressable Memory (CAM) for Improved Performance and Resource Usage Group Members: Anderson Raid Marie Beltrao.
Swankoski MAPLD 2005 / B103 1 Dynamic High-Performance Multi-Mode Architectures for AES Encryption Eric Swankoski Naval Research Lab Vijay Narayanan Penn.
1 Towards Practical Architectures for SRAM-based Pipelined Lookup Engines Author: Weirong Jiang, Viktor K. Prasanna Publisher: INFOCOM 2010 Presenter:
An Efficient FPGA Implementation of IEEE e LDPC Encoder Speaker: Chau-Yuan-Yu Advisor: Mong-Kai Ku.
1 Memory-Efficient and Scalable Virtual Routers Using FPGA Author: Hoang Le, Thilan Ganegedara and Viktor K. Prasanna Publisher: ACM/SIGDA FPGA '11 Presenter:
Author : Ioannis Sourdis, Vasilis Dimopoulos, Dionisios Pnevmatikatos and Stamatis Vassiliadis Publisher : ANCS’06 Presenter : Zong-Lin Sie Date : 2011/01/05.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
DBS A Bit-level Heuristic Packet Classification Algorithm for High Speed Network Author : Baohua Yang, Xiang Wang, Yibo Xue, Jun Li Publisher : th.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
Performance Analysis of Packet Classification Algorithms on Network Processors Deepa Srinivasan, IBM Corporation Wu-chang Feng, Portland State University.
Author : Weirong Jiang, Yi-Hua E. Yang, and Viktor K. Prasanna Publisher : IPDPS 2010 Presenter : Jo-Ning Yu Date : 2012/04/11.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Author: Weirong Jiang and Viktor K. Prasanna Publisher: ACM Symposium on Parallel Algorithms and Architectures, SPAA 2009 Presenter: Chin-Chung Pan Date:
A New Class of High Performance FFTs Dr. J. Greg Nash Centar ( High Performance Embedded Computing (HPEC) Workshop.
Range Enhanced Packet Classification Design on FPGA Author: Yeim-Kuan Chang, Chun-sheng Hsueh Publisher: IEEE Transactions on Emerging Topics in Computing.
Author: Weirong Jiang and Viktor K. Prasanna Publisher: IEEE TRANSACTIONS ON COMPUTERS, 2012 Presenter: Li-Hsien, Hsu Data: 10/03/
Author: Weirong Jiang and Viktor K. Prasanna Publisher: The 18th International Conference on Computer Communications and Networks (ICCCN 2009) Presenter:
1 Space-Efficient TCAM-based Classification Using Gray Coding Authors: Anat Bremler-Barr and Danny Hendler Publisher: IEEE INFOCOM 2007 Present: Chen-Yu.
Author: Weirong Jiang, Viktor K. Prasanna Publisher: th IEEE International Conference on Application-specific Systems, Architectures and Processors.
Authors : Baohua Yang, Jeffrey Fong, Weirong Jiang, Yibo Xue, and Jun Li. Publisher : IEEE TRANSACTIONS ON COMPUTERS Presenter : Chai-Yi Chu Date.
Optimizing Packet Lookup in Time and Space on FPGA Author: Thilan Ganegedara, Viktor Prasanna Publisher: FPL 2012 Presenter: Chun-Sheng Hsueh Date: 2012/11/28.
Haiyang Jiang, Gaogang Xie, Kave Salamatian and Laurent Mathy
Introduction to Intrusion Detection Systems. All incoming packets are filtered for specific characteristics or content Databases have thousands of patterns.
Practical Multituple Packet Classification Using Dynamic Discrete Bit Selection Author: Baohua Yang, Fong J., Weirong Jiang, Yibo Xue, Jun Li Publisher:
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
400 Gb/s Programmable Packet Parsing on a Single FPGA Author: Michael Attig 、 Gordon Brebner Publisher: ANCS 2011 Presenter: Chun-Sheng Hsueh Date: 2013/03/27.
Author: Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher:
2018/4/27 PiDFA : A Practical Multi-stride Regular Expression Matching Engine Based On FPGA Author: Jiajia Yang, Lei Jiang, Qiu Tang, Qiong Dai, Jianlong.
High-throughput Online Hash Table on FPGA
Toward Advocacy-Free Evaluation of Packet Classification Algorithms
Scalable Memory-Less Architecture for String Matching With FPGAs
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
Power-efficient range-match-based packet classification on FPGA
Large-scale Packet Classification on FPGA
Clustered Hierarchical Search Structure for Large-Scale Packet Classification on FPGA Publisher : Field Programmable Logic and Applications, 2011 Author.
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
Presentation transcript:

(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna Publisher: IEEE Presenter: 楊皓中 Date: 2013/10/24

Introduction The performance of today’s packet classification solutions depends on the characteristics of rulesets. ◦ the solution generated for a particular ruleset may not necessarily yield the same performance for a different ruleset with different features we propose a novel modular Bit-Vector (BV) based architecture to perform high-speed packet classification on Field Programmable Gate Array (FPGA). We introduce an algorithm named StrideBV and modularize the BV architecture to achieve better scalability than traditional BV methods.

contributions in this work as follows A memory-efficient bit vector -based packet classification algorithm suitable for hardware implementation; A modular architecture to perform large-scale highspeed (100+ Gbps) packet classification on a single chip; Inclusion of explicit range-search in the modular architecture to eliminate the expensive range-to-prefix conversion; Support for up to 28K rule classifiers using on chip distributed and block Random Access Memory (RAM) on a Xilinx Virtex 7 FPGA; Deterministic performance for any given ruleset due to ruleset-feature independence.

RULESET-FEATURE INDEPENDENT PACKET CLASSIFICATION We develop our complete solution in two phases, which are 1) StrideBV algorithm In this work, we generalize the Field Split Bit-Vector(FSBV) algorithm and extend the use of it to build a ruleset-feature independent packet classification solution, named StrideBV. 2) integration of range search and modularization of StrideBV architecture

StrideBV algorithm

Permutations function ◦ accepts k bits of the rule and generates array of size 2k (height)×k (width) stride size of k = 2 rule value for stride is 0 ∗. Output {11, 11, 01, 01}, match result of {00, 01, 10, 11}

StrideBV algorithm

Multi-match to Highest-Priority Match

Modular BV Approach each individual bit-level operation is completely independent of the rest of the bits in the BV. ◦ This allows us to partition the BV without affecting the operation of the BV approach benefit of partitioning ◦ we no longer need to load a N bit BV at each pipeline stage, but rather, a N/P bit BV ◦ where P is the number of partitions ◦ Partitioning is done based on the rule priority To better explain the severeness of the unpartitioned BV approach ◦ 2000 rule, stride k=3, pipeline length = 104/3 =35,operating at 200 MHZ ◦ each pipeline stage will be requesting a 2000 bit wide word at a 200 million requests per second rate. ◦ This translates to an on-chip memory bandwidth of 14 Tbps. On current FPGAs, having such high bandwidth is not possible.

Multi-level Priority Encoding

Scalable Range Matching on Hardware

StrideBV Architecture The output bitvector of stage memory (BVM) the bit-vector generated from the previous stage (BVP) the resultant bit-vector (BVR) (BVM) AND (BVP) = BVR

Range Search Integration

Modular Architecture This causes the packet classification latency to increase linearly proportional to the header length. This can be an undesirable feature especially for applications that demand low latency operation, such as multimedia, gaming and data center environments. If stride k = 4, latency = 26 cycle

Modular Architecture low latency variant of the proposed architecture since the order of search does not affect the final result If stride k = 4, latency = 11 cycle

PERFORMANCE EVALUATION evaluate the performance of both serial and parallel versions of StrideBV Xilinx Virtex T FPGA memory consumption throughput packet latency power consumption

Memory Requirement

Throughput packet size which is 40 Bytes for IPv4 packets.(In order to report the worst case) strides of size 4 can be perfectly aligned with header boundaries The advantage of modularization is that instead of storing and processing massively wide bit-vectors, the bit-vector length inside a module is restricted to a size such that the performance is not significantly deteriorated due to bit-vector length exploit the dual-ported feature of FPGA ◦ accept two packet headers per cycle.

Packet Latency

Power Efficiency Since the dynamic power is proportional to the amount of resources consumed, we evaluated the number of logic slices consumed by the logic and memory components of the architecture.