Large-scale Packet Classification on FPGA

Slides:



Advertisements
Similar presentations
Scalable Packet Classification Using Hybrid and Dynamic Cuttings Authors : Wenjun Li,Xianfeng Li Publisher : Engineering Lab on Intelligent Perception.
Advertisements

Optimizing Regular Expression Matching with SR-NFA on Multi-Core Systems Authors : Yang, Y.E., Prasanna, V.K. Yang, Y.E. Prasanna, V.K. Publisher : Parallel.
Parallel IP Lookup using Multiple SRAM-based Pipelines Authors: Weirong Jiang and Viktor K. Prasanna Presenter: Yi-Sheng, Lin ( 林意勝 ) Date:
1 Multi-Core Architecture on FPGA for Large Dictionary String Matching Department of Computer Science and Information Engineering National Cheng Kung University,
1 Efficient packet classification using TCAMs Authors: Derek Pao, Yiu Keung Li and Peng Zhou Publisher: Computer Networks 2006 Present: Chen-Yu Lin Date:
High-Performance Packet Classification on GPU Author: Shijie Zhou, Shreyas G. Singapura and Viktor K. Prasanna Publisher: HPEC 2014 Presenter: Gang Chi.
HybridCuts: A Scheme Combining Decomposition and Cutting for Packet Classification Authors : Wenjun Li, Xianfeng Li Publisher : 2013 IEEE 21st Annual Symposium.
Packet Classification using Rule Caching Author: Nitesh B. Guinde, Roberto Rojas-Cessa, Sotirios G. Ziavras Publisher: IISA, 2013 Fourth International.
Fast forwarding table lookup exploiting GPU memory architecture Author : Youngjun Lee,Minseon Jeong,Sanghwan Lee,Eun-Jin Im Publisher : Information and.
Packet Classification Using Multi-Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: COMPSACW, 2013 IEEE 37th Annual (Computer.
Leveraging Traffic Repetitions for High- Speed Deep Packet Inspection Author: Anat Bremler-Barr, Shimrit Tzur David, Yotam Harchol, David Hay Publisher:
A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,
A Hybrid IP Lookup Architecture with Fast Updates Author : Layong Luo, Gaogang Xie, Yingke Xie, Laurent Mathy, Kavé Salamatian Conference: IEEE INFOCOM,
EQC16: An Optimized Packet Classification Algorithm For Large Rule-Sets Author: Uday Trivedi, Mohan Lal Jangir Publisher: 2014 International Conference.
Scalable Many-field Packet Classification on Multi-core Processors Authors : Yun R. Qu, Shijie Zhou, Viktor K. Prasanna Publisher : International Symposium.
Regular Expression Matching for Reconfigurable Packet Inspection Authors: Jo˜ao Bispo, Ioannis Sourdis, Jo˜ao M.P. Cardoso and Stamatis Vassiliadis Publisher:
DBS A Bit-level Heuristic Packet Classification Algorithm for High Speed Network Author : Baohua Yang, Xiang Wang, Yibo Xue, Jun Li Publisher : th.
Memory-Efficient Regular Expression Search Using State Merging Author: Michela Becchi, Srihari Cadambi Publisher: INFOCOM th IEEE International.
2017/4/26 Rethinking Packet Classification for Global Network View of Software-Defined Networking Author: Takeru Inoue, Toru Mano, Kimihiro Mizutani, Shin-ichi.
Memory-Efficient and Scalable Virtual Routers Using FPGA Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan,
Early Detection of DDoS Attacks against SDN Controllers
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Binary-tree-based high speed packet classification system on FPGA Author: Jingjiao Li*, Yong Chen*, Cholman HO**, Zhenlin Lu* Publisher: 2013 ICOIN Presenter:
Boundary Cutting for Packet Classification Author: Hyesook Lim, Nara Lee, Geumdan Jin, Jungwon Lee, Youngju Choi, Changhoon Yim Publisher: Networking,
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
Lightweight Traffic-Aware Packet Classification for Continuous Operation Author: Shariful Hasan Shaikot, Min Sik Kim Presenter: Yen-Chun Tseng Date: 2014/11/26.
Range Enhanced Packet Classification Design on FPGA Author: Yeim-Kuan Chang, Chun-sheng Hsueh Publisher: IEEE Transactions on Emerging Topics in Computing.
Lossy Compression of Packet Classifiers Author: Ori Rottenstreich, J’anos Tapolcai Publisher: 2015 IEEE International Conference on Communications Presenter:
Packet Classification Using Dynamically Generated Decision Trees
GFlow: Towards GPU-based High- Performance Table Matching in OpenFlow Switches Author : Kun Qiu, Zhe Chen, Yang Chen, Jin Zhao, Xin Wang Publisher : Information.
Author: Weirong Jiang and Viktor K. Prasanna Publisher: The 18th International Conference on Computer Communications and Networks (ICCCN 2009) Presenter:
LOP_RE: Range Encoding for Low Power Packet Classification Author: Xin He, Jorgen Peddersen and Sri Parameswaran Conference : IEEE 34th Conference on Local.
SRD-DFA Achieving Sub-Rule Distinguishing with Extended DFA Structure Author: Gao Xia, Xiaofei Wang, Bin Liu Publisher: IEEE DASC (International Conference.
Packet Classification Using Multi- Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: 2013 IEEE 37th Annual Computer Software.
Practical Multituple Packet Classification Using Dynamic Discrete Bit Selection Author: Baohua Yang, Fong J., Weirong Jiang, Yibo Xue, Jun Li Publisher:
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
Scalable Multi-match Packet Classification Using TCAM and SRAM Author: Yu-Chieh Cheng, Pi-Chung Wang Publisher: IEEE Transactions on Computers (2015) Presenter:
JA-trie: Entropy-Based Packet Classification Author: Gianni Antichi, Christian Callegari, Andrew W. Moore, Stefano Giordano, Enrico Anastasi Conference.
A Multi-dimensional Packet Classification Algorithm Based on Hierarchical All-match B+ Tree Author: Gang Wang, Yaping Lin*, Jinguo Li, Xin Yao Publisher:
Author: Yun R. Qu, Shijie Zhou, and Viktor K. Prasanna Publisher:
Reorganized and Compact DFA for Efficient Regular Expression Matching
2018/4/27 PiDFA : A Practical Multi-stride Regular Expression Matching Engine Based On FPGA Author: Jiajia Yang, Lei Jiang, Qiu Tang, Qiong Dai, Jianlong.
A DFA with Extended Character-Set for Fast Deep Packet Inspection
High-throughput Online Hash Table on FPGA
2018/6/26 An Energy-efficient TCAM-based Packet Classification with Decision-tree Mapping Author: Zhao Ruan, Xianfeng Li , Wenjun Li Publisher: 2013.
2018/11/19 Source Routing with Protocol-oblivious Forwarding to Enable Efficient e-Health Data Transfer Author: Shengru Li, Daoyun Hu, Wenjian Fang and.
Parallel Processing Priority Trie-based IP Lookup Approach
Scalable Memory-Less Architecture for String Matching With FPGAs
2018/12/29 A Novel Approach for Prefix Minimization using Ternary trie (PMTT) for Packet Classification Author: Sanchita Saha Ray, Abhishek Chatterjee,
Binary Prefix Search Author: Yeim-Kuan Chang
2019/1/1 High Performance Intrusion Detection Using HTTP-Based Payload Aggregation 2017 IEEE 42nd Conference on Local Computer Networks (LCN) Author: Felix.
Memory-Efficient Regular Expression Search Using State Merging
Virtual TCAM for Data Center Switches
Scalable Multi-Match Packet Classification Using TCAM and SRAM
A New String Matching Algorithm Based on Logical Indexing
Large-scale Packet Classification on FPGA
2019/5/2 Using Path Label Routing in Wide Area Software-Defined Networks with OpenFlow ICNP = International Conference on Network Protocols Presenter:Hung-Yen.
Compact DFA Structure for Multiple Regular Expressions Matching
Online NetFPGA decision tree statistical traffic classifier
2019/5/8 BitCoding Network Traffic Classification Through Encoded Bit Level Signatures Author: Neminath Hubballi, Mayank Swarnkar Publisher/Conference:
2019/5/13 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter:Hung-Yen Wang Authors:Peng Wang, George Trimponias, Hong Xu,
Power-efficient range-match-based packet classification on FPGA
Design principles for packet parsers
A Hybrid IP Lookup Architecture with Fast Updates
2019/9/14 The Deep Learning Vision for Heterogeneous Network Traffic Control Proposal, Challenges, and Future Perspective Author: Nei Kato, Zubair Md.
A SRAM-based Architecture for Trie-based IP Lookup Using FPGA
2019/10/9 A Weighted ECMP Load Balancing Scheme for Data Centers Using P4 Switches Presenter:Hung-Yen Wang Authors:Jin-Li Ye, Yu-Huang Chu, Chien Chen.
MEET-IP Memory and Energy Efficient TCAM-based IP Lookup
Towards TCAM-based Scalable Virtual Routers
Packet Classification Using Binary Content Addressable Memory
Presentation transcript:

Large-scale Packet Classification on FPGA 2019/5/29 Large-scale Packet Classification on FPGA Authors : Shijie Zhou, Yun R. Qu, Viktor K. Prasanna Publisher :Application-specific Systems, Architectures and Processors (ASAP), 2015 IEEE 26th International Conference on Presenter : Kai-Hsun Li Date : 2015/9/23 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. CSIE CIAL Lab 1

Introduction Packet classification faces two challenges: 2019/5/29 Introduction Packet classification faces two challenges: (1) the data rate of the network traffic keeps increasing. (2) the size of the rule sets are becoming very large. In this paper, we propose an FPGA-based packet classification engine for large rule sets. Experimental results show that our design can achieve a throughput of 147 Million Packets Per Second (MPPS),while supporting up to 256K rules on a state-of-the-art FPGA. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme 這是整體的架構,後面會詳細解釋 2019/5/29 CSIE CIAL Lab National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(1/7) - Range Tree and Rule ID Set 2019/5/29 Proposed Scheme(1/7) - Range Tree and Rule ID Set We exploit range-tree to search each packet header field. For a field requiring range match, the major steps are : To flatten all the ranges into non-overlapping subranges. To construct a balanced binary search tree using the subrange boundaries. In our approach, each leaf node of range-tree stores a Rule ID Set (RIDS) and has following properties The rule IDs are all distinct. The rule IDs are in ascending order. 這邊有另外定義存在leaf的rule id set,這2個特性是要方便後面可以做merge用的 rule IDs are all distinct 是在一個RIDS內不會有重複的rule id 出現 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(2/7) - Range Tree and Rule ID Set 2019/5/29 Proposed Scheme(2/7) - Range Tree and Rule ID Set 這是paper提供的rule table National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(3/7) - Range Tree and Rule ID Set 2019/5/29 Proposed Scheme(3/7) - Range Tree and Rule ID Set 這部分跟training的dynamic Segment Tree相同 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme 此部分由2個部分所構成,分別是bitonic merge及neighborhood checker所組成 2019/5/29 Proposed Scheme 此部分由2個部分所構成,分別是bitonic merge及neighborhood checker所組成 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(4/7) - Bitonic Merge 2019/5/29 Proposed Scheme(4/7) - Bitonic Merge To pipeline the merging phase and improve the overall throughput, we use Use a bitonic merging to merge any 2 RIDSs. Merge all the M RIDSs iteratively in pairs. A bitonic merge network as BM(k), where k denotes the size of this network. BM(k) has log(k) stages and dist(i) is the distance of two IDs, if i=0 : dist(i) = k/2 else : dist(i) = dist(i) /2 首先,先介紹bitonic merge 這邊為了可以提高throughput,所以使用了bitonic merge National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(5/7) - Bitonic Merge 2019/5/29 Proposed Scheme(5/7) - Bitonic Merge dist(i) = 4 前面有提到,存在range tree中RIDS是ascending order,而使用bitonic merge必須有一個sequence是descending order,因此在merge之前會先將一個sequence reverse成descending order的形式 這邊以{1,2,4,9}、{9,5,3,2}做例子 , k = 8 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(5/7) - Bitonic Merge 2019/5/29 Proposed Scheme(5/7) - Bitonic Merge dist(i) = 4 前面有提到,存在range tree中RIDS是ascending order,而使用bitonic merge必須有一個sequence是descending order,因此在merge之前會先將一個sequence reverse成descending order的形式 這邊以{1,2,4,9}、{9,5,3,2}做例子 , k = 8 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(5/7) - Bitonic Merge 2019/5/29 Proposed Scheme(5/7) - Bitonic Merge dist(i) = 4 dist(i) = 2 , k = 8 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(6/7) - Neighborhood Checker 2019/5/29 Proposed Scheme(6/7) - Neighborhood Checker The Neighborhood Checker (NC) will check whether the 2 input numbers are equal or not. The NC finally reports the common numbers. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

Proposed Scheme(7/7) - Bitonic Tree 2019/5/29 Proposed Scheme(7/7) - Bitonic Tree A bitonic merging network along with an NC is only capable of collecting common numbers For a total number of M(M filds) RIDSs, we exploit a tree-like merging network, and denote as bitonic-tree. Each node in a bitonic-tree consists of a BM(k) and an NC; we denote the node as BMNC(k). 透過BM、NC主要是把不同dimension的common number收集起來 力用BM、NC的特性,提出了依個tree-like的架構,叫作bitonic tree National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

2019/5/29 Overall Architecture PE是紀錄tree height i的所有sub-range,而最後一個PE(以range tree中在leaf的那個PE)則是紀錄RIDS的rule ID 一個BMNC是由多個stage的BM及一個NC所組成 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(1/7) 2019/5/29 EXPERIMENTAL RESULTS(1/7) Environment FPGA Xilinx Virtex 7 Logic Slice 433,200 I/O pins 850 BRAM 51.6MB(on-chip) Development Tool Xilinx Vivado 2014.3 實驗的環境 這邊會先分別針對不同 “rule set、elementary interval的數量、RIDS的rule ID數量及不同dimension”對throughput及resource的影響, National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(2/7) 2019/5/29 EXPERIMENTAL RESULTS(2/7) 不同大小的rule set的throughput及resource utlization Dim = 4, Um = 8K , Sm = 64 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(3/7) 2019/5/29 EXPERIMENTAL RESULTS(3/7) Um是elementary interval的數量 這邊是各種elementary interval的throughput及resource utilization N = 256k, Dim = 4, Sm = 64 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(4/7) 2019/5/29 EXPERIMENTAL RESULTS(4/7) Sm是RIDS的rule ID數量 這是不同Sm的throughput及resource utlization N = 256k, Dim = 4, Um =8K National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(5/7) 2019/5/29 EXPERIMENTAL RESULTS(5/7) 不同dimension對throughput及resource utlization的影響 N = 256k, Um = 8K , Sm = 64 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(6/7) 2019/5/29 EXPERIMENTAL RESULTS(6/7) Table II是用 N = 256k, Um = 8K , Sm = 64 Table III 是跟其他方法的比較 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

EXPERIMENTAL RESULTS(7/7) 2019/5/29 EXPERIMENTAL RESULTS(7/7) CPU AMD Opteron 6278 Core 16 Frequency 2.4GHz L1 cache 16KB L2 cache 2MB L3 cache 60MB 這邊是實做在FPGA的方法跟他們另一篇實做在multi-core上的比較 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab