TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.

Slides:



Advertisements
Similar presentations
CSE 311 Foundations of Computing I
Advertisements

Deep Packet Inspection with DFA-trees and Parametrized Language Overapproximation Author: Daniel Luchaup, Lorenzo De Carli, Somesh Jha, Eric Bach Publisher:
Optimizing Regular Expression Matching with SR-NFA on Multi-Core Systems Authors : Yang, Y.E., Prasanna, V.K. Yang, Y.E. Prasanna, V.K. Publisher : Parallel.
An Efficient Regular Expressions Compression Algorithm From A New Perspective Authors : Tingwen Liu,Yifu Yang,Yanbing Liu,Yong Sun,Li Guo Tingwen LiuYifu.
Finite Automata CPSC 388 Ellen Walker Hiram College.
A Memory-Efficient Reconfigurable Aho-Corasick FSM Implementation for Intrusion Detection Systems Authors: Seongwook Youn and Dennis McLeod Presenter:
A hybrid finite automaton for practical deep packet inspection Department of Computer Science and Information Engineering National Cheng Kung University,
1 The scanning process Goal: automate the process Idea: –Start with an RE –Build a DFA How? –We can build a non-deterministic finite automaton (Thompson's.
Design of High Performance Pattern Matching Engine Through Compact Deterministic Finite Automata Department of Computer Science and Information Engineering.
1 Regular expression matching with input compression : a hardware design for use within network intrusion detection systems Department of Computer Science.
Pipelined Architecture For Multi-String Match Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
1 Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Department of Computer Science and Information Engineering National.
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,
OpenFlow-Based Server Load Balancing GoneWild Author : Richard Wang, Dana Butnariu, Jennifer Rexford Publisher : Hot-ICE'11 Proceedings of the 11th USENIX.
Thopson NFA Presenter: Yuen-Shuo Li Date: 2014/5/7 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Sampling Techniques to Accelerate Pattern Matching in Network Intrusion Detection Systems Author: Domenico Ficara, Gianni Antichi, Andrea Di Pietro, Stefano.
Packet Classification Using Multi-Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: COMPSACW, 2013 IEEE 37th Annual (Computer.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang.
REGULAR LANGUAGES.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection Authors: Ayesha Khalid, Rajat Sen†, Anupam Chattopadhyay Publisher:
A Regular Expression Matching Algorithm Using Transition Merging Department of Computer Science and Information Engineering National Cheng Kung University,
Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.
TFA : A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Tang Song and H. Jonathan Chao Publisher: Technical.
An Efficient Regular Expressions Compression Algorithm From A New Perspective  Author: Tingwen Liu, Yifu Yang, Yanbing Liu, Yong Sun, Li Guo  Publisher:
Pattern-Based DFA for Memory- Efficient and Scalable Multiple Regular Expression Matching Author: Junchen Jiang, Yang Xu, Tian Pan, Yi Tang, Bin Liu Publisher:IEEE.
StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher:
1 Optimization of Regular Expression Pattern Matching Circuits on FPGA Department of Computer Science and Information Engineering National Cheng Kung University,
Deterministic Finite Automaton for Scalable Traffic Identification: the Power of Compressing by Range Authors: Rafael Antonello, Stenio Fernandes, Djamel.
Regular Expression Matching for Reconfigurable Packet Inspection Authors: Jo˜ao Bispo, Ioannis Sourdis, Jo˜ao M.P. Cardoso and Stamatis Vassiliadis Publisher:
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
CMSC 330: Organization of Programming Languages Finite Automata NFAs  DFAs.
Memory-Efficient Regular Expression Search Using State Merging Author: Michela Becchi, Srihari Cadambi Publisher: INFOCOM th IEEE International.
SwinTop: Optimizing Memory Efficiency of Packet Classification in Network Author: Chen, Chang; Cai, Liangwei; Xiang, Yang; Li, Jun Conference: Communication.
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching Yao Song 11/05/2015.
Author : Yang Xu, Lei Ma, Zhaobo Liu, H. Jonathan Chao Publisher : ANCS 2011 Presenter : Jo-Ning Yu Date : 2011/12/28.
Author : Randy Smith & Cristian Estan & Somesh Jha Publisher : IEEE Symposium on Security & privacy,2008 Presenter : Wen-Tse Liang Date : 2010/10/27.
Updating Designed for Fast IP Lookup Author : Natasa Maksic, Zoran Chicha and Aleksandra Smiljani´c Conference: IEEE High Performance Switching and Routing.
Binary-tree-based high speed packet classification system on FPGA Author: Jingjiao Li*, Yong Chen*, Cholman HO**, Zhenlin Lu* Publisher: 2013 ICOIN Presenter:
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
Range Enhanced Packet Classification Design on FPGA Author: Yeim-Kuan Chang, Chun-sheng Hsueh Publisher: IEEE Transactions on Emerging Topics in Computing.
Cuckoo Filter: Practically Better Than Bloom Author: Bin Fan, David G. Andersen, Michael Kaminsky, Michael D. Mitzenmacher Publisher: ACM CoNEXT 2014 Presenter:
LaFA Lookahead Finite Automata Scalable Regular Expression Detection Authors : Masanori Bando, N. Sertac Artan, H. Jonathan Chao Masanori Bando N. Sertac.
An Improved DFA for Fast Regular Expression Matching Author : Domenico Ficara 、 Stefano Giordano 、 Gregorio Procissi Fabio Vitucci 、 Gianni Antichi 、 Andrea.
Packet Classification Using Dynamically Generated Decision Trees
LOP_RE: Range Encoding for Low Power Packet Classification Author: Xin He, Jorgen Peddersen and Sri Parameswaran Conference : IEEE 34th Conference on Local.
SRD-DFA Achieving Sub-Rule Distinguishing with Extended DFA Structure Author: Gao Xia, Xiaofei Wang, Bin Liu Publisher: IEEE DASC (International Conference.
Series DFA for Memory- Efficient Regular Expression Matching Author: Tingwen Liu, Yong Sun, Li Guo, and Binxing Fang Publisher: CIAA 2012( International.
Practical Multituple Packet Classification Using Dynamic Discrete Bit Selection Author: Baohua Yang, Fong J., Weirong Jiang, Yibo Xue, Jun Li Publisher:
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
Scalable Multi-match Packet Classification Using TCAM and SRAM Author: Yu-Chieh Cheng, Pi-Chung Wang Publisher: IEEE Transactions on Computers (2015) Presenter:
JA-trie: Entropy-Based Packet Classification Author: Gianni Antichi, Christian Callegari, Andrew W. Moore, Stefano Giordano, Enrico Anastasi Conference.
Reorganized and Compact DFA for Efficient Regular Expression Matching
2018/4/27 PiDFA : A Practical Multi-stride Regular Expression Matching Engine Based On FPGA Author: Jiajia Yang, Lei Jiang, Qiu Tang, Qiong Dai, Jianlong.
A DFA with Extended Character-Set for Fast Deep Packet Inspection
2018/6/26 An Energy-efficient TCAM-based Packet Classification with Decision-tree Mapping Author: Zhao Ruan, Xianfeng Li , Wenjun Li Publisher: 2013.
Regular Expression Matching in Reconfigurable Hardware
Memory-Efficient Regular Expression Search Using State Merging
Scalable Multi-Match Packet Classification Using TCAM and SRAM
A New String Matching Algorithm Based on Logical Indexing
Compact DFA Structure for Multiple Regular Expressions Matching
2019/5/3 A De-compositional Approach to Regular Expression Matching for Network Security Applications Author: Eric Norige Alex Liu Presenter: Yi-Hsien.
2019/5/5 A Flexible Wildcard-Pattern Matching Accelerator via Simultaneous Discrete Finite Automata Author: Hsiang-Jen Tsai, Chien-Chih Chen, Yin-Chi Peng,
Pipelined Architecture for Multi-String Matching
Presenter: Yu Hao, Tseng Date: 2014/8/25
MEET-IP Memory and Energy Efficient TCAM-based IP Lookup
Presentation transcript:

TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE ANCS, 2007 Presenter: Ching-Hsuan Shih Date: 2014/05/28 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.

Outline Introduction Motivation Tunable Finite Automaton(TFA) Splitting NFA Active State Combinations State Encoding Performance Evaluation 2 National Cheng Kung University CSIE Computer & Internet Architecture Lab

Introduction (1/3) Network Intrusion Detection System (NIDS) Is a device or software to monitor the network whether there are malicious activities. Most IDS is to observe the network packet,system log or network flow. Regular Expression Current rule-sets like Snort, Bro, and many others are replacing strings with the more powerful and expressive regular expressions. National Cheng Kung University CSIE Computer & Internet Architecture Lab 3

Introduction (2/3) Deterministic Finite Automatons (DFAs) and Non- deterministic Finite Automatons (NFAs) are two typical representations of regular expressions. The main problem with DFAs is prohibitive memory usage: The number of states in a DFA scale poorly with the size and number of wildcards in the regular expressions they represent. An NFA represents regular expressions with much less memory storage. However, this memory reduction comes with the price of a high and unpredictable memory bandwith requirement. National Cheng Kung University CSIE Computer & Internet Architecture Lab 4

Introduction (3/3) In this paper, we propose Tunable Finite Automaton (TFA) with a small (larger than one) but bounded number of active states. The main idea of TFA is to use a few TFA states to remember the matching status traditionally tracked by a single DFA state. National Cheng Kung University CSIE Computer & Internet Architecture Lab 5

Motivation (1/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 6 Regex : 1..*a.*b[ˆa]*c 2..*d.*e[ˆd]*f 3..*g.*h[ˆg]*i Alphaset Σ ={a, b,..., i} Number of states in DFA :54 Number of states in NFA :10 Although the NFA requires much less memory, its memory bandwidth requirement is four times that of the DFA

Motivation (2/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 7

Motivation (3/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 8

Motivation (4/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 9 We have seen the main reason for the DFA having far more states than the corresponding NFA is that the DFA needs one state for each NFA active state combination One possible solution is to allow multiple automaton states (bounded by a given bound factor b) to represent each combination of NFA active states. We name it Tunable Finite Automaton (TFA).

Tunable Finite Automaton (1/5) National Cheng Kung University CSIE Computer & Internet Architecture Lab 10 A. Constructing A TFA The implementation of a TFA logically consists of two components : A TFA structure. Set Split Table (SST) : Each entry of the SST table corresponds to one combination of NFA active states (i.e., a DFA state) recording how to split the combination into multiple TFA states.

Tunable Finite Automaton (2/5) National Cheng Kung University CSIE Computer & Internet Architecture Lab Generate the DFA states using the subset construction scheme [13]. The obtained DFA states provide us with all valid NFA active state combinations. 2. Split each NFA active state combination into up to b subsets, with the objective of minimizing the number of distinct subsets, and generate one TFA state for each distinct subset. After this step, we obtain the TFA state set Q T and the set split table SST. 3. Decide the transition function δ T. Different from traditional automatons, outgoing transitions of TFA states do not point to other TFA states. Instead, they point to a data structure called state label, which contains a set of NFA state IDs. Given a TFA state s, its state label associated with character “c” includes all NFA states that can be reached via character “c” from the NFA states associated with TFA state s. 4. Decide the set of initial states (I) and the set of accept states (F T ).

Tunable Finite Automaton (3/5) National Cheng Kung University CSIE Computer & Internet Architecture Lab 12

Tunable Finite Automaton (4/5) National Cheng Kung University CSIE Computer & Internet Architecture Lab 13

Tunable Finite Automaton (5/5) National Cheng Kung University CSIE Computer & Internet Architecture Lab 14 B. Operating A TFA Assume the input string is “adegf ”. Initial active state : O 1. a: return label {A,O}, next active states: OA 2. d: return label {A,D,O}, next active states: O, AD 3. e: return label {A,E,O}, next active states: O, AE 4. g: return label {A,E,G,O}, next active states: OG, AE 5. f return label {A,F,G,O}, next active states: OG, AF 6. AF is an accept state => match!

Splitting NFA Active State Combinations (1/3) National Cheng Kung University CSIE Computer & Internet Architecture Lab 15 A. Set Split Problem (SSP) To find a minimal number of subsets from the NFA state set, so that for any valid NFA active state combination, we can always find up to b subsets to exactly cover it. b-SSP problem is an NP-hard problem for any b > 1. We present here a heuristic algorithm to solve the b-SSP problem.

Splitting NFA Active State Combinations (2/3) National Cheng Kung University CSIE Computer & Internet Architecture Lab 16 B. A Heuristic Algorithm for 2-SSP Problem Given an NFA active state combination with v states, we consider only two kinds of special splits: 1. No split at all (i.e., one subset is empty). 2. Splits that divide the combination into two subsets whose sizes are 1 and v-1, respectively. The reason to use the second special split is that, after analyzing the NFA active state combinations of many rule sets, we find many combinations of NFA active states differ from each other in only one NFA state.

Splitting NFA Active State Combinations (3/3) National Cheng Kung University CSIE Computer & Internet Architecture Lab 17

State Encoding National Cheng Kung University CSIE Computer & Internet Architecture Lab 18 A simple scheme is to implement each state label as an array, including all associated NFA state IDs. High storage cose. TFA operation overhead. Bit vector: Find a way to assign each NFA state a bit vector, so that the bit vector associated with each valid combination of NFA active states (i.e., each DFA state) must be unique. And the number of bits used in the bit vector is minimized.

Performance Evaluation (1/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 19

Performance Evaluation (2/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 20

Performance Evaluation (3/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 21

Performance Evaluation (4/4) National Cheng Kung University CSIE Computer & Internet Architecture Lab 22