A hybrid finite automaton for practical deep packet inspection Department of Computer Science and Information Engineering National Cheng Kung University,

Slides:



Advertisements
Similar presentations
Deep Packet Inspection: Where are We? CCW08 Michela Becchi.
Advertisements

Deep packet inspection – an algorithmic view Cristian Estan (U of Wisconsin-Madison) at IEEE CCW 2008.
Automata Theory Part 1: Introduction & NFA November 2002.
Lexical Analysis IV : NFA to DFA DFA Minimization
Deep Packet Inspection with DFA-trees and Parametrized Language Overapproximation Author: Daniel Luchaup, Lorenzo De Carli, Somesh Jha, Eric Bach Publisher:
Optimizing Regular Expression Matching with SR-NFA on Multi-Core Systems Authors : Yang, Y.E., Prasanna, V.K. Yang, Y.E. Prasanna, V.K. Publisher : Parallel.
An Efficient Regular Expressions Compression Algorithm From A New Perspective Authors : Tingwen Liu,Yifu Yang,Yanbing Liu,Yong Sun,Li Guo Tingwen LiuYifu.
1 TCAM Razor: A Systematic Approach Towards Minimizing Packet Classifiers in TCAMs Department of Computer Science and Information Engineering National.
XFA : Faster Signature Matching With Extended Automata Author: Randy Smith, Cristian Estan and Somesh Jha Publisher: IEEE Symposium on Security and Privacy.
1 An adaptable FPGA-based System for Regular Expression Matching Department of Computer Science and Information Engineering National Cheng Kung University,
Improved TCAM-based Pre-Filtering for Network Intrusion Detection Systems Department of Computer Science and Information Engineering National Cheng Kung.
Compact State Machines for High Performance Pattern Matching Department of Computer Science and Information Engineering National Cheng Kung University,
1 Regular expression matching with input compression : a hardware design for use within network intrusion detection systems Department of Computer Science.
Two stage packet classification using most specific filter matching and transport level sharing Authors: M.E. Kounavis *,A. Kumar,R. Yavatkar,H. Vin Presenter:
1 Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Department of Computer Science and Information Engineering National.
1 Performing packet content inspection by longest prefix matching technology Authors: Nen-Fu Huang, Yen-Ming Chu, Yen-Min Wu and Chia- Wen Ho Publisher:
CS5371 Theory of Computation Lecture 4: Automata Theory II (DFA = NFA, Regular Language)
Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,
Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs.
1 Efficient packet classification using TCAMs Authors: Derek Pao, Yiu Keung Li and Peng Zhou Publisher: Computer Networks 2006 Present: Chen-Yu Lin Date:
Thopson NFA Presenter: Yuen-Shuo Li Date: 2014/5/7 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Sampling Techniques to Accelerate Pattern Matching in Network Intrusion Detection Systems Author: Domenico Ficara, Gianni Antichi, Andrea Di Pietro, Stefano.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang.
1 Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Fang Yu Microsoft Research, Silicon Valley Work was done in UC Berkeley,
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection Authors: Ayesha Khalid, Rajat Sen†, Anupam Chattopadhyay Publisher:
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Authors: Fang Yu, Zhifeng Chen, Yanlei Diao, T. V. Lakshman, Randy H.
TFA : A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Tang Song and H. Jonathan Chao Publisher: Technical.
An Efficient Regular Expressions Compression Algorithm From A New Perspective  Author: Tingwen Liu, Yifu Yang, Yanbing Liu, Yong Sun, Li Guo  Publisher:
Parallelization and Characterization of Pattern Matching using GPUs Author: Giorgos Vasiliadis 、 Michalis Polychronakis 、 Sotiris Ioannidis Publisher:
StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher:
Sampling Techniques to Accelerate Pattern Matching in Network Intrusion Detection Systems Author : Domenico Ficara, Gianni Antichi, Andrea Di Pietro, Stefano.
1 Optimization of Regular Expression Pattern Matching Circuits on FPGA Department of Computer Science and Information Engineering National Cheng Kung University,
Deterministic Finite Automaton for Scalable Traffic Identification: the Power of Compressing by Range Authors: Rafael Antonello, Stenio Fernandes, Djamel.
Regular Expression Matching for Reconfigurable Packet Inspection Authors: Jo˜ao Bispo, Ioannis Sourdis, Jo˜ao M.P. Cardoso and Stamatis Vassiliadis Publisher:
1 Fast packet classification for two-dimensional conflict-free filters Department of Computer Science and Information Engineering National Cheng Kung University,
Extending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions Publisher : Conference on emerging Networking EXperiments and Technologies.
TCAM –BASED REGULAR EXPRESSION MATCHING SOLUTION IN NETWORK Phase-I Review Supervised By, Presented By, MRS. SHARMILA,M.E., M.ARULMOZHI, AP/CSE.
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
Memory-Efficient Regular Expression Search Using State Merging Author: Michela Becchi, Srihari Cadambi Publisher: INFOCOM th IEEE International.
A Scalable Architecture For High-Throughput Regular-Expression Pattern Matching Yao Song 11/05/2015.
Author : Yang Xu, Lei Ma, Zhaobo Liu, H. Jonathan Chao Publisher : ANCS 2011 Presenter : Jo-Ning Yu Date : 2011/12/28.
Author : Randy Smith & Cristian Estan & Somesh Jha Publisher : IEEE Symposium on Security & privacy,2008 Presenter : Wen-Tse Liang Date : 2010/10/27.
TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
Range Enhanced Packet Classification Design on FPGA Author: Yeim-Kuan Chang, Chun-sheng Hsueh Publisher: IEEE Transactions on Emerging Topics in Computing.
LaFA Lookahead Finite Automata Scalable Regular Expression Detection Authors : Masanori Bando, N. Sertac Artan, H. Jonathan Chao Masanori Bando N. Sertac.
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Publisher : ANCS’ 06 Author : Fang Yu, Zhifeng Chen, Yanlei Diao, T.V.
An Improved DFA for Fast Regular Expression Matching Author : Domenico Ficara 、 Stefano Giordano 、 Gregorio Procissi Fabio Vitucci 、 Gianni Antichi 、 Andrea.
Packet Classification Using Dynamically Generated Decision Trees
Author : S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese Publisher : ANCS ‘07 Presenter : Jo-Ning Yu Date : 2011/04/20.
Hierarchical packet classification using a Bloom filter and rule-priority tries Source : Computer Communications Authors : A. G. Alagu Priya 、 Hyesook.
SRD-DFA Achieving Sub-Rule Distinguishing with Extended DFA Structure Author: Gao Xia, Xiaofei Wang, Bin Liu Publisher: IEEE DASC (International Conference.
Series DFA for Memory- Efficient Regular Expression Matching Author: Tingwen Liu, Yong Sun, Li Guo, and Binxing Fang Publisher: CIAA 2012( International.
Packet Classification Using Multi- Iteration RFC Author: Chun-Hui Tsai, Hung-Mao Chu, Pi-Chung Wang Publisher: 2013 IEEE 37th Annual Computer Software.
Deflating the Big Bang: Fast and Scalable Deep Packet Inspection with Extended Finite Automata Date:101/3/21 Publisher:SIGCOMM 08 Author:Randy Smith Cristian.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
Hierarchical Hybrid Search Structure for High Performance Packet Classification Authors : O˜guzhan Erdem, Hoang Le, Viktor K. Prasanna Publisher : INFOCOM,
Efficient Signature Matching with Multiple Alphabet Compression Tables Publisher : SecureComm, 2008 Author : Shijin Kong,Randy Smith,and Cristian Estan.
Reorganized and Compact DFA for Efficient Regular Expression Matching
A DFA with Extended Character-Set for Fast Deep Packet Inspection
Regular Expression Matching in Reconfigurable Hardware
Memory-Efficient Regular Expression Search Using State Merging
Scalable Multi-Match Packet Classification Using TCAM and SRAM
Compact DFA Structure for Multiple Regular Expressions Matching
2019/5/3 A De-compositional Approach to Regular Expression Matching for Network Security Applications Author: Eric Norige Alex Liu Presenter: Yi-Hsien.
2019/5/5 A Flexible Wildcard-Pattern Matching Accelerator via Simultaneous Discrete Finite Automata Author: Hsiang-Jen Tsai, Chien-Chih Chen, Yin-Chi Peng,
A Hybrid Finite Automaton for Practical Deep Packet Inspection
2019/10/9 Regular Expression Matching for Reconfigurable Constraint Repetition Inspection Authors : Miad Faezipour and Mehrdad Nourani Publisher : IEEE.
Presentation transcript:

A hybrid finite automaton for practical deep packet inspection Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. Authors: Michela Becchi; Patrick Crowley Publisher: ACM Conference on Emerging Network Experiment and Technology, CoNEXT 2007, New York, NY, USA, December , ACM 2007, ISBN Present: Chia-Ming,Chang Date: 4, 22,

Outline 1. Introduction 2. State Blow-up 3. Hybrid-FA 4. Experimental results 5. concluding remarks 2

Introduction (1/5) 3 For example: The regular expression “prefix.{100}suffix”, which matches if and only if “prefix” is separated from “suffix” by 100 characters, “require well over 1 million states to be represented in a DFA.” The regular expression above would require just 101+ (# chars in prefix) + (# chars in suffix) NFA states.

Introduction (2/5) 4 Problem :DFAs corresponding to large sets of regular expressions, each one representing a different rule, can be prohibitively large First method: an explosion in states can occur when many rules are grouped together into a single DFA, Yu et al. have proposed segregating rules into multiple groups and evaluating the corresponding DFAs concurrently. pro and con : This solution decreases memory space requirements, sometimes dramatically, but increases memory bandwidth linearly with the number of active DFAs. PS: The memory bandwidth requirement can be expressed in terms of the number of memory operations to be performed for each input character processed.

Introduction (3/5) 5 Problem :DFAs corresponding to large sets of regular expressions, each one representing a different rule, can be prohibitively large Second method: aims at reducing the memory space requirement of any given DFA and is based on two observations. (A) the memory space required to store a DFA strictly depends on the number of transitions between states. (B) many states in DFAs have identical sets of outgoing transitions. (remove redundancy) pro and con : This solution compression technique proposed trades off memory storage requirements with processing time.

Introduction (4/5) 6 we propose a hybrid DFA-NFA finite automaton (Hybrid-FA), a solution bringing together the strengths of both DFAs and NFAs. we are interested in two distinct situations: a)State blow-up happens when compiling a single regular expression in isolation. b) Given two regular expressions RE1 and RE2 the corresponding DFA1 and DFA2 can be built without incurring state explosion. However, when compiling RE1 and RE2 into a unique DFA12, either the number of states in DFA12 is significantly greater than the sum of DFA1 and DFA2, or DFA12 cannot be built at all, due to exponential state explosion.

Introduction (5/5) 3 we are interested in two distinct situations: a)=>DFAs are not a feasible representation of the given regular expression b) =>keeping the two DFAs separated and operating them in parallel However, given a set of regular expressions, it would be beneficial to be able to predict this situation without testing all possible combinations. The goals of this work are the following: Explore two distinct conditions which lead to the state blow up during subset construction. Propose a hybrid automaton which deals with the above problems in a unified way. Refine the proposal in order to provide an acceptable worse case bound on the memory bandwidth requirement.

Outline 1. Introduction 2. State Blow-up 3. Hybrid-FA 4. Experimental results 5. concluding remarks 7

State Blow-up (1/4) 8 In practical rule-sets, dot-star conditions do not cause state blow-up when individual regular expressions are compiled in isolation. Figure1, which compares DFAs accepting regular expressions abcd and ab.*cd, It can be observed that the number of states is the same in the two cases; the transitions are “moved toward” the tail of the DFA in the second one.

State Blow-up (2/4) 9 Figure 2 illustrates this situation in the composite DFA for regular expressions “ab.*cd” and “efgh”. Notice that the sub-DFA matching “efgh” is replicated: first in states 2, 4, 7 and 10, and second in states 6, 9, 11 and 12. The second replica originates from state 3, which derives from expanding the dot-star condition.

State Blow-up (3/4) 10 Figure 3 illustrates this fact on the simple regular expression “ab.{3}cd”. Clearly, the size of the DFA grows rapidly with the cardinality of the counting constraints

State Blow-up (4/4) 11 In contrast to dot-star terms, counting constraints on wildcards and large character ranges cause exponential state blow-up when creating DFAs even for single regular expressions. This can be explained as follows: when expanding the counting constraint, all possible occurrences of the regular expression prefix must be considered at each instance of the wildcard or character range.

Outline 1. Introduction 2. State Blow-up 3. Hybrid-FA 4. Experimental results 5. concluding remarks 12

Hybrid-FA (1/5) 13 processing symbol a in NFA state 0 leads to NFA states 0 and 1, processing the same character in (DFA) state 0 of the hybrid automaton leads to a (DFA) state tagged 0-1. It can be noted that the border state has two distinct outgoing transitions on character c: Border state NFA-Part DFA-Part

Hybrid-FA (2/5) 14 resulting hybrid-FA will exhibit some useful properties. Specifically: i)the starting state will be a DFA-state; ii) the NFA part of the automaton will remain inactive till a border state is reached; iii) There will be no backwards activation of the DFA coming from the NFA.

Hybrid-FA (3/5) 15

Hybrid-FA (4/5) 16 As long as the border state 2 is not traversed. In other words, as long as the prefix of the first regular expression “ab” is not matched, processing is restricted to the DFA portion.

Hybrid-FA (5/5) 17

Outline 1. Introduction 2. State Blow-up 3. Hybrid-FA 4. Experimental results 5. concluding remarks 18

Experimental results (1/3) distinct regular expressions: 25% contain long counting constraints, generally located at the end of the regular expressions, 11.4% contain.* conditions and 54.89% [^c1c2...ck]* conditions. The Snort IDS performs packet payload inspection only after header filtering (i.e. packet classification). we clustered rules with common header and performed experiments on some of the largest groups.

Experimental results (2/3) 20 Rule-sets group1 and group2 do not contain counting constraints, Rule-sets group3-group6 of counting constraints encountered consist of 20 to 1024 repetitions of large character ranges; Over 10 million states

Experimental results (3/3) 21 The memory bandwidth requirement can be expressed in terms of the number of memory operations to be performed for each input character processed.

Outline 1. Introduction 2. State Blow-up 3. Hybrid-FA 4. Experimental results 5. concluding remarks 22

concluding remarks (1/1) 23 The primary contribution of this work is the hybrid-FA, which is, to our knowledge, the first automaton that is capable of evaluating all the regular-expression types found in common NIDS systems such as Snort and can be implemented efficiently in practical high-speed systems. The key characteristics of a hybrid-FA are: a modest memory storage requirement comparable to those of an NFA solution, an average case memory bandwidth requirement similar to that of a single DFA solution