Author : S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese Publisher : ANCS ‘07 Presenter : Jo-Ning Yu Date : 2011/04/20.

Slides:



Advertisements
Similar presentations
Jing-Shin Chang1 Regular Expression: Syntax for Specifying String Patterns Basic Alphabet empty-string: any symbol a in input symbol set Basic Operators.
Advertisements

Author : Xinming Chen,Kailin Ge,Zhen Chen and Jun Li Publisher : ANCS, 2011 Presenter : Tsung-Lin Hsieh Date : 2011/12/14 1.
Complexity and Computability Theory I Lecture #4 Rina Zviel-Girshin Leah Epstein Winter
Nondeterministic Finite Automata CS 130: Theory of Computation HMU textbook, Chapter 2 (Sec 2.3 & 2.5)
Regular Expressions and DFAs COP 3402 (Summer 2014)
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
296.3: Algorithms in the Real World
Introduction to Computability Theory
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
A hybrid finite automaton for practical deep packet inspection Department of Computer Science and Information Engineering National Cheng Kung University,
Pattern Matching II COMP171 Fall Pattern matching 2 A Finite Automaton Approach * A directed graph that allows self-loop. * Each vertex denotes.
CS5371 Theory of Computation Lecture 6: Automata Theory IV (Regular Expression = NFA = DFA)
Lecture 3 Goals: Formal definition of NFA, acceptance of a string by an NFA, computation tree associated with a string. Algorithm to convert an NFA to.
1 Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Department of Computer Science and Information Engineering National.
Automating Construction of Lexers. Example in javacc TOKEN: { ( | | "_")* > | ( )* > | } SKIP: { " " | "\n" | "\t" } --> get automatically generated code.
Improving Signature Matching using Binary Decision Diagrams Liu Yang, Rezwana Karim, Vinod Ganapathy Rutgers University Randy Smith Sandia National Labs.
Sampling Techniques to Accelerate Pattern Matching in Network Intrusion Detection Systems Author: Domenico Ficara, Gianni Antichi, Andrea Di Pietro, Stefano.
An Improved Algorithm to Accelerate Regular Expression Evaluation Author : Michela Becchi 、 Patrick Crowley Publisher : ANCS’07 Presenter : Wen-Tse Liang.
CMPS 3223 Theory of Computation
Nondeterministic Finite Automata CS 130: Theory of Computation HMU textbook, Chapter 2 (Sec 2.3 & 2.5)
An Improved Algorithm to Accelerate Regular Expression Evaluation Author: Michela Becchi, Patrick Crowley Publisher: 3rd ACM/IEEE Symposium on Architecture.
SI-DFA: Sub-expression Integrated Deterministic Finite Automata for Deep Packet Inspection Authors: Ayesha Khalid, Rajat Sen†, Anupam Chattopadhyay Publisher:
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Authors: Fang Yu, Zhifeng Chen, Yanlei Diao, T. V. Lakshman, Randy H.
Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
Converting NFAs to DFAs How a Syntax Analyser is constructed.
Regular Expressions Hopcroft, Motawi, Ullman, Chap 3.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
StriD 2 FA: Scalable Regular Expression Matching for Deep Packet Inspection Author: Xiaofei Wang, Junchen Jiang, Yi Tang, Bin Liu, and Xiaojun Wang Publisher:
Sampling Techniques to Accelerate Pattern Matching in Network Intrusion Detection Systems Author : Domenico Ficara, Gianni Antichi, Andrea Di Pietro, Stefano.
Doctoral Dissertation Proposal: Acceleration of Network Processing Algorithms Sailesh Kumar Advisors: Jon Turner, Patrick Crowley Committee: Roger Chamberlain,
Memory Compression Algorithms for Networking Features Sailesh Kumar.
INFAnt: NFA Pattern Matching on GPGPU Devices Author: Niccolo’ Cascarano, Pierluigi Rolando, Fulvio Risso, Riccardo Sisto Publisher: ACM SIGCOMM 2010 Presenter:
Algorithms to Accelerate Multiple Regular Expressions Matching for Deep Packet Inspection Sailesh Kumar Sarang Dharmapurikar Fang Yu Patrick Crowley Jonathan.
CMSC 330: Organization of Programming Languages Finite Automata NFAs  DFAs.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Extending Finite Automata to Efficiently Match Perl-Compatible Regular Expressions Publisher : Conference on emerging Networking EXperiments and Technologies.
CS 3813: Introduction to Formal Languages and Automata Chapter 2 Deterministic finite automata These class notes are based on material from our textbook,
Author : Sarang Dharmapurikar, John Lockwood Publisher : IEEE Journal on Selected Areas in Communications, 2006 Presenter : Jo-Ning Yu Date : 2010/12/29.
Advanced Regular Expression Matching for Line-Rate Deep Packet Inspection Sailesh Kumar, Jon Turner Michela Becchi, Patrick Crowley, George Varghese.
CS 203: Introduction to Formal Languages and Automata
Author : Randy Smith & Cristian Estan & Somesh Jha Publisher : IEEE Symposium on Security & privacy,2008 Presenter : Wen-Tse Liang Date : 2010/10/27.
Donghyun (David) Kim Department of Mathematics and Physics North Carolina Central University 1 Chapter 1 Regular Languages Some slides are in courtesy.
TFA: A Tunable Finite Automaton for Regular Expression Matching Author: Yang Xu, Junchen Jiang, Rihua Wei, Yang Song and H. Jonathan Chao Publisher: ACM/IEEE.
A Fast Regular Expression Matching Engine for NIDS Applying Prediction Scheme Author: Lei Jiang, Qiong Dai, Qiu Tang, Jianlong Tan and Binxing Fang Publisher:
LaFA Lookahead Finite Automata Scalable Regular Expression Detection Authors : Masanori Bando, N. Sertac Artan, H. Jonathan Chao Masanori Bando N. Sertac.
Fast and Memory-Efficient Regular Expression Matching for Deep Packet Inspection Publisher : ANCS’ 06 Author : Fang Yu, Zhifeng Chen, Yanlei Diao, T.V.
An Improved DFA for Fast Regular Expression Matching Author : Domenico Ficara 、 Stefano Giordano 、 Gregorio Procissi Fabio Vitucci 、 Gianni Antichi 、 Andrea.
Finding Regular Simple Paths Sept. 2013Yangjun Chen ACS Finding Regular Simple Paths in Graph Databases Basic definitions Regular paths Regular simple.
using Deterministic Finite Automata & Nondeterministic Finite Automata
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
Chapter 5 Finite Automata Finite State Automata n Capable of recognizing numerous symbol patterns, the class of regular languages n Suitable for.
Finite Automata Great Theoretical Ideas In Computer Science Victor Adamchik Danny Sleator CS Spring 2010 Lecture 20Mar 30, 2010Carnegie Mellon.
Advanced Algorithms for Fast and Scalable Deep Packet Inspection Author : Sailesh Kumar 、 Jonathan Turner 、 John Williams Publisher : ANCS’06 Presenter.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
Theory of Computation Automata Theory Dr. Ayman Srour.
Deterministic Finite Automata Nondeterministic Finite Automata.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
CSC-305 Design and Analysis of AlgorithmsBS(CS) -6 Fall-2014CSC-305 Design and Analysis of AlgorithmsBS(CS) -6 Fall-2014 Design and Analysis of Algorithms.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Pushdown Automata.
HEXA: Compact Data Structures for Faster Packet Processing
Chapter 2 FINITE AUTOMATA.
Advanced Algorithms for Fast and Scalable Deep Packet Inspection
Regular Expression Matching in Reconfigurable Hardware
Converting NFAs to DFAs How Lex is constructed
2019/5/3 A De-compositional Approach to Regular Expression Matching for Network Security Applications Author: Eric Norige Alex Liu Presenter: Yi-Hsien.
A Hybrid Finite Automaton for Practical Deep Packet Inspection
Presentation transcript:

Author : S. Kumar, B. Chandrasekaran, J. Turner, and G. Varghese Publisher : ANCS ‘07 Presenter : Jo-Ning Yu Date : 2011/04/20

  Curing DFA from Insomnia  H-FA: Curing DFAs from Amnesia  H-cFA: Curing DFAs from Acalculia  Experimental Evaluation 2 Outline

  Traditional approach of pattern matching constructs an automaton for the entire regular expression (reg-ex) signature.  However, in NIDS applications, normal flows rarely match more than first few symbols of any signature.  The traditional approach appears wasteful.  The traditional approach is unable to perform such selective sleeping and keeps the automaton awake for the entire signature, we call this deficiency insomnia. Curing DFA from Insomnia 3

  An obvious cure to insomnia will essentially require an isolation of the frequently visited portions of the signatures from the infrequent ones.  Fortunately, practical signatures can be cleanly split into simple prefixes and suffixes, such that the prefixes comprise of the entire frequently visited portions of the signature.  Therefore, only the automaton representing the prefixes need to remain active at all times; thereby, curing the traditional approach from insomnia by keeping the suffix automaton in a sleep state most of the times. 4 Curing DFA from Insomnia

  We refer to the automaton which represents the prefixes as the fast path(construct by DFA ) and the remaining as the slow path(construct by NFA ). 5 Curing DFA from Insomnia

  Let p s : Q → [0, 1] denote the probability with which the NFA states are active.  Initially, we keep all states in the slow region; thus the slow path probability p is Σ p s.  Afterwards, we begin moving states from the slow region to the fast region. Splitting the reg-ex 6

  The movements are performed in a breadth first order beginning at the start state of the NFA, and those states are moved first, whose probabilities of being active are higher.  After a state s is moved to the fast region, p s [s] is reduced from the slow path probability p.  We continue these movements, until the slow path probability, p becomes smaller than ε, the slow processing capacity threshold. 7 Splitting the reg-ex

  The fast path parses every byte of each flow and matches them against the prefixes of all reg-exes. → construct a single composite DFA of all prefixes.  The slow path parses only those flows which have found a match in the fast path, and matches them only against the corresponding suffixes. → NFA may suffice to represent the slow path. The bifurcated pattern matching 8

 9  store the “per flow parse state” C : flow number state f : the bits needed to represent a DFA state

  Problem  Slow path can be triggered multiple times for the same flow, thus there can be multiple instances of per flow active parse states.  Ex: an expression “ababcda” prefix : ab ; suffix : abcda packet payload : xyababcdpq  The slow path will be triggered twice by this packet, and there will be two instances of active parse states in the slow path. The bifurcated pattern matching 10

  Upon first trigger, it will parse the payload ”xyababcdpq” and stop after it sees p.  Upon second trigger, it will parse the payload ”xyababcdpq”, thus repeating the previous parse.  Due to these complications, we propose a packetized version of bifurcated packet processing architecture. 11 The bifurcated pattern matching

  The objective of the packetized bifurcated packet processing is to ensure that the fast path triggers the slow path at most once for every packet.  This objective can be satisfied by slightly modifying the slow path automaton, so that it parses the packets against the entire signature, and not just the suffixes. Packetized bifurcated pattern matching 12

  Ex:  r1 =.*[gh]d[^g]*ge  r2 =.*fag[^i]*i[^j]*j  r3 =.*a[gh]i[^l]*[ae]c  ε = 0.05  p1 = [gh]d[^g]*; s1 = ge;  p2 = f; s2 = ag[^i]*i[^j]*j;  p3 = a[gh]; s3 = i[^l]*[ae]c; 13 Packetized bifurcated pattern matching ruleprefixsuffix r1=.*[gh]d[^g]*ge p1 = [gh]d[^g]*g s1 = e r2 =.*fag[^i]*i[^j]*j p2 = fa s2 = g[^i]*i[^j]*j r3 =.*a[gh]i[^l]*[ae]c p3 = a[gh]i s3 = [^l]*[ae]c fast path : Construct a composite DFA slow path : Construct 3 separateDFAs

 14  we begin its parsing at the state (0,1,3), which indicates that the prefix p1([gh]d[^g]*g) has just been detected. Packetized bifurcated pattern matching

  If this final DFA state comprises any of the states of the slow path NFA, then the implication is that the slow path processing will continue; else the slow path will be put to sleep.  Ex: the parsing of a packet payload ”gdgdgh”  The fast path state traversal is illustrated below :  Stop at the DFA state (0, 1). → Slow path automaton will be put to sleep. 15 Packetized bifurcated pattern matching

  The byte based pattern matcher will halt the slow path as soon as the next input character leads to a suffix mismatch.  The packetized pattern matcher will retain the slow path active till the last byte of the packet is parsed.  The packetized architecture maintains the triggering probability at a much lower value. 16 byte based vs. packetized

 17 byte based vs. packetized

  DFA state explosion occurs primarily due the incompetence of the DFA to follow multiple partial matches with a single state of execution.  One way to enable single execution state and yet avoid state explosion is to equip the machine with a small and fast cache, to register key events during the parse, such as encountering a closure.  the parsing get stuck at a single or multiple closures; thus if the history buffer will register these events then one may avoid several states. →History based Finite Automaton (H-FA) H-FA: Curing DFAs from Amnesia 18

  Ex : r1 =.*ab[^a]*c; r2 =.*def; 19 H-FA: Curing DFAs from Amnesia NFA DFA H-FA (0) cdabc ↑ reset flag ↑ set flag  Payload = “cdabc” → (0)→ (0,4) → (0,1) → (0)→ (0,3)

  |s : The transition is taken when flag for the state s is set.  +s : When this transition is taken, the flag for s is set.  −s : When this transition is taken, the flag for s is reset. 20 Formal Description of H-FA original DFA

  Rule1 : outgoing transitions of the fading state (0,2) now originates from the state (0) and has the associated condition that the flag is set (|2).  Rule2 : those transitions which led to a non-fading state resets the flag (-2).  Rule3 : incoming transitions into state (0,2) originating from a non-fading state now has an action to set the flag (+2). 21 Formal Description of H-FA original DFA

  The blowup in the number of states in the presence of the length restriction occurs due to acalculia or the inability of the DFA to keep track of the length restriction.  History based counting finite Automata (H-cFA) H-cFA: Curing DFAs from Acalculia 22

  Construct  The first step replaces the length restriction with a closure, and constructs the H-FA.  Second, every flag in the history buffer, a counter is appended.  The counter is set to the length restriction value by those conditional transitions which set the flag, while it is reset by those transitions which reset the flag. 23 H-cFA: Curing DFAs from Acalculia

 → (0,6)→ (0,5)→ (0,4)→ (0)→ (0,1)  Ex : r1 =.*ab[^a] 4 c; r2 =.*def; 24 H-cFA (0) abdef ↑ ctr <= 3 ↑ ctr <= 2  Payload = “abdefdc” → (0,4)→ (0,3) cd ↑ flag <= 1 ctr <= 4 ↑ ctr <= 1 ↑ ctr <= 0 ↑ flag <= 0 ↑ initialize H-cFA: Curing DFAs from Acalculia * A DFA will require 20 states.

  sets the slow path packet diversion probability at 1% Experimental Evaluation 25

 Experimental Evaluation 26