Presentation is loading. Please wait.

Presentation is loading. Please wait.

A DFA with Extended Character-Set for Fast Deep Packet Inspection

Similar presentations


Presentation on theme: "A DFA with Extended Character-Set for Fast Deep Packet Inspection"— Presentation transcript:

1 A DFA with Extended Character-Set for Fast Deep Packet Inspection
2018/6/2 A DFA with Extended Character-Set for Fast Deep Packet Inspection Author: Cong Liu, Yan Pan, Ai Chen, and Jie Wu Presenter: Xiao-Min Zheng Date:2016/11/09 Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C. CSIE CIAL Lab 1

2 2018/6/2 Introduction DPI is becoming increasing important in classifying and controlling network traffic. We focus on a general-purpose processor approach. NFA implementations of regular expression will cause a nondeterministic number of main memory accesses per byte. DFA implementations of regular expression will cause a very large memory space to store their transition table. 第二點:新的DPI系統,軟體部分像SNORT的regular expression,硬體部分提出平行機制和快速on-chip memory,小型on-chip lookup engines,或者有特殊目的的processor approach。 第三點和第四點:特殊目的的設備的查詢速度受限於processor的記憶體頻寬,為了加快,需要減少the number of main memory accesses per bytes in the traffic payload. NFA 和DFA有各自的缺點不是很理想,DFA雖然可以take only one main memory accesses per byte,但是。。。 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

3 2018/6/2 Introduction DFA/EC as a general model of DFA, removes part of each DFA state and incorporates it with the next input character. This novel solution focuses on state reduction, through doubling the size of the character-set. our solution requires only a single main memory access for each byte in the traffic payload. DFA/EC is simple, easy to implement, and easy to update due to fast construction speed. DFA/EC can be combined with other compression approaches to provide a better level of compression. 第三點:和其他state reduction不同的 第二点在第13 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

4 The Conceptual DFA/EC A DFA can be constructed from a set of NFAs.
2018/6/2 The Conceptual DFA/EC A DFA can be constructed from a set of NFAs. The number of DFA states is the number of possible combinations of active DFA states that can be simultaneously active, which can be exponential to the number of NFA states Let N be the set of NFA states, and let D be the set of DFA states. 1) d∈D,d⊆N; 2) D⊂2N ; 3) |N|≪ |D| ≪| 2N | =2|N| |N|≪ |D| 由于state explosion problem National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

5 2018/6/2 The Conceptual DFA/EC 這個state explosion problem導致了 DFA size過大的問題,為了解決這個問題提出DFA/EC The reasons why NFAs cause state explosion: 1) The states 2,6, and 7 are more likely to be active; 2) A frequently active NFA state is more likely to be active simultaneously with other sets of states. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

6 2018/6/2 The Conceptual DFA/EC In DFA/EC we select some of the most frequently active NFA states and incorporate them into the character-set(or the alphabet) of the DFA to form a slightly larger extended character-set. Three states of DFA/EC: 1) main DFA(D1): A main DFA in a DFA/EC that implements the rest of the infrequently active NFA states; 2) complementary states(N2):NFA states that are selected and incorporated into the character-set; 3) main states(N1):Remaining NFA states. 簡單講下如何區分這三種狀態 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

7 2018/6/2 The Conceptual DFA/EC 1)0,4 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

8 The Formal Model of DFA/EC
Theorem 1: For any DFA, there exists an DFA/EC A DFA/EC can be defined by (D1,D2,C,Ce,He,Te,M) C D Set of simultaneously active sets of main states Set of simultaneously active sets of complementary states Original character-set (or alphabet) Extended character-set Set of conventional DFA states National Cheng Kung University CSIE Computer & Internet Architecture Lab

9 The Formal Model of DFA/EC
2018/6/2 The Formal Model of DFA/EC He: Generates an extended character ce∈Ce Te: Generates a pair of partial states M: Generates another partial state d1 is the next state of the main DFA D2 is the next state of the complementary program H,M應用在complementary program T應用在transition table National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

10 The Formal Model of DFA/EC
2018/6/2 The Formal Model of DFA/EC DFA defined by (D,C,T),T:D x C →D can be equivalent to another form of DFA(D1,D2,C,T11,T12,T21,T22) D1交集D2為空,D1聯集D2為D d1聯集d2為d T11 is a transition function ,which returns the set of newly active NFA states in D1 that is activated through transitions from the set of previously active states d1属于D1 on character c. d1 and d2 as sets of simultaneously active NFA states National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

11 An Efficient implementation
2018/6/2 An Efficient implementation Two constraints on the complementary states 1) conflicting constraint 2) binary constraint The purpose of these constraints is to reduce the range of function He 在证明相等之后,就开始分析complementary state的限制条件,如何选出complementary state A large extended character-set would undermine the advantage of reducing the number of states D1 in the main DFA National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

12 An Efficient implementation
2018/6/2 An Efficient implementation conflicting constraint complementary state 下一个state不能同一个 Under the non-conflicting constraint, for a given c, their can be at most one complementary state n2 in d2 that has a transition to one or several main states,{n1 }. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

13 An Efficient implementation
2018/6/2 An Efficient implementation binary constraint 感觉有错 The binary constraint is in terms of the transitions with in N2, while the non-conflicting constraint, concerns transitions from states in N2 to states in N1. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

14 An Efficient implementation
2018/6/2 An Efficient implementation Determine the complementary states by independent-state method. 1)First step is to estimate the extent to which each NFA state causes state explosion; 2)Second step is to determine the complementary states based on the results in the first step and the two constraints. 如果沒有這兩個限制,太大的extended character-set會降低減少D1的優勢 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

15 An Efficient implementation
2018/6/2 An Efficient implementation The number of states of a DFA constructed from an NFA depends on the level of independence among the states in the NFA: 1) if every pair of states in the NFA is not independent, the size of the DFA equals that of the NFA; 2) if every pair of states in the NFA is independent, the size of the DFA is 2|N| We measure the level of independence of an NFA state among other NFA states by using the number of times it appears in a pair of independent states, which we call the independent number of the state. N is the size of the NFA National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

16 An Efficient implementation
2018/6/2 An Efficient implementation Priority是作为一个惩罚数值,来排除大量的状态 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

17 An Efficient implementation
2018/6/2 An Efficient implementation Main DFA state 的建立 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

18 2018/6/2 Evaluation We developed several compilers, which read files of rules and created the corresponding inspection programs and the transition tables. We extracted rule-sets from the Snort rules. We developed a synthetic payload generator. We generate the inspection programs for the rule-sets, measure their storages, and load them with the synthetic payloads to measure their performance. Results 1) On storage size 2) On memory bandwidth and speed 如果沒有這兩個限制,太大的extended character-set會降低減少D1的優勢 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

19 Evaluation The total number of states (percentage to DFA)
2018/6/2 Evaluation The total number of states (percentage to DFA) The significant reduction is due of the removal of the frequently active complementary state in DFA/EC. 表格左方那些奇奇怪怪的是Snort的rule-set National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

20 Evaluation The total number of transitions (percentage to DFA)
2018/6/2 Evaluation The total number of transitions (percentage to DFA) The number of transitions is the sum of the numbers of transitions of each state. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

21 Evaluation The transition storage(Bits/Percentage to DFA)
2018/6/2 Evaluation The transition storage(Bits/Percentage to DFA) The total minimum memory (storage) requirement of the transition tables in terms of bits. National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

22 Evaluation The size of the per-flow state(Bits)
2018/6/2 Evaluation The size of the per-flow state(Bits) We measure the sizes of the per-flow state of the inspection programs in terms of bits and words. The per-flow states for DFA , MDFA and DFA/EC are , and N2 National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

23 Evaluation Memory bandwidth (bits) with different rule-sets
2018/6/2 Evaluation Memory bandwidth (bits) with different rule-sets The memory bandwidths of DFA, MDFA, and DFA/EC are Figure shows that the memory bandwidth of DFA/EC is very close to that of DFA and is much smaller than MDFAs. Memory bandwidth is the amount of memory accesses per byte in the payload National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

24 Evaluation Memory accesses (times/KB)
2018/6/2 Evaluation Memory accesses (times/KB) The number of main memory accesses per KB of payload. DFA/EC and DFA have the minimum number of main memory accesses, MDFAs increases in proportional to M. Memory bandwidth is the amount of memory accesses per byte in the payload National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

25 Evaluation Inspection speed(Java)
2018/6/2 Evaluation Inspection speed(Java) We measure the speed of the inspection programs with both Java and C++ implementations in a Unix machine with 16GB of 1333MHz DDR3 memory and a 2.66 GHz Inter Core i5 CPU. The speeds of the inspection programs depend on the hardware and software on which they are implemented. Memory bandwidth is the amount of memory accesses per byte in the payload National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

26 2018/6/2 Evaluation Java MDFA is fast because of its compact transition table size and the relatively large amount of cache memory in our platform. C++ National Cheng Kung University CSIE Computer & Internet Architecture Lab CSIE CIAL Lab

27 2018/6/2 Conclusion DFA/EC a general-purpose processor and regular expressions-based deep packet inspection algorithm. This solution requires only a single main memory access for each byte in the traffic payload. DFA/ECs are very compact, has a smaller memory bandwidth, and runs faster than DFA. We will combine DFA/EC with the existing transition compression and character-set compression techniques, and perform experiments with more rule-sets. CSIE CIAL Lab


Download ppt "A DFA with Extended Character-Set for Fast Deep Packet Inspection"

Similar presentations


Ads by Google