Download presentation

Presentation is loading. Please wait.

Published byCheyanne Fogg Modified over 2 years ago

1
1 Exact pattern matching on resource- limited network devices Chien-Chung Su 2002/12/10

2
2 Outline Problem definition Resource-limited network devices Introduction of SEBMH Disadvantages of SEBMH Adaptive bucket management Conclusion

3
3 Problem definition Given –P : pattern(s) –T : text General action –Find all occurrences of P in T

4
4 Research for exact pattern matching The exact matching problem is solved for those typical word-processing applications. The story changes radically for other specific applications. –DNA and protein search –Relation between search performance and database size –Network intrusion detection

5
5 Resource-limited network devices Special issues –Security issues Check whether P occur in T –Resource-limited Try to break the tradeoff between speed and space Characteristics –Network-related pattern matching Patterns change sometimes Texts change usually –Solutions Dynamic hash function Adaptive bucket management

6
6 SEBMH Global Shift Table Hash-Link-List Structure of ASCII Patterns Hash-Link-List Structure of non-ASCII Patterns Input Mask

7
7 Set-Exclusive table

8
8 Disadvantages of SEBMH Because the hash function is static, the performance is still dependent with pattern set. –Dynamic hash function The general pattern matching problem, the global shift values will be close to 1 when there are more and more patterns –Classifying the patterns to ease the influence

9
9 How to improvement Pattern classifier Approximate perfect hash function Adaptive bucket management

10
10 Step1. sort the class target patterns by KEY Step2. equally distribute the class target patterns into each bucket n = BUCKET_NUM; i = 0; while (pattern is not the last one) { for (i=0 ; i

11
11 Approximate hash function (2)

12
12 Adaptive bucket management Assumption –Resource is limited –Total bucket number is fixed Step 1 : classify the patterns –For example (feature is a factor) Class A Class B Class C

13
13 Adaptive bucket management Step 2 : allocate buckets –For example Traffic distribution –Class A : 50% –Class B : 30% –Class C : 20% Policy –SEBMH(Class A) could get more buckets at this time –Set-Exclusive table will be more effective »bucket ↑, pattern per bucket ↓, efficacy of set-exclusive table ↑ »bucket ↓, set-exclusive utilization ↑

14
14 How to allocate buckets Communism Fair Greedy

15
15 Basic assumption Assumption –Φ : matching time for one pattern –B : total buckets number –P : total patterns number –C : classes number –Bi : buckets number for class i –Pi : patterns number for class i –Di : traffic distribution of class I Known –P1 + P2 + … + Pc = P –D1 + D2 + … + Dc = 1 Problem –Find a sequence (B1, B2, …, Bc) B1 + B2 + … + Bc = B is small enough

16
16 Communism Method ABM is not applied Without ABM –Classifier is no need –Average matching time : –Other overheads Overheads of approximate perfect hashing Efficacy of Global-Shift table is not obvious Efficacy of Set-Exclusive table is not obvious

17
17 Fair Method At least one solution For example –Traffic distribution Class A : 50% Class B : 30% Class C : 20% With ABM in Fair Method –Average matching time : –Example:

18
18 Greedy Method We can find better solutions For example –Traffic distribution Pattern distribution Class A : 50% Class A : 5 Class B : 30% Class B : 5 Class C : 20% Class C : 20 With ABM in Greedy Method –Average matching time : –Example

19
19 20021112_ 實驗報告

20
20 Objective 觀察最佳解的分佈情況 希望能從觀察中找出演算法來求解

21
21 Traffic dist. 和 pattern dist. 成正比 Bucket = 10 Bucket = 30

22
22 Traffic dist. 和 pattern dist. 成反比 Bucket = 10 Bucket = 30

23
23 結論 當 pattern 和 traffic 的分布成反比時才有效果, 可作為訓練 classifier 的參考依據

24
24 Greedy Algorithm (temp) Step 1 : get the B i from fair method Step 2 : borrow 1 bucket from each class –bonus_bucket = # of class Step 3 : dispatch the bonus buckets –Bonus i = floor (bonus_bucket * (P i / P)) Step 4 : dispatch the remainder buckets –Add bucket into each class and find the best solution one by one

25
25 How to classify patterns (1) The goals the classifier should achieve –High priority reduce the frequency of ABM performed –Low priority enhance the efficacy of ABM

26
26 How to classify patterns (2) reduce the frequency of ABM performed –When ABM should not be performed for specific classes …….(1) …….(2)

27
27 How to classify patterns (3) Expected affect of and – ↑ – ↓ – ↑ – ↓

28
28 How to classify patterns (4) enhance the efficacy of ABM –Try to let Pi is increasing Di is decreasing

29
29 How to classify patterns (5) Operators –Combination Directly combine two classes in the same domain –Sibling aggregation Combine two classes in the different domain patterns OtherUDPTCP HTTP FTP …. TFTPICMP Objective –Make the tree with the stable traffic tree …. Constrain –A lots of patterns with the same prefix in the same class should be a independent class

30
30 How to classify patterns (6) Mathematical model for training classifier –Merge two classes when Conditions of means hold Conditions of variances hold – are the same as previous meanings –k (>=1) is a coefficient that could balance Resource [ k↑] Performance [ k ↓]

31
31 How to classify patterns (7) Conditions of means

32
32 How to classify patterns (7) Conditions of variances

33
33 Classifier Advantages –reduce the impact of complex approximate perfect hash function –eliminate the pattern matching not required

34
34 Classifier behavior Input packet belong to any class? NO bypass YES dispatch the input packet to the corresponding handler

35
35 Next Experiments

36
36 Conclusion

Similar presentations

Presentation is loading. Please wait....

OK

CS4432: Database Systems II

CS4432: Database Systems II

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google