Presentation is loading. Please wait.

Presentation is loading. Please wait.

Intrusion Detection Using Neural Networks and Support Vector Machine

Similar presentations


Presentation on theme: "Intrusion Detection Using Neural Networks and Support Vector Machine"— Presentation transcript:

1 Intrusion Detection Using Neural Networks and Support Vector Machine
IEEE WCCI IJCNN 2002 World Congress on Computational Intelligence International Joint Conference on Neural Networks Intrusion Detection Using Neural Networks and Support Vector Machine Srinivas Mukkamala, Guadalupe Janoski, Andrew Sung Dept. of CS in New Mexico Institute of Mining and Technology

2 Outline Approaches to intrusion detection using neural networks and support vector machines DARPA dataset Neural Networks Support Vector Machines Experiments Conclusion and Comments

3 Approaches Key ideas are to
discover useful patterns or features that describe user behavior on a system And use the set of relevant features to build classifiers that can recognize anomalies and known intrusions Neural networks and support vector machines are trained with normal user activity and attack patterns Significant deviations from normal behavior are flagged as attacks

4 DARPA Data for Intrusion Detection
DARPA (Defense Advanced Research Projects Agency) An agency of US Department of Defense responsible for the development of new technology for use by the military Benchmark from a KDD (Knowledge Discovery and Data Mining) competition designed by DARPA Attacks fall into four main categories DOS: denial of service R2L: unauthorized access from a remote machine U2R: unauthorized access to local super user (root) privileges Probing: surveillance and other probing

5 Features

6 Neural Networks Neuron 神經 Signals Dendrite 樹突 Signals Gather signals
Soma 中心 Combine signals & decide to trigger Signals Signals Axon 軸突 Output signal

7 Σ Σ Σ Σ X2 θ X1 w1 w2 Divide and Conquer N1 N3 N2
平面的線: w1X1 + w2X2 – θ = 0 D A θ X1 INPUT w1 OUTPUT Σ w2 C B ACTIVATION Divide and Conquer WEIGHT Data N1 N2 A +1 -3 B -1 C D -1 1 N1 -1 -1 x1 1 Σ N3 1 1 out1 x2 Σ out3 N3 A +1 -1 B C D -1 1 N2 1 -1 x1 Σ out2 -1 x2

8 Feed Forward Neural Network (FFNN)
Layer 1 Layer 2 Layer 3 Layer 4 Decide Architecture 1 Determine Weight Automatically 2 tanh(S) S x1(1) xj(l) N1 Nj Hyperbolic function eS – e-S Layer 1 Layer l tanh(S) = eS + e-S general S1(1) Sj(l) Σ Σ w01(1) w21(1) wij(l) w11(1) x0(0) x2(0) x1(0) xi(l-1)

9 Σ Σ Σ How to minimize E(w) ?  Stochastic Gradient Descent (SGD) w w w
Input Output Σ w w w Σ w g(x) 由w所組成的classifier Training Data: E Error Function: How to minimize E(w) ?  Stochastic Gradient Descent (SGD) w is random small value at the beginning for T iterations wnew  wold – η.▽w(En) w learning rate

10 Σ …… Back Propagation Algorithm Nj forward for l = 1, 2, …, L
Layer 1 Layer 2 … … Layer L-1 Layer L …… for l = 1, 2, …, L compute Sj(l) and xj(l) x1(l) Back Propagation Algorithm backward Nj for l = L, L-1, …, 1 compute δi(l) Layer l Sj(l) Σ wij(l) General xi(l-1)

11 Σ Σ Σ Σ Feed Forward NNet Consists of layers w 1, 2, …, L w w
wij(l) connect neuron i in layer (l-1) to neuron j in layer l w w … … Σ Cumulated signal w w w Σ Activated output w often tanh x1(l) Minimize E(w) and determine the weights automatically Nj SGD (Stochastic Gradient Descent) Layer l w is random small value at the beginning for T iterations wnew  wold – η.▽w(En) Sj(l) Σ Forward: compute Sj(l) and xj(l) Backward: compute δi(l) wij(l) Stop when desired error rate was met xi(l-1)

12 Support Vector Machine
A supervised learning method Is known as the maximum margin classifier Find the max-margin separating hyperplane

13 SVM – hard margin max argmin <w, x> - θ = 0
2 max ∥w∥ w, θ yn(<w, xn> - θ) ≧1 2 ∥w∥ 1 argmin <w, w> 2 w, θ yn(<w, xn> - θ) ≧1 x1

14 Quadratic programming
1 Σ Σ aijvivj + Σ bivi 2 i j argmin v V*  quadprog(A, b, R, q) Σ rkivi ≧ qk i Let V = [ θ, w1, w2, …, wD ] Adapt the problem for quadratic programming Find A, b, R, q and put into the quad. solver argmin 2 w, θ yn(<w, xn> - θ) ≧1 1 <w, w> Σ wd2 2 1 d=1 D (-yn) θ + Σ yn (xn)d wd ≧ 1 d=1 D

15 Adaptation argmin Σ Σ aijvivj + Σ bivi Σ rkivi ≧ qk Σ wd2 1 2 1
V = [ θ, w1, w2, …, wD ] v0, v1, v2, .…, vD (-yn) θ + Σ yn (xn)d wd ≧ 1 d=1 D Σ wd2 2 1 d=1 D (1+D)*(1+D) v0 vd (1+D)*1 a00 = 0 a0j = 0 ai0 = 0 i ≠ 0, j ≠ 0 aij = 1 (i = j) 0 (i ≠ j) b0 = 0 i ≠ 0 bi = 0 rn0 = -yn d > 0 rnd = yn (xn)d (2N)*(1+D) qn = 1 (2N)*1

16 SVM – soft margin Allow possible training errors Tradeoff c
Large c : thinner hyperplane, care about error Small c : thicker hyperplane, not care about error tradeoff 1 argmin <w, w> + c Σξn errors 2 w, θ n yn(<w, xn> - θ) ≧1 – ξn ξn ≧ 0

17 Adaptation argmin Σ Σ aijvivj + Σ bivi Σ rkivi ≧ qk 1 2
V = [ θ, w1, w2, …, wD, ξ1, ξ2, …, ξN ] (1+D+N)*(1+D+N) (1+D+N)*1 (2N)*1 (2N)*(1+D+N)

18 Primal form and Dual form
1 argmin <w, w> + c Σξn 2 w, θ n yn(<w, xn> - θ) ≧1 – ξn Variables: 1+D+N Constraints: 2N ξn ≧ 0 Dual form 1 argmin ΣΣ αnynαmym<xn, xm> - Σ αn 2 α n m n 0 ≦αn≦C Variables: N Constraints: 2N+1 Σ ynαn = 0 n

19 Dual form SVM Find optimal α* Use α* solve w* and θ
αn=0  correct or on 0<αn<C  on αn=C  wrong or on αn=C free SV Support Vector αn=0

20 Nonlinear SVM Nonlinear mapping X  Φ(X)
{(x)1, (x)2} R2  {1, (x)1, (x)2, (x)12, (x)22, (x)1(x)2} R6 Need kernel trick 1 argmin ΣΣ αnynαmym<Φ(xn), Φ(xm)> - Σ αn 2 α n m n 0 ≦αn≦C Σ ynαn = 0 n (1+ <xn, xm>)2

21 Support Vector Machines
Experiments Pre-processing Training Testing Using automated parsers to process the raw TCP/IP dump data into machine-readable form 7312 training data (different types of attacks and normal data) has 41 features 6980 testing data evaluate the classifier Support Vector Machines Neural Networks Details RBF kernel C = 1000 204 support vectors (29 free) 3-layer FFNNets Scaled conjugate gradient descent Desired error rate = 0.001 Accuracy 99.5% 99.25% Time spent 17.77 sec 18 min

22 Conclusion and Comments
Speed SVMs is significant shorter Avoid the ”curse of dimensionality” by max-margin Accuracy Both have high accuracy SVMs can only make binary classification IDS requires multiple-class identification How to determine the features?


Download ppt "Intrusion Detection Using Neural Networks and Support Vector Machine"

Similar presentations


Ads by Google