Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah.

Similar presentations


Presentation on theme: "Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah."— Presentation transcript:

1 Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah Ramli Department of Electrical Engineering Universitas Indonesia

2 Network Security

3  The most important element In the network security: IDS  Intrusion detection principles:  Misuse detection (signature base)  Anomaly detection (statistics)  Classification with Machine Learning (research) Background: IDS

4  Intrusion detection too many false alarm  More often arise new types of attack  Required effective and adaptable detection method  Classification with Machine Learning gives the best result depend on the kernel function and its parameters, and network data attributes/features.  There are no systematic theories concerning how to choose the appropriate kernel/parameters. Background: Problem

5 1.Capturing packets transferred on the network. 2.Extracting an extensive set of attributes/features of the network packets data that can describe a network connection or a host session. 3.Learning a model that can accurately describe the behavior of abnormal and normal activities by applying data mining techniques. 4.Detecting the intrusions by using the learned models. Data Mining Approach for IDS

6 Classification (Supervised) Clustering (Unsupervised) K Nearest Neighbor (K-NN)K-Means Naïve BayesHierarchical Clustering Artificial Neural NetworkDBSCAN Support Vector MachineFuzzy C-Means Fuzzy K-NNSelf Organizing Map Data Mining Approach

7 Machine Learning Input Training Data (x,y) Input Training Data (x,y) Model Development Learning Algorithm Model Implementation Input Test Data (x,?) Input Test Data (x,?) Output Test Data (x,y) Output Test Data (x,y)

8 SVM Classification

9 Kernel NameDefinition of Function Linear K(x,y)= x.y PolynomialK(x,y)= (x.y + c) d Gaussian RBF K(x,y)= exp(- II x-y II 2 /2.σ 2 ) Sigmoid (Tangent Hyperbolic) K(x,y)= tanh(σ(x.y) + c) Inverse Multiquadric K(x,y)= 1 / √ II x-y II 2 + c Kernel Function x and y pair of data from train dataset σ, c, d > 0 constant parameter

10  How to choose the optimal/significant input dataset feature.  How to set the best kernel function and parameters: σ, ε and C. SVM Performance

11  Three important dynamic properties: the intrinsic stochastic property, ergodicity and regularity  Advantage of chaos escape from local minima  More efficient to obtain optimization parameters by means of its powerful global searching ability Chaos

12 System Design

13 Metodologi Data Collection Data Preprocessing Model Development Data Classification Training Dataset Test Dataset KDDCUP ’99 DARPA Dataset Predicted Intrusion Data

14 Data Preprocessing Dataset Transformation Dataset Normalization Range Discretization Format Conversion Dataset Division: Training & Test KDDCUP ’99 DARPA Dataset Test Dataset Training Dataset

15 Model Development Input Training Data (x,y) Input Training Data (x,y) Parameter Selection with Chaos Optimization Learning Algorithm (SVM) Learning Algorithm (SVM) Model Implementation Input Test Data (x,?) Input Test Data (x,?) Output Test Data (x,y) Output Test Data (x,y) Kernel Function Selection

16 Fitur 1-9 : intrinsic feature extracted from header paket Fitur 10-22 : atribut konten yang didapat dari pengetahuan ahli dari paket Fitur 23-31 : atribut konten dari koneksi 2 detik sebelumnya Fitur 32-41 : atribut trafik dari mesin yang didapat dari 100 koneksi sebelumnya Fitur Payload : payload berdasarkan waktu (minggu) Feature in KddCup

17 Intrinsic Attributes These attributes are extracted from the headers' area of the network packets

18 Content Attributes These attributes are extracted from the contents area of the network packets based on expert person knowledge

19 Time Traffics Attributes To calculate these attributes we considered the connections that occurred in the past 2 seconds

20 Machine Traffic Attributes To calculate these attributes we took into account the previous 100 connections

21 21 Network Traffic Classification

22 The features that used in previous works are eight features from Mukkamala are: src_bytes, dst_bytes, Count, srv_count, dst_host_count, dst_host_srv_count, dst_host_same_src_port_rate, dst_host_srv_diff _host_rate. Selected Features

23 The features that used in previous works are 24 features from Natesan are: Duration, protocol_type, Service, Flag, src_bytes, dst_bytes, Hot, num_failed_logins, logged-in, num_compromised, root_shell, num_root, num_file_creations, num_shells, num_access_files, is_host_login, is_guest_login, Count, serror_rate, rerror_rate, diff_srv_rate, dst_host_count, dst_host_diff_srv_rate, dst_host_srv_serror_rate. Selected Features

24 Proposed Features

25

26 Data Pre-processing

27 Simulation Experiment

28 Simulation Process Design

29 Using payload can improve accuracy of IDS in detecting R2L. Using SVM with RBF kernel, accuracy detection rates up to 98.2%. Based on experiment, average detection of all features are best using 28 features using payload : Experiment Result


Download ppt "Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah."

Similar presentations


Ads by Google