Modeling IDS using hybrid intelligent systems

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Support Vector Machines
Machine learning continued Image source:
Lecture 14 – Neural Networks
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Unsupervised Intrusion Detection Using Clustering Approach Muhammet Kabukçu Sefa Kılıç Ferhat Kutlu Teoman Toraman 1/29.
Machine Learning as Applied to Intrusion Detection By Christine Fossaceca.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Aula 4 Radial Basis Function Networks
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
An Introduction to Support Vector Machines Martin Law.
Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.
Intrusion Detection Using Neural Networks and Support Vector Machine
This week: overview on pattern recognition (related to machine learning)
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
1 Selecting Features for Intrusion Detection: A Feature Relevance Analysis on KDD 99 Benchmark H. Güneş Kayacık Nur Zincir-Heywood Malcolm I. Heywood.
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
Intrusion Detection Using Hybrid Neural Networks Vishal Sevani ( )
NEURAL NETWORKS FOR DATA MINING
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
An Overview of Intrusion Detection Using Soft Computing Archana Sapkota Palden Lama CS591 Fall 2009.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Implementation of Machine Learning and Chaos Combination for Improving Attack Detection Accuracy on Intrusion Detection System (IDS) Bisyron Wahyudi Kalamullah.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
An Introduction to Support Vector Machines (M. Law)
Data Mining and Decision Support
CHEE825 Fall 2005J. McLellan1 Nonlinear Empirical Models.
Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:
1 Traffic accident analysis using machine learning paradigms Miao Chong, Ajith Abraham, Mercin Paprzycki Informatica 29, P89, 2005 Report: Hsin-Chan Tsai.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Neural networks and support vector machines
CS 9633 Machine Learning Support Vector Machines
Support Vector Machine 04/26/17
Machine Learning for Computer Security
Data Transformation: Normalization
Deep Feedforward Networks
Artificial Neural Networks
an introduction to: Deep Learning
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Intelligent Information System Lab
An Enhanced Support Vector Machine Model for Intrusion Detection
An Introduction to Support Vector Machines
Machine Learning Today: Reading: Maria Florina Balcan
CSSE463: Image Recognition Day 20
A survey of network anomaly detection techniques
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Neuro-Computing Lecture 4 Radial Basis Function Network
Creating Data Representations
Connecting Data with Domain Knowledge in Neural Networks -- Use Deep learning in Conventional problems Lizhong Zheng.
Lecture Notes for Chapter 4 Artificial Neural Networks
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Introduction to Radial Basis Function Networks
Overview of deep learning
CSSE463: Image Recognition Day 18
Neural networks (3) Regularization Autoencoder
Neural Networks II Chen Gao Virginia Tech ECE-5424G / CS-5824
Attention for translation
A task of induction to find patterns
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Word representations David Kauchak CS158 – Fall 2016.
A task of induction to find patterns
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
Support Vector Machines 2
Presentation transcript:

Modeling IDS using hybrid intelligent systems Peddabachigari et al. (2007) Presenter: Andy Tang

Background Intrusion defense system (IDS): 2nd line of defense, after authentication / encryption Two main types: Misuse intrusion – well-defined patterns of attack, encoded in advance Anomaly intrusion – deviation from baseline behaviors, approximated by differences Machine learning (ML) paradigm of IDS: model intrusion detection as a binary classification task with input features from various sensors Neural nets (NN) Genetic programming (GP) Fuzzy logic (FL) Decision trees (DT) Support vector machines (SVM)

General IDS model

General IDS model Sequence of events E(1) – E(2) – … – E(n) ⇒ classification @ E(n) General IDS model

Updated rules (statistical or rule-based) dependent on recent behaviors General IDS model

Updates baseline behaviors General IDS model

General IDS model Some Limitations: A misuse detector paradigm, requires complimentary anomaly detection. Rule-based (ES), limited representation power w.r.t. transition patterns. Does not scale with complex (i.e., non-linear) sequences of behaviors. Depends on quality and quantity of pre-defined rules. General IDS model

Goal: represent characteristics of intrusion behavior by adaptive behavioral “models”. ML style of IDS

Goal: represent characteristics of intrusion behavior by adaptive behavioral “models”. End-to-end pipeline from input data streams to classification rules. Direct feedback from empirical performance to adjust extracted patterns and model parameters. ML style of IDS

ML style of IDS Neural Nets Inputs: a sliding window of w previous records. Output: classification rule Learning Mechanism: backpropagation by vector-jacobian products. “Hidden layers” are intermediary transformed feature spaces ≈ behavioral patterns. ML style of IDS

ML style of IDS Fuzzy logic (FL) Inputs: set of rules, operators and knowledge base(i.e., user-specified graph dependencies). Output: classification rule Learning Mechanism: linear combinations of pre-defined rules on transformed features. ML style of IDS

ML style of IDS Support Vector Machines (SVM) Inputs: feature space Output: classification rule Learning Mechanism: linear combinations of support vectors based Deals w/ nonlinearity in decision space by transforming the feature dimensions (Kernel Trick) More on this later. ML style of IDS

SVM Background

SVM Background

SVM Background e = vector of 1’s Q is SPD, also called the “kernel” matrix. SVM Background

Kernel Trick

SVM Cost Sample complexity: Runtime of QP-solver for support vector selection: w/ sparse feature space w/ dense feature space Key assumption in paper: Kernel selection (and support vector computation separate from “runtime” of SVM during training / testing)! SVM Cost

ML style of IDS Decision Trees (DT) Inputs: joint feature and output space Output: a set of decision rules in the joint space Learning Mechanism: weighted combination of decision rules User redefines the max number of leaves, leading to fixed complexity Already an ensemble system! ML style of IDS

Decision tree example

Others: Evolutionary computation (EC) and genetic programs (GP) Hybrid system: DT-SVM ML style of IDS

Hybrid DT-SVM mode.

Experiments KDD cup 1999 data: 5 million connection records, 24 attack types categorized into 4 classes of intrusions: Denial of service (DOS): sequence of behaviors leading to overload of resources for service. Remote to user (R2L): static attack sending packages over network to gain access in vulnerable nodes. User to Root (U2R): static attack with access to normal user account, gains root access by system vulnerabilities. Probing: sends sequential information over network to identify new vulnerabilities.

SVM vs. DT vs. DT-SVM Experiment Design: Input features: 41 attributes for each connection (e.g., content features, no. of failed logins) Labels: multilabel (1 for each class of attack) for decision trees, multiclass (5 classes for SVM and DT-SVM) “Ensemble” = DT-SVM + SVM + DT. Train-test split over 11,982 records (5092 Tr, 6890 Te) Kernel selection for SVM on training sets (Polynomial p=2) selected. SVM vs. DT vs. DT-SVM

DT vs. SVM performance

Comparison w/ DT-SVM and ensemble Key observations: Performance gain almost exclusively from DT No insight into statistical significance was investigated 1 train-test split for each task, failure to provide variance of performance Remark: DT is already an ensemble approach. Lack of systematic approach for picking ensemble combinations. Comparison w/ DT-SVM and ensemble

Moving Forward: Addition of Transfer Learning?

Discussion 1) It is interesting to consider how this ensemble combination (DT-SVM) would fair against other ensemble combinations: decision tree with neural nets (DT-NN), SVM-NN, RBF-SVM with Linear-SVM, ... etc. Do you think there is a general rule for choosing ensemble pairs (or feature / kernel spaces and loss functions) that best compliment each other? 2) What feasibility issues (e.g., cost, runtime issues) may occur when trying to implement this ensemble IDS in CAVs? 3) Why do you think the U2R attack predictions had the lowest performance across all the learning models? Can you think of other classes of attacks similar to U2R that may be difficult for the proposed method to identify correctly?

Q & A