Modeling IDS using hybrid intelligent systems

Modeling IDS using hybrid intelligent systems
Peddabachigari et al. (2007) Presenter: Andy Tang

Background Intrusion defense system (IDS): 2nd line of defense, after authentication / encryption Two main types: Misuse intrusion – well-defined patterns of attack, encoded in advance Anomaly intrusion – deviation from baseline behaviors, approximated by differences Machine learning (ML) paradigm of IDS: model intrusion detection as a binary classification task with input features from various sensors Neural nets (NN) Genetic programming (GP) Fuzzy logic (FL) Decision trees (DT) Support vector machines (SVM)

General IDS model

General IDS model Sequence of events E(1) – E(2) – … – E(n)
⇒ E(n) General IDS model

Updated rules (statistical or rule-based) dependent on recent behaviors
General IDS model

Updates baseline behaviors
General IDS model

General IDS model Some Limitations:
A misuse detector paradigm, requires complimentary anomaly detection. Rule-based (ES), limited representation power w.r.t. transition patterns. Does not scale with complex (i.e., non-linear) sequences of behaviors. Depends on quality and quantity of pre-defined rules. General IDS model

Goal: represent characteristics of intrusion behavior by adaptive behavioral “models”.
ML style of IDS

Goal: represent characteristics of intrusion behavior by adaptive behavioral “models”.
End-to-end pipeline from input data streams to classification rules. Direct feedback from empirical performance to adjust extracted patterns and model parameters. ML style of IDS

ML style of IDS Neural Nets
Inputs: a sliding window of w previous records. Output: classification rule Learning Mechanism: backpropagation by vector-jacobian products. “Hidden layers” are intermediary transformed feature spaces ≈ behavioral patterns. ML style of IDS

ML style of IDS Fuzzy logic (FL)
Inputs: set of rules, operators and knowledge base(i.e., user-specified graph dependencies). Output: classification rule Learning Mechanism: linear combinations of pre-defined rules on transformed features. ML style of IDS

ML style of IDS Support Vector Machines (SVM) Inputs: feature space
Output: classification rule Learning Mechanism: linear combinations of support vectors based Deals w/ nonlinearity in decision space by transforming the feature dimensions (Kernel Trick) More on this later. ML style of IDS

SVM Background

SVM Background e = vector of 1’s
Q is SPD, also called the “kernel” matrix. SVM Background

Kernel Trick

SVM Cost Sample complexity:
Runtime of QP-solver for support vector selection: w/ sparse feature space w/ dense feature space Key assumption in paper: Kernel selection (and support vector computation separate from “runtime” of SVM during training / testing)! SVM Cost

ML style of IDS Decision Trees (DT)
Inputs: joint feature and output space Output: a set of decision rules in the joint space Learning Mechanism: weighted combination of decision rules User redefines the max number of leaves, leading to fixed complexity Already an ensemble system! ML style of IDS

Decision tree example

Others: Evolutionary computation (EC) and genetic programs (GP) Hybrid system: DT-SVM ML style of IDS

Hybrid DT-SVM mode.

Experiments KDD cup 1999 data: 5 million connection records, 24 attack types categorized into 4 classes of intrusions: Denial of service (DOS): sequence of behaviors leading to overload of resources for service. Remote to user (R2L): static attack sending packages over network to gain access in vulnerable nodes. User to Root (U2R): static attack with access to normal user account, gains root access by system vulnerabilities. Probing: sends sequential information over network to identify new vulnerabilities.

SVM vs. DT vs. DT-SVM Experiment Design:
Input features: 41 attributes for each connection (e.g., content features, no. of failed logins) Labels: multilabel (1 for each class of attack) for decision trees, multiclass (5 classes for SVM and DT-SVM) “Ensemble” = DT-SVM + SVM + DT. Train-test split over 11,982 records (5092 Tr, 6890 Te) Kernel selection for SVM on training sets (Polynomial p=2) selected. SVM vs. DT vs. DT-SVM

DT vs. SVM performance

Comparison w/ DT-SVM and ensemble
Key observations: Performance gain almost exclusively from DT No insight into statistical significance was investigated 1 train-test split for each task, failure to provide variance of performance Remark: DT is already an ensemble approach. Lack of systematic approach for picking ensemble combinations. Comparison w/ DT-SVM and ensemble

Moving Forward: Addition of Transfer Learning?

Discussion 1) It is interesting to consider how this ensemble combination (DT-SVM) would fair against other ensemble combinations: decision tree with neural nets (DT-NN), SVM-NN, RBF-SVM with Linear-SVM, ... etc. Do you think there is a general rule for choosing ensemble pairs (or feature / kernel spaces and loss functions) that best compliment each other? 2) What feasibility issues (e.g., cost, runtime issues) may occur when trying to implement this ensemble IDS in CAVs? 3) Why do you think the U2R attack predictions had the lowest performance across all the learning models? Can you think of other classes of attacks similar to U2R that may be difficult for the proposed method to identify correctly?

Modeling IDS using hybrid intelligent systems

Similar presentations

Presentation on theme: "Modeling IDS using hybrid intelligent systems"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Modeling IDS using hybrid intelligent systems

Similar presentations

Presentation on theme: "Modeling IDS using hybrid intelligent systems"— Presentation transcript:

Similar presentations

About project

Feedback