Presentation is loading. Please wait.

Presentation is loading. Please wait.

Machine Learning as Applied to Intrusion Detection By Christine Fossaceca.

Similar presentations


Presentation on theme: "Machine Learning as Applied to Intrusion Detection By Christine Fossaceca."— Presentation transcript:

1 Machine Learning as Applied to Intrusion Detection By Christine Fossaceca

2 “It’s only a matter of the ‘when,’ not the ‘if,’ that we are going to see something dramatic,” Admiral Michael Rogers, director of the National Security Agency testifying before Congress in late 2014 on the real possibility of a cyber attack on critical infrastructure in the United States (WSJ – November 2014)

3 * A “network intrusion” or “network attack” * is an event a user attempts to exploit a system vulnerability * In order to either 1)gain access to network resources or 2)disrupt the ability of the network to operate. Intrusion detection systems are designed to help detect network attacks and alert network operators to the presence of such attacks so that they may take appropriate actions.

4 * MARK-ELM: Application of a novel Multiple Kernel Learning framework for improving the robustness of Network Intrusion Detection * I chose this paper because it was written by my dad for his PhD and I was familiar with his research. I want to go into the field of Cyber security when I graduate, and I like reading about the different ways to enhance cyber security in the world today. * This system combines machine learning techniques that we have learned about in class

5 * The MARK –ELM is a new approach to intrusion detection * Multiple Adaptive Reduced Kernel Extreme Learning Machine * Instead of using a single algorithm, this approach combines the outputs of a variety of decisions to increase overall effectiveness and obtain a better decision than any of the individual classifiers

6 * High rate of false alarms * Inability to identify multiple types of attacks * Too much “human tuning” is required The MARK-ELM addresses all of these issues with a low rate of false alarms, multiclass classification ability, and minimal human tuning

7 * This dataset was used in the Third International Knowledge Discovery and Data Mining Tools Competition. * “The competition task was to build a network intrusion detector, a predictive model capable of distinguishing between “bad” connections, called intrusions or attacks, and “good” normal connections. This database contains a standard set of data to be audited, which includes a wide variety of intrusions simulated in a military network environment.” –UCI Repository Description * Used in 1999 but continues to be the most widely used intrusion detection benchmark dataset today

8 * However, the dataset, which is relative to actual network traffic, has a disproportionate amount of normal traffic and DDOS examples, as compared to the other attack types, Probing, User to Root (U2R) and Remote to Local (R2L) Traffic Type Training Examples Percent of Total Testing Examples Percent of Total Normal43881 60.28% 43950 60.38% DDOS27307 37.51% 27265 37.46% Probe1080 1.48% 1051 1.44% R2L490 0.68% 505 0.69% U2R300.04%220.03% Table 2.4: Typical Distribution for KDD Cup 99 Dataset From MARK-ELM *Adapted from (Fossaceca, John M.,2014)

9 * Ensemble learning– Combines the decisions of multiple extreme learning machines * Different outputs are combined using a weighting scheme, so that algorithms that are good at classifying one type of attack and not another can contribute to that classification, without affecting the data negatively for the other types of attacks. * Instead of using a single algorithm, this approach combines the outputs of a variety of decisions to increase overall effectiveness and obtain a better decision than any of the individual classifiers * One big problem with PAST algorithms was that they have a high rate of detection for only one type of attack  DDOS

10 * Not only is this classifier able to identify different types of attacks, but it is also robust when it comes to noise and mislabeled data because of its Adaboost voting scheme * One of the most important algorithms used in this classifier is Adaboost: * “Iterative ensemble learning approach that ensures parameters are chosen using a technique that emphasizes errors and continually tries to optimize parameters.” * Extended the MKBoost sampling algorithm originally designed to handle only binary class data to handle input data with multiple classes.

11 * The data was trained in a reduced-kernel manner so that the full kernel is not calculated * For extremely large datasets, the computation and storage of a full kernel is not practical in many cases. * MARK-ELM computes a reduced kernel matrix for each round of learning using a small sample of the input data. The data is chosen randomly with the constraint that there must be at least one example of every class present in selected sample set.

12 *Adapted from (Fossaceca, John M.,2014)

13 Φ: x → φ(x) General idea: the original input space can be mapped to some higher-dimensional feature space where the training set is separable A kernel function is defined as a function that corresponds to a dot product of two feature vectors in some expanded feature space. *Adapted from (Eck, D,2006)

14 * Value between 0 and 1 * Does not meet the standard definition of a kernel( Mercer’s condition), but were proven to still be usable in machine learning and have been shown to be effective * First discussed in (Rossius, R., Zenker, G., Ittner, A., & Dilger, W., 1998) as a means of improving the performance of SVMs. The authors argue that “fractional degrees allow a more continuous range of concepts”. * Approach has never been used before on intrusion detection data

15 * Stems from the support vector machine * The extreme learning machine is calculated with an inverse matrix so it trains much faster than a support vector machine * Can handle multiclass classification directly, which is better than the Support Vector Machine (which, due to its limitation of only handling binary classification without special grouping like one against one and one against all, can take times on the order of O(n 2 ))

16 SUPPORT VECTOR MACHINEEXTREME LEARNING MACHINE Trains slower than the ELMTrains faster than the SVM Can only directly handle binary classification Can directly handle multiclass classification Quadratic ProgrammingMatrix inverse

17 *Adapted from (Fossaceca, John M.,2014)

18

19

20 * Unsupervised learning is a more real-world approach to machine learning in intrusion detection, and utilizing this algorithm to build an unsupervised machine would be very valuable to future intrusion detection applications * This algorithm would probably train well on other data such in important areas of bioinformatics and should be researched further to discover the most effective combination methods.


Download ppt "Machine Learning as Applied to Intrusion Detection By Christine Fossaceca."

Similar presentations


Ads by Google