Presentation is loading. Please wait.

Presentation is loading. Please wait.

Masquerade Detection Mark Stamp 1Masquerade Detection.

Similar presentations


Presentation on theme: "Masquerade Detection Mark Stamp 1Masquerade Detection."— Presentation transcript:

1 Masquerade Detection Mark Stamp 1Masquerade Detection

2  Masquerader --- someone who makes unauthorized use of a computer  How to detect a masquerader?  Here, we consider…  Anomaly-based intrusion detection (IDS)  Detection is based on UNIX commands  Lots and lots of prior work on this problem  We attempt to apply PHMMs  For comparison, we also implement other techniques (HMM and N-gram) Masquerade Detection2

3 Schonlau Data Set  Schonlau, et al, collected large data set  Contains UNIX commands for 50 users  50 files, one for each user  Each file has 15k commands, 5k from user plus 10k for masquerade test data  Test data: 100 blocks, 100 commands each  Dataset includes map file  100 rows (test blocks), 50 columns (users)  0 if block is user data, 1 if masquerade data Masquerade Detection3

4 Schonlau Data Set  Map file structure  This data set used for many studies  Approximately, 50 published papers Masquerade Detection4

5 Previous Work  Approaches to masquerade detection  Information theoretic  Text mining  Hidden Markov models (HMM)  Naïve Bayes  Sequences and bioinformatics  Support vector machines (SVM)  Other approaches  We briefly look at each of these Masquerade Detection5

6 Information Theoretic  Original work by Schonlau included a compression technique  Based on theory (hope?) that legitimate commands compress more than attack  Results were disappointing  Some additional recent work  Still not competitive with best approaches Masquerade Detection6

7 Text Mining  A few papers in this area  One approach extracts repetitive sequences from training data  Another paper use principal component analysis (PCA)  Method of “exploratory data analysis”  Good results on Schonlau data set  But high cost during training phase Masquerade Detection7

8 Hidden Markov Models  Several authors have used HMMs  One of the best known approaches  We have implemented HMM detector  We do sensitivity analysis on the parameters  In particular, determine optimal N (number of hidden states)  We also use HMMs for comparison with our PHMM results Masquerade Detection8

9 Naïve Bayes  In simplest form, relies only on command frequencies  That is, no sequence info is used  Several papers analyze this approach  Among the simplest approaches  And, results are good Masquerade Detection9

10 Sequences  In a sense, this is the opposite extreme from naïve Bayes  Naïve Bayes only considers frequency stats  Sequence/bioinformatics focused on sequence-related information  Schonlau’s original work included elementary sequence-based analysis Masquerade Detection10

11 Bioinformatics  We are aware of only one previous paper that uses bioinformatics approach  Use Smith-Waterman algorithm to create local alignments  Alignments then used directly for detection  In contrast, we do pairwise alignments, MSA, PHMM  PHMM is used for scoring (forward algorithm)  Our scoring is much more efficient  Also, our results are at least as strong Masquerade Detection11

12 Support Vector Machines  Support vector machines (SVM)  Machine learning technique  Separate data points (i.e., classify) based on hyperplanes in high dimensional space  Original data mapped to higher dimension, where separation is likely easier  SVMs maximize separation  And have low computational costs  Used for classification and regression analysis Masquerade Detection12

13 SVMs & Masquerade Detection  SVMs have been applied to masquerade detection problem  Results are good  Comparable to naïve Bayes  Recent work using SVMs focused on improved efficiency Masquerade Detection13

14 Other Approaches  The following have also been studied  Detect using low frequency commands  Detect using high frequency commands  Hybrid Bayes “one step Markov”  Natural to consider hybrid approaches  Multistep Markov  Markov process of order greater than 1  None of these particularly successful Masquerade Detection14

15 Other Approaches (Continued)  Non-negative matrix factorization (NMF)  At least 2 papers on this topic  Appears to be competitive  Other hybrids that attempt to combine several approaches  So far, no significant improvement over individual techniques Masquerade Detection15

16 HMMs  See previous presentation Masquerade Detection16

17 HMM for Masquerade Detection  Using the Schonlau data set we…  Train HMM for each user  Set thresholds  Test the models and plot results  Note that this has been done before  Here, we perform sensitivity analysis  That is, we test different number of hidden states, N  Also use it for comparison with PHMM Masquerade Detection17

18 HMM Experiments  Plotted as “ROC” curves  Closer to origin is better  Useful region  That is, false positives below 5%  The shaded region Masquerade Detection18

19 HMM Conclusion  Number of hidden states does not matter  So, use N=2  Since most efficient Masquerade Detection19

20 PHMM  See previous presentation Masquerade Detection20

21 PHMM Experiments  A problem with Schonlau data…  For given user, 5000 commands  No begin/end session markers  So, must split it up to obtain multiple sequences  But where to split sequence?  And what about tradeoff between number of sequences and length of each sequence?  That is, how to decide length/number??? Masquerade Detection21

22 PHMM Experiments  Experiments done for following cases:  See next slide… Masquerade Detection22

23 PHMM Experiments  Tests various numbers of sequences  Best results  5 sequences, 1k commands each seq.  This case in next slide Masquerade Detection23

24 PHMM Comparison  Compare PHMM to “weighted N -gram” and HMM  HMM is best  PHMM is competitive Masquerade Detection24

25 PHMM Detector  PHMM at disadvantage on Schonlau data  PHMM uses positional information  Such info not available for Schonlau data  We have to guess the positions for PHMM  How to get fairer comparison between HMM and PHMM?  We need different data set  Only option is simulated data set Masquerade Detection25

26 Simulated Data  We generate simulated data as follows  Using Schonlau data, construct Markov chain for each user  Use resulting Markov chain to generate sequences representing user behavior  Restrict “begin” to more common commands  What’s the point?  Simulated seqs have sensible begin and end Masquerade Detection26

27 Simulated Data  Training data and user data for scoring generated using Markov chain  Attack data taken from Schonlau data  How much data to generate?  First test, we generate same amount of simulated data as is in Schonlau set  That is, 5k commands per user Masquerade Detection27

28 Detection with Simulated Data  PHMM vs HMM  Round 2  It’s close, but HMM still wins! Masquerade Detection28

29 Limited Training Data  What if less training data is available?  In a real application, initially, training data is limited  Can’t detect attacks until sufficient training data has been accumulated  So, less data required, the better  Experiments, using simulated data, limited training date  Used 200 to 800 commands for training Masquerade Detection29

30 Limited Training Data  PHMM vs HMM  Round 3  With 400 or less, PHMM wins big! Masquerade Detection30

31 Conclusion  PHMM is competitive with best approaches  PHMM likely to do better, given better training data (begin/end info)  PHMM much better than HMM when limited training data available  Of practical importance  Why does it make sense that PHMM would do better with limited training data? Masquerade Detection31

32 Conclusion  Given current state of research…  Optimal masquerade detection approach  Initially, collect small training set  Train PHMM and use for detection  No attack, then continue to collect data  When sufficient data available, train HMM  From then on, use HMM for detection Masquerade Detection32

33 Future Work  Collect better real data set!!!  Many problems/limitations with Schonlau data  Improved data set could be basis for lots and lots of research  Directly compare PHMM/bioinformatics approaches with previous work (HMM, naïve Bayes, SVM, etc., etc.)  Consider hybrid techniques  Other techniques? Masquerade Detection33

34 References  Masquerade detection using profile hidden Markov models, L. Huang and M. Stamp, to appear in Computers and Security  Masquerading user data, M. Schonlau Masquerading user data Masquerade Detection34


Download ppt "Masquerade Detection Mark Stamp 1Masquerade Detection."

Similar presentations


Ads by Google