Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistical based IDS background introduction

Similar presentations


Presentation on theme: "Statistical based IDS background introduction"— Presentation transcript:

1 Statistical based IDS background introduction

2 Statistical IDS background
Why do we do this project Attack introduction IDS architecture Data description Feature extraction Statistical method introduction Result analysis

3 Project goals Related work Our goals
Internet has various network attacks, including denial of service attacks and port scans, etc. Overall traffic detection Flow-level detection Our goals Detect both attacks at the same time Differentiate DoS and port scans

4 Attack introduction TCP SYN flooding
- An important form of DoS attacks - Exploit the TCP’s three-way handshake mechanism and its limitation in maintaining half-open connection - Feature: spoofed source IP - Recent reflected SYN/ACK flooding attacks

5 Attack introduction Port scan - horizontal scan - Vertical scan
- Block scan Feature: real source IP address

6 Statistical IDS architecture
Learning part Detection part

7 Data description DARPA98 data
The first standard corpora for evaluation of network intrusion detection systems. From the Information Systems Technology Group ( IST ) of MIT Lincoln Laboratory, Under Defense Advanced Research Projects Agency ( DARPA ITO ) and Air Force Research Laboratory ( AFRL/SNHS ) sponsorship Seven weeks of training data Two weeks of detection data

8 Data description DARPA98 data format
> : S ACK : (0) win 512 <mss 1460> - Time stamp: - Source IP address + port: - Destination IP address + port: - TCP flag: S (maybe other : R, F, P) - ACK flag: ACK - Other part of packet header: : (0) win 512 <mss 1460>

9 Feature extraction Calculate the metrics in every 5 minute traffic
For example: SYN-SYN_ACK pair SYN-FIN + SYN-RSTactive pair traffic volume SYN packet volume …… Good Luck 

10 Statistical method Statistical based IDS
Goals: Using statistical metrics and algorithm to differentiate the anomaly traffic from benign traffic, and to differentiate different types of attacks. - Advantage: detect unknown attacks - Disadvantage: false positive and false negative

11 Hidden Markov Model (HMM)
HMM is a very useful statistical learning model. It has been successfully implemented in the speech recognition. - Advantage 1. analyzing sequence data (using observation probability and transition probability to represent) 2. unsurprised data training and surprised data training 3. high accuracy - Disadvantage comparatively long training time

12 Double Gaussian model Introduction - Two Gaussion distribution models are used to represent two classes of behaviors - Get the two probabilities of current behavior using different two-class Gaussian parameters - Compare them. The current behavior belongs to the larger probability class. Training period - Get the two-class Gaussian parameters Detection period - Use two-class Gaussian parameters to get probabilities and compare them

13 Double Gaussian model Advantage Disadvantage
Simple, easy to understand Fast Disadvantage No sequence characteristic

14 Result analysis Evaluation - Important quantitative analysis:
false positive + false negative - Looking at metric value, and finding the reasons - Repeating experiments


Download ppt "Statistical based IDS background introduction"

Similar presentations


Ads by Google