Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada.

Similar presentations


Presentation on theme: "Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada."— Presentation transcript:

1 Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada

2 PST2005 Jiong Zhang and Mohammad Zulkernine 2 Outline Motivation Motivation Intrusion detection system Intrusion detection system Data mining meets intrusion detection Data mining meets intrusion detection Proposed architecture Proposed architecture Challenges and solutions Challenges and solutions Experimental results Experimental results Conclusion and future work Conclusion and future work

3 PST2005 Jiong Zhang and Mohammad Zulkernine 3 Motivation Intrusion Prevention System (firewall) can not prevent all attacks. Intrusion Prevention System (firewall) can not prevent all attacks. Internet Intruder Victim Firewall

4 PST2005 Jiong Zhang and Mohammad Zulkernine 4 Motivation (contd.) Statistical data for intrusions Total losses of 2004 (reported): $141,496,560.Total losses of 2004 (reported): $141,496,560. Source: FBI survey for Year 2004 Source: FBI survey for Year 2004 50% of security breaches are undetected.50% of security breaches are undetected. Source: FBI Statistics for Year 2000 Source: FBI Statistics for Year 2000

5 PST2005 Jiong Zhang and Mohammad Zulkernine 5 Intrusion Detection Techniques Misuse Detection Misuse Detection Extracts patterns of known intrusionsExtracts patterns of known intrusions Cannot detect novel intrusionsCannot detect novel intrusions Has low false positive rateHas low false positive rate Anomaly Detection Anomaly Detection Builds profiles for normal activitiesBuilds profiles for normal activities Uses the deviations from the profiles to detect attacksUses the deviations from the profiles to detect attacks Can detect unknown attacksCan detect unknown attacks Has high false positive rateHas high false positive rate

6 PST2005 Jiong Zhang and Mohammad Zulkernine 6 Network Intrusion Detection System (NIDS) Monitors network traffic to detect intrusions Monitors network traffic to detect intrusions Monitors more targets on a network Monitors more targets on a network Detects some attacks that host- based systems miss Detects some attacks that host- based systems miss Does not affect network operations Does not affect network operations

7 PST2005 Jiong Zhang and Mohammad Zulkernine 7 Current NIDS Many current NIDSs (like snort) : Rule-based Rule-based Unable to detect novel attacks Unable to detect novel attacks High maintenance cost High maintenance cost

8 PST2005 Jiong Zhang and Mohammad Zulkernine 8 Rule Based vs. Data Mining Rule based systems Rule based systems Data mining based systems Data mining based systems Intrusion DataSecurity ExpertsRules Labeled Data Data Mining Engine Patterns

9 PST2005 Jiong Zhang and Mohammad Zulkernine 9 Data Mining Meets Intrusion Detection Extract patterns of intrusions for misuse detection Extract patterns of intrusions for misuse detection Build profiles of normal activities for anomaly detection Build profiles of normal activities for anomaly detection Build classifiers to detect attacks Build classifiers to detect attacks Some IDSs have successfully applied data mining techniques in intrusion detection Some IDSs have successfully applied data mining techniques in intrusion detection

10 PST2005 Jiong Zhang and Mohammad Zulkernine 10 Proposed Architecture AlarmerDetector Pattern BuilderData Set Sensors On-line Pre- Processors Off line On line Architecture of the proposed NIDS NetworksNetworks Database (On line) Off-line Pre- processor Database (Off line) Patterns Packets Audited data Feature vectors Feature vectors Alarms Training data

11 PST2005 Jiong Zhang and Mohammad Zulkernine 11 Random Forests Unsurpassable in accuracy among the current data mining algorithms Unsurpassable in accuracy among the current data mining algorithms Runs efficiently on large data set with many features Runs efficiently on large data set with many features Gives the estimates of what features are important Gives the estimates of what features are important No nominal data problem No nominal data problem No over-fitting No over-fitting

12 PST2005 Jiong Zhang and Mohammad Zulkernine 12 Imbalanced Intrusion Problems Problems Higher error rate for minority intrusionsHigher error rate for minority intrusions Some minority intrusions are more dangerousSome minority intrusions are more dangerous Need to improve the performance for the minority intrusionsNeed to improve the performance for the minority intrusions Proposed Solution Proposed Solution Down-sample the majority intrusions and over-sample the minority intrusionsDown-sample the majority intrusions and over-sample the minority intrusions

13 PST2005 Jiong Zhang and Mohammad Zulkernine 13 Feature Selection Essential for improving detection rate Essential for improving detection rate Reduces the computational cost Reduces the computational cost Many NIDSs select features by intuition or the domain knowledge Many NIDSs select features by intuition or the domain knowledge

14 PST2005 Jiong Zhang and Mohammad Zulkernine 14 Feature Selection over the KDD’99 Dataset Calculate variable importance using random forests. Calculate variable importance using random forests. Select the 38 most important features in detection. Select the 38 most important features in detection.

15 PST2005 Jiong Zhang and Mohammad Zulkernine 15 Some Features The two most important features The two most important features Feature 3. service type, such as http, telnet, and ftpFeature 3. service type, such as http, telnet, and ftp Feature 23. count, # connections to the same host as the current one during past two secondsFeature 23. count, # connections to the same host as the current one during past two seconds The three least important features The three least important features Feature 7. land, 1 if connection is from/to the same host/port; 0 otherwiseFeature 7. land, 1 if connection is from/to the same host/port; 0 otherwise Feature 20. num_outbound_cmds, # of outbound commands in an ftp sessionFeature 20. num_outbound_cmds, # of outbound commands in an ftp session Feature 21. is_hot_login, 1 if the login belongs to the “hot” list; 0 otherwiseFeature 21. is_hot_login, 1 if the login belongs to the “hot” list; 0 otherwise

16 PST2005 Jiong Zhang and Mohammad Zulkernine 16 Parameter Optimization for Random Forests Optimize the parameter Mtry of random forests to improve detection rate. Optimize the parameter Mtry of random forests to improve detection rate. Choose 15 as the optimal value, which reaches the minimum of the oob error rate. Choose 15 as the optimal value, which reaches the minimum of the oob error rate.

17 PST2005 Jiong Zhang and Mohammad Zulkernine 17 Performance Comparison on the KDD’99 Dataset Our approach provides lower overall error rate and cost compared to the best KDD’99 result. Our approach provides lower overall error rate and cost compared to the best KDD’99 result. Feature selection can improve the performance of intrusion detection. Feature selection can improve the performance of intrusion detection.

18 PST2005 Jiong Zhang and Mohammad Zulkernine 18 Conclusion and Future Work Random forests algorithm can help improve detection performance and select features. Random forests algorithm can help improve detection performance and select features. Sampling techniques can reduce the time to build patterns and increase the detection rate of minority intrusions. Sampling techniques can reduce the time to build patterns and increase the detection rate of minority intrusions. In future, we will focus on anomaly detection and a multiple classifier architecture. In future, we will focus on anomaly detection and a multiple classifier architecture.

19 PST2005 Jiong Zhang and Mohammad Zulkernine 19


Download ppt "Network Intrusion Detection Using Random Forests Jiong Zhang Mohammad Zulkernine School of Computing Queen's University Kingston, Ontario, Canada."

Similar presentations


Ads by Google