Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg Rule-Based Anomaly Detection on IP Flows.

Similar presentations


Presentation on theme: "Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg Rule-Based Anomaly Detection on IP Flows."— Presentation transcript:

1 Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg Rule-Based Anomaly Detection on IP Flows

2 2  Intrusion Detection Systems (IDSes) p rotect the edge of a network  Inspect IP packets  Look for worms, DoS, scans, instant messaging, etc  Many IDSes leverage known signatures of traffic  e.g., Slammer packets contain “MS-SQL” (say) in the payload  or AOL IM packets use specific TCP ports and application headers IP header TCP header App header Payload Enterprise Unwanted traffic detection

3 Benefits Programmable Leverage existing community Many rules already exist CERT, SANS Institute, etc Classification “for free” A predicate is a boolean function on a packet feature e.g., TCP port = 80 A signature (or rule) is a set of predicates 3 Packet and rule-based IDSs

4 Drawbacks Packet inspection at the edge requires deployment at many interfaces Too many packets per second 4 A predicate is a boolean function on a packet feature e.g., TCP port = 80 A signature (or rule) is a set of predicates

5 Drawbacks Packet inspection at the edge requires deployment at many interfaces Too many packets per second DPI predicates can be computationally expensive Packet has: Port number X, Y, or Z Contains pattern “foo” within the first 20 bytes Contains pattern “ba*r” within the first 40 bytes 5 Packet and rule-based IDSs A predicate is a boolean function on a packet feature e.g., TCP port = 80 A signature (or rule) is a set of predicates

6 src IP dst IP src Port dst Port Durat ion # Packets A B5 min36 ……………… Our idea: IDS on IP flows 6 How well can rule-based IDS’s be mimicked on IP flows? Efficient Only fixed-offset rule predicates More compact (no payload) Flow collection infrastructure is ubiquitous IP flows capture the concept of a connection

7 Idea 1. IDS’es associate a “label” with every packet 2. An IP flow is associated with a set of packets 3. Our systems associates the labels with flows 7

8 Snort rule taxonomy 8 Header-onlyMeta- Information Payload dependent Inspect only IP flow header Inexact correspondence Inspects packet payload e.g., port numberse.g., TCP flagse.g., ”contains ab*c” Relies on features that cannot be exactly reproduced in the IP flow realm

9 Simple translation 9 3. Our systems associates the labels with flows Simple rule translation would capture only flow predicates Low accuracy or low applicability dst port = MS SQL contains “Slammer” 9 dst port = MS SQL Snort rule: Only flow predicates: Slammer Worm

10 Machine Learning (ML) 3. Our systems associates the labels with flows 10 Leverage ML to learn mapping from “IP flow space” to label IP flow space = src port * # packets * flags * duration : if raised otherwise src port # packets

11 Boosting 11 Boosting combines a set of weak learners to create a strong learner h1h1 h2h2 h3h3 H final sign

12 dst port = MS SQL contains “Slammer” Benefit of Machine Learning (ML) Rule translation would capture flow-only predicates Low accuracy or low applicability ML algorithms discover new predicates that capture the rule Latent correlations between predicates Capturing same subspace using different dimensions 12 dst port = MS SQL Snort rule:Only flow predicates:ML-generated rule: Slammer Worm dst port = MS SQL packet size = 404 flow duration

13 1.Operate at a small # of interfaces 2.Use ML algorithms to learn to classify on IP flows 3.Apply learned classifiers across all/other interfaces Architecture 13

14 Evaluation Border router on OC-3 link Used Snort rules in place Unsampled NetFlow v5 and packet traces Statistics One month, 2 MB/s average, 1 billion flows 400k Snort alarms 14

15 Accuracy metrics Receiver Operator Characteristic (ROC) Full FP vs TP tradeoff But need a single number Area Under Curve (AUC) Average Precision AP of p 1 - p p FP per TP 15

16 Training on week 1, testing on week n High degree of accuracy for header and meta Minimal drift within a month Rule classWeek1-2Week1-3Week1-4 Header rules1.000.99 Meta- information 1.00 0.95 Payload0.700.710.70 16 Classifier accuracy 5 FP per 100 TP 43 FP per 100 TP

17 Accuracy is a function of correlation between flow and packet-level features w/o dst port w/o mean packet size 0.990.83 0.790.06 0.020.22 RuleOverall Accuracy MS-SQL version overflow1.00 ICMP PING speedera0.82 NON-RFC HTTP DELIM0.48 17 Difference in rule accuracy

18 Choosing an operating point 18 XZ Y X = alarms we want raised Z = alarms that are raised Precision Y Z Exactness Recall Y X Completeness AP is a single number, but not most intuitive Precision & recall are useful for operators  “I need to detect 99% of these alarms!”

19 AP is a single number, but not most intuitive Precision & recall are useful for operators  “I need to detect 99% of these alarms!” RulePrecision w/recall 1.00 Precision w/recall=0.99 MS-SQL version overflow1.00 ICMP PING speedera0.020.83 CHAT AIM receive message0.020.11 19 Choosing an operating point

20 Computational efficiency 1. Machine learning (boosting) 33 hours per rule for one week of OC48 2. Classification of flows 57k flows/sec 1.5 GHz Itanium 2 Line rate classification for OC48 20

21 Conclusion Applying Snort alarms to flows is feasible ML algorithms discover latent correlations between packet and flow predicates High degree of accuracy for many rules Minimal drift within a month Prototype can scale up to OC48 speeds Qualitatively predictive rule taxonomy Future work Performance on sampled NetFlow Cross-site training /classification 21

22 22 Thank you! Questions? Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg


Download ppt "Nick Duffield, Patrick Haffner, Balachander Krishnamurthy, Haakon Ringberg Rule-Based Anomaly Detection on IP Flows."

Similar presentations


Ads by Google