Firewalls and Intrusion Detection Systems David Brumley Carnegie Mellon University.

Firewalls and Intrusion Detection Systems David Brumley dbrumley@cmu.edu Carnegie Mellon University

IDS and Firewall Goals Expressiveness: What kinds of policies can we write? Effectiveness: How well does it detect attacks while avoiding false positives? Efficiency: How many resources does it take, and how quickly does it decide? Ease of use: How much training is necessary? Can a non- security expert use it? Security: Can the system itself be attacked? Transparency: How intrusive is it to use? 2

Firewalls Dimensions: 1.Host vs. Network 2.Stateless vs. Stateful 3.Network Layer 3

Firewall Goals Provide defense in depth by: 1.Blocking attacks against hosts and services 2.Control traffic between zones of trust 4

Logical Viewpoint 5 InsideOutside Firewall For each message m, either: Allow with or without modification Block by dropping or sending rejection notice Queue m ?

Placement Host-based Firewall Network-Based Firewall 6 HostFirewallOutside FirewallOutside Host B Host C Host A Features: Faithful to local configuration Travels with you Features: Protect whole network Can make decisions on all of traffic (traffic- based anomaly)

Parameters Types of Firewalls 1. Packet Filtering 2. Stateful Inspection 3. Application proxy Policies 1. Default allow 2. Default deny 7

Recall: Protocol Stack 8 Application (e.g., SSL) Transport (e.g., TCP, UDP) Network (e.g., IP) Link Layer (e.g., ethernet) Physical Application message - data TCPdataTCPdataTCPdata TCP Header dataTCPIP dataTCPIPETH Link (Ethernet) Header Link (Ethernet) Trailer IP Header

Stateless Firewall Filter by packet header fields 1. IP Field (e.g., src, dst) 2. Protocol (e.g., TCP, UDP,...) 3. Flags (e.g., SYN, ACK) Application Transport Network Link Layer Firewall OutsideInside Example: only allow incoming DNS packets to nameserver A.A.A.A. Allow UDP port 53 to A.A.A.A Deny UDP port 53 all Fail-safe good practice e.g., ipchains in Linux 2.2 9

Need to keep state 10 InsideOutside Listening Store SN c, SN s Wait SN C  rand C AN C  0 Syn SYN/ACK: SN S  rand S AN S  SN C Established ACK: SN  SN C +1 AN  SN S Example: TCP Handshake Firewall Desired Policy: Every SYN/ACK must have been preceded by a SYN

Stateful Inspection Firewall Added state (plus obligation to manage) – Timeouts – Size of table State Application Transport Network Link Layer OutsideInside e.g., iptables in Linux 2.4 11

Stateful More Expressive 12 InsideOutside Listening Store SN c, SN s Wait SN C  rand C AN C  0 Syn SYN/ACK: SN S  rand S AN S  SN C Established ACK: SN  SN C +1 AN  SN S Example: TCP Handshake Firewall Record SN c in table Verify AN s in table

State Holding Attack 13 FirewallAttacker Inside Syn... 1. Syn Flood 2. Exhaust Resources 3. Sneak Packet Assume stateful TCP policy

Fragmentation 14 Octet 1Octet 2Octet 3Octet 4 VerIHLTOSTotal Length ID0 DFDF MFMF Frag ID... Data Frag 1Frag 2Frag 3 IP HdrDF=0MF=1ID=0Frag 1 IP HdrDF=0MF=1ID=nFrag 2 IP HdrDF=1MF=0ID=2nFrag 3 say n bytes DF : Don’t fragment (0 = OK, 1 = Don’t) MF: More fragements (0 = Last, 1 = More) Frag ID = Octet number

Reassembly 15 Data Frag 1Frag 2Frag 3 IP HdrDF=0MF=1ID=0Frag 1 IP HdrDF=0MF=1ID=nFrag 2 IP HdrDF=1MF=0ID=2nFrag 3 Frag 1Frag 2Frag 3 0Byte nByte 2n

16 Octet 1Octet 2Octet 3Octet 4 Source PortDestination Port Sequence Number....... DF= 1 MF=1ID=0... 1234 (src port) 80 (dst port)... Packet 1 Overlapping Fragment Attack... DF= 1 MF=1ID=2...22... Packet 2 12348022 Assume Firewall Policy:  Incoming Port 80 (HTTP)  Incoming Port 22 (SSH) Bypass policy TCP Hdr (Data!)

Stateful Firewalls Pros More expressive Cons State-holding attack Mismatch between firewalls understanding of protocol and protected hosts 17

Application Firewall Check protocol messages directly Examples: – SMTP virus scanner – Proxies – Application-level callbacks 18 State Application Transport Network Link Layer Outside Inside

Firewall Placement 19

Demilitarized Zone (DMZ) 20 InsideOutside Firewall DMZ WWW NNTP DNS SMTP

Dual Firewall 21 InsideOutsideHubDMZ Interior Firewall Exterior Firewall

Design Utilities Solsoft Securify 22

References Elizabeth D. Zwicky Simon Cooper D. Brent Chapman William R Cheswick Steven M Bellovin Aviel D Rubin 23

Intrusion Detection and Prevetion Systems 24

Logical Viewpoint 25 InsideOutside IDS/IPS For each message m, either: Report m (IPS: drop or log) Allow m Queue m ?

Overview Approach: Policy vs Anomaly Location: Network vs. Host Action: Detect vs. Prevent 26

Policy-Based IDS Use pre-determined rules to detect attacks Examples: Regular expressions (snort), Cryptographic hash (tripwire, snort) 27 Detect any fragments less than 256 bytes alert tcp any any -> any any (minfrag: 256; msg: "Tiny fragments detected, possible hostile activity";) Detect IMAP buffer overflow alert tcp any any -> 192.168.1.0/24 143 ( content: "|90C8 C0FF FFFF|/bin/sh"; msg: "IMAP buffer overflow!”;) Example Snort rules

Modeling System Calls [wagner&dean 2001] 28 Entry(f)Entry(g) Exit(f)Exit(g) open() close() exit() getuid()geteuid() f(int x) { if(x){ getuid() } else{ geteuid();} x++ } g() { fd = open("foo", O_RDONLY); f(0); close(fd); f(1); exit(0); } Execution inconsistent with automata indicates attack

Anomaly Detection 29 Distribution of “normal” events IDS New Event Attack Safe

Example: Working Sets 30 Alice Days 1 to 300 reddit xkcd slashdot fark working set of hosts Alice Day 300 outside working set reddit xkcd slashdot fark 18487

Anomaly Detection Pros Does not require pre- determining policy (an “unknown” threat) Cons Requires attacks are not strongly related to known traffic Learning distributions is hard 31

Automatically Inferring the Evolution of Malicious Activity on the Internet David Brumley Carnegie Mellon University Shobha Venkataraman AT&T Research Oliver Spatscheck AT&T Research Subhabrata Sen AT&T Research

33 Tier 1 EK A... Spam Haven Labeled IP’s from spam assassin, IDS logs, etc. Evil is constantly on the move Goal: Characterize regions changing from bad to good (Δ-good) or good to bad (Δ-bad)

Research Questions Given a sequence of labeled IP’s 1.Can we identify the specific regions on the Internet that have changed in malice? 2.Are there regions on the Internet that change their malicious activity more frequently than others? 34

35 Spam Haven Tier 1 Tier 2 D X BC K Per-IP often not interesting A... DSLCORP A X Challenges 1.Infer the right granularity E Previous work: Fixed granularity Per-IP Granularity (e.g., Spamcop)

36 Spam Haven Tier 1 Tier 2 DE BC W A... DSLCORP A BGP granularity (e.g., Network-Aware clusters [KW’00]) Challenges 1.Infer the right granularity XX Previous work: Fixed granularity

B C 37 Spam Haven Tier 1 Tier 2 D X BC K Coarse granularity A... DSLCORP A K Challenges 1.Infer the right granularity EECORP Idea: Infer granularity Medium granularity Well-managed network: fine granularity

38 Spam Haven Tier 1 Tier 2 DE BC W A... DSLSMTP Challenges 1.Infer the right granularity 2.We need online algorithms A fixed-memory device high-speed link X

Research Questions Given a sequence of labeled IP’s 1.Can we identify the specific regions on the Internet that have changed in malice? 2.Are there regions on the Internet that change their malicious activity more frequently than others? 39 Δ-Change Δ-Motion We Present

Background 1.IP Prefix trees 2.TrackIPTree Algorithm 40

41 Spam Haven Tier 1 Tier 2 DE BC W A... DSLCORP A X 1.2.3.4/32 8.1.0.0/24 IP Prefixes: i/d denotes all IP addresses i covered by first d bits IP Prefixes: i/d denotes all IP addresses i covered by first d bits Ex: 8.1.0.0-8.1.255.255 Ex: 1 host (all bits)

42 One Host Whole Net 0.0.0.0/0 0.0.0.0/1128.0.0.0/1 128.0.0.0/2192.0.0.0/2 0.0.0.0/2 64.0.0.0.0/ 2 An IP prefix tree is formed by masking each bit of an IP address. 0.0.0.0/320.0.0.1/32 0.0.0.0/31 128.0.0.0/3160.0.0.0/3 128.0.0.0/4152.0.0.0/4

43 0.0.0.0/0 0.0.0.0/1128.0.0.0/1 0.0.0.0/2 64.0.0.0.0/ 2 128.0.0.0/2192.0.0.0/2 A k-IPTree Classifier [VBSSS’09] is an IP tree with at most k-leaves, each leaf labeled with good (“+) or bad (“-”). 128.0.0.0/3160.0.0.0/3 128.0.0.0/4152.0.0.0/4 0.0.0.0/320.0.0.1/32 0.0.0.0/31 6- IPTree +- + - + + Ex: 64.1.1.1 is bad Ex: 1.1.1.1 is good

44 /0 /1 /16 /17 /18 + - - In: stream of labeled IPs... TrackIPTree Algorithm [VBSSS’09] Out: k-IPTree TrackIPTree

Δ-Change Algorithm 1.Approach 2.What doesn’t work 3.Intuition 4.Our algorithm 45

46 Goal: identify online the specific regions on the Internet that have changed in malice. /0 /1 /16 /17 /18 T 1 for epoch 1 + - + /0 /1 /16 /17 /18 T 2 for epoch 2 ++ - Δ-Bad: A change from good to bad Δ-Good: A change from bad to good Epoch 1 IP stream s 1 Epoch 2 IP stream s 2....

47 Goal: identify online the specific regions on the Internet that have changed in malice. /0 /1 /16 /17 /18 T 1 for epoch 1 + - + /0 /1 /16 /17 /18 T 2 for epoch 2 ++ - False positive: Misreporting that a change occurred False Negative: Missing a real change

48 Goal: identify online the specific regions on the Internet that have changed in malice. Idea: divide time into epochs and diff Use TrackIPTree on labeled IP stream s 1 to learn T 1 Use TrackIPTree on labeled IP stream s 2 to learn T 2 Diff T 1 and T 2 to find Δ-Good and Δ-Bad /0 /1 /16 /17 /18 T 1 for epoch 1 + - - /0 /1 /16 T 2 for epoch 2 - Different Granularities! ✗

49 Goal: identify online the specific regions on the Internet that have changed in malice. Δ-Change Algorithm Main Idea: Use classification errors between T i-1 and T i to infer Δ-Good and Δ-Bad

50 T i-2 T i-1 TiTi TrackIPTree S i-1 SiSi Fixed Ann. with class. error S i-1 T old,i-1 Ann. with class. error SiSi T old,i compare (weighted) classification error (note both based on same tree) Δ-Good and Δ-Bad Δ-Change Algorithm

Comparing (Weighted) Classification Error 51 /16 IPs: 200 Acc: 40% IPs: 150 Acc: 90% IPs: 110 Acc: 95% T old,i-1 IPs: 40 Acc: 80% IPs: 50 Acc: 30% /16 IPs: 170 Acc: 13% IPs: 100 Acc: 10% IPs: 80 Acc: 5% T old,i IPs: 20 Acc: 20% IPs: 70 Acc: 20% Δ-Change Somewhere

Comparing (Weighted) Classification Error 52 /16 IPs: 200 Acc: 40% IPs: 150 Acc: 90% IPs: 110 Acc: 95% T old,i-1 IPs: 40 Acc: 80% IPs: 50 Acc: 30% /16 IPs: 170 Acc: 13% IPs: 100 Acc: 10% IPs: 80 Acc: 5% T old,i IPs: 20 Acc: 20% IPs: 70 Acc: 20% Insufficient Change

Comparing (Weighted) Classification Error 53 /16 IPs: 200 Acc: 40% IPs: 150 Acc: 90% IPs: 110 Acc: 95% T old,i-1 IPs: 40 Acc: 80% IPs: 50 Acc: 30% /16 IPs: 170 Acc: 13% IPs: 100 Acc: 10% IPs: 80 Acc: 5% T old,i IPs: 20 Acc: 20% IPs: 70 Acc: 20% Insufficient Traffic

Comparing (Weighted) Classification Error 54 /16 IPs: 200 Acc: 40% IPs: 150 Acc: 90% IPs: 110 Acc: 95% T old,i-1 IPs: 40 Acc: 80% IPs: 50 Acc: 30% /16 IPs: 170 Acc: 13% IPs: 100 Acc: 10% IPs: 80 Acc: 5% T old,i IPs: 20 Acc: 20% IPs: 70 Acc: 20% Δ-Change Localized

Evaluation 1.What are the performance characteristics? 2.Are we better than previous work? 3.Do we find cool things? 55

56 Performance In our experiments, we : – let k=100,000 (k-IPTree size) – processed 30-35 million IPs (one day’s traffic) – using a 2.4 Ghz Processor Identified Δ-Good and Δ-Bad in <22 min using <3MB memory

57 2.5x as many changes on average! How do we compare to network-aware clusters? (By Prefix)

Spam 58 Grum botnet takedown

59 Botnets 22.1 and 28.6 thousand new DNSChanger bots appeared 38.6 thousand new Conficker and Sality bots

Caveats and Future Work “For any distribution on which an ML algorithm works well, there is another on which is works poorly.” – The “No Free Lunch” Theorem 60 Our algorithm is efficient and works well in practice.....but a very powerful adversary could fool it into having many false negatives. A formal characterization is future work. !

Detection Theory Base Rate, fallacies, and detection systems 61

62 Let Ω be the set of all possible events. For example: Audit records produced on a host Network packets seen Ω

63 Ω I Set of intrusion events I Intrusion Rate: Example: IDS Received 1,000,000 packets. 20 of them corresponded to an intrusion. The intrusion rate Pr[I] is: Pr[I] = 20/1,000,000 =.00002 Example: IDS Received 1,000,000 packets. 20 of them corresponded to an intrusion. The intrusion rate Pr[I] is: Pr[I] = 20/1,000,000 =.00002

64 Ω I A Set of alerts A Alert Rate: Defn: Sound

65 Ω I A Defn: Complete

66 Ω IA Defn: False PositiveDefn: False Negative Defn: True PositiveDefn: True Negative

67 Ω IA Defn: Detection rate Think of the detection rate as the set of intrusions raising an alert normalized by the set of all intrusions.

Another Viewpoint: Probability Trees 68 Detection Rate Note:

69 Ω IA Defn: Detection rate

70 Ω IA 18 4 2

72 Ω IA Think of the Bayesian detection rate as the set of intrusions raising an alert normalized by the set of all alerts. (vs. detection rate which normalizes on intrusions.) Defn: Bayesian Detection rate Crux of IDS usefulness !

73 Ω IA Defn: Bayesian Detection rate Bayesian detection rate is the probability an alert signifies an intrusion. The crux of deciding whether a detection system is useful often comes down to this equation.

74 Ω IA 2 4 18 About 18% of all alerts are false positives!

Challenge We’re often given the detection rate and know the intrusion rate, and want to calculate the Bayesian detection rate – 99% accurate medical test – 99% accurate IDS – 99% accurate test for deception –... 75

Fact: 76 Proof:

Calculating Bayesian Detection Rate Fact: So to calculate the Bayesian detection rate: One way is to compute: 77

Example 1,000 people in the city 1 is a terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000 Suppose we have a new terrorist facial recognition system that is 99% accurate. – 99/100 times when someone is a terrorist there is an alarm – For every 100 good guys, the alarm only goes off once. An alarm went off. Is the suspect really a terrorist? 78 City (this times 10)

Example Answer: The facial recognition system is 99% accurate. That means there is only a 1% chance the guy is not the terrorist. 79 (this times 10) City Wrong!

Formalization 1 is terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000. P[T] = 0.001 99/100 times when someone is a terrorist there is an alarm. P[A|T] =.99 For every 100 good guys, the alarm only goes off once. P[A | not T] =.01 Want to know P[T|A] 80 City (this times 10)

1 is terrorists, and we have their pictures. Thus the base rate of terrorists is 1/1000. P[T] = 0.001 99/100 times when someone is a terrorist there is an alarm. P[A|T] =.99 For every 100 good guys, the alarm only goes off once. P[A | not T] =.01 Want to know P[T|A] 81 City (this times 10) Intuition: Given 999 good guys, we have 999*.01 ≈ 9-10 false alarms False alarms Guesses?

82 Unknown

Recall to get Pr[A] Fact: 83 Proof:

84 Unknown ✓

..and to get Pr[A∩ I] Fact: 85 Proof:

86 ✓ ✓

For IDS Let – I be an intrusion, A an alert from the IDS – 1,000,000 msgs per day processed – 2 attacks per day – 10 attacks per message 88 False positives True positives 70% detection requires FP < 1/100,000 80% detection generates 40% FP From Axelsson, RAID 99

Conclusion Firewalls – 3 types: Packet filtering, Stateful, and Application – Placement and DMZ IDS – Anomaly vs. policy-based detection Detection theory – Base rate fallacy 89

90 Questions?

92 Thought

Overview Approach: Policy vs Anomaly Location: Network vs. Host Action: Detect vs. Prevent 93 TypeExample Host, Rule, IDSTripwire Host, Rule, IPSPersonal Firewall Net, Rule, IDSSnort Net, Rule, IPSNetwork firewall Host, Anomaly, IDSSystem call monitoring Host, Anomaly, IPSNX Bit? Net, Anomaly, IDSWorking set of connections Net, Anomaly, IPS

Firewalls and Intrusion Detection Systems David Brumley Carnegie Mellon University.

Similar presentations

Presentation on theme: "Firewalls and Intrusion Detection Systems David Brumley Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Firewalls and Intrusion Detection Systems David Brumley Carnegie Mellon University.

Similar presentations

Presentation on theme: "Firewalls and Intrusion Detection Systems David Brumley Carnegie Mellon University."— Presentation transcript:

Similar presentations

About project

Feedback