Presentation is loading. Please wait.

Presentation is loading. Please wait.

Use of Measurements in Anomaly Detection CS 8803: Network Measurements Seminar Instructor: Constantinos Dovrolis Fall 2003 Presenter: Buğra Gedik.

Similar presentations


Presentation on theme: "Use of Measurements in Anomaly Detection CS 8803: Network Measurements Seminar Instructor: Constantinos Dovrolis Fall 2003 Presenter: Buğra Gedik."— Presentation transcript:

1 Use of Measurements in Anomaly Detection CS 8803: Network Measurements Seminar Instructor: Constantinos Dovrolis Fall 2003 Presenter: Buğra Gedik

2 Outline  We’ll be discussing 3 papers Topic Detail: Inferring DoS Activity  Paper: D. Moore, G. M. Voelker, and S. Savage. Inferring internet denial-of-service activity. In Proceedings of the USENIX Annual Technical Conference (USENIX 2001). Topic Detail: Code-Red Worm  Paper: D. Moore, C. Shanning, and J. Brown. Code-Red: A Case Study on the Spread and Victims of an Internet Worm. In Proceedings of the ACM Internet Measurement Workshop (IMW 2002). Topic Detail: DoS Attacks and Flash Crowds  Paper: J. Jung, B. Krishnamurthy, and M. Rabinovich. Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites. In Proceedings of the International World Wide Web Conference (WWW 2002).

3 Inferring Internet Denial-of-Service Activity David Moore Geoffrey M. Voelker Stefan Savage In Proceedings of the USENIX Annual Technical Conference (USENIX 2001).

4 Problem Statement & Solution Overview  Problem: How prevalent are denial-of-service attacks in the Internet today? This paper only considers flood type of attacks  Technique: Use backscatter analysis for estimating the worldwide prevalence of DoS attacks

5 Backscatter Analysis

6 Some Limiting Assumptions  Address uniformity: Attackers spoof source addresses at random.  Reliable delivery: Attack traffic is delivered reliably to the victim and backscatter is delivered reliably to the monitor.  Backscatter hypothesis: Unsolicited packets observed by the monitor represent backscatter.

7 Customer AS Provider AS attacker Address uniformity  May not hold because: Some ISPs employ ingress filtering, as a result the attacker may be forced to restrict its address space Reflector Attacks: A different kind of flooding attack that is not captured by backscattering, e.g. Smurf or Fraggle attacks  The main motivation of the assumption: Many direct DoS attack “tools” use random address spoofing, e.g. Shaft, TFN, TFN2k, trinoo, Stacheldraht, mstream, Trinity It is possible to use tests like A2 to test uniformity spoofed packet ingress filter victim attacker packet spoofed with victims IP Multicast Group … responses

8 Reliable delivery  May not hold because: During the attack packets may be dropped due to congestion IDS may filter the packets Some type of attacks may not produce a backscatter  Many attacks generate a backscatter Most type of flooding attacks do generate a response

9 Backscatter hypothesis  May not hold because Any host on the internet can send unsolicited packets to the monitored network  Motivation of the assumption Packets that are consistently targeted to a specific address in the monitored network can be filtered easily Although a concerted effort by a third party can bias the results, this is quite unlikely

10 Extrapolating Backscatter Analysis Results  Let n be the number of monitored IP addresses  And consider an attack with m packets Then the expected number of backscatter packets observed from the attack, E(X), is: E(X) = (n*m)/2 32 Similarly, if the observed rate of an attack is R’, than an upper bound on the real rate R, is: R > R’ * 2 32 /n

11 Attack Classification  Two types of classification are done: Flowed based classification  Used to classify individual attacks  Answering the questions:  how many  how long  what kind Event based classification  Analyze the severity of attacks on short time scales

12 Flow-based classification  A flow is defined as a series of consecutive packets sharing the same target (victim’s address) and same IP protocol If no more packets are observed from a flow for 5 minutes, the flow is assumed to end All flows that do not have more than 100 packets or last less than 60secs are discarded Flows that are only backscattered to a single IP address in the monitored range are discarded

13 Examining the Flows  Determine the type of attack by examining TCP flag settings ICMP packets  Look at the distributions of IP addresses, use A2 uniformity test to validate the assumption, significance level of 0.05 port addresses  Classify the victim by examining DNS information of the victim AS level information of the victim from BGP tables

14 Event-based Classification  An attack event is defined by a victim emitting at least 10 backscatter packets during a one minute period  Attacks are not classified based on type, only criterion is the victim’s IP address  For each minute, the victims that are under attack and the intensity of each attack is determined and recorded

15 Experimental Setup  /8 network represents 1/256 of the total Internet  February 1 st to February 25 th, Ethernet traffic is captured using a shared hub with the ingress router

16 Summary of Observed Attacks  5000 distinct victim IP addresses in more than 2000 distinct DNS domains

17 Attack/Response Protocols  ~ 50% of the attacks generate TCP (RST ACK) suggesting they are TCP flood attacks destined to closed ports  ~ 15% of the attacks generate ICMP host unreachable containing a TCP header including the victim’s IP again suggesting a TCP flood  ~ 12% of the attacks generate ICMP (TTL Exceeded) Strange! These we caused by attacks with very high rate and they correspond to around 50% of all backscatter packets observed  ~ 8% of the attacks generate TCP (SYN ACK) suggesting SYN floods

18 Attack Rate  Uniform Random Attacks are the ones whose source IP addresses satisfy the A2 test  500 SYN packets per second are enough to overwhelm a server (~40% of attacks satisfy this)  14,000 SYN packets per second are enough to overwhelm a server with specialized firewalls (~2.5% of attacks satisfy this)

19 Attack Duration  50% of the attacks are less than 10 minutes  80% of the attacks are less than 30 minutes  90% of the attacks are less than 60 minutes

20 Victim Classification  Significant fraction of attacks targeted to home machines, either dial-up or broadband  Within home users, cable-modem users have experienced some intense attacks with rates going up to 1,000 packets per second.  Significant number of attacks to IRC servers

21 Victim Classification  No single AS or a small set of ASs are major targets  65% of the victems were attacked once and 18% twice

22 Validation  98% of the packets attributed to backscatter does not itself provoke a response, so they can not be packets used to probe the monitored network  98% of the victim IP addresses are also encountered in other traces extracted from different datasets collected at the same period

23 Code-Red: A Case Study on the Spread and Victims of an Internet Worm David Moore Colleen Shannon Jeffery Brown In Proceedings of the ACM Internet Measurement Workshop (IMW 2002)

24 Analysis of the Code-Red Worm  Worms: Self replicating viruses  Code-Red worm classification Code-RedI-v1: memory-resident, static seed, infect/spread/attack Code-RedI-v2: memory-resident, random seed, infect/spread/attack Code-RedII: disk-resident, intelligent, infect/backdoor/spread  Data Sets: Packet header trace of hosts sending unsolicited TCP SYN packets to a /8 (class A) network and two /16 networks, July 4 / August 21  July 12, 2001 -Code-RedI-v1 set loose  July 19, 2001 - Code-RedI-v2 set loose  August 4, 2001 - Code-RedII set loose Hosts that has sent at least two unsolicited TCP SYN packets (on port 80) to the /8 network are suspected as infected hosts

25 Code-RedI Worms Bogus HTTP request containing the worm Leverages a buffer overflow vulnerability in MS IIS HTTP server Host running MS IIS HTTP server Infection Phase Attack Phase... Randomly generated IPs www.whitehouse.gov DoS attack Infected host running MS IIS HTTP server Bogus HTTP request No MS IIS running Bogus HTTP request No such host Bogus HTTP request No such host Bogus HTTP request Host Infected From the beginning of 20 th to the end of the month From the beginning to the end of 19 th of the month

26 Unsolicited SYN probes, Code-Redv1  The trace includes large number of probes to 23 IP addresses within the monitored /8 network  Using the same static seed first 1 million IP addresses are generated by reverse engineering the worm code  Those 23 addresses in deed appear in the generated sequence  3 source addresses in the trace do not belong to the generated IP addresses, they must be the initial hosts infected manually Atlanta, USA Cambridge, USA GuangDong, China

27 Host Infection Rate, Code-Redv2  More than 359,000 unique IP addresses are infected with the Code-RedI worm within a day between midnight of July 19 and July 20.

28 Deactivation rate for Code-Redv1  A clear time of day effect is seen from the figure  Many machines are shut during the night  This is an indication that many home and office users are affected from the virus  The worm is programmed to switch to its attack phase on July 20, thus we have a sudden increase in deactivation rate at midnight

29 Host Classification  Reverse DNS lookups are used to characterize the hosts  It is clear that a surprisingly large number of hosts are dial-up and broadband users  Diurnal variations are observed, which suggests that a majority of the infected hosts are not production web servers

30 Investigating time of day effect  Find location of hosts using IxMapping (http://www.ipmapper.com) servicehttp://www.ipmapper.com  Convert UTC time to local time for each host and plot active hosts as function of time

31 The Effect of DHCP  Between August 2 and August 16, 2 million infected addresses are observed  However only 143,000 hosts were active in the most active 10 minute period  This can be accounted to DHCP  DHCP inflates the infected host number  However NAT usage may deflate the number

32 Flash Crowds and Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites J. Jung B. Krishnamurthy M. Rabinovich In Proceedings of the International World Wide Web Conference (WWW 2002)

33 Definitions & Problem Statement  Definitions: Flash Event (FE): A FE is a large surge in traffic to a particular Web site causing dramatic increase in server load and putting severe strain on the network links. Denial of Service Attack (DoS) : A DoS is an explicit attempt by attackers to prevent legitimate users of a service from using that service.  Problem: How to differentiate DoS attacks from Flash Events ? How to improve CDN performance for handling FEs ?

34 Some Example DoS Attacks  TCP SYN Attack: spoofed SYN packets  UDP Attacks: connect chargen-echo  Ping of Death: oversized ICMP packets cause crash  Smurf Attack: ping various hosts with victims address  Fragile and Snork Attacks: echo and WinNT RPC  Flooding Attack: flood network with useless packets  DDoS Attacks !!!

35 Example Flash Events  Popular Events, like Elections Olympics  Catastrophic events, like Sept. 11  Popular Webcasts  Play-along Web Sites (for TV shows)

36 Dimensions of the Comparison  The comparison between DoS and FE is done along the following dimensions: Traffic Patterns Client Characteristics File Reference Characteristics

37 Flash Events  Datasets Studied Play-along Play-along web site for a populat TV show Chile The Chile Web site that hosted continuously updated election results of 1999 election

38 Traffic Volume Request rate grows dramatically during the FE But the duration of the FE is relatively short

39 Traffic Volume Request rates increase rapidly during the initial period of the attack But the increase is far from instantaneous, enough room for adaptation

40 Characterizing Clients Number of clients in a FE is commensurate with the request rate

41 Characterizing Clients There is no clear increase in per-client request rates

42 Old and New clusters  Old clusters: clusters that have been seen before the FE  New clusters: clusters that have been seen during the FE but not before  The percentage of old clusters during the FE is 42.7% for Play-along and 82.9% for Chile Significant proportion of the clusters seen during the FE consists of old clusters Request distribution over clusters is highly skewed

43 File Reference Characteristics  Over 60% of documents are accessed only during flash events  Less than 10% of documents account for more than 90% of the requests  File reference distribution is highly Zipf-like

44 DoS Attacks  Datasets studied: esg and ol Log files that recorded more than 1 million requests within 60 days. A password cracking attack is performed during this period. bit.nl, creighton, fullnote, rellim, sptcccxus Collection of 5 traces that recorded requests to Web servers from machines infected by Code- Red worm.

45 Traffic Volume & Client Characteristics (Code-Red) The surge occurred because of new clusters joining the attack For traces that contain both infected and non-infected client requests, less than 14.3% of the clusters during the attack were old clusters (even smaller for password cracking)

46 Client Characteristics (Code-Red) Request rates per client do not change during the attack Distribution of requests among clusters are more spread across a number of clusters

47 Comparison of FE and DoS ?

48 Implications to CDNs  How we can handle FEs more effectively using CDNs?  We have seen that most requests during a FE are to documents that are not accessed before the FE  This causes a lot of cache misses, which overloads the origin server  One solution is to use cooperative caches, but this introduces high delays  Authors propose an alternative approach which does not incur a high delay yet decrease load on the origin server

49 Illustration of the Problem Origin Server CDN Server CDN DNS Server request doc receive doc CDN Server request address receive address Client request obj cache miss request obj CDN Server request obj from several CDN servers

50 Adaptive CDN


Download ppt "Use of Measurements in Anomaly Detection CS 8803: Network Measurements Seminar Instructor: Constantinos Dovrolis Fall 2003 Presenter: Buğra Gedik."

Similar presentations


Ads by Google