Presentation is loading. Please wait.

Presentation is loading. Please wait.

BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Written by Guofei Gu, Roberto Perdisci, Junjie.

Similar presentations


Presentation on theme: "BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Written by Guofei Gu, Roberto Perdisci, Junjie."— Presentation transcript:

1 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Written by Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee Georgia Institute of Technology Presented by Latasha A. Gibbs University of South Carolina

2 OUTLINE Definitions and Introduction to Botnet Problem
Detection Framework and Implementation Traffic Monitors and Clustering Experiments & Evaluations Related Work Future Work & Conclusion

3 What is a Bot? Software application that can run automated tasks over the Internet Perform task that are simple and structurally repetitive Implemented when emulation of human activity is required Implemented where response speed is faster than that of humans is required Examples include gaming bots, chat bots, or auction-site robots

4 What is the Command and Control (C&C) Channel?
The Command and Control (C&C) channel is needed so bots can receive their commands and coordinate fraudulent activities The C&C channel is the means by which individual bots form a botnet

5 Definition of Botnet -collection of compromised computers connected to the Internet Paper – coordinated group of malware instances that are controlled via C&C communication channel

6 Botnet Diagram

7 (1 quarter of all pc’s are part of a botnet) –Vint Cerf
The Problem Botnets are becoming one of the most serious threats to Internet security (1 quarter of all pc’s are part of a botnet) –Vint Cerf Botnets are evolving and becoming more flexible Prior to this research, most detection approaches worked only on specific command and control (C&C) protocols like (IRC and HTTP) and structures that are (centralized)

8 Centralized Structure VS. Peer-to-Peer (P2P) Structure

9 Top 10 Most Wanted Botnets http://www. networkworld
Top 10 Most Wanted Botnets *Compromised US Computers Zeus (3.6 million) Koobface (2.9 million) TidServ (1.5 million) Trojan.Fakeavalert (1.4 million) TR/DIdr.Agent.JKH (1.2 million) Monkif (520,000) Hamweq (480,000) Swizzor (370,000) Gammima (230,000) Conficker (210,000)

10 Botnets are utilized to perform the following:
Distributed Denial-of-Service Attacks Spam Phishing Identity Theft Information Exfiltration

11 OUTLINE Definitions and Introduction to Botnet Problem
BotMiner Detection Framework and Implementation Traffic Monitors and Clustering Evaluations Related Work Future Work Conclusion

12 MAIN COMPONENTS OF BOTMINER DETECTION SYSTEM
C-PLANE MONITOR A-PLANE MONITOR C-PLANE CLUSTERING A-PLANE CLUSTERING CROSS-PLANE CORRELATOR

13 Architecture of BotMiner

14 OUTLINE Definitions and Introduction to Botnet Problem
Detection Framework and Implementation Traffic Monitors and Clustering Evaluations Related Work Future Work Conclusion

15 Traffic Monitors C-PLANE MONITOR A-PLANE MONITOR
Captures network flows and records information on “who is talking to whom” The fcapture tool was used (very efficient on high-speed networks) Each flow record contained: time, duration, source IP, destination IP, destination port, and # packets/bytes transferred in both directions Logs information on “who is doing what” Based on Snort (open-source intrusion detection tool) Capable of detecting scanning activities, spamming, and binary downloading

16 C-PLANE CLUSTERING Section 2.5
Responsible for reading logs generated by the C-plane monitor and finding clusters of machines that share similar communication patterns Start Irrelevant traffic flows are filtered out (2 steps: basic filtering and white-listing) After basic filtering and white-listing, traffic is reduced further by aggregating related flows into communication flows (C-flows) These 2 steps are not critical for proper functioning of the C-plane clustering module. They are useful in helping to reduce the traffic workload and making the actual clustering process more efficient. Basic filtering filters out flows that are not directed from internal hosts to external hosts. F2 filters out flows that contain one-way traffic. White-list filtering filters out flows whose destinations are legitimate servers. Based upon US top 100 most popular websites from Alexa.com * If the C-PLANE monitor is deployed/tested in a LAN, the filtering can be seen. If the C-PLANE monitor is deployed at the router’s edge then this traffic will not be seen.

17 ARCHITECTURE OF C-PLANE CLUSTERING Figure 3

18 C-PLANE CLUSTERING CONT’D
Given an epoch E (1 day), For all m TCP/UDP flows must share the same: protocol (TCP or UDP) source IP destination IP port Aggregated into the same C-flow denoted as Where is a single TCP/UDP flow. Basically, the set of all the n C-flows tells “who was talking to whom” during that epoch.

19 Vector Representation of C-flows
To apply clustering algorithms to C-flows they must be translated into suitable vector representation A number of statistical features are extracted from each C-flow and then they are translated into a d-dimensional pattern of vectors. Given a C-flow, the discrete sample distribution is computed for 4 variables: The number of flows per hour (fph) The number of packets per flow (ppf) The average # of bytes per packet (bpp) The average # of bytes per second (bps)

20 Example of Results Both graphs depict the statistical distribution for the same client, but the top graph shows a temporal distribution and the other is showing a spatial distribution.

21 2-Step Clustering Clustering C-flows is very expensive
Because the % of machines in a network that are infected by bots is generally small, the authors separate the botnet-related C-flows from a large number of benign C-flows To cope with the complexity of clustering the task is broken down into steps

22 2-Step Clustering of C-flows
At the first step, they perform coarse-grained clustering on a reduced feature space using a simple clustering algorithm. The results of the first-step clustering is a set of C-flows (relatively large clusters). Later a second step of clustering is done on each different dataset. They implemented the 1st and 2nd step using the X means clustering algorithm (which is a efficient algorithm based on K-means). X-means is fast and scales well with respect to the size of the dataset.

23 A-PLANE CLUSTERING In this stage, 2 layer clustering is performed on activity logs A scan activity could include scanning ports (e.g, two machines scanning the same ports) Another feature could be target subnet/distribution (e.g. when machines are scanning the same subnet) For spam activity, two machines could be clustered together if their SMTP connection destinations are highly overlapped In the paper, the authors cluster scanning activities according to the destination scanning ports

24 CROSS-PLANE CORRELATION Section 2.7
The idea is to cross-check both clusters (A-PLANE & C-PLANE) to find out whether there is evidence of the host being a part of a botnet The first step is to compute the bot score s(h) for each host h on which at least one kind of suspicious activity has been performed Host that have a score below a certain threshold are filtered out The remaining most suspicious host are grouped together according to a similarity metric that takes into account A-PLANE and C-PLANE clusters Higher values are assigned to “strong” activities like spam or exploits Lower values are assigned to “weak activities” like scanning or binary downloads

25 Hierarchical Clustering & Dendrogram
The figure shows a hypothetical example The Davis-Bouldin (DB) validation index is used to find the best dendrogram cut The figure shows that the best cut suggested by the DB index is at height 90

26 OUTLINE Definitions and Introduction to the Botnet Problem
Detection Framework and Implementation Traffic Monitors Evaluations Related Work Future Work Conclusion

27 EVALUATIONS Tested performance on several real-world network traces (campus network) C-PLANE and A-PLANE monitors were ran continuously for 10 days Collected 6 different botnets (IRC and HTTP) Two P2P botnets, namely Nugache (82 bots) and Storm(13 bots); the network trace lasted a whole day

28 10 DAYS

29 Collected Trace Results

30 Detection Results

31 OUTLINE Definitions and Introduction to Botnet Problem
Detection Framework and Implementation Traffic Monitors Evaluations Limitations and Evasion Related Work Future Work Conclusion

32 Limitations Adversaries that find details about the BotMiner detection framework and implementation will find ways to evade detection Possibility that attackers can evade C-PLANE and A-PLANE monitoring and clustering, or cross-plane correlation analysis

33 Evading C-PLANE Monitoring and Clustering
Evasion Method Examples Misuse white-listing (optional) Botnets may try to use a legitimate website to evade detection Trade-off: Reduces the volume of monitored traffic and improves efficiency BotMiner For example, use to locate a secondary URL which really is a source of command hosting or binary downloading; botnets will be able to hide the secondary URL and corresponding communications

34 Evading C-PLANE Monitoring and Clustering Cont’d
Evasion Method Examples Manipulate communication patterns Switch between multiple C&C servers Randomizing individual communication patterns (e.g. injecting random packets in a flow or by padding random bytes in a packet) Bots could use covert channels to hide their actual C&C communications

35 Evading A-PLANE Monitoring and Clustering
Evasion Method Example Performing very stealthy malicious activities Vary the way bots are commanded in the same monitored network Scan very slow (e.g. send one scan per hour) The “botmaster” sends out different commands to each bot

36 Evading Cross-Plane Analysis
The “botmaster” can send commands that are extremely delayed tasks Malicious activities are performed on different days Trade-off: The “botmaster” also suffers because as the C&C communications slow down, efficiency of controlling the bot army declines

37 SOLUTIONS Use multiple-days of data Cross check back several days
More false positives may be generated If the pc is powered off or disconnected from the Internet the bot is unavailable to the “botmaster” TRADE-OFF

38 Related Work Paper by Gu, Zhang, and Lee
BotSniffer-proposed approach to use network-based anomaly detection to identify botnet C&C channels in local area networks without any prior knowledge of signatures or C&C server addresses Contribution: Understanding and detecting the C&C channel has great value in the battle against botnets Note: If a active C&C server is taken down or interrupted , the “botmaster” will not be able to control the botnet

39 BotSniffer Architecture

40 BotSniffer Cont’d If certain conditions are satisfied, BotSniffer has the ability to detect the botnet C&C channel even if there is only 1 bot in the monitored network BotSniffer was tested on several network traces in two modes: stand alone and normal traces BotSniffer has two main components: the monitor engine and the correlation engine C&C detection module relies on known signatures Possible evasion methods include evasion using white-list, evasion by long delays, evasion by injecting random noise packets, and evasion by encryption

41 Related Work Researcher use honeypot techniques to collect and analyze bots (e.g. Nephenthes) TAMD is a system used to detect malware (including botnets) by aggregating traffic that shares the same destination, similar payloads, and host with similar host OS platforms Rishi is a signature-based IRC botnet detection system that matches known IRC bot nickname patterns (

42 Related Work Cont’d Considering most of the systems mentioned in the paper, the majority of the systems are limited to specific botnet protocols and structures, and many work only on IRC-based botnets

43 OUTLINE Definitions and Introduction to the Botnet Problem
Detection Framework and Implementation Traffic Monitors and Clustering Experiments & Evaluations Related Work Future Work & Conclusion

44 Future Work Develop new techniques to monitor/cluster communication and activity patterns of botnets making them more robust to evasion attempts Improve efficiency of C-flow converting and clustering algorithms Combine different correlation techniques Develop a new real-time detection system based on layered design using sampling techniques that work in large high-speed networks

45 Predictions Researching home networks and mobile devices since they are primary targets Research socialbots since internet criminals are gathering and selling vast quantities of data Monitoring virtual environments since “botmasters“ are now able to detect whether defenders are using virtual machines

46 Conclusion Botnet detection is a challenging problem
BotMiner Detection System is independent of protocol and structure used by most botnets BotMiner shows excellent detection accuracy on various types of botnets including IRC, HTTP, and P2P with very low false positive rate on normal traffic

47 Free Tools RUBotted (2.0 Beta) by Trend Micro
BotHunter (Windows, Linux, FreeBSD, and MacOs) Microsoft Security Essentials

48 REFERENCES [1] G. Gu, J. Zhang, and W. Lee. BotSniffer: Detecting botnet command and control channels in network traffic. In Proceedings of the 15th Annual Network and Distributed System Security Symposium (NDSS ’08), 2008. [2] Botnet. [3] BotHunter. [4] Messmer, Ellen. America’s 10 Most Wanted Botnets. July 22, [5] RUBotted. [6] Whitelist. [7] P. Baecher, T. Holz, M. Kotter, and G. Wicherski. Know your enemy: Tracking botnets [8] B. Stone-Gross, M. Cova, L. Cavallaro, B. Gilbert, M. Szydlowski, R. Kemmerer, C. Kruegel, G. Vigna. Your Botnet is My Botnet: Analysis of a Botnet TakeOver. In Proceedings of the ACM CCS, 2009. [9] P. Baecher, M. Koeter, T. Holz, M. Dornseif, and F. Freiling. The nepenthes platform: An efficient approach to collect malware. In Proceedings of International Symposium on Recent Advances in Intrusion Detection (RAID ’06), Hamburg, September 2006. [10] Anderson, Nate.

49 Questions or Comments…
Thank you!


Download ppt "BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Written by Guofei Gu, Roberto Perdisci, Junjie."

Similar presentations


Ads by Google