Presentation is loading. Please wait.

Presentation is loading. Please wait.

Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.

Similar presentations


Presentation on theme: "Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu."— Presentation transcript:

1 Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu April 13 th, 2009

2 Motivation and Background System description Experimental analysis Conclusion

3 Motivation and Background System description Experimental analysis Conclusion

4 This paper proposes a general detection framework BotMiner that is independent of botnet Command and Control (C&C) protocol and structure, and requires no a priori knowledge of botnets

5 Bot A malware instance that runs autonomously and automatically on a compromised computer (zombie) without owner’s consent Botnet: network of bots controlled by criminals Definition: “A coordinated group of malware instances that are controlled by a botmaster via some C&C channel” 25% of Internet PCs are part of a botnet!

6 Why BotMiner? Traditional methods are not enough. Botnets can change their C&C content (encryption, etc.), protocols (IRC, HTTP, etc.), structures (P2P, etc.), C&C servers, infection models …

7 Cluster similar communication traffic and similar malicious traffic, and performs cross cluster correlation to identify the hosts that share both similar communication patterns and similar malicious activity patterns

8 Revisit the definition of Botnet again “A coordinated group of malware instances that are controlled by a botmaster via some C&C channel” We need to monitor two planes C-plane (C&C communication plane): “who is talking to whom” A-plane (malicious activity plane): “who is doing what” Horizontal correlation Bots are for long-term use Botnet: communication and activities are coordinated/similar

9 Motivation and Background System description Experimental analysis Conclusion

10

11 A-Plane Monitor + Clustering C-Plane Monitor + Clustering Cross-Plane Correlation Network Traffic Report

12 A-Plane Monitor + Clustering C-Plane Monitor + Clustering Cross-Plane Correlation Network Traffic Report

13 Log information on who is doing what Monitor four types of malicious activities Scanning Spamming Binary downloading Exploit attempts Based on Snort, adapt some existing intrusion detection techniques (e.g. BotHunter, PEHunter)

14 Two-layer clustering on activity logs

15 A-Plane Monitor + Clustering C-Plane Monitor + Clustering Cross-Plane Correlation Network Traffic Report

16 Capture network flows and records information on who is talking to whom Adapt an efficient network flow capture tool named fcapture, which is based on Judy library Each flow record contains the following information: time, duration, source IP, source port, destination IP, destination port, and the number of packets and bytes transferred in both directions

17 Architecture of the C-plane clustering First two steps are not critical, however, they can reduce the traffic workload and make the actual clustering process more efficient In the third step, given an epoch E (typically one day), all TCP/UDP flows that shares the same protocol, source IP, destination IP and port, are aggregated into the same C- flow

18 Extract a number of statistical features from each C- flow and translate them into d-dimensional pattern vectors compute the discrete sample distribution of (currently) four random variables the number of flows per hour (fph) the number of packets per flow (ppf) the average number of bytes per packets (bpp) the average number of bytes per second (bps) Temporal related statistical distribution information: FPH and BPS Spatial related statistical distribution information: BPP and PPF

19 Compute the overall discrete sample distribution of the random variable considering all the C-flows in the traffic for an epoch E, then describe that random variable (approximate) distribution as a vector of 13 elements. Apply the same algorithm for all four random variables, and therefore we map each C-flow into a pattern vector of d = 52 elements

20

21 Why multi-step? Coarse-grained clustering Using reduced feature space: mean and variance of the distribution of FPH, PPF, BPP, BPS for each C-flow (2*4=8) Efficient clustering algorithm: X-means Fine-grained clustering Using full feature space (13*4=52)

22 A-Plane Monitor + Clustering C-Plane Monitor + Clustering Cross-Plane Correlation Network Traffic Report

23 Botnet score s (h) for every host h h will receive a high score if it has performed multiple types of suspicious activities, and if other hosts that were clustered with h also show the same multiple types of activities Similarity score between host h i and h j Two hosts in the same A-clusters and in at least one common C-cluster are clustered together

24 Use the Davies-Bouldin (DB) validation index to find the best dendrogram cut, which produces the most compact and well separated clusters

25 Motivation and Background System description Experimental analysis Conclusion

26

27

28 Motivation and Background System description Experimental analysis Conclusion

29 Evading C-plane monitoring and clustering Misuse whitelist Manipulate communication patterns Evading A-plane monitoring and clustering Very stealthy activity Individualize bots’ communication/activity Evading cross-plane analysis Extremely delayed task

30

31 Propose a detection framework which is independent of botnet C&C protocol and structure, and requires no a priori knowledge of specific botnets Build a prototype system based on the general detection framework, and evaluate it with multiple real-world network traces including normal traffic and several real-world botnet traces

32 Offline system Long time data collection and analysis No incremental ability of analysis The experiment is not convincing enough Only shows the system performance on day-2, what about the other days? Not a real “real world experiment”

33 Fast detection and online analysis More efficient clustering, more robust features More experiments in different and real network environment

34 Sides of the paper in USENIX Security’08 http://faculty.cs.tamu.edu/guofei/paper/botMiner-Security08-slides.pdf Sad Planet, Kayak Adventure. Botnets on the Rampage http://birdhouse.org/blog/2006/11/16/botnets-on-the-rampage/ Beware of Potential Confickor BotNet Chaos http://thejunction.net/2009/03/25/april-1st-beware-of-potential-botnet-chaos/ Oracle Data Mining Mining Techniques and Algorithms http://www.oracle.com/technology/products/bi/odm/odm_techniques_algorithms. html

35 Question?


Download ppt "Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu."

Similar presentations


Ads by Google