Presentation is loading. Please wait.

Presentation is loading. Please wait.

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.

Similar presentations

Presentation on theme: "Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks."— Presentation transcript:

1 Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks

2 Overview Problem: Botnet and Spam Detection in high-speed networks Common theme: Examine network-level properties and build classifier Two systems: BotMiner and SNARE –Overview –Integration with SMITE architecture Current integration status and plan

3 3 BotMiner: Structure and Protocol Independent Botnets can change their C&C content (encryption, etc.), protocols (IRC, HTTP, etc.), structures (P2P, etc.), C&C servers, infection models …

4 4 Definition of a Botnet A coordinated group of malware instances that are controlled by a botmaster via some C&C channel –Hosts that have similar C&C-like traffic and similar malicious activities We need to monitor two planes –C-plane (C&C communication plane): who is talking to whom –A-plane (malicious activity plane): who is doing what

5 5 BotMiner Architecture Sensors Algorithms Correlation

6 6 Cross-plane Correlation Botnet score s(h) for every host h –A host has higher score if it is in more activity clusters and in both activity and communication clusters –A host with a high score is a bot Similarity score between bot host h i and h j –Two hosts in the same A-clusters and in at least one common C-cluster are clustered together –Each cluster is a bot

7 7 SMITE Integration: BotMiner

8 8 Sensors –Feature extraction for C-Plane and A-Plane clustering –C-Flow temporal and statistical features (SMITE flow analysis sensors) Counting packets and connections between each pair of endpoints: bytes per second, flows per hour, bytes per packet, packets per flow –A-Plane header and payload features (SMITE flow sensors, AVIES) Destination IP addresses and ports, payload bytes/strings Integrating BotMiner and SMITE

9 9 Algorithms –C-plane clustering Multi-step clustering based on statistical and temporal C-flow features –A-plane clustering Based on activity-specific similarity measures: e.g., spread of destination IP addresses and ports, and payload similarity Analyze additional alerts from other detection algorithms –Bot scoring and botnet clustering methods Scoring based on participation in C-plane and A-plane clusters Clustering based on common memberships in the C-plane and A-plane clusters Integrating BotMiner and SMITE

10 10 Cross-plane correlation –Botnet detection involves both vertical and horizontal analysis/clustering: Vertical: what activities a host has been involved in –Bot detection Horizontal: what other hosts have similar (vertical) behavior patterns –Botnet detection Integrating BotMiner and SMITE

11 11 Filter email based on how it is sent, in addition to simply what is sent. Network-level properties are less malleable –Hosting or upstream ISP (AS number) –Membership in a botnet (spammer, hosting infrastructure) –Network location of sender and receiver –Set of target recipients Network-Based Spam Detection

12 12 Finding the Right Features Goal: Sender reputation from a single packet header? –Low overhead –Fast classification –In-network –Perhaps more evasion resistant Key challenge –What features satisfy these properties and can distinguish spammers from legitimate senders?

13 13 Network-Level Features Single-Packet –AS of senders IP –Distance to k nearest senders –Status of email service ports –Geodesic distance –Time of day Single-Message –Number of recipients –Length of message Aggregate (Multiple Message/Recipient)

14 14 Sender-Receiver Geodesic Distance 90% of legitimate messages travel 2,200 miles or less

15 15 Density of Senders in IP Space For spammers, k nearest senders are much closer in IP space

16 16 Local Time of Day at Sender Spammers peak at different local times of day

17 17 Combining Features: RuleFit Put features into the RuleFit classifier 10-fold cross validation on one day of query logs from a large spam filtering appliance provider Comparable performance to SpamHaus –Incorporating into the system can further reduce FPs Using only network-level features Completely automated

18 18 Sample Results False positives reduced to 0.14%

19 19 Integrating SNARE and SMITE SensorsAlgorithms

20 20 SMITE Integration Challenges Sources of labeled data –SNARE requires clean sources of labeled data for training Data collection –SNAREs performance improves when behavior can be observed across multiple domains Availability of external data in RTEN testbed

21 21 SMITE Integration: Current Work Study pipeline architecture and code Modify flow-analyzer to dump 5-tuple flow information

22 22 SMITE Integration: Step 1 Modify flow-analyzer with SMITE team to generate 5-tuple flow information (mid-March) Spam/scan detection, flow aggregation in BotMiner; Spam feature extraction in SNARE (end of March) Clustering and correlation in BotMiner; Classifier in SNARE (end of April)

23 23 SMITE Integration: Step 2 Evaluate performance of BotMiner and SNARE –How many hours to process one-day of traffic, or what is the lag time between event and detection? Design real-time detection algorithms –A two-tier system: off-line module output lists of suspicious hosts, and real-time module inspects all packets of these hosts; or, off-line module output clusters Design algorithms to handle asymmetric traffic –Cluster on each direction of traffic and cross-correlate

Download ppt "Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks."

Similar presentations

Ads by Google