Presentation is loading. Please wait.

Presentation is loading. Please wait.

Netflow Data-Mining Techniques Chris Poetzel Argonne National Laboratory Scott Pinkerton.

Similar presentations

Presentation on theme: "Netflow Data-Mining Techniques Chris Poetzel Argonne National Laboratory Scott Pinkerton."— Presentation transcript:

1 Netflow Data-Mining Techniques Chris Poetzel Argonne National Laboratory Scott Pinkerton

2 22 July 2004ESCC Meeting2 Netflow Data Mining Argonne Background Information Sliding Window Analysis Using Contextual Knowledge to adjust data-mining Incident Investigation Integration, Integration, Integration Future Conclusions

3 22 July 2004ESCC Meeting3 ANL Background Utilize OSU’s Flow-Tools written by Mark Fullmer Collecting from 14 different Router/Switches at ANL-East ~600GB currently stored and growing 1 Year retention period desired – backing off as we add devices Current collection/Analysis Station: IBM 360, RedHat Linux, 8GB Ram, 4 1.6 Mhz CPU

4 22 July 2004ESCC Meeting4 Sliding Window Analysis The raw volume of Netflow Data can make data-mining long and cumbersome Implemented a 5 minute Sliding Window for analysis –Every minute, check previous 5 minutes of data (via cron jobs) –Reduces processing time (~20 secs) –Catches vast majority of scans/probes in near real-time

5 22 July 2004ESCC Meeting5 Contextual Knowledge Which way is the data flowing? Contextual knowledge will affect what we search for & what we do with the results INOUT IN Destination Source

6 22 July 2004ESCC Meeting6 OUT -> IN –Receive many class B/C scans a day –Only Watch for scans on open FW ports Dynamically read FW config every ½ hour to determine open ports in FW –Use Netflow Data to look for scans on open FW ports Fast Scans: Script executed every minute looking at past 5 minutes of data to catch Fast Scanners Slow Scans: Script run every hour looking at previous 24 hours of data to catch Slow Scanners –Once scanner detected, send IP for FW shun

7 22 July 2004ESCC Meeting7 IN -> OUT Looking for problem machines at the Lab – 1 st approximation is to look at machines which have contacted large # of Internet hosts in a short period of time –Can indicate a compromised/infected machine Exclude a number of internal machines based on apriori knowledge –email servers, domain controllers, network scanning machines (ignore)

8 22 July 2004ESCC Meeting8 IN -> IN Requires collection on multiple internal switches/routers Detect Internal Scanning –Cron job runs every hour –Infected host scanning local subnet/supernet –Detect unauthorized internal network scans Post-Mortem Forensic Value –What did an internally compromised machine do once it was compromised –Track down cross-contamination

9 22 July 2004ESCC Meeting9 OUT -> OUT May not apply to every site Co-location personal or transport traffic constitute OUT -> OUT traffic on a network Scans in the OUT OUT direction are detected and the appropriate network admin/security personal are notified

10 22 July 2004ESCC Meeting10 Incident Investigation 1/2 What to do when an incident happens? (Besides pull your hair out) Netflow Data is invaluable in cyber security investigations. Start by classifying IP addresses into a taxonomy –Possible Bad Guy –Possible Victims –Possible Intermediary (stepping stone, rootkit resource site, etc) –This process can be aided by host syslog, etc.

11 22 July 2004ESCC Meeting11 Incident Investigation 2/2 By identifying the possible victims, the process of containment and clean-up becomes much easier Netflow has become an invaluable tool for our cyber security team

12 22 July 2004ESCC Meeting12 Integration³ To improve Signal-to-Noise ratio of cyber security events, correlating netflow data with other data sources has been very helpful –IDS logs –ARP/CAM Tables – MAC “persistence” –Firewall Logs –DHCP/VPN Logs –Host based Syslog

13 22 July 2004ESCC Meeting13 IDS & Netflow Logs Used to cross validate either an IDS or a Netflow alarm with each other IDS alarms usually give specific points of attack Netflow can be used to provide background or framework of attack Netflow + IDS can provide a better perspective of cyber security events Store IDS and Netflow Logs in same directory structure to make searching easier

14 22 July 2004ESCC Meeting14 VPN/DailUP Scan/Virus Detection Marriage of Many Data Sources Each Dailup/VPN login initiates a virus scan of connected host Dailup/VPN connected host is monitored via netflow for outbound scanning activity If remotely connected host is determined to be virally infected or doing malicious behavior, connection is terminated and user account is locked All actions are performed via automated scripts, no human intervention

15 22 July 2004ESCC Meeting15 Future Host Profiling Via Netflow –Determine what “normal” behavior for a host is and then alert when it varies from the norm –Some IDS products are attempting this approach (Network Flight Recorder, Lancope) Visualization of Netflow Data –Charts, Graphs, Animations of Network Conversations –Work Being done by NCSA Better Integration with other data sources

16 22 July 2004ESCC Meeting16 Conclusions Collecting Netflow data to support Cyber Security activities is tremendously helpful. It is an invaluable data source for performing post-mortem forensic analysis, as well as an extremely helpful tool for performing real-time detection, notification, and active response – blocking an IP address.

17 22 July 2004ESCC Meeting17 Thanks Chris Poetzel – –630-252-7431 Scott Pinkerton – –630-252-9770

Download ppt "Netflow Data-Mining Techniques Chris Poetzel Argonne National Laboratory Scott Pinkerton."

Similar presentations

Ads by Google