Monitoring and Intrusion Detection Nick Feamster CS 4251 Fall 2008.

Slides:



Advertisements
Similar presentations
Nick Feamster Georgia Tech
Advertisements

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Dynamics of Online Scam Hosting Infrastructure
11/20/09 ONR MURI Project Kick-Off 1 Network-Level Monitoring for Tracking Botnets Nick Feamster School of Computer Science Georgia Institute of Technology.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Guofei Gu1,2, Roberto Perdisci3, Junjie Zhang1,
Understanding the Network- Level Behavior of Spammers Anirudh Ramachandran Nick Feamster Georgia Tech.
Data-Plane Accountability with In-Band Path Diagnosis Murtaza Motiwala, Nick Feamster Georgia Tech Andy Bavier Princeton University.
Spamming with BGP Spectrum Agility Anirudh Ramachandran Nick Feamster Georgia Tech.
Understanding the Network- Level Behavior of Spammers Anirudh Ramachandran Nick Feamster Georgia Tech.
Multihoming and Multi-path Routing
Network Security Highlights Nick Feamster Georgia Tech.
1 Dynamics of Online Scam Hosting Infrastructure Maria Konte, Nick Feamster Georgia Tech Jaeyeon Jung Intel Research.
1 Network-Level Spam Detection Nick Feamster Georgia Tech.
Network Security Highlights Nick Feamster Georgia Tech.
Multihoming and Multi-path Routing
Network Monitoring and Security Nick Feamster CS 4251 Spring 2008.
Monitoring very high speed links Gianluca Iannaccone Sprint ATL joint work with: Christophe Diot – Sprint ATL Ian Graham – University of Waikato Nick McKeown.
Internet Area IPv6 Multi-Addressing, Locators and Paths.
New Opportunities for Load Balancing in Network-Wide Intrusion Detection Systems Victor Heorhiadi, Michael K. Reiter, Vyas Sekar UNC Chapel Hill UNC Chapel.
RIP V1 W.lilakiatsakun.
New Directions in Traffic Measurement and Accounting Cristian Estan – UCSD George Varghese - UCSD Reviewed by Michela Becchi Discussion Leaders Andrew.
Introduction to IPv6 Presented by: Minal Mishra. Agenda IP Network Addressing IP Network Addressing Classful IP addressing Classful IP addressing Techniques.
IPv6 Victor T. Norman.
CSE331: Introduction to Networks and Security Lecture 8 Fall 2002.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
OpenSketch Slides courtesy of Minlan Yu 1. Management = Measurement + Control Traffic engineering – Identify large traffic aggregates, traffic changes.
Hash-Based IP Traceback Best Student Paper ACM SIGCOMM’01.
Understanding the Network-Level Behavior of Spammers Anirudh Ramachandran Nick Feamster.
Network Traffic Measurement and Modeling CSCI 780, Fall 2005.
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
Semester 4 - Chapter 3 – WAN Design Routers within WANs are connection points of a network. Routers determine the most appropriate route or path through.
FIREWALLS & NETWORK SECURITY with Intrusion Detection and VPNs, 2 nd ed. 6 Packet Filtering By Whitman, Mattord, & Austin© 2008 Course Technology.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Guide to TCP/IP, Third Edition
Packet Filtering. 2 Objectives Describe packets and packet filtering Explain the approaches to packet filtering Recommend specific filtering rules.
NetfFow Overview SANOG 17 Colombo, Sri Lanka. Agenda Netflow –What it is and how it works –Uses and Applications Vendor Configurations/ Implementation.
Copyright © 2002 OSI Software, Inc. All rights reserved. PI-NetFlow and PacketCapture Eric Tam, OSIsoft.
Computer Networks. IP Addresses Before we communicate with a computer on the network we have to be able to identify it. Every computer on a network must.
Network Flow-Based Anomaly Detection of DDoS Attacks Vassilis Chatzigiannakis National Technical University of Athens, Greece TNC.
1 Chapter 6: Proxy Server in Internet and Intranet Designs Designs That Include Proxy Server Essential Proxy Server Design Concepts Data Protection in.
1 IP Forwarding Relates to Lab 3. Covers the principles of end-to-end datagram delivery in IP networks.
Institute of Technology Sligo - Dept of Computing Semester 2 Chapter 10 IP Addressing.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Connecting to the Network Networking for Home and Small Businesses.
Dividing the Pizza An Advanced Traffic Billing System An Advanced Traffic Billing System Christopher Lawrence Burke The University of Queensland.
TCP/IP Essentials A Lab-Based Approach Shivendra Panwar, Shiwen Mao Jeong-dong Ryoo, and Yihan Li Chapter 5 UDP and Its Applications.
POSTECH DP&NM Lab. Internet Traffic Monitoring and Analysis: Methods and Applications (1) 5. Passive Monitoring Techniques.
IP Forwarding.
NetFlow: Digging Flows Out of the Traffic Evandro de Souza ESnet ESnet Site Coordinating Committee Meeting Columbus/OH – July/2004.
Packet Filtering Chapter 4. Learning Objectives Understand packets and packet filtering Understand approaches to packet filtering Set specific filtering.
© 2006 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 4: Addressing in an Enterprise Network Introducing Routing and Switching in the.
Wide-scale Botnet Detection and Characterization Anestis Karasaridis, Brian Rexroad, David Hoeflin In First Workshop on Hot Topics in Understanding Botnets,
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
Trajectory Sampling for Direct Traffic Oberservation N.G. Duffield and Matthias Grossglauser IEEE/ACM Transactions on Networking, Vol. 9, No. 3 June 2001.
Open-Eye Georgios Androulidakis National Technical University of Athens.
Understanding the network level behavior of spammers Published by :Anirudh Ramachandran, Nick Feamster Published in :ACMSIGCOMM 2006 Presented by: Bharat.
Advanced Packet Analysis and Troubleshooting Using Wireshark 23AF
Net Flow Network Protocol Presented By : Arslan Qamar.
Semester 2v2 Chapter 8: IP Addressing. Describe how IP addressing is important in routing. IP addresses are specified in 32-bit dotted-decimal format.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
POSTECH DP&NM Lab Detailed Design Document NetFlow Generator 정승화 DPNM Lab. in Postech.
IETF 62 NSIS WG1 Porgress Report: Metering NSLP (M-NSLP) Georg Carle, Falko Dressler, Changpeng Fan, Ali Fessi, Cornelia Kappler, Andreas Klenk, Juergen.
1 Internet Traffic Measurement and Modeling Carey Williamson Department of Computer Science University of Calgary.
Jennifer Rexford Princeton University
Semester 4 - Chapter 3 – WAN Design
Chapter 7 Backbone Network
SONATA: Query-Driven Network Telemetry
Memento: Making Sliding Windows Efficient for Heavy Hitters
Networking and Network Protocols (Part2)
Presentation transcript:

Monitoring and Intrusion Detection Nick Feamster CS 4251 Fall 2008

Passive vs. Active Measurement Passive Measurement: Collection of packets, flow statistics of traffic that is already flowing on the network –Packet traces –Flow statistics –Application-level logs Active Measurement: Inject probing traffic to measure various characteristics –Traceroute –Ping –Application-level probes (e.g., Web downloads)

Monitoring Internet Traffic Hundreds of megabits per second Cannot afford to look at all traffic Goals –High-speed monitoring –Low false positives

Passive Traffic Data Measurement SNMP byte/packet counts: everywhere Packet monitoring: selected locations Flow monitoring: typically at edges (if possible) –Direct computation of the traffic matrix –Input to denial-of-service attack detection Deep Packet Inspection: also at edge, where possible

Two Main Approaches Packet-level Monitoring –Keep packet-level statistics –Examine (and potentially, log) variety of packet-level statistics. Essentially, anything in the packet. –Timing Flow-level Monitoring –Monitor packet-by-packet (though sometimes sampled) –Keep aggregate statistics on a flow

Packet-level Monitoring Passive monitoring to collect full packet contents (or at least headers) Advantages: lots of detailed information –Precise timing information –Information in packet headers Disadvantages: overhead –Hard to keep up with high-speed links –Often requires a separate monitoring device

Full Packet Capture (Passive) Example: Georgia Tech OC3Mon Rack-mounted PC Optical splitter Data Acquisition and Generation (DAG) card Source: endace.com

What is a flow? Source IP address Destination IP address Source port Destination port Layer 3 protocol type TOS byte (DSCP) Input logical interface (ifIndex)

Cisco Netflow Basic output: Flow record –Most common version is v5 Current version (9) is being standardized in the IETF (template-based) –More flexible record format –Much easier to add new flow record types Core Network Collection and Aggregation Collector (PC) Approximately 1500 bytes flow records Sent more frequently if traffic increases

Flow Record Contents Source and Destination, IP address and port Packet and byte counts Start and end times ToS, TCP flags Basic information about the flow… …plus, information related to routing Next-hop IP address Source and destination AS Source and destination prefix

flow 1flow 2flow 3 flow 4 Aggregating Packets into Flows Criteria 1: Set of packets that belong together –Source/destination IP addresses and port numbers –Same protocol, ToS bits, … –Same input/output interfaces at a router (if known) Criteria 2: Packets that are close together in time –Maximum inter-packet spacing (e.g., 15 sec, 30 sec) –Example: flows 2 and 4 are different flows due to time

Reducing Measurement Overhead Filtering: on interface –destination prefix for a customer –port number for an application (e.g., 80 for Web) Sampling: before insertion into flow cache –Random, deterministic, or hash-based sampling –1-out-of-n or stratified based on packet/flow size –Two types: packet-level and flow-level Aggregation: after cache eviction –packets/flows with same next-hop AS –packets/flows destined to a particular service

Packet Sampling for Flow Monitoring Packet sampling before flow creation (Sampled Netflow) –1-out-of-m sampling of individual packets (e.g., m=100) –Create of flow records over the sampled packets Reducing overhead –Avoid per-packet overhead on (m-1)/m packets –Avoid creating records for a large number of small flows Increasing overhead (in some cases) –May split some long transfers into multiple flow records –… due to larger time gaps between successive packets time not sampled two flows timeout

Sampling: Flow-Level Sampling Sampling of flow records evicted from flow cache –When evicting flows from table or when analyzing flows Stratified sampling to put weight on heavy flows –Select all long flows and sample the short flows Reduces the number of flow records –Still measures the vast majority of the traffic Flow 1, 40 bytes Flow 2, bytes Flow 3, 8196 bytes Flow 4, bytes Flow 5, 532 bytes Flow 6, 7432 bytes sample with 100% probability sample with 0.1% probability sample with 10% probability

High-Speed Packet Sampling Traffic arrives at high rates –High volume –Some analysis scales with the size of the input Possible approaches –Random packet sampling –Targeted packet sampling

Approach Idea: Bias sampling of traffic towards subpopulations based on conditions of traffic Two modules –Counting: Count statistics of each traffic flow –Sampling: Sample packets based on (1) overall target sampling rate (2) input conditions Counting Traffic stream Sampling Input conditions Instantaneous sampling probability Overall sampling rate Traffic subpopulations

Challenges How to specify subpopulations? –Solution: multi-dimensional array specification How to maintain counts for each subpopulation? –Solution: rotating array of counting Bloom filters How to derive instantaneous sampling probabilities from overall constraints? –Solution: multi-dimensional counter array, and scaling based on target rates

Specifying Subpopulations Idea: Use concatenation of header fields (tupples) as a key for a subpopulation –These keys specify a group of packets that will be counted together # base sampling rate sampling_rate = 0.01 # number of tuples tuples = 2 # number of conditions conditions = 1 # tuple definitions tuple_1 := srcip.dstip tuple_2 := srcip.srcport.dstport # condition : sampling budget tuple_1 in (30, 1] AND tuple_2 in (0, 5]: 0.5 Count groups of packets with the same source and destination IP address Count groups of packets with the same source IP, source port, and destination port

# base sampling rate sampling_rate = 0.01 # number of tuples tuples = 2 # number of conditions conditions = 1 # tuple definitions tuple_1 := srcip.dstip tuple_2 := srcip.srcport.dstport # condition : sampling budget tuple_1 in (30, inf] AND tuple_2 in (0, 5]: 0.5 Sampling Rates for Subpopulations Operator specifies –Overall sampling rate –Conditional rate within each class Flexsample computes instantaneous sampling probabilities based on this Sample one in 100 packets on average Within the 1/100 budget, half of sampled packets should come from groups satisfying this condition

Examining the Condition Biases sampling towards packets from (source IP, destination IP) pairs which –Have sent at least 30 packets –Have sent packets to at least 5 distinct ports Application: Portscan # base sampling rate sampling_rate = 0.01 # number of tuples tuples = 2 # number of conditions conditions = 1 # tuple definitions tuple_1 := srcip.dstip tuple_2 := srcip.srcport.dstport # condition : sampling budget tuple_1 in (30, inf] AND tuple_2 in (0, 5]: 0.5

Sampling Lookup Table Problem: Conditions may not be completely specified Solution: Sampling budget lookup table –Lookup table for allocating sampling budget to each class # tuple definitions tuple_1 := srcip.dstip tuple_2 := srcip.srcport.dstport # condition : sampling budget tuple_1 in (30, inf] AND tuple_2 in (0, 5]: 0.5 Deduced values Next problem: Determining which condition each packet satisfies

Counting Subpopulations Each packet belongs to a particular range in n- dimensional space Counts for each condition –Maintain counter (counting Bloom filter) for each tuple in every subcondition –Rotate counters to expunge stale values Details: 1. Number of counters 2. How often to rotate

Deriving Instantaneous Sampling Rates Problem: Traffic rates are dynamic –Relative fractions of packets in each class may change Solution: Count packets in each sampling class, and adjust probabilities to rebalance according to the lookup table –Instantaneous rate = overall rate * (target rate) / (actual rate) –Keep track of actual rate using Bloom filter array and EWMA

Example Evaluation: Portscan Parameters as above Nmap scan injected into ful one-hour trace from department network Results Setup FlexSample can capture 10x more of the portscan packets if all sampling budget is allocated to portscan class Bias can be configured

Packet Capture on High-Speed Links Example: Georgia Tech OC3Mon Rack-mounted PC Optical splitter Data Acquisition and Generation (DAG) card Source: endace.com

Characteristics of Packet Capture Allows inpsection on every packet on 10G links Disadvantages –Costly –Requires splitting optical fibers –Must be able to filter/store data

Online Scams Often advertised in spam messages URLs point to various point-of-sale sites These scams continue to be a menace –As of August 2007, one in every 87 s constituted a phishing attack Scams often hosted on bullet-proof domains Problem: Study the dynamics of online scams, as seen at a large spam sinkhole

Online Scam Hosting is Dynamic The sites pointed to by a URL that is received in an message may point to different sites Maintains agility as sites are shut down, blacklisted, etc. One mechanism for hosting sites: fast flux

Overview of Dynamics Source: HoneyNet Project

Why Study Dynamics? Understanding –What are the possible invariants? –How many different scam-hosting sites are there? Detection –Today: Blacklisting based on URLs –Instead: Identify the network-level behavior of a scam- hosting site

Summary of Findings What are the rates and extents of change? –Different from legitimate load balance –Different cross different scam campaigns How are dynamics implemented? –Many scam campaigns change DNS mappings at all three locations in the DNS hierarchy A, NS, IP address of NS record Conclusion: Might be able to detect based on monitoring the dynamic behavior of URLs

Data Collection One month of spamtrap data –115,000 s –384 unique domains –24 unique spam campaigns

Top 3 Spam Campaigns Some campaigns hosted by thousands of IPs Most scam domains exhibit some type of flux Sharing of IP addresses across different roles (authoritative NS and scam hosting)

Time Between Changes How quickly do DNS-record mappings change? Scam domains change on shorter intervals than their TTL values Domains within the same campaign exhibit similar rates of change

Rates of Change Domains that exhibit fast flux change more rapidly than legitimate domains Rates of change are inconsistent with actual TTL values

Rates of Accumulation How quickly do scams accumulate new IP addresses? Rates of accumulation differ across campaigns Some scams only begin accumulating IP addresses after some time

Rates of Accumulation

Location of Change in Hierarchy Scam networks use a different portion of the IP address space than legitimate sites –30/8 – 60/8 --- lots of legitimate sites, no scam sites DNS lookups for scam domains are often more widely distributed than those for legitimate sites

Location in IP Address Space Scam campaign infrastructure is considerably more concentrated in the 80/8-90/8 range

Distribution of DNS Records

Registrars Involved in Changes About 70% of domains still active are registered at eight domains Three registrars responsible for 257 domains (95% of those still marked as active)

Conclusion Scam campaigns rely on a dynamic hosting infrastructure Studying the dynamics of that infrastructure may help us develop better detection methods Dynamics –Rates of change differ from legitimate sites, and differ across campaigns –Dynamics implemented at all levels of DNS hierarchy Location –Scam sites distributed more across IP address space