Traffic Analysis of UDP-based Flows in ourmon Jim Binkley, Divya Parekh Portland State University Computer.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Knowledge Representation
Advertisements

Analyzing and Exploiting Network Behaviors of Malware Jose Andre Morales Areej Al-Bataineh Shouhuai XuRavi Sandhu SecureComm Singapore, 2010 ©2010 Institute.
Greening Backbone Networks Shutting Off Cables in Bundled Links Will Fisher, Martin Suchara, and Jennifer Rexford Princeton University.
Network Monitoring System In CSTNET Long Chun China Science & Technology Network.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
1 Resonance: Dynamic Access Control in Enterprise Networks Ankur Nayak, Alex Reimers, Nick Feamster, Russ Clark School of Computer Science Georgia Institute.
Autotuning in Web100 John W. Heffner August 1, 2002 Boulder, CO.
Towards Automating the Configuration of a Distributed Storage System Lauro B. Costa Matei Ripeanu {lauroc, NetSysLab University of British.
Energy-Efficient Distributed Algorithms for Ad hoc Wireless Networks Gopal Pandurangan Department of Computer Science Purdue University.
Query optimisation.
Peer-to-peer and agent-based computing P2P Algorithms.
1 Data Link Protocols By Erik Reeber. 2 Goals Use SPIN to model-check successively more complex protocols Using the protocols in Tannenbaums 3 rd Edition.
SPATor: Improving Tor Bridges with Single Packet Authorization Paper Presentation by Carlos Salazar.
Protocol layers and Wireshark Rahul Hiran TDTS11:Computer Networks and Internet Protocols 1 Note: T he slides are adapted and modified based on slides.
SHARKFEST '09 | Stanford University | June 15–18, 2009 The Reality of 10G Analysis Presented by: Network Critical Wednesday, June 17 th, :30 pm –
Mehdi Naghavi Spring 1386 Operating Systems Mehdi Naghavi Spring 1386.
Object-Oriented Software Engineering Visual OO Analysis and Design
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Disambiguation of Residential Wired and Wireless Access in a Forensic Setting Sookhyun.
School of Geography FACULTY OF ENVIRONMENT Working with Tables 1.
OpenFlow Global Configuration
TCP Probe: A TCP with Built-in Path Capacity Estimation Anders Persson, Cesar Marcondes, Ling-Jyh Chen, Li Lao, M. Y. Sanadidi, Mario Gerla Computer Science.
powerful network monitoring & management solution
ICmyNet.Flow Network Traffic Analysis System If You Want to See Your Net
Routing and Congestion Problems in General Networks Presented by Jun Zou CAS 744.
1 NS-2 Tutorial COMP R2 University of Manitoba March 4, 2009.
Scalable and Dynamic Quorum Systems Moni Naor & Udi Wieder The Weizmann Institute of Science.
Executional Architecture
Measurement: Techniques, Strategies, and Pitfalls Nick Feamster CS 7260 February 7, 2007.
What’s New in WatchGuard Dimension v1.2
Chapter 10: The Traditional Approach to Design
Systems Analysis and Design in a Changing World, Fifth Edition
Choosing an Order for Joins
Computer Net Lab/Praktikum Datenverarbeitung 2 1 Overview Sockets Sockets in C Sockets in Delphi.
Chapter 8 Improving the User Interface
Firewalls Steven M. Bellovin Matsuzaki ‘maz’ Yoshinobu 1.
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
Greg Williams CS691 Summer Honeycomb  Introduction  Preceding Work  Important Points  Analysis  Future Work.
Cloud Control with Distributed Rate Limiting Raghaven et all Presented by: Brian Card CS Fall Kinicki 1.
Sanjay Goel, School of Business/Center for Information Forensics and Assurance University at Albany Proprietary Information 1 Unit Outline Information.
Chapter 23: ARP, ICMP, DHCP IS333 Spring 2015.
An Algorithm for Anomaly-based Botnet Detection Jim Binkley Suresh Singh Portland State University Computer Science.
Network Analyzer CS4500 Spring 2004 Hong Jiang Ryan Pratt Raul Chiari By Palantir:
Ourmon and Network Monitoring Performance Jim Binkley Bart Massey Portland State University Computer Science.
Practical Networking. Introduction  Interfaces, network connections  Netstat tool  Tcpdump: Popular network debugging tool  Used to intercept and.
COEN 252: Computer Forensics Router Investigation.
Port Scanning.
IST 228\Ch3\IP Addressing1 TCP/IP and DoD Model (TCP/IP Model)
Firewalls CS432. Overview  What are firewalls?  Types of firewalls Packet filtering firewalls Packet filtering firewalls Sateful firewalls Sateful firewalls.
Computer Security: Principles and Practice First Edition by William Stallings and Lawrie Brown Lecture slides by Lawrie Brown Chapter 8 – Denial of Service.
Differences between In- and Outbound Internet Backbone Traffic Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University.
Step-by-Step Intrusion Detection using TCPdump SHADOW.
1 mmdump Reference: “mmdump: A Tool for Monitoring Internet Multimedia Traffic” J. van der Merwe, R. Cceres, Y-H. Chu, C. Sreenan. ACM SIGCOMM Computer.
LWIP TCP/IP Stack 김백규.
Dividing the Pizza An Advanced Traffic Billing System An Advanced Traffic Billing System Christopher Lawrence Burke The University of Queensland.
NATs and UDP Victor Norman CS322 Spring NAPT Suppose we have a router doing NAT: half is the “public side”, IP address ; other half is.
MIS Week 4 Site:
Linux Networking and Security
Agilent Technologies Copyright 1999 H7211A+221 v Capture Filters, Logging, and Subnets: Module Objectives Create capture filters that control whether.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
BARD / April BARD: Bayesian-Assisted Resource Discovery Fred Stann (USC/ISI) Joint Work With John Heidemann (USC/ISI) April 9, 2004.
Sniffer, tcpdump, Ethereal, ntop
Workpackage 3 New security algorithm design ICS-FORTH Ipswich 19 th December 2007.
Bradley Cowie Supervised by Barry Irwin Security and Networks Research Group Department of Computer Science Rhodes University DATA CLASSIFICATION FOR CLASSIFIER.
1 Microsoft Windows 2000 Network Infrastructure Administration Chapter 4 Monitoring Network Activity.
Tracking Rejected Traffic.  When creating Cisco router access lists, one of the greatest downfalls of the log keyword is that it only records matches.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Role Of Network IDS in Network Perimeter Defense.
Software Defined Networking (SDN)
Presentation transcript:

Traffic Analysis of UDP-based Flows in ourmon Jim Binkley, Divya Parekh Portland State University Computer Science Courtesy of John McHugh

2 Outline problem space - and short ourmon intro UDP flow tuple UDP work weight UDP guesstimator problems (DNS and p2p as scanners) packet-size based UDP application guessing conclusions

3 motivation - problem space UDP-based DOS attacks certainly exist p2p searching courtesy of Distributed Hash Tables on the rise (use UDP to search and TCP to fetch) Kademlia protocol - Maymounkov and D. Mazieres. stormworm botnet is UDP/P2P based based on edonkey related protocol (overnet) p2p-based apps not just for file-sharing Joost - cable TV, Skype - VOIP goal: focus on UDP flow activity in terms of security and p2p

4 brief ourmon intro 2 part system: front-end, back-end front-end: packet sniffer, output ASCII files back-end: web-interface with graphs, and aggregated logs front-end produces: scalars that produce RRDTOOL web graphs either hardwired or programmable (BPF) various kinds of top-N lists (ourmon flows) back-end web access plus graphics processing, log aggregation 30-second view and hourly aggregation views event log for important security events

5 ourmon architectural breakdown probe box/FreeBSDgraphics box/BSD or linux ourmon.conf config file runtime: 1. N BPF expressions 2. + topn (hash table) of flows and other things (tuples or lists) 3. some hardwired C filters (scalars of interest) pkts from NIC/kernel BPF buffer mon.lite report file outputs: 1. RRDTOOL strip charts 2. histogram top N graphs 3. various ASCII reports, hourly summaries or report period tcpworm.txt etc. filters: BPF expressions, lists, some hardwired C filters

6 ourmon flow breakdown top N traditional (IP.port->IP.port) flows IP, UDP, TCP, ICMP hourly summarizations and web histograms IP host centric flows at Layer 4 TCP (presented in TCP port report) UDP (presented in UDP port report) <----- (this is what we are talking about here) Layer 7 specific flows now include IRC channels and hosts in channels DNS and ssh flows (spin-off of traditional flows)

7 UDP port report UDP centric top N tuple collected by front-end every 30 seconds hourly summarizations made by back-end flow tuple fields: IP address - key IP dst address - one sampled IP dst UDP work weight - noise measurement (sort by) SENT - packet count of packets sent RECV - packet count of packets returned to IP ICMPERRORS - icmp errors returned (unreachables in particular)

8 UDP port report tuple, cont. L3D - count of unique remote IP addresses in 30- second sample period L4D - count of unique remote UDP dst ports SIZEINFO - size histogram 5 buckets, <= 40, , 1000, 1500 (this is L7 payload size) SA - running average of sent payload size RA - running average of recv. payload size APPFLAGS - tags based on L7 regular expressions s for spim, d for DNS, b for Bittorrent, etc. PORTSIG - first ten dst ports seen with packet counts expressed as frequency in 30 sec report e.g., [53,100] meaning 100% sent to port 53

9 UDP work weight calculation per IP host UDP ww = (SENT * ICMPERRORS) + RECV if ICMPERRORS == 0, then just SENT + RECV we sort the top N report by the UDP ww basically can divide results up into about 3 bands: (numbers are relative to ethernet speed, 1 Gbit in our case) TOO HIGH (> 10 million in our case) BUSY million (p2p/games/dns servers) LOW (most - e.g., clients doing DNS) < 1000

10 theory behind UDP workweight if a host is doing scanning p2p it may generate SENT * ERROR packets and hence appear higher in the report scanning error generation is obvious p2p error generation is because a p2p host has a set of peers, some of which are stale if just busy, we add SENT + RECV some hosts may recv more packets then they send e.g., JOOST p2p video apps result: big error makers to the top, busy hosts next

11 some added features of UDP work weight we graph the very first tuple (the winner!) over the day, which gives an average distribution shows spikes average day shown in next slide if work weight > HIGH THRESHOLD we record N packets with automated tcpdump mechanism this has proved effective at the past in catching DOS attacks sources and targets even when monitoring fails if DOS was too much for probe - so far have always managed to capture sufficient packets

12 daily graph of top UDP work weights top single work weight per 30-second period for typical day: note: peaks here are usually SPIM outside in

13 contrived UDP port report (simplified) IP srcwwGuessSENT RECV ICMP ERR L3D / L4D App flags portsig 1*20 million scan / 527 bmany 212 million ipscan / 2 s1026, *49000p2p / 1297 bmany 43321p2p / 279 d53

14 UDP guesstimator algorithm attempt to guess what host is up to based on attributes principally on L3D/L4D and workweight goal: use only L3 and L4 attributes not L7 attributes and avoid destination port semantics thus it should work if bittorrent is on port 53 and encrypted per IP host guess basically a decision tree with 3 thresholds WW high threshold - set at 10 million L3D/L4D - p2p counts (say 10 for a low threshold)

15 rough algorithm guess = unknown if ww > HIGHTHRESHOLD guess = scanner if L4D is HIGH and L3D is LOW guess = portscanner else if L3D is HIGH and L4D is LOW guess = ipscanner else if L3D and L4D > P2PTHRESHOLD guess = p2p we have HIGHTHRESHOLD at 10million, port thresholds at 10 (might be higher/lower depending on locality)

16 how well does it work? it is really only pointing out obvious attribute aspects but this is helpful to a busy analyst two interesting errors 1. because DNS servers are typically busy and because they send to many ports, many destinations diagnosed as p2p -- true, but somehow annoying our L7 pattern is complex and is probably sufficient as DNS isnt going to be encrypted 2. some p2p hosts -- typically with stale caches may be diagnosed as scanners in a sense this is true note that p2p/scanner overlap is a long-standing problem

17 application guessing - limited experiment inspired by Collins, Reiter: Finding Peer-To-Peer File Sharing Using Coarse Network Behaviors, Sept decided to try to use packet sizes to see if we could guess UDP-based applications SIZEINFO SA/RA fields used for the most part thus 7 attributes in all, basic sent size histogram + SA,RA initially only done if guesstimator guesses p2p had to back that off for Skype only tested in a lab using Windows Vista and applications (some testing on a MAC) culled stats from 30 second UDP port reports this information is appended to guess e.g., p2p:joost

18 approach limited testing - lab only (barring stormworm where we got pcap traces from elsewhere) gathered attribute stats and graphed them per attribute choose lower and upper threshold based on >= 90% of samples note that the byte SIZE attribute was always 0 (not used) result coded as decision tree forest really a set of if tests - not if-then-else therefore results could overlap (fuzzy match)

19 apps/protocols in experiment applicationprotocol edonkeyemule bittorrent azureusbittorrent utorrentbittorrent limewiregnutella or bittorrent joost skype stormworm (UDP)emule variant

20 results?! suggestive and interesting but not 100% conclusive that this approach might be valuable problems: not enough testing but seemingly worked well barring skype not enough apps (should have included DNS! and probably NTP) we may be finding app classes not particular apps we dont know all the p2p apps on our network it is a university, although bittorrent and gnutella are dominant perhaps should have more buckets, look at recv packet buckets. better threshold estimation, etc. we could not get skype to behave - could catch it sometimes, other times not, not necessarily p2p, not necessarily UDP

21 conclusions UDP centric port tuple is useful for host behavior analysis with simple stats and a top N sort UDP ww is a good simple stat helps up track down blatant security problems measure of noise and load guesstimator is useful in terms of dividing world into security threats vs p2p based on non-L7 data saving time spent looking at data best to learn DNS servers though application guessing promising -- would be nice if researchers elsewhere would pursue it as well

22 ourmon on sourceforge open source new release (2.9) including work here expected Spring 2009 UDP port report guesstimator etc, plus hourly UDP summarization for port report ssh flow statistics (global site logging) expanded DNS statistics (errors, top N queries) expanded blacklist mechanism (can handle net/mask) ourmon.sourceforge.net (version 2.81) currently supports threads in front-end