Tracking Malicious Regions of the IP Address Space Dynamically.

Slides:



Advertisements
Similar presentations
11/20/09 ONR MURI Project Kick-Off 1 Network-Level Monitoring for Tracking Botnets Nick Feamster School of Computer Science Georgia Institute of Technology.
Advertisements

Research Summary Nick Feamster. The Big Picture Improving Internet availability by making networks easier to operate Three approaches –From the ground.
Network Security Highlights Nick Feamster Georgia Tech.
Network Operations Research Nick Feamster
Internet Indirection Infrastructure (i3 ) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002 Presented by:
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
1 IP-Lookup and Packet Classification Advanced Algorithms & Data Structures Lecture Theme 08 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
CSCI 4550/8556 Computer Networks Comer, Chapter 22: The Future IP (IPv6)
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
MULTOPS A data-structure for bandwidth attack detection Thomer M. Gil Vrije Universiteit, Amsterdam, Netherlands MIT, Cambridge, MA, USA
CMPE 150- Introduction to Computer Networks 1 CMPE 150 Fall 2005 Lecture 22 Introduction to Computer Networks.
1 BotGraph: Large Scale Spamming Botnet Detection Yao Zhao EECS Department Northwestern University.
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and.
Three kinds of learning
Flash Crowds And Denial of Service Attacks: Characterization and Implications for CDNs and Web Sites Aaron Beach Cs395 network security.
Classification.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao Yinglian Xie *, Fang Yu *, Qifa Ke *, Yuan Yu *, Yan Chen and Eliot Gillum ‡ EECS Department,
Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002.
Internet Quarantine: Requirements for Containing Self-Propagating Code David Moore et. al. University of California, San Diego.
CBLOCK: An Automatic Blocking Mechanism for Large-Scale Deduplication Tasks Ashwin Machanavajjhala Duke University with Anish Das Sarma, Ankur Jain, Philip.
SMS Mobile Botnet Detection Using A Multi-Agent System Abdullah Alzahrani, Natalia Stakhanova, and Ali A. Ghorbani Faculty of Computer Science, University.
Online Learning Algorithms
Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Shuang Hao, Nadeem Ahmed Syed, Nick Feamster, Alexander G. Gray,
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
Presented by Group 2: Presented by Group 2: Shan Gao ( ) Shan Gao ( ) Dayang Yu ( ) Dayang Yu ( ) Jiayu Zhou ( ) Jiayu Zhou.
SCAN: a Scalable, Adaptive, Secure and Network-aware Content Distribution Network Yan Chen CS Department Northwestern University.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Understanding the Network-Level Behavior of Spammers Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster Ye Wang (sando)
Data Structures & Algorithms and The Internet: A different way of thinking.
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
© Copyright 2012 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. LogKV: Exploiting Key-Value.
Application of Content Computing in Honeyfarm Introduction Overview of CDN (content delivery network) Overview of honeypot and honeyfarm New redirection.
Boosting of classifiers Ata Kaban. Motivation & beginnings Suppose we have a learning algorithm that is guaranteed with high probability to be slightly.
Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Leveraging Asset Reputation Systems to Detect and Prevent Fraud and Abuse at LinkedIn Jenelle Bray Staff Data Scientist Strata + Hadoop World New York,
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Online Learning Yiling Chen. Machine Learning Use past observations to automatically learn to make better predictions or decisions in the future A large.
Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo.
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
© 2009 WatchGuard Technologies WatchGuard ReputationAuthority Rejecting Unwanted & Web Traffic at the Perimeter.
Leveraging Delivery for Spam Mitigation.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
High-Speed Policy-Based Packet Forwarding Using Efficient Multi-dimensional Range Matching Lakshman and Stiliadis ACM SIGCOMM 98.
Exploiting Network Structure for Proactive Spam Mitigation Shobha Venkataraman * Joint work with Subhabrata Sen §, Oliver Spatscheck §, Patrick Haffner.
1 Multi-Level Indexing and B-Trees. 2 Statement of the Problem When indexes grow too large they have to be stored on secondary storage. However, there.
Author: Haoyu Song, Murali Kodialam, Fang Hao and T.V. Lakshman Publisher/Conf. : IEEE International Conference on Network Protocols (ICNP), 2009 Speaker:
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
CS 241 Discussion Section (12/1/2011). Tradeoffs When do you: – Expand Increase total memory usage – Split Make smaller chunks (avoid internal fragmentation)
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
John S. Otto Mario A. Sánchez John P. Rula Fabián E. Bustamante Northwestern, EECS.
Computational Challenges in BIG DATA 28/Apr/2012 China-Korea-Japan Workshop Takeaki Uno National Institute of Informatics & Graduated School for Advanced.
1 Machine Learning Lecture 8: Ensemble Methods Moshe Koppel Slides adapted from Raymond J. Mooney and others.
Mobile IP THE 12 TH MEETING. Mobile IP  Incorporation of mobile users in the network.  Cellular system (e.g., GSM) started with mobility in mind. 
How dynamic are IP addresses? Yinglian Xie, Fang Yu, Kannan Achan, Eliot Gillum, Moises Goldszmidt, Ted Wobber SIGCOMM ‘07 Chulhyun Park
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Internet Quarantine: Requirements for Containing Self-Propagating Code
Internet Indirection Infrastructure (i3)
Plethora: Infrastructure and System Design
Memento: Making Sliding Windows Efficient for Heavy Hitters
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
GANG: Detecting Fraudulent Users in OSNs
Presentation transcript:

Tracking Malicious Regions of the IP Address Space Dynamically

Introduction Recent focus on network-level behaviour of malicious activity:  Spam originating from different /24 prefixes [RF06]  Spam originating from long-lived network-aware clusters [VSSHS07]  Malicious activity originating from different /16 prefixes [CSFJWS07] Application:  Real-time, effective mechanism for blocking traffic at line speed Challenges:  Transience of individual IP addresses: due to e.g., bots, DHCP effects  Effective clustering: variety of different clustering of IP prefixes possible  Scaling to large volume of data [RF06] Ramachandran & Feamster, Understanding the Network-level Behavior of Spammers, SIGCOMM’06 [VSSHS07] Venkataraman et al. Exploiting Network Structure for Proactive Spam Mitigation. Security’ 07 [CSFJWS07] Collins et al, Using Uncleanliness to Predict Future Botnet Addresses, IMC’07

Study: Transience of Individual IPs Spamming IPs present on many days contribute little spam! Measurement Study: Analysis of spamming IP addresses & regions Data: Enterprise mail logs over 6 months, 28 million s

Network-Aware Clusters [KW00]: Pre-defined set of IP prefixes that reflect Internet routing structure IP address belongs to cluster with longest matching prefix Study: Persistence of IP Prefix Clusters 90% of total spam comes from “bad” clusters present for 60+ days Most spam comes from “bad” clusters present for many days Bad Clusters [KW00] On Network-Aware Clustering of Web Clients, Krishnamurty & Wang, SIGCOMM ’00 Bad IPs

Introduction Recent focus on network-level behaviour of malicious activity:  Spam originating from different /24 prefixes [RF06]  Spam originating from long-lived network-aware clusters [VSSHS07]  Malicious activity originating from different /16 prefixes [CSFJWS07] Challenges:  Transience of individual IP addresses: due to e.g., bots, DHCP effects  Scaling to large volumes of data  Effective Clustering: variety of different clustering of IP prefixes possible Question:  Can we automatically compute the optimal clustering for tracking and predicting malicious activity? [RF06] Ramachandran & Feamster, Understanding the Network-level Behavior of Spammers, SIGCOMM’06 [VSSHS07] Venkataraman et al. Exploiting Network Structure for Proactive Spam Mitigation. Security’ 07 [CSFJWS07] Collins et al, Using Uncleanliness to Predict Future Botnet Addresses, IMC’07

Problem, v.1 (naïve) Input: IP addresses labeled normal(+)/malicious(-) e.g., logs with sending mail server’s IP, SpamAssassin labels spam(-) or nonspam(+) Output: Tree of IP prefixes, that  optimal for classifying IPs as normal(+)/malicious(-)  contains no more than k leaves Limit output IP tree to k leaves:  Small output state  Avoid overfitting / / / / /2 x.x.x.x/32 (IP addr) x.x.x.x/24 x.x.x.x/30

Challenges As stated, version 1 easy: use dynamic programming! However: IP tree over dynamic and evolving data:  Data is collected and labelled over time (e.g., logs, traffic logs)  Compromised/malicious IP prefixes may change over time Want to compute updated tree without going back to past data Low space & low overhead: up to 2 32 leaves in IP tree! Want algorithm to use space near-linear in k Adversarially-generated IP addresses: Want guarantees as function of data seen & adversary’s power Approach: Design online algorithms to address all issues

Online Learning Low space & overhead: IPs seen one at a time  Algorithm only needs to maintain small internal state Naturally incorporates dynamic aspect of data Guarantees: function of mistakes made by optimal offline tree  Quantifies cost of learning tree online  Data may be generated adversarially Algorithm IP Predicted Label Correct Label Data e.g., mail logs labelled +/-

Problem, v.2: Online Learning of Static Tree Input: a stream of IPs labelled normal(+)/malicious(-) Output: tree of IP prefixes:  Predicts nearly as well as OPTIMAL offline tree with k leaves  Using low space in online model of learning / / / / /2 x.x.x.x/32 (IP addr) x.x.x.x/24 x.x.x.x/30

Dynamic IP Prefix Tree Malicious IP regions may change over time New Goal: Predict nearly as well as the optimal changing tree  Optimal tree may make two kinds of changes: splitting & changing leaf sign  Prediction Guarantee: Our mistakes = O(OPTIMAL tree’s cost) Optimal tree’s cost: function of mistakes made, and changes it makes + - #1: Leaf changes sign #2: Leaf splits

IP address x.x.x.x/ / / / x.x.x.x/24 x.x.x.x/ Problem: Compute IP tree online to predict nearly as well as best changing tree with k leaves, using space O(k) ~ Problem: Online Learning of Dynamic Tree

Related Problems Machine learning:  Predicting as well as the best pruning of a decision tree [HS97] Our requirements: low space, tracking a dynamic tree online, real-world implementation  Learning decision trees in streaming models [DH00] Our requirement: Adversarial data Simpler problem: tree fixed; no need to “learn” a tree Online algorithms:  Paging Algorithms [HS97] Helmbold & Schapire, Predicting Nearly As Well As the Best Pruning of a Decision Tree. Machine Learning ‘97 [DH00] Domingos & Hulten, Mining High-Speed Data Streams, KDD 2000

Overview of our Algorithm Given IP address, predict:  Trace IP path on tree & flag all nodes on path (i.e., all prefixes of input IP in tree)  Label IP by combining weights of flagged nodes Given correct label, update:  If predicted correctly, do nothing  If predicted incorrectly: update flagged node weights grow tree by splitting leaf (if necessary) IP: a.b.c.d Predicted Label Correct Label

Four Algorithmic Questions Fixed Tree Structure:  Relative importance of flagged nodes?  Label of flagged node: positive/negative? Changing the Tree Structure:  When to grow the tree?  How to maintain a small tree?

w1w1 Relative Importance of Nodes Use sleeping experts algorithm:  Every node is expert  Each flagged node is an “awake” expert  Best expert: leaf of optimal tree Algorithm:  Each node has a “relative” weight  Predict with all experts that are awake e.g., weighted coin toss  Update by redistribution of weight between awake experts w0w0 w2w2 w 10 w 15 w 16 w 17

Label of Flagged Nodes Shifting experts problem:  Each node has 2 internal experts: “positive” expert: label + “negative” expert: label –  Best expert: label of node Algorithm:  Predict by weighted coin toss between + & – experts  Update by shifting weight from incorrect to correct expert Tracking a dynamic tree:  Automatically incorporates leaf changing sign w+w+ w-w-

Growing the Tree Algorithm:  Track # mistakes made by each leaf  Split after leaf makes sufficient mistakes Tracking dynamic tree:  Also incorporates changes caused by leaf splitting # mistakes?

Maintaining Small-Space Tree Convert to Paging Problem:  Each node is a page  Size of optimal cache: 2k  Q: which node to discard? Use paging algorithms & competitive analysis:  e.g. using Flush-When-Full Start from scratch after tree has grown to max size ? ? ? ? ?? ??

Analysis Define ε so that: additive change to internal node weight per mistake: ε multiplicative loss factor to relative node weight per mistake: 1/ε mistakes needed before splitting node: 1/ε Then, E[# mistakes of algorithm] = (1+ ε ) (mistakes of OPTIMAL) + (1/ ε )(sign-changes of OPTIMAL’s leaves) + (log k/ ε ) (splits of OPTIMAL’s leaves) Space required in using FWF: O(k log k/ ε 2 )

Implementation Issues Sparse data from some IP prefixes  Might not see any IP addresses from some prefixes Clusters might be too big to be meaningful  Include loss function to penalize prefixes for “nothingness” Efficient implementations for large-scale data:  Coalesce nodes appropriately in binary tree (effectively becomes trie)  Randomization calls are computationally-expensive Experimental performance: In progress