Early Detection of Outgoing Spammers in Large-Scale Service Provider Networks Yehonatan Cohen Daniel Gordon Danny Hendler Ben-Gurion University Yehonatan.

Slides:



Advertisements
Similar presentations
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Advertisements

Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
1 Network-Level Spam Detection Nick Feamster Georgia Tech.
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
What is Spam  Any unwanted messages that are sent to many users at once.  Spam can be sent via , text message, online chat, blogs or various other.
RB-Seeker: Auto-detection of Redirection Botnet Presenter: Yi-Ren Yeh Authors: Xin Hu, Matthew Knysz, Kang G. Shin NDSS 2009 The slides is modified from.
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
Presented by: Alex Misstear Spam Filtering An Artificial Intelligence Showcase.
Phishing (pronounced “fishing”) is the process of sending messages to lure Internet users into revealing personal information such as credit card.
Service Discrimination and Audit File Reduction for Effective Intrusion Detection by Fernando Godínez (ITESM) In collaboration with Dieter Hutter (DFKI)
Learning on User Behavior for Novel Worm Detection.
SMS WATCHDOG: PROFILING SOCIAL BEHAVIORS OF SMS USERS FOR ANOMALY DETECTION Authors: Guanhua Yan, Stephan Eidenbenz, Emannuele Galli Presented by: Ishtiaq.
Introduction to Automatic Classification Shih-Wen (George) Ke 7 th Dec 2005.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Understanding the Network-Level Behavior of Spammers Anirudh Ramachandran Nick Feamster.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering An Effective Defense Against Spam Laundering Mengjun Xie, Heng Yin, Haining.
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and.
Spam Detection Jingrui He 10/08/2007. Spam Types  Spam Unsolicited commercial  Blog Spam Unwanted comments in blogs  Splogs Fake blogs.
Testing Intrusion Detection Systems: A Critic for the 1998 and 1999 DARPA Intrusion Detection System Evaluations as Performed by Lincoln Laboratory By.
Broadcast service Core tools. Agenda 1.Introduction – tool and its main features 2.Setting up and sending a simple broadcast 3.Achieving.
Towards Online Spam Filtering in Social Networks Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia and Alok Choudhary Lab for Internet and Security Technology.
SMS Mobile Botnet Detection Using A Multi-Agent System Abdullah Alzahrani, Natalia Stakhanova, and Ali A. Ghorbani Faculty of Computer Science, University.
Pro Exchange SPAM Filter An Exchange 2000 based spam filtering solution.
Spam? Not any more !! Detecting spam s using neural networks ECE/CS/ME 539 Project presentation Submitted by Sivanadyan, Thiagarajan.
SocialFilter: Introducing Social Trust to Collaborative Spam Mitigation Michael Sirivianos Telefonica Research Telefonica Research Joint work with Kyungbaek.
Understanding Forgery Properties of Spam Delivery Paths Fernando Sanchez, Zhenhai Duan Florida State University Yingfei Dong University of Hawaii.
Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Shuang Hao, Nadeem Ahmed Syed, Nick Feamster, Alexander G. Gray,
Login Screen This is the Sign In page for the Dashboard Enter Id and Password to sign In New User Registration.
Towards Modeling Legitimate and Unsolicited Traffic Using Social Network Properties 1 Towards Modeling Legitimate and Unsolicited Traffic Using.
Networks and Security. Types of Attacks/Security Issues  Malware  Viruses  Worms  Trojan Horse  Rootkit  Phishing  Spyware  Denial of Service.
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Jay Stokes, Microsoft Research John Platt, Microsoft Research Joseph Kravis, Microsoft Network Security Michael Shilman, ChatterPop, Inc. ALADIN: Active.
PhishScore: Hacking Phishers’ Minds
Login Screen This is the Sign In page for the Dashboard New User Registration Enter Id and Password to sign In.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
SpotRank : A Robust Voting System for Social News Websites
Client X CronLab Spam Filter Technical Training Presentation 19/09/2015.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
FluXOR: Detecting and Monitoring Fast-Flux Service Networks Emanuele Passerini, Roberto Paleari, Lorenzo Martignoni, and Danilo Bruschi 5th international.
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
A Technical Approach to Minimizing Spam Mallory J. Paine.
Jhih-sin Jheng 2009/09/01 Machine Learning and Bioinformatics Laboratory.
Enron Corpus: A New Dataset for Classification By Bryan Klimt and Yiming Yang CEAS 2004 Presented by Will Lee.
BOTNET JUDO Fighting Spam with Itself By: Pitsillidis, Levchenko, Kreibich, Kanich, Voelker, Paxson, Weaver, and Savage Presentation by: Heath Carroll.
One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.
SOCIAL NETWORKS ANALYSIS SEMINAR INTRODUCTORY LECTURE #2 Danny Hendler and Yehonatan Cohen Advanced Topics in on-line Social Networks Analysis.
Improving Spam Detection Based on Structural Similarity By Luiz H. Gomes, Fernando D. O. Castro, Rodrigo B. Almeida, Luis M. A. Bettencourt, Virgílio A.
Web Content Filtering Mayur Lodha (mdl2130). Agenda  Need of Filtering  Content Filtering  Basic Model  Filtering Techniques  Filtering  Circumvent.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
Chapter 9 Hardware Addressing and Frame Type Identification 1.Delivering and sending packets 2.Hardware addressing: specifying a destination 3. Broadcasting.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
Leveraging Delivery for Spam Mitigation.
Machine Learning for Spam Filtering 1 Sai Koushik Haddunoori.
Exploiting Network Structure for Proactive Spam Mitigation Shobha Venkataraman * Joint work with Subhabrata Sen §, Oliver Spatscheck §, Patrick Haffner.
11 Shades of Grey: On the effectiveness of reputation- based “blacklists” Reporter: 林佳宜 /8/16.
Firewalls. Intro to Firewalls Basically a firewall is a barrier to keep destructive forces away from your computer network.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Identifying Spam Web Pages Based on Content Similarity Sole Pera CS 653 – Term paper project.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
1 Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Speaker: Jun-Yi Zheng 2010/01/18.
Learning to Detect and Classify Malicious Executables in the Wild by J
TMG Client Protection 6NPS – Session 7.
Analyzing Behavioral Features for Classification
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Fix Yahoo Mail Box Error 550 Call for Help
Dieudo Mulamba November 2017
This is the Sign In page for the Dashboard
Presented by: Sehar Munawar Registration no: B2F17ASOC0035 Presented to: Sir Waseem Iqbal & management & spam.
Cybersecurity Simplified: Phishing
Presentation transcript:

Early Detection of Outgoing Spammers in Large-Scale Service Provider Networks Yehonatan Cohen Daniel Gordon Danny Hendler Ben-Gurion University Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

 Preliminaries  ErDOS: An Early Detection Scheme for Outgoing Spam  Evaluation  Conclusions and Future Work Danny Hendler and Philipp Woelfel, PODC 2009 Talk outline

Preliminaries  Spam Unsolicited mail, typically sent in large quantities  Hazards Malware distribution Phishing Resource consumption Poor user experience  Detection may be attempted when Mail is sent (outgoing spam detection) Mail is received (incoming spam detection) Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

Outgoing spam detection  Spam can be blocked before leaving the Service Provider (ESP)  Advantages Reduces load on ESP infrastructure Prevents damage to ESP reputation Detection may be based on hosted accounts' activity Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

Outgoing spam filtering techniques  Contents-based filtering: Learn & identify messages' textual patterns typical of spam messages May be tricked by manipulating spam content o Image-based o Random string insertion (hash busters) Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 Non-negligible false negative rate

Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 Outgoing spam filtering techniques (cont'd)  Inter-account communication patterns analysis: Models accounts' behaviour Based on inter-account social interactions Typically utilizes machine-learning techniques May leverage ESP account identification

 Devise an effective detector of outgoing spammers for large ESPs (the ErDOS detector)  Emphasis on early detection Detects spammers before the contents-based filter  Short training periods Highly adaptive to changing spamming patterns Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 Our goals

Most relevant related work  Lam & Yeung, CEAS 2007 Introduce “social-network”-based outgoing spam detection Use the k-NN classifier Relatively small dataset (ENRON) Labeling based on simulated spammer accounts  Tseng & Chen, CSE 2009 Uses same set of features Uses SVM classifier Larger, non-ESP dataset (University server) Incremental model update Labeling based on pure accounts Account identification based on “from” header field Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

Comparison with data-sets of previous work Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 Our data setNTUEnron #mails9.86E72.13E82.86E65.17E5 #accounts5.63E75.81E76.37E53.67E4 #edges7.40E712.90E7-3.68E5 time period 4 days (in/out) 26 days (outgoing) 10 days3.5 years contentsspam & ham ham  Collected by a very large ESP  Consists of incoming and outgoing log files o 4 days of bi-directional data + 22 days of outgoing traffic only  Both incoming and outgoing messages are labeled as spam/ham by a content-based detector

Comparison with data-sets of previous work Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 Our data setNTUEnron #mails9.86E72.13E82.86E65.17E5 #accounts5.63E75.81E76.37E53.67E4 #edges7.40E712.90E7-3.68E5 time period 4 days (in/out) 26 days (outgoing) 10 days3.5 years contentsspam & ham ham  Collected by a very large ESP  Consists of incoming and outgoing log files o 4 days of bi-directional data + 22 days of outgoing traffic only  Both incoming and outgoing messages are labeled as spam/ham by a content-based detector

Danny Hendler and Philipp Woelfel, PODC 2009  Preliminaries  ErDOS: An Early Detection Scheme for Outgoing Spam Computation Flow Features  Evaluation  Conclusions and Future Work Talk outline

The ErDOS detector: computation flow Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 Scored accounts Classified data set Classification model Undersampling: extract all spammers and equal number of legitimate accounts as training set Training set Remainder of accounts not in training set Determine accounts' classification Compute account feature values based on a single day of logs Build rotation forest model Assign account scores using classification model Construct suspect accounts list of configurable size Pre-processing Feature values computed

Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013  Preliminaries  ErDOS: An Early Detection Scheme for Outgoing Spam Computation Flow Features  Evaluation  Conclusions and Future Work Talk outline

Legitimate users  Maintain social interactions  Often belong to mailing lists Spammers  Sent messages seldom replied Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 An account’s IOR = #incoming/#outgoing mails Low IOR characteristic of spammers ErDOS features: IOR

Danny Hendler and Philipp Woelfel, PODC 2009 ErDOS features: IOR (cont'd)

 Communication Reciprocity (CR) Fraction of recipients who responded to an account's s Defined by Gomes et al. IOR is superior for short training periods Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 ErDOS features: IOR versus CR

 IEBC (Internal/External Behaviour Consistency) An account can send/receive s to/from  Internal addresses (accounts hosted by ESP)  External addresses Legitimate accounts show correlation between internal and external IOR, spammers less so Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 ErDOS features: IEBC

ErDOS features: #outgoing messages  Number of outgoing messages Spamming accounts send more s than legitimate Insufficient for detecting low-volume spammers Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

 A large fraction of spammers' incoming mail is spam! Legitimate accounts seldom send s to spamming accounts Dictionary attacks may cause spammers to spam each other  Analyse senders' characteristics Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 ErDOS: Sender Accounts' Characteristics

Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013  Preliminaries  ErDOS: An Early Detection Scheme for Outgoing Spam  Evaluation  Conclusions and Future Work Talk outline

Accuracy for Single-Day training  Evaluate Accuracy attained for single day logs accounts are classified based on the tags of the contents-base detector True Positive (TP) and False Positive (FP) values are averaged over available 4 days of bidirectional data Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 ErDOSLY-knn MailNET TPFPTPFPTPFP

Early detection evaluation  Spamming accounts detected before the contents-based detector Suspected by detector, send messages tagged as spam only on later days Evaluation uses all 26 days of data  Early detection quality criteria: e-Precision: fraction of early detected accounts out of suspects list. Enrichment Factor (EF): ratio between detector's e-Precision and that of a random accounts list. Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

Early detection  Early detection results, averaged over 4 days:  Prior art’s early detections results compared to ErDOS: Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013 ErDOS’s suspectsEntire population #accounts100 Early detections90.53 e-Precision ErDOSLY-knnMailNET e-Precision EF

Early detection (cont’d)  e-Precision for varying suspects list lengths: Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013

 Preliminaries  ErDOS: An Early Detection Scheme for Outgoing Spam  Evaluation  Conclusions and Future Work Talk outline

Conclusions and Future Work  Conclusions The case of outgoing spam detection for ESPs has its unique nature Contents-based filtering is not enough Early detection of spamming accounts can be achieve by a combination of contents-based filter and network level- based detector  Future Work Enhancement of ErDOS’s early detection performance by additional features A low-volume spammers expert detector, based on ErDOS’s computation flow and features Yehonatan Cohen, Daniel Gordon and Danny Hendler, DIMVA 2013