Spam Sagar Vemuri slides courtesy: Anirudh Ramachandran Nick Feamster.

Slides:



Advertisements
Similar presentations
Nick Feamster Georgia Tech
Advertisements

Filtering: Sharpening Both Sides of the Double-Edged Sword Prof. Nick Feamster Georgia Tech feamster cc.gatech.edu.
Revealing Botnet Membership Using DNSBL Counter-Intelligence Anirudh Ramachandran, Nick Feamster, David Dagon College of Computing, Georgia Tech.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Dynamics of Online Scam Hosting Infrastructure
11/20/09 ONR MURI Project Kick-Off 1 Network-Level Monitoring for Tracking Botnets Nick Feamster School of Computer Science Georgia Institute of Technology.
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Understanding the Network- Level Behavior of Spammers Anirudh Ramachandran Nick Feamster Georgia Tech.
Spam and Botnets: Characterization and Mitigation Nick Feamster Anirudh Ramachandran David Dagon Georgia Tech.
Spamming with BGP Spectrum Agility Anirudh Ramachandran Nick Feamster Georgia Tech.
Spamming with BGP Spectrum Agility Anirudh Ramachandran Nick Feamster Georgia Tech.
Understanding the Network- Level Behavior of Spammers Anirudh Ramachandran Nick Feamster Georgia Tech.
Network-Based Spam Filtering Anirudh Ramachandran Nick Feamster Georgia Tech.
Network-Based Spam Filtering Nick Feamster Georgia Tech Joint work with Anirudh Ramachandran and Santosh Vempala.
1 Dynamics of Online Scam Hosting Infrastructure Maria Konte, Nick Feamster Georgia Tech Jaeyeon Jung Intel Research.
1 Network-Level Spam Detection Nick Feamster Georgia Tech.
Spam Sinkholing Nick Feamster. Introduction Goal: Identify bots (and botnets) by observing second-order effects –Observe application behavior thats likely.
Spamming with BGP Spectrum Agility Anirudh Ramachandran Nick Feamster Georgia Tech.
Network Operations Research Nick Feamster
Network-Based Spam Filtering Nick Feamster Georgia Tech with Anirudh Ramachandran, Nadeem Syed, Alex Gray, Sven Krasser, Santosh Vempala.
Basic Communication on the Internet:
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
What is Spam  Any unwanted messages that are sent to many users at once.  Spam can be sent via , text message, online chat, blogs or various other.
BotMiner Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology.
Understanding the Network-Level Behavior of Spammers Anirudh Ramachandran Nick Feamster.
Network Security: Spam Nick Feamster Georgia Tech CS 6250 Joint work with Anirudh Ramachanrdan, Shuang Hao, Santosh Vempala, Alex Gray.
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and.
Spam May CS239. Taxonomy (UBE)  Advertisement  Phishing Webpage  Content  Links From: Thrifty Health-Insurance Mailed-By: noticeoption.comReply-To:
Measurement and Monitoring Nick Feamster Georgia Tech.
FIREWALLS & NETWORK SECURITY with Intrusion Detection and VPNs, 2 nd ed. 6 Packet Filtering By Whitman, Mattord, & Austin© 2008 Course Technology.
Internet Traffic Analysis for Threat Detection Joshua Thomas, CISSP Thomas Conley, CISSP Ohio University Communication Network Services Joshua Thomas,
1 Authors: Anirudh Ramachandran, Nick Feamster, and Santosh Vempala Publication: ACM Conference on Computer and Communications Security 2007 Presenter:
Pro Exchange SPAM Filter An Exchange 2000 based spam filtering solution.
Can DNS Blacklists Keep Up With Bots? Anirudh Ramachandran, David Dagon, and Nick Feamster College of Computing, Georgia Tech.
Fighting Spam, Phishing and Online Scams at the Network Level Nick Feamster Georgia Tech with Anirudh Ramachandran, Shuang Hao, Nadeem Syed, Alex Gray,
Spam Sonia Jahid University of Illinois Fall 2007.
23 October 2002Emmanuel Ormancey1 Spam Filtering at CERN Emmanuel Ormancey - 23 October 2002.
Spam Reduction Techniques Using greylisting and SpamAssassin.
Team Excel What is SPAM ?. Spam Offense Team Excel '‘a distinctive chopped pork shoulder and ham mixture'' Image Source:Appscout.com.
Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Shuang Hao, Nadeem Ahmed Syed, Nick Feamster, Alexander G. Gray,
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Revealing Botnet Membership Using DNSBL Counter-Intelligence David Dagon Anirudh Ramachandran, Nick Feamster, College of Computing,
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
11 SECURING INTERNET MESSAGING Chapter 9. Chapter 9: SECURING INTERNET MESSAGING2 CHAPTER OBJECTIVES  Explain basic concepts of Internet messaging. 
Network-Level Spam and Scam Defenses Nick Feamster Georgia Tech with Anirudh Ramachandran, Shuang Hao, Maria Konte Alex Gray, Jaeyeon Jung, Santosh Vempala.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
SMTP PROTOCOL CONFIGURATION AND MANAGEMENT Chapter 8.
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
Forensic and Investigative Accounting Chapter 14 Internet Forensics Analysis: Profiling the Cybercriminal © 2005, CCH INCORPORATED 4025 W. Peterson Ave.
The Internet 8th Edition Tutorial 2 Basic Communication on the Internet: .
Understanding the Network-Level Behavior of Spammers Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster Ye Wang (sando)
Maintaining a Secure Messaging Environment Across , IM, Web and Other Protocols Jim Jessup Regional Manager, Information Risk Management Specialist.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
Packet Filtering Chapter 4. Learning Objectives Understand packets and packet filtering Understand approaches to packet filtering Set specific filtering.
Spamscatter: Characterizing Internet Scam Hosting Infrastructure By D. Anderson, C. Fleizach, S. Savage, and G. Voelker Presented by Mishari Almishari.
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
Understanding the network level behavior of spammers Published by :Anirudh Ramachandran, Nick Feamster Published in :ACMSIGCOMM 2006 Presented by: Bharat.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
Exploiting Network Structure for Proactive Spam Mitigation Shobha Venkataraman * Joint work with Subhabrata Sen §, Oliver Spatscheck §, Patrick Haffner.
Tracking Malicious Regions of the IP Address Space Dynamically.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
1 Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Speaker: Jun-Yi Zheng 2010/01/18.
TCP/IP1 Address Resolution Protocol Internet uses IP address to recognize a computer. But IP address needs to be translated to physical address (NIC).
Spoofing The False Digital Identity. What is Spoofing?  Spoofing is the action of making something look like something that it is not in order to gain.
Analysing s Michael Jones. Overview How works Types of crimes associated with Mitigations Countermeasures Michael Jones2Analsysing s.
Slides Credit: Sogand Sadrhaghighi
Presentation transcript:

Spam Sagar Vemuri slides courtesy: Anirudh Ramachandran Nick Feamster

2 Agenda Understanding Spam –What is Spam? –Statistics –Types of Spam –Spamming Methods –Spam Mitigation Methods Understanding the Network-level behavior of spammers –Data Collection Methods –Statistics –BGP Spectrum Agility, Botnets, Harvesting –Drawbacks

3 What is Spam? Unsolicited commercial message “Spam is that is both unsolicited by the recipient and sent in substantively identical form to many recipients” As of last quarter of 2005, estimates indicate that about 80-85% of all is spam Microsoft founder Bill Gates receives four million s per year, most of them being spam

4 Some statistics An spam is sent to 600 addresses First large-scale spam sent to 6000 newsgroups, reaching millions of people newsgroups (June) 30 billion per day (June) 55 billion per day (December) 85 billion per day (February) 90 billion per day

5 Products advertised Porn site subscriptions Prescription drugs Printer ink cartridges Counterfeit software Mortgage offers Fake diplomas from non-existent or non- accredited universities

6 Types of Spam spam IM spam –Also called ‘Spim’ –1.2 billion spam IM messages in 2004 SMS spam –Also called ‘m-spam’ Image spam –Text of a msg stored as GIF or JPEG and displayed in the –Prevents text based spam filters from detecting it

7 Spamming Methods Direct spamming –By purchasing upstream connectivity from “spam- friendly ISPs” Open relays and proxies –Mail servers that allow unauthenticated Internet hosts to connect and relay mail through them Botnets –Collection of machines acting under one centralized controller. Eg: Bobax BGP Spectrum Agility –IP hijacking techniques

8 Spam Mitigation Filtering –Based on content –Use features in ’s headers and body –Eg: SpamAssassin Blacklisting: –IP addresses of known spam sources are used to classify –More than 30 widely used blacklists available today

9 Content-based Filtering Content-based properties are malleable –Low cost to evasion: Spammers can easily alter features of an ’s content –Customization: Customized s are easy to generate –High cost to filter maintainers: Filters must be continually updated as content-changing techniques become more sophisticated Content-based filters are applied at the destination –Too little, too late: Wasted network bandwidth, storage, etc. Many users receive (and store) the same spam content

10 DNS Blacklisting Aggressive filters have many false positives One list might not have all the information about spamming IPs Need to consult multiple lists

11 Network-level Spam Filtering Network-level properties are harder to change than content Network-level properties –IP addresses and IP address ranges (prevalence) –Change of addresses over time (persistence) –Distribution according to operating system, country and AS –Characteristics of botnets and short-lived route announcements Help develop better spam filters

12 Spamming Patterns Network-level properties of spam arrival –From where? What IP address space? ASes? What OSes? –What techniques? Botnets Short-lived route announcements Shady ISPs –Capabilities and limitations? Bandwidth Size of botnet army

Understanding the Network- Level Behavior of Spammers Anirudh Ramachandran Nick Feamster (Georgia Tech)

14 Data Collection Primary dataset: Actual spam messages collected at a large spam sinkhole Corpus of logs from a large provider Command and Control traffic from a Bobax botnet BGP route advertisements from an upstream border router in the same network Also capturing traceroutes, DNSBL results, passive TCP host fingerprinting simultaneous with spam arrival

15 Data Collection Setup Exchange 1

16 Data collected when the spam is received IP address of the relay that established the SMTP connection to the sinkhole Traceroute to that IP address, to help us estimate the network location of the mail relay Passive “p0f” TCP fingerprint, to determine the OS of the mail relay Result of DNS blacklist (DNSBL) lookups for that mail relay at eight different DNSBLs

17 MailAvenger Highly configurable SMTP server that collects many useful statistics

18 Spam per Day Both the amount of spam and the number of distinct IP addresses increase over time

19 IP Address Distribution The majority of spam is sent from a relatively small fraction of IP address space The distribution is the same for legitimate mail

20 AS distribution Large fraction of spam received from just a handful of ASes 12% of all received spam originates in just two ASes (from Korea and China) Top 20 ASes are responsible for sending nearly 37% of all spam Spam filtering efforts might be better if focussed on identifying high-volume, persistent groups of spammers by AS number rather than on blacklisting individual IP addresses.

21 Distribution across ASes Still about 40% of spam coming from the U.S.

22 Distribution Across Operating Systems About 4% of known hosts are non-Windows. These hosts are responsible for about 8% of received spam.

23 Persistence More than half of the client IPs appear less than twice 85% of the client IP addresses sent less than 10 s to the sinkhole

24 Effectiveness of Blacklists Nearly 80% of all spam received from mail relays appear in at least one of eight blacklists > 50% of spam was listed in two or more blacklists If spammers use BGP spectrum agility, then 50% of the IP addresses do not appear in any blacklist About 30% appear in more than one blacklist

25 Effectiveness of Blacklists

26 Effectiveness of Blacklists

27 Spam From Botnets

28 Most Bot IP addresses do not return 65% of bots only send mail to a domain once over 18 months Collaborative spam filtering seems to be helping track bot IP addresses Lifetime (seconds) Percentage of bots

29 Most Bots Send Low Volumes of Spam Lifetime (seconds) Amount of Spam Most bot IP addresses send very little spam, regardless of how long they have been spamming…

30 BGP Spectrum Agility Log IP addresses of SMTP relays Correlate BGP route advertisements seen at network where spam trap is co-located. A small club of persistent players appears to be using this technique. Common short-lived prefixes and ASes / / / ~ 10 minutes Somewhere between 1-10% of all spam (some clearly intentional, others might be flapping)

31 Why Such Big Prefixes? Flexibility: Client IPs can be scattered throughout dark space within a large /8 –Same sender usually returns with different IP addresses Visibility: Route typically won’t be filtered (nice and short)

32 Characteristics of IP-Agile Senders IP addresses are widely distributed across the /8 space IP addresses typically appear only once at the sinkhole Depending on which /8, 60-80% of these IP addresses were not reachable by traceroute when spot-checked Some IP addresses were in allocated, albeit unannounced space Some AS paths associated with the routes contained reserved AS numbers

33 Length of short-lived BGP epochs

34 The Effectiveness of Blacklisting ~80% listed on average ~95% of bots listed in one or more blacklists Number of DNSBLs listing this spammer Only about half of the IPs spamming from short-lived BGP are listed in any blacklist Fraction of all spam received Spam from IP-agile senders tend to be listed in fewer blacklists

35 Harvesting Tracking Web-based harvesting –Register domain, set up MX record –Post, link to page with randomly generated addresses –Log requests –Wait for spam

36 Harvesting Domain was registered on November 19, 2005 SMTP server was setup on December 6, harvesting occurred on January 16, 2006 First spam came on January 20, 2006 (phishing attack) The harvester and the spammers were not in the same AS Attack was coordinated between two machines –One machine sent to half of the addresses listed alphabetically, the other machine to the other half

37 Spam Mitigation Spam filtering requires a better notion of host identity –IP address is not enough to identify an host IP address range based filtering is more effective than single IP address based filtering –Some IP address ranges send more spam than others Securing the Internet routing is necessary for bolstering identity and traceability of senders –BGP spectrum agility method can be used more Network-level properties can make current spam filters more effective

38 Conclusion A detailed study examining network level properties Reveals botnet characteristics in sending spam Shows the existence of BGP spectrum agility method Datasets are substantial, but not comprehensive –Comparison between spam and legitimate mail is questionable –Comparison between spam and legitimate mail of a single domain, repeating this using several domains can be better? –Analysis of IP addresses and address ranges fails to draw important conclusions Does not analyze other types of spam, apart from spam. Data Analysis from a single vantage point