Spamscatter 1 Aug. 9 th, 2007Usenix Security 2007 Spamscatter: David S. Anderson, Chris Fleizach, Stefan Savage, and Geoffrey M. Voelker University of.

Slides:



Advertisements
Similar presentations
Dynamics of Online Scam Hosting Infrastructure
Advertisements

1 Dynamics of Online Scam Hosting Infrastructure Maria Konte, Nick Feamster Georgia Tech Jaeyeon Jung Intel Research.
Enabling Secure Internet Access with ISA Server
Click Trajectories: End-to-End Analysis of the Spam Value Chain Author : Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, M’ark F’elegyh’azi,
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Roberto Perdisci, Igino Corona, David Dagon, Wenke Lee ACSAC.
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
Design and Evaluation of a Real-Time URL Spam Filtering Service
DSPIN: Detecting Automatically Spun Content on the Web Qing Zhang, David Y. Wang, Geoffrey M. Voelker University of California, San Diego 1.
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
1 Aug. 3 rd, 2007Conference on and Anti-Spam (CEAS’07) Slicing Spam with Occam’s Razor Chris Fleizach, Geoffrey M. Voelker, Stefan Savage University.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Understanding the Network-Level Behavior of Spammers Mike Delahunty Bryan Lutz Kimberly Peng Kevin Kazmierski John Thykattil By Anirudh Ramachandran and.
URL Obscuring COEN 152/252 Computer Forensics  Thomas Schwarz, S.J
Reporting Module for Gateway Yvonne Yao. Recap: What is the Gateway? Web-base system Create, schedule, send mailings Statistics collected and presented.
Unconstrained Endpoint Profiling (Googling the Internet)‏ Ionut Trestian Supranamaya Ranjan Aleksandar Kuzmanovic Antonio Nucci Northwestern University.
The Medusa Proxy A Tool For Exploring User- Perceived Web Performance Mimika Koletsou and Geoffrey M. Voelker University of California, San Diego Proceeding.
Teach a man (person) to Phish Recognizing scams, spams and other personal security attacks July 17 th, 2013 High Tea at IT, Summer, 2013.
Internet Basics.
Team Excel What is SPAM ?. Spam Offense Team Excel '‘a distinctive chopped pork shoulder and ham mixture'' Image Source:Appscout.com.
1. Introduction The underground Internet economy Web-based malware The system analyzing the post-infection network behavior of web-based malware How do.
Norman SecureSurf Protect your users when surfing the Internet.
Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us 17th USENIX Security Symposium (Security'08), San Jose, CA, 2008 Google Technical Report.
11 The Ghost In The Browser Analysis of Web-based Malware Reporter: 林佳宜 Advisor: Chun-Ying Huang /3/29.
Niels Provos and Panayiotis Mavrommatis Google Google Inc. Moheeb Abu Rajab and Fabian Monrose Johns Hopkins University 17 th USENIX Security Symposium.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
1 ITGS - introduction A computer may have: a direct connection to a net (cable); or remote access (modem). Connect network to other network through: cables.
14 Publishing a Web Site Section 14.1 Identify the technical needs of a Web server Evaluate Web hosts Compare and contrast internal and external Web hosting.
1 All Your iFRAMEs Point to Us Mike Burry. 2 Drive-by downloads Malicious code (typically Javascript) Downloaded without user interaction (automatic),
Network and Systems Security By, Vigya Sharma (2011MCS2564) FaisalAlam(2011MCS2608) DETECTING SPAMMERS ON SOCIAL NETWORKS.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
Economics of Malware: Spam Amir Houmansadr CS660: Advanced Information Assurance Spring 2015 Content may be borrowed from other resources. See the last.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
Click Trajectories: End-to-End Analysis of the spam value chain Kirill Levchenko, Andreas Pitsillidis, Neha Chachra, Brandon Enright, Tristan Halvorson,
A Crawler-based Study of Spyware on the Web Authors: Alexander Moshchuk, Tanya Bragin, Steven D.Gribble, and Henry M. Levy University of Washington 13.
URL Obscuring COEN 252 Computer Forensics  Thomas Schwarz, S.J
1 Detecting Malicious Flux Service Networks through Passive Analysis of Recursive DNS Traces Speaker: Jun-Yi Zheng 2010/03/29.
Understanding the Network-Level Behavior of Spammers Best Student Paper, ACM Sigcomm 2006 Anirudh Ramachandran and Nick Feamster Ye Wang (sando)
FluXOR: Detecting and Monitoring Fast-Flux Service Networks Emanuele Passerini, Roberto Paleari, Lorenzo Martignoni, and Danilo Bruschi 5th international.
Introduction To Internet
Crawling Slides adapted from
Internet  Major:Safety science and engineering  Author:jiangqian( 蒋乾 )
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
1 Characterizing Botnet from Spam Records Presenter: Yi-Ren Yeh ( 葉倚任 ) Authors: L. Zhuang, J. Dunagan, D. R. Simon, H. J. Wang, I. Osipkov, G. Hulten,
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
1 UNIT 15 Webpage Creator Lecturer: fadwa tlaelan.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
All Your iFRAMEs Point to Us Cheng Wei. Acknowledgement This presentation is extended and modified from The presentation by Bruno Virlet All Your iFRAMEs.
Not So Fast Flux Networks for Concealing Scam Servers Theodore O. Cochran; James Cannady, Ph.D. Risks and Security of Internet and Systems (CRiSIS), 2010.
The UCSD Network Telescope A Real-time Monitoring System for Tracking Internet Attacks Stefan Savage David Moore, Geoff Voelker, and Colleen Shannon Department.
Spamscatter: Characterizing Internet Scam Hosting Infrastructure By D. Anderson, C. Fleizach, S. Savage, and G. Voelker Presented by Mishari Almishari.
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
Jan 30, 2001CSCI {4,6}900: Ubiquitous Computing1 Announcements Project Milestone 2 due today. Undergraduate projects should have 3 students per project.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
Understanding the Network-Level Behavior of Spammers Author: Anirudh Ramachandran, Nick Feamster SIGCOMM ’ 06, September 11-16, 2006, Pisa, Italy Presenter:
Understanding the network level behavior of spammers Published by :Anirudh Ramachandran, Nick Feamster Published in :ACMSIGCOMM 2006 Presented by: Bharat.
The Koobface Botnet and the Rise of Social Malware Kurt Thomas David M. Nicol
We.b : The web of short URLs Demetris Antoniades, lasonas Polakis, Gerogios Kontaxis, Elias Athansapoulos, Sotiris loannidis, Evangelos P.Markatos, Thomas.
11 Shades of Grey: On the effectiveness of reputation- based “blacklists” Reporter: 林佳宜 /8/16.
WHAT IS E-COMMERCE? E-COMMERCE is a online service that helps the seller/buyer complete their transaction through a secure server. Throughout the past.
The Internet. Important Terms Network Network Internet Internet WWW (World Wide Web) WWW (World Wide Web) Web page Web page Web site Web site Browser.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Inferring Internet Denial-of-Service Activity Authors: David Moore, Geoffrey M. Voelker and Stefan Savage; University of California, San Diego Publish:
Spam By Dan Sterrett. Overview ► What is spam? ► Why it’s a problem ► The source of spam ► How spammers get your address ► Preventing Spam ► Possible.
Information Networks. Internet It is a global system of interconnected computer networks that link several billion devices worldwide. It is an international.
Week-10 (Lecture-1) Web Building STEPS OF BUILDING: create web pages using HTML add a consistent style using CSS add computer code using JavaScript add.
(class #2) CLICK TO CONTINUE done by T Batchelor.
Click Trajectories: End to End Analysis of the Spam Value Chain
HWP2 – Distributed search
INTRODUCTION TO THE INTERNET AND WEB
Presentation transcript:

Spamscatter 1 Aug. 9 th, 2007Usenix Security 2007 Spamscatter: David S. Anderson, Chris Fleizach, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego Characterizing Internet Scam Hosting Infrastructure Introduction

Spamscatter 2 Aug. 9 th, 2007Usenix Security 2007 Motivation 70 billion spam messages are sent everyday for a simple reason, advertising websites. A scam then is any website marketed using spam This online resource is directly implicated in the spam profit cycle, meaning it is rarer and more valuable Characterizing the scam infrastructure helps – Reveal the dynamics and business pressures exerted on spammers – Identify means to reduce unwanted sites and spam Introduction

Spamscatter 3 Aug. 9 th, 2007Usenix Security 2007 Spamscatter Approach Mine a large quantity of spam – Extract URLs – Probe machines hosting the scams This works because URLs must be correct – Follow the scent of money… All we need is a reliably large source of spam – We have access to a four letter, top level domain producing 150K spam per day Introduction

Spamscatter 4 Aug. 9 th, 2007Usenix Security 2007 Understanding scams Are scams distributed across different servers? Do different scams share the same server? How long do scams stay active? How reliable is their hosting? Where are scam servers located? Why is it useful to study these characteristics? Introduction

Spamscatter 5 Aug. 9 th, 2007Usenix Security 2007 Spamscatter and the Scam Methodology

Spamscatter 6 Aug. 9 th, 2007Usenix Security 2007 Methodology Data collection – Extract links from large spam feed – Probe links every 3 hours for 7 days – Record browser redirection – Save screenshots Analysis – Identify scams across servers and domains – Report on distributed and shared infrastructure, lifetime, stability, and location Methodology

Spamscatter 7 Aug. 9 th, 2007Usenix Security 2007 Identifying Scams Goal: Identify multiple hosts in the same scam, since many scams are spread across different IPs and domain names Naïve Approaches: 1. Correlate independent spam s 2. Use HTML content returned from the webserver Limitations: Spam has too much chaff and obfuscation HTML is uninteresting and mostly composed of images. Web crawlers fail with frames, iframes and JavaScript Methodology

Spamscatter 8 Aug. 9 th, 2007Usenix Security 2007 Image Shingling Solution: Use rendered screenshots of web pages for correlation. – How to compare upwards of 10,000 images? Image shingling – based on text shingling idea [BRO97] – Fragment images into blocks and hash the blocks – Two images are similar if T% of the hashed blocks are the same (T=70-80%) – Shingling allows us to essentially compare all images in O(N lg N) – Resilient to small variations among images Methodology

Spamscatter 9 Aug. 9 th, 2007Usenix Security 2007 An Example Scam: “Downloadable Software” Scam Perspective 99 observed virtual hosts 3 IP addresses Operated for months 85 senders No forwarding used 5535 probes (97% successful) An Example Scam

Spamscatter 10 Aug. 9 th, 2007Usenix Security 2007 Clustering with Image Shingling Images differ slightly Some pages rotate content An Example Scam

Spamscatter 11 Aug. 9 th, 2007Usenix Security 2007 Location 2 Web servers in China; 1 Webserver in Russia 85 senders from 30 countries (28 from US) Blue – Web servers hosting Downloadable Software Red – Spam Relays – Hosts that sent us spam An Example Scam

Spamscatter 12 Aug. 9 th, 2007Usenix Security 2007 Shared Infrastructure One of the IPs ( ) hosting “Downloadable Software” was also hosting “Toronto Pharmacy” Server located in Guangzhou, China An Example Scam

Spamscatter 13 Aug. 9 th, 2007Usenix Security 2007 Summary Statistics 1,087, ,700 36,390 7,029 Spam messages 30% contain links 11.3% are distinct links 19.3% resolve to unique IP addresses 1 week of spam collection – Nov. 28 th – Dec. 4 th 2 weeks of probing – Nov. 28 th – Dec. 11 th 2, % resolve to distinct scams Results

Spamscatter 14 Aug. 9 th, 2007Usenix Security 2007 Distributed Infrastructure To what extent is the infrastructure distributed for scams? Most scams are not distributed: – 94% used one IP Top three distributed scams were extensive – 22, 30, and 45 IPs Top three virtual- hosted scams – 110, 695, and 3029 domain names Results - Infrastructure

Spamscatter 15 Aug. 9 th, 2007Usenix Security 2007 Shared Infrastructure To what extent do multiple scams share infrastructure? 38% of scams hosted on a machine with at least one other scam 10 IPs hosted 10 or more scams Top three shared IPs – 15, 18, and 22 scams Results - Infrastructure

Spamscatter 16 Aug. 9 th, 2007Usenix Security 2007 Scam Lifetime & Stability How long are scams active, and how reliable are the hosts? Scam webhosts seem to be taken down shortly after scams disappear Overall scam lifetime approached two weeks Reliability is high > 97% usually Results - Lifetime

Spamscatter 17 Aug. 9 th, 2007Usenix Security 2007 Spam campaign lifetime How long do spam campaigns last for a scam? 137 spams messages per scam (Avg) Most spam campaigns relatively short – 88% last 20 hours or less Only 8% last more than 2 days Scam lifetimes considerably longer – on average one week Results - Lifetime < 20 hour < 2 days

Spamscatter 18 Aug. 9 th, 2007Usenix Security 2007 Location Where are scam hosting servers located? Blue – Web servers Red – Spam Relays Results - Location

Spamscatter 19 Aug. 9 th, 2007Usenix Security 2007 Location Web Servers Country Count Percent 1. usa5884 [57.40%] 2. chn741 [7.23%] 3. can379 [3.70%] 4. gbr315 [3.07%] 5. fra314 [3.06%] 6. deu258 [2.52%] 7. rus185[1.80%] 8. kor181 [1.77%] Spam Relays Country CountPercent 1. usa54159 [14.50%] 2. fra26371 [7.06%] 3. esp25196[6.75%] 4. chn24833[6.65%] 5. pol21199 [5.68%] 6. ind20235 [5.42%] 7. deu18678 [5.00%] 8. kor17446 [4.67%] Results - Location

Spamscatter 20 Aug. 9 th, 2007Usenix Security 2007 Scam Categorization Scam category % of scams Uncategorized……………………………… % Information Technology………………… 16.67% Dynamic Content ………………………… % Business and Economy …………………. 6.23% Shopping ……………………………………… 4.30% Financial Data and Services ………… % Illegal or Questionable …………………. 2.15% Adult ……………………………………………. 1.80% Message Boards and Clubs …………… 1.80% Web Hosting ………………………………… 1.63% Results - Categorization

Spamscatter 21 Aug. 9 th, 2007Usenix Security 2007 Lifetime of scams with Categorization More than 40% of malicious scams disappear before 120 hours Same is true for less than 15% of all scams Results - Categorization

Spamscatter 22 Aug. 9 th, 2007Usenix Security 2007 Summary Started with over 1m spam messages and coalesced to fewer than 2,500 scams. Image shingling allowed us to scalably determine if two sites were part of the same scam Most scams use one web server (vulnerable to blacklisting) – Scams may use many virtual domains that point to one IP Most scams not malicious per se Scam infrastructure more stable, longer lived, concentrated in US, compared with spam senders Conclusion

Spamscatter 23 Aug. 9 th, 2007Usenix Security 2007 Spammers beware; These boffins are on the prowl Questions and Answers Conclusion

Spamscatter 24 Aug. 9 th, 2007Usenix Security 2007 Spamscope Visibility Collected spam from news.admin.net- abuse.sightings – a newsgroup for contributing spam For a 3 day period, we saw – 6,977 spam from the newsgroup  205 scams – 113,216 spam from our feed  1,687 12% of the newsgroup scams were in ours The “largest” scams (most s and most domains/IP) were seen in both feeds Supplementary Information

Spamscatter 25 Aug. 9 th, 2007Usenix Security 2007 Blacklists Host type Classification % of hosts Spam relay Open proxy 72.27% Spam host 5.86% Scam host Open proxy 2.06% Spam host 14.86% 9.7% of the scam hosts also sent us spam Results - Blacklisting

Spamscatter 26 Aug. 9 th, 2007Usenix Security 2007 Web Server OS 1Linux recent 2.4 (1)11.97% 2Windows 2000 (SP1+) 11.05% 3Akamai ???10.86% 4Windows 2000 SP48.25% 5Linux recent 2.4 (2)7.84% 6FreeBSD % 7Slashdot or BusinessWeek 7.04% 8FreeBSD % 9Windows XP SP15.90% 10Linux older % Supplementary Information

Spamscatter 27 Aug. 9 th, 2007Usenix Security 2007 URL Classification WISP Dynamic Content % WISP Uncategorized % WISP Illegal or Questionable % WISP Information Technology 9.051% WISP Shopping 4.872% WISP Business and Economy 4.733% WISP Financial Data and Services 4.626% WISP Personals and Dating 1.867% WISP Advertisements 1.249% WISP Educational Institutions 1.247% WISP Pay-to-Surf 1.022% WISP Search Engines and Portals 0.884% WISP Supplements and Unregulated Compounds 0.865% WISP Sex 0.862% Supplementary Information

Spamscatter 28 Aug. 9 th, 2007Usenix Security 2007 Image Clustering 2,541, , Total probes 9.8% of probes result in a captured image 3.8% of screenshots are the 'first' screenshot for a scam Clusters detected by image shingling 1 week of spam collection – Nov. 28 th – Dec. 4 th 2 weeks of probing – Nov. 28 th – Dec. 11 th Supplementary Information

Spamscatter 29 Aug. 9 th, 2007Usenix Security 2007 Image Shingling For a typical day of screenshots, we tested various thresholds A 70% threshold provided a good mixture between flexibility and accuracy Supplementary Information

Spamscatter 30 Aug. 9 th, 2007Usenix Security 2007 Overlap of pairs of scams on the same server For scams running on the same server, how much time do they overlap? 96% of all scam pairs overlapped with each other when they remained active Only 10% of scams fully overlapped each other One week Supplementary Information

Spamscatter 31 Aug. 9 th, 2007Usenix Security 2007 IP ranges What are the network locations of scams and spam relays? The cumulative distribution of IP addresses is highly non- uniform Majority of spam relays (60%) fall between 58.* -> 91.* Most scams (50%) fall between 64.* -> 72.* Supplementary Information