Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.

Slides:



Advertisements
Similar presentations
A Crawler-based Study of Spyware on the Web Author: Alexander Moshchuk, Tanya Bragin, Steven D.Gribble, Henry M.Levy Presented At: NDSS, 2006 Prepared.
Advertisements

Web Defacement Anh Nguyen May 6 th, Organization Introduction How Hackers Deface Web Pages Solutions to Web Defacement Conclusions 2.
Cloak and Dagger. In a nutshell… Cloaking Cloaking in search engines Search engines’ response to cloaking Lifetime of cloaked search results Cloaked pages.
CpSc 881: Information Retrieval. 2 How hard can crawling be?  Web search engines must crawl their documents.  Getting the content of the documents is.
Automated Web Patrol with Strider Honey Monkeys Y.Wang, D.Beck, S.Chen, S.King, X.Jiang, R.Roussev, C.Verbowski Microsoft Research, Redmond Justin Miller.
March 26, 2003CS502 Web Information Systems1 Web Crawling and Automatic Discovery Donna Bergmark Cornell Information Systems
SEO PACKAGES. Types of Plans Starter Plan Business Plan Enterprises Plan.
 Proxy Servers are software that act as intermediaries between client and servers on the Internet.  They help users on private networks get information.
WEB SCIENCE: SEARCHING THE WEB. Basic Terms Search engine Software that finds information on the Internet or World Wide Web Web crawler An automated program.
IDK0040 Võrgurakendused I Building a site: Publicising Deniss Kumlander.
W3af LUCA ALEXANDRA ADELA – MISS 1. w3af  Web Application Attack and Audit Framework  Secures web applications by finding and exploiting web application.
Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us 17th USENIX Security Symposium (Security'08), San Jose, CA, 2008 Google Technical Report.
11 The Ghost In The Browser Analysis of Web-based Malware Reporter: 林佳宜 Advisor: Chun-Ying Huang /3/29.
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
The Ghost In The Browser Analysis of Web-based Malware Niels Provos, Dean McNamee, Panayiotis Mavrommatis, Ke Wang and Nagendra Modadugu Google, Inc. The.
JOHN P. JOHN FANG YU YINGLIAN XIE MARTÍN ABADI ARVIND KRISHNAMURTHY PRESENTATION BY SAM KLOCK Searching the Searchers with SearchAudit.
Wasim Rangoonwala ID# CS-460 Computer Security “Privacy is the claim of individuals, groups or institutions to determine for themselves when,
Niels Provos and Panayiotis Mavrommatis Google Google Inc. Moheeb Abu Rajab and Fabian Monrose Johns Hopkins University 17 th USENIX Security Symposium.
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,
CS621 : Seminar-2008 DEEP WEB Shubhangi Agrawal ( )‏ Jayalekshmy S. Nair ( )‏
1 All Your iFRAMEs Point to Us Mike Burry. 2 Drive-by downloads Malicious code (typically Javascript) Downloaded without user interaction (automatic),
Penetration Testing James Walden Northern Kentucky University.
Chapter 6 The World Wide Web. Web Pages Each page is an interactive multimedia publication It can include: text, graphics, music and videos Pages are.
Strategies for improving Web site performance Google Webmaster Tools + Google Analytics Marshall Breeding Director for Innovative Technologies and Research.
Web Site Performance An analytical approach for benchmarking and tuning.
Crawlers and Spiders The Web Web crawler Indexer Search User Indexes Query Engine 1.
XHTML Introductory1 Linking and Publishing Basic Web Pages Chapter 3.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
A Web Crawler Design for Data Mining
A Crawler-based Study of Spyware on the Web A.Moshchuk, T.Bragin, D.Gribble, M.Levy NDSS, 2006 * Presented by Justin Miller on 3/6/07.
A Crawler-based Study of Spyware on the Web Authors: Alexander Moshchuk, Tanya Bragin, Steven D.Gribble, and Henry M. Levy University of Washington 13.
Software Security Testing Vinay Srinivasan cell:
Crawling Slides adapted from
WHAT IS A SEARCH ENGINE A search engine is not a physical engine, instead its an electronic code or a software programme that searches and indexes millions.
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
--Harish Reddy Vemula Distributed Denial of Service.
CSCE 201 Web Browser Security Fall CSCE Farkas2 Web Evolution Web Evolution Past: Human usage – HTTP – Static Web pages (HTML) Current: Human.
Improving Cloaking Detection Using Search Query Popularity and Monetizability Kumar Chellapilla and David M Chickering Live Labs, Microsoft.
McLean HIGHER COMPUTER NETWORKING Lesson 7 Search engines Description of search engine methods.
استاد : مهندس حسین پور ارائه دهنده : احسان جوانمرد Google Architecture.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
Internet Architecture and Governance
Module 7: Advanced Application and Web Filtering.
Sid Stamm, Zulfikar Ramzan and Markus Jokobsson Erkang Xu.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
Malicious Software.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
WEB SERVER SOFTWARE FEATURE SETS
Firewalls. Intro to Firewalls Basically a firewall is a barrier to keep destructive forces away from your computer network.
WebWatcher A Lightweight Tool for Analyzing Web Server Logs Hervé DEBAR IBM Zurich Research Laboratory Global Security Analysis Laboratory
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
1 Crawling Slides adapted from – Information Retrieval and Web Search, Stanford University, Christopher Manning and Prabhakar Raghavan.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Adware and Browser Hijacker – Symptoms and Preventions /killmalware /u/2/b/ /alexwaston14/viru s-removal/ /channel/UC90JNmv0 nAvomcLim5bUmnA.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
Search Engine and Optimization 1. Introduction to Web Search Engines 2.
Week-10 (Lecture-1) Web Building STEPS OF BUILDING: create web pages using HTML add a consistent style using CSS add computer code using JavaScript add.
Powerpoint presentation on Drive-by download attack -By Yogita Goyal.
1 Web Technologies Website Publishing/Going Live! Copyright © Texas Education Agency, All rights reserved.
Data mining in web applications
Top 5 Open Source Firewall Software for Linux User
Secure Software Confidentiality Integrity Data Security Authentication
Software Applications for end-users
Strategies for improving Web site performance
Search Search Engines Search Engine Optimization Search Interfaces
Presentation transcript:

Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P. Papadopoulos 1

Compromising Web Servers Phishing and malware pages, redirecting user traffic to malicious sites Almost 90% of Web attacks take place through legitimate sites that have been compromised Over of 50% of popular search keywords have at least one malicious link to a compromised site Communicate with clients behind NATs and firewalls 2

Honeypots A honeypot is a computer security mechanism set to detect, deflect, or, counteract attempts to gain unauthorized access to information systems. Client-based - Detect malicious servers that attack clients Server-based - Emulate vulnerable services/software and passively wait for attackers 3

Heat-seeking Honeypots 1.Actively attrack attackers 2.Dynamically generate and deploy honeypot pages 3.Advertise honeypot pages to attackers via search engines 4.Analyze the honeypot logs to identify attack patterns 4

Heat-seeking Honeypot Architecture 5

Attacker queries How attackers find vulnerable Web servers: 1.Brute-force port scanning on the Internet 2.Make use of search engines Identify malicious queries in the Bing log E.g. “phpizabi v0.848b c1 hfp1” 6

Creation of Honeypot Pages Deployment: - Search engines (Bing and Google ) - Top three results - The crawler fetches the Web pages at these URLs - Strip all Javascript content and rewrite all links of the page to point to the local Ex. Install a few common Web applications - Different VM for each app 7

Advertising Honeypot Pages Submit the URLs of the honeypot pages to the search engines and wait for the crawlers to visit them Increase the chance of honeypot pages (pagerank) Add hidden links (not visible to regular users) pointing to the honeypot pages on other public Web sites 8

Detecting Malicious Traffic Process the log (visitors) and automatically extract attack traffic Identifying crawlers - Well-known : Google’s crawler uses Mozilla/5.0 (compatible;Googlebot/2.1;+ - Characterizing the behavior of known crawlers - Identifying unknown crawlers Identify mallicious traffic 9

Identifying Crawlers 1/2  Known crawlers - Look at the user agent string and verify that the IP address matches the organization - A single search engine uses multiple IP addresses to crawl different pages (AS) - Most of crawlers can visit static links - Only one crawler can visit dynamic links 10

Identifying Crawlers 2/2  Unknown crawlers - Other IPs  Also grouped by AS numbers - Similar behavior as the know crawlers - Threshold: K = |P| / |C| (P: fraction pages, C: crawlable pages) 11

Identifying Malicious Traffic Attackers do not target static pages Try to access non-existent or private files Whitelist: All the dynamic and static links, for each site Try to access links not contained in the whitelist - Exact set of links present in the honeypots - Files visited by well behaved crawlers (robots.txt) 12

Results Experiment duration: 3 months Place : Washington university CS personal home page 96 automatically generated honeypot web pages 4 manually installed Web application software packages Received visits from different IPs 13

Distinguishing malicious visits 14 One crawler visitors links are dynamic links in the software. Low PageRank

Crawler Visits 15 Bi-modal distribution 16 ASes crawling more than 75% of the hosted pages 18 ASes visiting less than 25% of the pages

Attacker Visits Joomla 16

Attacker Visits 17

Geographic Locations & Discovery Time 18

Comparing Honeypots 1.Web Server - No hostname - No hyperlinks 2.Vulnerable Software - Pages accessible on the Internet - Search engines can find them 3.Heat-seeking Honeypot Pages - Generated as simple static HTML pages 19

Comparison of the total number of visits and the number of distinct IP addresses 20

Attack Types 21

Attack Types 22

Applying whitelists to the Internet ●Random set of 100 Web Servers whose HTTP access logs are indexed by search engines ●A request is defined to be from an attacker - If not present in the whitelist  Link not accessed by a crawler - Not present at all  Request results in an HTTP 404 Error 23

Applying whitelists to the Internet For 20% of the sites, almost 90% of the traffic came from attackers

Conclusion Heat-seeking Honeypots ○Deploy honeypot pages corresponding to vulnerable pages ○Attract Attackers Detect malicious IP addresses only through their Web access patterns False-negative rate of at most 1%. 25