JOHN P. JOHN FANG YU YINGLIAN XIE MARTÍN ABADI ARVIND KRISHNAMURTHY PRESENTATION BY SAM KLOCK Searching the Searchers with SearchAudit.

Slides:



Advertisements
Similar presentations
Cross-Site Scripting Issues and Defenses Ed Skoudis Predictive Systems © 2002, Predictive Systems.
Advertisements

Zhiyun Qian, Z. Morley Mao (University of Michigan)
Social Engineering Training. Training Goals Increase Laboratory Awareness. Provide the tools required to identify, avoid and report advanced Social Engineering.
Code-Red : a case study on the spread and victims of an Internet worm David Moore, Colleen Shannon, Jeffery Brown Jonghyun Kim.
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Reporter: Jing Chiu Advisor: Yuh-Jye Lee /7/181Data Mining & Machine Learning Lab.
PHISHING AND ANTI-PHISHING TECHNIQUES Sumanth, Sanath and Anil CpSc 620.
ATTACKING AUTHENTICATION The Web Application Hacker’s Handbook, Ch. 6 Presenter: Jie Huang 10/31/2012.
Phishing and Pharming New Identity Theft Threats Presentation by Jason Guthrie.
Attacking Session Management Juliette Lessing
By Brian Vees.  SQL Injection  Username Enumeration  Cross Site Scripting (XSS)  Remote Code Execution  String Formatting Vulnerabilities.
1 BotGraph: Large Scale Spamming Botnet Detection Yao Zhao EECS Department Northwestern University.
Phishing – Read Behind The Lines Veljko Pejović
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao Yinglian Xie *, Fang Yu *, Qifa Ke *, Yuan Yu *, Yan Chen and Eliot Gillum ‡ EECS Department,
Leveraging User Interactions for In-Depth Testing of Web Application Sean McAllister Secure System Lab, Technical University Vienna, Austria Engin Kirda.
Beyond DDoS: Case Studies on Attack Mitigation for Financial Services Mike Kun and Patrick Laverty, Akamai CSIRT.
BOTNETS & TARGETED MALWARE Fernando Uribe. INTRODUCTION  Fernando Uribe   IT trainer and Consultant for over 15 years specializing.
Norman SecureSurf Protect your users when surfing the Internet.
S PAMMING B OTNETS : S IGNATURES AND C HARACTERISTICS Introduction of AutoRE Framework.
Presentation by Kathleen Stoeckle All Your iFRAMEs Point to Us 17th USENIX Security Symposium (Security'08), San Jose, CA, 2008 Google Technical Report.
11 The Ghost In The Browser Analysis of Web-based Malware Reporter: 林佳宜 Advisor: Chun-Ying Huang /3/29.
MSIT 458 – The Chinchillas. Offense Overview Botnet taxonomies need to be updated constantly in order to remain “complete” and are only as good as their.
資安新聞簡報 報告者:劉旭哲、曾家雄. Spam down, but malware up 報告者:劉旭哲.
Web Hacking 1. Overview Why web HTTP Protocol HTTP Attacks 2.
GONE PHISHING ECE 4112 Final Lab Project Group #19 Enid Brown & Linda Larmore.
Phishing and Intrusion Prevention Tod Beardsley, TippingPoint (a division of 3Com), 02/15/06 – IMP-201.
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
+ Websites Vulnerabilities. + Content Expand of The Internet Use of the Internet Examples Importance of the Internet How to find Security Vulnerabilities.
WEB SECURITY WEEK 3 Computer Security Group University of Texas at Dallas.
WEB SPOOFING by Miguel and Ngan. Content Web Spoofing Demo What is Web Spoofing How the attack works Different types of web spoofing How to spot a spoofed.
Niels Provos and Panayiotis Mavrommatis Google Google Inc. Moheeb Abu Rajab and Fabian Monrose Johns Hopkins University 17 th USENIX Security Symposium.
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,
Lecture 14 – Web Security SFDV3011 – Advanced Web Development 1.
1 All Your iFRAMEs Point to Us Mike Burry. 2 Drive-by downloads Malicious code (typically Javascript) Downloaded without user interaction (automatic),
Reliability & Desirability of Data
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
Cloak and Dagger: Dynamics of Web Search Cloaking David Y. Wang, Stefan Savage, and Geoffrey M. Voelker University of California, San Diego 左昌國 Seminar.
Web Application Security ECE ECE Internetwork Security What is a Web Application? An application generally comprised of a collection of scripts.
Analysis of SQL injection prevention using a filtering proxy server By: David Rowe Supervisor: Barry Irwin.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
Understanding Computer Viruses: What They Can Do, Why People Write Them and How to Defend Against Them Computer Hardware and Software Maintenance.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
Security. Security Flaws Errors that can be exploited by attackers Constantly exploited.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Geoff Hulten, and Ivan Osipkov. SIGCOMM, Presented.
1 Introduction to Malcode, DoS Attack, Traceback, RFID Security Cliff C. Zou 03/02/06.
Search Worms, ACM Workshop on Recurring Malcode (WORM) 2006 N Provos, J McClain, K Wang Dhruv Sharma
The Koobface Botnet and the Rise of Social Malware Kurt Thomas David M. Nicol
Web Security Lesson Summary ●Overview of Web and security vulnerabilities ●Cross Site Scripting ●Cross Site Request Forgery ●SQL Injection.
Bloom Cookies: Web Search Personalization without User Tracking Authors: Nitesh Mor, Oriana Riva, Suman Nath, and John Kubiatowicz Presented by Ben Summers.
WebWatcher A Lightweight Tool for Analyzing Web Server Logs Hervé DEBAR IBM Zurich Research Laboratory Global Security Analysis Laboratory
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms Zhichun Li 1, Lanjia Wang 2, Yan Chen 1 and Judy Fu 3 1 Lab.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Heat-seeking Honeypots: Design and Experience John P. John, Fang Yu, Yinglian Xie, Arvind Krishnamurthy and Martin Abadi WWW 2011 Presented by Elias P.
Exposing Private Information by Timing Web Applications Stephen Kleinheider.
SlideSet #20: Input Validation and Cross-site Scripting Attacks (XSS) SY306 Web and Databases for Cyber Operations.
M M Waseem Iqbal.  Cause: Unverified/unsanitized user input  Effect: the application runs unintended SQL code.  Attack is particularly effective if.
Botnets A collection of compromised machines
CSCE 548 Student Presentation Ryan Labrador
TMG Client Protection 6NPS – Session 7.
Chapter 7: Identifying Advanced Attacks
De-anonymizing the Internet Using Unreliable IDs
Botnets A collection of compromised machines
Intro to Ethical Hacking
De-anonymizing the Internet Using Unreliable IDs By Yinglian Xie, Fang Yu, and Martín Abadi Presented by Peng Cheng 03/22/2017.
Cross-Site Scripting Issues and Defenses Ed Skoudis Predictive Systems
Exposing Private Information by Timing Web Applications
Presentation transcript:

JOHN P. JOHN FANG YU YINGLIAN XIE MARTÍN ABADI ARVIND KRISHNAMURTHY PRESENTATION BY SAM KLOCK Searching the Searchers with SearchAudit

Motivation We can find this via a Google search

Motivation (cont’d) Search engines open opportunities for attackers  Construct clever queries  Find vulnerable sites  Plant malware; spam (e.g., MyDoom)  Do so stealthily and cheaply Mitigation strategy: identify malicious queries  May be able to deny results to user  Identify attackers (probably bots)  Interpret strategy, then anticipate and prevent The question: how to do so

Proposed Approach SearchAudit  Framework for generating malicious queries Input:  Seed set of known malicious queries  Search logs Output:  Large set of suspicious queries  Regular expressions matching queries inurl:gotoURL.asp?url= filetype:asp inurl:"shopdisplayprod ucts.asp" ext:pl inurl:cgi intitle:"FormMail *" -"*Referrer" -"* Denied" -sourceforge -error -cvs -input filetype:cgi inurl:tseekdir.cgi... SearchAudit inurl:gotoURL.asp?url= filetype:asp inurl:"shopdisplayprod ucts.asp" ext:pl inurl:cgi intitle:"FormMail *" -"*Referrer" -"* Denied" -sourceforge -error -cvs -input filetype:cgi inurl:tseekdir.cgi... inurl:gotoURL.asp?url= filetype:asp inurl:"shopdisplayprod ucts.asp" ext:pl inurl:cgi intitle:"FormMail *" -"*Referrer" -"* Denied" -sourceforge -error -cvs -input filetype:cgi inurl:tseekdir.cgi... inurl:gotoURL.asp?url= filetype:asp inurl:"shopdisplayprod ucts.asp" ext:pl inurl:cgi intitle:"FormMail *" -"*Referrer" -"* Denied" -sourceforge -error -cvs -input filetype:cgi inurl:tseekdir.cgi... "/includes/joomla\.php " site:\.[a-zA- Z]{2,3} "/includes/class_item\.php" 4} "php-nuke" 4} "modules\.php\?op=modl oad" site:\.[a-zA- Z0-9]{2,6} Seed setSearch logs Expanded setRegular expressions

Proposed Approach (cont’d) Needed to implement:  Seed set: milw0rm.com  Search logs: Microsoft Research  Bing  Way to expand seed set into more queries  Way to infer regular expressions Intended benefits:  Harvesting lots of information  Three months: ~1.2 TB of logs  Interpret relationship between queries and attacks  Use queries to find potential victims  Stop attacks

SearchAudit Query identification Query analysis

Query Identification: Expansion Basic idea: bootstrap on seed set  Search logs for exact matches to seed queries  Record IPs of hosts making seed queries  Add other queries from those IPs to set  Intuition: make one malicious query, will probably make more Account for DHCP Seed queries IP addresses Queries made by IPs Log search Queries made on same day

Query Identification: Regular Expressions Goals:  Account for variation in queries  Take advantage of scripting See paper for generation algorithm Compute score for generated expressions  Lower score: more specific  Goal: discard overly general expressions (score > 0.6) Consolidate to avoid overlap Avoid proxies, public NAT for performance Loopback for more queries

Query Identification: Results Data from Bing and milw0rm  500 queries  Logs for Feb. 2009, Dec. 2009, Jan  ~2 billion views per month System implemented on Dryad/DryadLINQ Initial observations:  Using specificity scores < 0.6 seems to be effective  Based on cookie heuristic  Proxy elimination does not limit results

Query Identification: Results (cont’d) Query expansion:  122 of 500 queries matched in logs: 174 unique IPs  Expanded to 800 unique queries, 264 IPs  Regular expressions matched 3,560 queries, 1,001 IPs Incomplete seeds  Tried with subsets of original set  Coverage still good

Query Identification: Results (cont’d) Loopback:  Multiple loopbacks got more results  One iteration is good enough Overall statistics  10,000s IPs each month  100,000s unique queries each month  Dec. 09: set of unusual attacker IPs cause spike

Query Identification: Verification Want to show queries are malicious  Sometimes easy: 73% of queries associated with security/hacker sites  What about others? No ground truth exists So: look for bot-like features  Individual level (one IP)  Group level (multiple IPs) Individual bots  New cookie  Whether a link was clicked Groups of bots  Data often fixed by botnets  User agent string  Metadata for requests  Tendencies dictated by scripts  Pages viewed per query  Time between queries

Query Identification: Verification (cont’d) Substantial variation between host behavior for normal queries and suspicious queries

Observations on Stage One Regular expressions can become obsolete  Just need fresh logs and a new seed to get new ones Attacker awareness of technique yields adaptation  Example: mix in normal user queries  Goal: trick SearchAudit into identifying as proxy  Hard to do: needs to be appropriate to time and place  Anyway: proxy elimination is optimization only  Injecting randomness also possible, but makes querying less productive  Could obviate cookie heuristic, but it is replaceable All attackers need to be careful to succeed

Query Analysis

42,000 IPs gave suspicious queries globally  U.S., Russia, China contribute almost 50%  10% of IPs gave 90% of queries Found 200 regular expressions Reveal three kinds of attack-related queries:  Vulnerable web sites  Forum spamming  Phishing on Windows Live Messenger

Queries for Vulnerable Websites Queries look for exploitable server vulnerabilities  GET variables embedded in URL (for SQL injection)  Server software with known vulnerabilities (e.g., status pages) SearchAudit as a defense:  Pull suspicious queries for vulnerabilities  Run queries; gather results  Inspect results for vulnerabilities  Notify sites of vulnerabilities inurl:index.php?content=X ex.php?content=X’%20OR%20’ 1’%20OR%20‘1=1’

Queries for Vulnerable Websites (cont’d) With identified queries:  Sampled 5,000 queries  Obtained 80,490 URLs from 39,475 sites Compared to malware/phishing lists:  3-4% on anti-phishing lists  1.5% on anti-malware lists SQL injection vulnerability:  Add a single-quote to variable in URL  Look for SQL error  12% of examined URLs showed an error

Queries for Forum Spamming Query motivation:  Find scriptable forums  Good for spam, PageRank Found 46 applicable regular expressions Most IPs show transient behavior: probably bots  All regular expression groups show at least one group similarity feature IPs got less aggressive over time: more stealthy

Queries for Forum Spamming (cont’d) Validation  Project Honey Pot  Dynamically generate e- mail address for each visiting IP  received: must be spam  12% of all IPs listed (vs. 0.5% for normal IPs) Applications  Use queries to find and clean targeted pages  Deny results to malicious queries

Phishing via Windows Live Messenger Queries triggered by normal users  Victim receives message from a contact  Follow link for party photos  Taken to fake WLM login  After giving credentials, redirected to Bing search for “party” Bing search to avoid costs of hosting

Phishing via WLM (cont’d) Detect via query referral field (source page)  Found two regular expressions for referrals  Both expressions: victim username embedded in URL Over 180 phishing domains for 12 IPs detected Compromised accounts show different login behaviors

Conclusion Presented framework for finding suspicious queries  Input: search logs, small set of seed queries  Output: regular expressions, millions of suspicious queries Analyzed suspicious queries  Identified possible attacks  Suggested means of prevention Generally: attempted to demonstrate relationship between suspicious queries and the possibility of attack