PhishScore: Hacking Phishers’ Minds

Slides:



Advertisements
Similar presentations
PhishZoo: Detecting Phishing Websites By Looking at Them
Advertisements

11 PhishNet: Predictive Blacklisting to detect Phishing Attacks Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/4/26.
Typo-Squatting: a Nuisance or a Threat to Your Traffic? Mishari Almishari.
1 CANTINA : A Content-Based Approach to Detecting Phishing Web Sites WWW Yue Zhang, Jason Hong, and Lorrie Cranor.
Report : 鄭志欣 Advisor: Hsing-Kuo Pao 1 Learning to Detect Phishing s I. Fette, N. Sadeh, and A. Tomasic. Learning to detect phishing s. In Proceedings.
Design and Evaluation of a Real-Time URL Spam Filtering Service
Phishing and Pharming New Identity Theft Threats Presentation by Jason Guthrie.
Jason Rich CIS  The purpose of this project is to inform the audience about the act of phishing. Phishing is when fake websites are created.
PHISHING By, Himanshu Mishra Parrag Mehta. OUTLINE What is Phishing ? Phishing Techniques Message Delivery Effects of Phishing Anti-Phishing Techniques.
Jean-Eudes Ranvier 17/05/2015Planet Data - Madrid Trustworthiness assessment (on web pages) Task 3.3.
Design and Evaluation of a Real- Time URL Spam Filtering Service Kurt Thomas, Chris Grier, Justin Ma, Vern Paxson, Dawn Song University of California,
Internet Phishing Not the kind of Fishing you are used to.
Automatic Discovery and Classification of search interface to the Hidden Web Dean Lee and Richard Sia Dec 2 nd 2003.
Accurately Detect Parked Domain Typo- squatting Attacks Mishari Almishari and Xiaowei Yang University of California, Irvine Donald Bren School of Information.
Typo-Squatting: a Nuisance or a Threat to Your Traffic? Mishari Almishari.
Phishing – Read Behind The Lines Veljko Pejović
Cyber Security - Threats James Clement Network Specialist ETS: Communications & Network Services
Verma - ICISS 2014 R easoning M ining NLP Defense Rakesh M. Verma ReMiND Laboratory Catching Classical and Hijack-based Phishing Attacks.
The OWASP Foundation OWASP Chennai Phishing.
Detection of Internet Scam Using Logistic Regression
Norman SecureTide Powerful cloud solution to stop spam and threats before it reaches your network.
Norman SecureSurf Protect your users when surfing the Internet.
URLDoc: Learning to Detect Malicious URLs using Online Logistic Regression Presented by : Mohammed Nazim Feroz 11/26/2013.
Signatures As Threats to Privacy Brian Neil Levine Assistant Professor Dept. of Computer Science UMass Amherst.
PhishNet: Predictive Blacklisting to Detect Phishing Attacks Pawan Prakash Manish Kumar Ramana Rao Kompella Minaxi Gupta Purdue University, Indiana University.
John P., Fang Yu, Yinglian Xie, Martin Abadi, Arvind Krishnamurthy University of California, Santa Cruz USENIX SECURITY SYMPOSIUM, August, 2010 John P.,
KAIST Web Wallet: Preventing Phishing Attacks by Revealing User Intentions Min Wu, Robert C. Miller and Greg Little Symposium On Usable Privacy and Security.
Beyond Blacklists: Learning to Detect Malicious Web Sites from Suspicious URLs Justin Ma, Lawrence Saul, Stefan Savage, Geoff Voelker Computer Science.
What is Social Engineering. Pretexting Pretexting is the act of creating and using an invented scenario called the Pretext to persuade a target to release.
Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 9/19/2015Slide 1 (of 32)
Adam Soph, Alexandra Smith, Landon Peterson. Phishing is a way of attempting to acquire information such as usernames, passwords, and credit card details.
URL Obscuring COEN 252 Computer Forensics  Thomas Schwarz, S.J
11 CANTINA: A Content- Based Approach to Detecting Phishing Web Sites Reporter: Gia-Nan Gao Advisor: Chin-Laung Lei 2010/6/7.
Detecting Semantic Cloaking on the Web Baoning Wu and Brian D. Davison Lehigh University, USA WWW 2006.
Anti-Phishing Approaches Lifeng Hu
Internet Information Retrieval Sun Wu. Course Goal To learn the basic concepts and techniques of internet search engines –How to use and evaluate search.
Phishing Pharming Spam. Phishing: Definition  A method of identity theft carried out through the creation of a website that seems to represent a legitimate.
11 A Hybrid Phish Detection Approach by Identity Discovery and Keywords Retrieval Reporter: 林佳宜 /10/17.
Web Spoofing Steve Newell Mike Falcon Computer Security CIS 4360.
Phishing Webpage Detection Jau-Yuan Chen COMS E6125 WHIM March 24, 2009.
DNS and Semantic Analysis for Phishing Detection June 22, 2015 Ph.D. defense Samuel Marchal Defense committee: Prof. Ulrich Sorger – chairmanProf. Eric.
What is Phishing?.  Phishing attempts are attempts to get valuable personal information from people via the internet.  Attempts usually come in the.
Week 10-11c Attacks and Malware III. Remote Control Facility distinguishes a bot from a worm distinguishes a bot from a worm worm propagates itself and.
Chapter 7 Phishing, Pharming, and Spam. Phishing Phishing is a criminal activity using computer security techniques. Phishers try to acquire information.
CCT355H5 F Presentation: Phishing November Jennifer Li.
About Phishing Phishing is a criminal activity using social engineering techniques.criminalsocial engineering Phishers attempt to fraudulently acquire.
BY : MUHAMMAD KHUZAIMI B. ISHAK 4 ADIL PUAN MAZITA INFORMATION AND COMMUNICATION OF TECHNOLOGY.
How Phishing Works Prof. Vipul Chudasama.
Lexical Feature Based Phishing URL Detection Using Online Learning Reporter: Jing Chiu Advisor: Yuh-Jye Lee /3/17Data.
Detecting Phishing in s Srikanth Palla Ram Dantu University of North Texas, Denton.
Topic 5: Basic Security.
Reporter: Jing Chiu Advisor: Yuh-Jye Lee /3/17 1 Data Mining and Machine Learning Lab.
Phishing Website Detection & Target Identification October 30 th, 2015 Samuel Marchal*, Kalle Saari*, Nidhi Singh †, N.Asokan* *Aalto University - † Intel.
Company LOGO User Authentication Threat Modelling from User and Social Perspective “Defending the Weakest Link: Intrusion.
Blog Track Open Task: Spam Blog Detection Tim Finin Pranam Kolari, Akshay Java, Tim Finin, Anupam Joshi, Justin.
A Framework for Detection and Measurement of Phishing Attacks Reporter: Li, Fong Ruei National Taiwan University of Science and Technology 2/25/2016 Slide.
Uploading Web Page  It would be meaningful to share your web page with the rest of the net user.  Thus, we have to upload the web page to the web server.
Unveiling Zeus Automated Classification of Malware Samples Abedelaziz Mohaisen Omar Alrawi Verisign Inc, VA, USA Verisign Labs, VA, USA
Created by the E-PoliceSlide 122 February, 2012 Dangers of s By Michael Kuc.
Off the Hook: Real-Time Client- Side Phishing Prevention System July 28 th, 2016 University of Helsinki Samuel Marchal*, Giovanni Armano*, Kalle Saari*,
January 31st, 2017 Samuel Marchal*, Giovanni Armano*, Kalle Saari*,
Detection of Internet Scam Using Logistic Regression
ISYM 540 Current Topics in Information System Management
Phishing is a form of social engineering that attempts to steal sensitive information.
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
What is Phishing? Pronounced “Fishing”
Spear Phishing Awareness
When Machine Learning Meets Security – Secure ML or Use ML to Secure sth.? ECE 693.
Characterizing Pixel Tracking through the Lens of Disposable Services
Presentation transcript:

PhishScore: Hacking Phishers’ Minds CNSM 2014 – Fault Tolerance and Security Track November 18, 2014 Samuel Marchal, Jérôme François, Radu State and Thomas Engel {samuel.marchal,radu.state,thomas.engel}@uni.lu jerome.francois@inria.fr

PhishScore at a glance PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 1 / 16

What is Phishing ? Use of technical subterfuges and social engineering to steal any kind of valuable consumers’ data: Identity information Web-sites credentials: login, password, etc. Credit card information Etc. Cause billions of dollars of loss every year PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 2 / 16

Phishing techniques and statistics Web based delivery Trojan hosts Content Injection (website) Phishing emails Instant messaging Fake websites etc. PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 3 / 16

Phishing website example PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 4 / 16

Phishing URLs characteristics www.paypal.creasconsultores.com/www.paypal.com/Resolutioncenter.php shevkun.org/css/paypal.com/cgi-bin/cmd%3D_login-submit/css/websc.php us-mg6.mail.yahoo.com.dwarkamaigroup.com/Yahoo.html emailoans.hostingventure.com.au/bankofamerica.com nitkowski.pl/components/wellsfargo/questions.php URL characteristics: Long URLs (many level domains, long path, etc.) Composed of many labels Embed targeted brand at different URL level e.g. Yahoo, Wells Fargo Embed specific key words PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 5 / 16

Prior Work URL lexical analysis Garrera et al. [WORM `07] Logistic regression with word based features Ma et al. [SIGKDD `09] Batch classification method with lexical and host based features Blum et al. [AISec `10] Refined technique with binary feature for each word/level Le et al. [Infocom `11] Batch and online learning with lexical features and URL features PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 6 / 16

The registered domain has no relationship with the rest of the URL Phishing URLs characteristics www.paypal.creasconsultores.com/www.paypal.com/Resolutioncenter.php shevkun.org/css/paypal.com/cgi-bin/cmd%3D_login-submit/css/websc.php us-mg6.mail.yahoo.com.dwarkamaigroup.com/Yahoo.html emailoans.hostingventure.com.au/bankofamerica.com nitkowski.pl/components/wellsfargo/questions.php The registered domain has no relationship with the rest of the URL Most parts of URLs can be freely defined Except the registered domain: main level domain + public suffix http:// 4ld.3ld. mld.ps /path1/path2?key1=value1&key2=value2 PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 7 / 16

Proposition for Phishing URL Detection Hypothesis: Components of legitimate URLs are all related Registered domains (mld.ps) of phishing URLs are not related to the remaining of the URL Analyse relatedness between mld.ps and the remaining part of a URL : Intra-URL relatedness PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 8 / 16

Intra-URL relatedness URL label extraction: http://4ld.3ld.mld.ps/path1/path2?key1=value1&key2=value2 Basic splitting “mld” & “mld.ps” login.paypal.com/securepayment RDurl = {paypal; paypal.com} REMurl = {login; secure; payment} PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 9 / 16

Intra-URL relatedness evaluation How to evaluate intra-URL relatedness ? Compare the two sets RDurl and REMurl Existing word relatedness techniques : Wordnet [Miller90], NGD [Cilibrasi07], Disco [Kolb08], etc. Problem: all dictionary based and ”Internet” vocabulary is not necessarily contained in dictionary Idea : use Search Engine Query Data Web searches reflect the cognitive behaviour of users looking for services on Internet (what phishers try to identify and to mimic) Request well-known services: Google Trends & Yahoo Clues See which words are requested together in search engines to infer word relatedness PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 10 / 16

Intra-URL relatedness evaluation PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 11 / 16

Features set cardrem JRR JRA JAA JAR JARrd JARrem mldres mld.psres 12 features representing intra-URL relatedness: Word set relatedness (Jaccard index) Words embedded in URL cardrem JRR JRA JAA JAR JARrd JARrem Popularity of registered domain Popularity of words in URL mldres mld.psres ranking ratioArem ratioRrem PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 12 / 16

Feature analysis Datasets: 48,009 phishing URLs (source: PhishTank) 48,009 legitimate URLs (source DMOZ) Features extraction for all dataset PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 13 / 16

URL classification Machine learning approach: Determine the best classifier to identify phishing URLs 7 classifiers tested: Random Forest, C4.5, JRip, SVM, etc. 10-fold cross-validation on the presented feature set (96,016 URLs) Random Forest: 94.91% accuracy 1.44% FPrate PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 14 / 16

URL rating Random Forest based rating system: Use soft prediction score [0;1] as URL score: 1: phishing URL 0: legitimate URL 0: 22,863 legitimate // 40 phishing 1: 26 legitimate // 34,790 phishing 99.89% correctness on 60.11% of the dataset [0;0.1] and [0.9;1] 99.22% correctness on 83.97% of the dataset PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 15 / 16

Conclusion PhishScore Lexical analysis to detect phishing URLs: Intra-URL relatedness Word relatedness inferred with search engine query data Phishing URL detection: 95% accuracy (FP rate = 1.44%) URL rating system: >99% correctness for > 80% URLs Future Work: Use distributed on-line processing (Big Data) to reduce delay Implementation as phishing email filtering and browser add-on URL Semantic Analysis for Phishing Detection – Samuel Marchal 16 / 16

PhishScore: Hacking Phishers’ Minds CNSM 2014 – Fault Tolerance and Security Track November 18, 2014 Samuel Marchal, Jérôme François, Radu State and Thomas Engel {samuel.marchal,radu.state,thomas.engel}@uni.lu jerome.francois@inria.fr

Phishing summary Is there a global characteric for phishing ? No seeks to steal different kind of data targets several industry sector uses various techniques Is there a global characteric for phishing ? No , but most of phishing attacks rely on fake websites using redirecting links Phishing detection technique with wide scope: Phishing URL identification PhishScore: Hacking Phishers‘ Minds – Samuel Marchal 5 / 17