Jhih-sin Jheng 2009/09/01 Machine Learning and Bioinformatics Laboratory.

Slides:



Advertisements
Similar presentations
Wenke Lee and Nick Feamster Georgia Tech Botnet and Spam Detection in High-Speed Networks.
Advertisements

Detecting Spam Zombies by Monitoring Outgoing Messages Zhenhai Duan Department of Computer Science Florida State University.
A Survey of Botnet Size Measurement PRESENTED: KAI-HSIANG YANG ( 楊凱翔 ) DATE: 2013/11/04 1/24.
Battle of Botcraft: Fighting Bots in Online Games with Human Observational Proofs Steven Gianvecchio, Zhenyu Wu, Mengjun Xie, and Haining Wang.
RB-Seeker: Auto-detection of Redirection Botnet Presenter: Yi-Ren Yeh Authors: Xin Hu, Matthew Knysz, Kang G. Shin NDSS 2009 The slides is modified from.
The testbed environment for this research to generate real-world Skype behaviors for analyzation is as follows: A NAT-ed LAN consisting of 7 machines running.
 Firewalls and Application Level Gateways (ALGs)  Usually configured to protect from at least two types of attack ▪ Control sites which local users.
BotMiner Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology.
1 BotGraph: Large Scale Spamming Botnet Detection Yao Zhao EECS Department Northwestern University.
Unsupervised Intrusion Detection Using Clustering Approach Muhammet Kabukçu Sefa Kılıç Ferhat Kutlu Teoman Toraman 1/29.
Leveraging Personal Knowledge for Robust Authentication Systems Mentor: Danfeng Yao Anitra Babic Chestnut Hill College Computer Science Department.
Detecting Botnets Using Hidden Markov Models on Network Traces Wade Gobel Bio-Grid, Summer 2008.
Botnets Abhishek Debchoudhury Jason Holmes. What is a botnet? A network of computers running software that runs autonomously. In a security context we.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao Yinglian Xie *, Fang Yu *, Qifa Ke *, Yuan Yu *, Yan Chen and Eliot Gillum ‡ EECS Department,
Dr. Steven Gianvecchio.  Internet of Things botnet  Includes TV and refrigerator  Flashback hits Mac OS X  800K Macs infected  Explosion of Android.
Towards Online Spam Filtering in Social Networks Hongyu Gao, Yan Chen, Kathy Lee, Diana Palsetia and Alok Choudhary Lab for Internet and Security Technology.
SMS Mobile Botnet Detection Using A Multi-Agent System Abdullah Alzahrani, Natalia Stakhanova, and Ali A. Ghorbani Faculty of Computer Science, University.
Bayesian Bot Detection Based on DNS Traffic Similarity Ricardo Villamarín-Salomón, José Carlos Brustoloni Department of Computer Science University of.
2009/9/151 Rishi : Identify Bot Contaminated Hosts By IRC Nickname Evaluation Reporter : Fong-Ruei, Li Machine Learning and Bioinformatics Lab In Proceedings.
An Effective Defense Against Spam Laundering Paper by: Mengjun Xie, Heng Yin, Haining Wang Presented at:CCS'06 Presentation by: Devendra Salvi.
Battle of Botcraft: Fighting Bots in Online Games withHuman Observational Proofs Steven Gianvecchio, Zhenyu Wu, Mengjun Xie, and Haining Wang The College.
Detecting Spammers with SNARE: Spatio-temporal Network-level Automatic Reputation Engine Shuang Hao, Nadeem Ahmed Syed, Nick Feamster, Alexander G. Gray,
PROJECT IN COMPUTER SECURITY MONITORING BOTNETS FROM WITHIN FINAL PRESENTATION – SPRING 2012 Students: Shir Degani, Yuval Degani Supervisor: Amichai Shulman.
11 Active Botnet Probing to Identify Obscure Command and Control Channels G Gu, V Yegneswaran, P Porras, J Stoll, and W Lee - on Annual Computer Security.
Guofei Gu, Roberto Perdisci, Junjie Zhang, and Wenke Lee College of Computing, Georgia Institute of Technology USENIX Security '08 Presented by Lei Wu.
1 Measurements and Mitigation of Peer-to-Peer-based Botnets: A Case Study on Storm Worm T. Holz, M. Steiner, F. Dahl, E. Biersack, and F. Freiling - Proceedings.
Combining Supervised and Unsupervised Learning for Zero-Day Malware Detection © 2013 Narus, Inc. Prakash Comar 1 Lei Liu 1 Sabyasachi (Saby) Saha 2 Pang-Ning.
Botnets An Introduction Into the World of Botnets Tyler Hudak
B OTNETS T HREATS A ND B OTNETS DETECTION Mona Aldakheel
WARNINGBIRD: A Near Real-time Detection System for Suspicious URLs in Twitter Stream.
Intrusion Detection Jie Lin. Outline Introduction A Frame for Intrusion Detection System Intrusion Detection Techniques Ideas for Improving Intrusion.
BotNet Detection Techniques By Shreyas Sali
Improving Intrusion Detection System Taminee Shinasharkey CS689 11/2/00.
BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure-Independent Botnet Detection Guofei Gu, Roberto Perdisci, Junjie Zhang, and.
Speaker:Chiang Hong-Ren Botnet Detection by Monitoring Group Activities in DNS Traffic.
11 Automatic Discovery of Botnet Communities on Large-Scale Communication Networks Wei Lu, Mahbod Tavallaee and Ali A. Ghorbani - in ACM Symposium on InformAtion,
1 Measurement and Classification of Humans and Bots in Internet Chat By Steven Gianvecchio, Mengjun Xie, Zhenyu Wu, and Haining Wang College of William.
Man vs. Machine: Adversarial Detection of Malicious Crowdsourcing Workers Gang Wang, Tianyi Wang, Haitao Zheng, Ben Y. Zhao, UC Santa Barbara, Usenix Security.
Detection Unknown Worms Using Randomness Check Computer and Communication Security Lab. Dept. of Computer Science and Engineering KOREA University Hyundo.
Botnet behavior and detection October RONOG Silviu Sofronie – a Head of Forensics.
1 Impact of IT Monoculture on Behavioral End Host Intrusion Detection Dhiman Barman, UC Riverside/Juniper Jaideep Chandrashekar, Intel Research Nina Taft,
Automated Classification and Analysis of Internet Malware M. Bailey J. Oberheide J. Andersen Z. M. Mao F. Jahanian J. Nazario RAID 2007 Presented by Mike.
BOTNET JUDO Fighting Spam with Itself By: Pitsillidis, Levchenko, Kreibich, Kanich, Voelker, Paxson, Weaver, and Savage Presentation by: Heath Carroll.
Automatically Generating Models for Botnet Detection Presenter: 葉倚任 Authors: Peter Wurzinger, Leyla Bilge, Thorsten Holz, Jan Goebel, Christopher Kruegel,
Model-Based Covert Timing Channels: Automated Modeling and Evasion Steven Gianvecchio 1, Haining Wang 1, Duminda Wijesekera 2, and Sushil Jajodia 2 1 College.
Studying Spamming Botnets Using Botlab 台灣科技大學資工所 楊馨豪 2009/10/201 Machine Learning And Bioinformatics Laboratory.
BotGraph: Large Scale Spamming Botnet Detection Yao Zhao, Yinglian Xie, Fang Yu, Qifa Ke, Yuan Yu, Yan Chen, and Eliot Gillum Speaker: 林佳宜.
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
By Gianluca Stringhini, Christopher Kruegel and Giovanni Vigna Presented By Awrad Mohammed Ali 1.
Automatic Detection of Emerging Threats to Computer Networks Andre McDonald.
Date: 2015/11/19 Author: Reza Zafarani, Huan Liu Source: CIKM '15
Botnets Usman Jafarey Including slides from The Zombie Roundup by Cooke, Jahanian, McPherson of the University of Michigan.
1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
BotCop: An Online Botnet Traffic Classifier 鍾錫山 Jan. 4, 2010.
05/04/07 Using Active Learning to Label Large Corpora Ted Markowitz Pace University CSIS DPS & IBM T. J. Watson Research Ctr.
KAIST TS & IS Lab. CS710 Know your Neighbors: Web Spam Detection using the Web Topology SIGIR 2007, Carlos Castillo et al., Yahoo! 이 승 민.
Don’t Follow me : Spam Detection in Twitter January 12, 2011 In-seok An SNU Internet Database Lab. Alex Hai Wang The Pensylvania State University International.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
2009/6/221 BotMiner: Clustering Analysis of Network Traffic for Protocol- and Structure- Independent Botnet Detection Reporter : Fong-Ruei, Li Machine.
1. ABSTRACT Information access through Internet provides intruders various ways of attacking a computer system. Establishment of a safe and strong network.
An Effective Defense Against Spam Laundering Author: Mengjun Xie, Heng Yin, Haining Wang Presented At: CCS’ 06 Prepared By: Amit Shrivastava.
Botnets A collection of compromised machines
BotCatch: A Behavior and Signature Correlated Bot Detection Approach
Botnets A collection of compromised machines
Dieudo Mulamba November 2017
REVISITING DEFENSES AGAINST LARGE SCALE ONLINE PASSWORD GUESSING ATTACKS Mansour Alsaleh,Mohammad Mannan and P.C van Oorschot.
Binghui Wang, Le Zhang, Neil Zhenqiang Gong
Identifying Slow HTTP DoS/DDoS Attacks against Web Servers DEPARTMENT ANDDepartment of Computer Science & Information SPECIALIZATIONTechnology, University.
GANG: Detecting Fraudulent Users in OSNs
Introduction to Internet Worm
Presentation transcript:

Jhih-sin Jheng 2009/09/01 Machine Learning and Bioinformatics Laboratory

Reference Measurement and Classification of Humans and Bots in Internet Chat Steven Gianvecchio, Mengjun Xie, ZhenyuWu, and Haining Wang Department of Computer Science The College of William and Mary (USENIX Security),2008 2

Outline Background Measurement Classification System Experimental Evaluation Conclusion 3

Outline Background Measurement Classification System Experimental Evaluation Conclusion 4

Chat Bots vs. BotNets BotNets – networks of compromised machines some use chat systems (IRC) for C&C, others use P2P, HTTP, etc. abuse various systems Chat Bots – automated chat programs some are helpful, e.g., chat loggers can abuse chat systems and their users Send spam,spread malicious software, mount phishing attacks Our focus is on the Yahoo! Chat system. 5

Outline Background Measurement Classification System Experimental Evaluation Conclusion 6

Measurement August-November 2007 – we collect data August 2007 – Yahoo! adds CAPTCHA very few chat bots October 2007 – bots are back 7

Measurement August and November 2007 many chat bots 1,440 hours of chat logs 147 chat logs 21 chat rooms 8

Measurement To create our dataset, we read and label the chat users as human, bot, or ambiguous In total, we recognized 14 different types of chat bots different triggering mechanisms different text generation techniques 9

Types of Chat Bots Periodic Bots – sends messages based on periodic timers Random Bots – sends messages based on random timers Responder Bots – responds to messages of other users Replay Bots – replays messages of other users 10

Humans inter-message delay – evidence of heavy tail message size – well fit by Exponential (λ=0.034) 11

Periodic Bots inter-message delay – several clusters with high probabilities message size – messages built from templates approximate a normal distribution 12

Random Bots inter-message delay – Equilikely distribution at 40, 64, and 88; Uniform distribution message size – messages selected from a small database 13

Responder Bots inter-message delay – human-like timing message size – multiple templates of different lengths 14

Replay Bots inter-message delay – cluster with high probabilities (replay bots are periodic) message size – human-like size, well fit by Exponential (λ=0.028) 15

Outline Background Measurement Classification System Experimental Evaluation Conclusion 16

Classification System Entropy Classifier detects abnormal behavior based on message sizes and inter-message delays accurate but slow Machine Learning Classifier detects “learned” patterns based on message content fast but must be trained 17

18 Observation – chat bots are less complex than humans, and thus, lower in entropy exploits the low entropy of chat bots Corrected Conditional Entropy Test (CCE) estimates higher-order entropy Entropy Test (EN) estimates first-order entropy Entropy Classifier 18

Machine Learning Classifier Observation - chat spam like spam is a text classification problem exploits message content of chat bots CRM114 a powerful text classification system 19

20  Hybrid Classification System  entropy classifier builds and maintains the bot corpus  machine learning classifier uses the bot and human corpora BOT CORPUS CLASSIFY AS CHAT BOT HUMAN CORPUS CLASSIFY AS HUMAN INPUT ENTROPY CLASSIFIER MACHINE LEARNING CLASSIFIER

Outline Background Measurement Classification System Experimental Evaluation Conclusion 21

Experimental Evaluation Types of Chat Bots Periodic Bots Random Bots Responder Bots Replay Bots Classifiers entropy classifier – 100 messages machine learning classifier – 25 messages 22

Experimental Evaluation Classification Tests Ent – entropy classifier SupML – fully-supervised ML classifier, trained on AUG BOTS SupMLre – fully-supervised ML classifier, retrained on NOV BOTS EntML – entropy-trained ML on AUG BOTS 23

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/  Entropy Classifier  EN – entropy  CCE – corrected conditional entropy  (imd) – inter-message delay  (ms) – message size

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/  EN(imd) and CCE(imd)  problems against responder bots  detect most other chat bots

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/  EN(ms) and CCE(ms)  problems against random and replay bots  detect most other chat bots

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP EN(imd) 121/12168/681/3051/51109/10940/407/1713 CCE(imd) 121/12149/684/3051/51109/10940/4011/1713 EN(ms) 92/1217/688/3046/5134/1090/407/1713 CCE(ms) 77/1218/6830/3051/516/1090/4011/1713 OVERALL 121/12168/6830/3051/51109/10940/4017/  OVERALL  detects all chat bots  false positive rate is ~0.01  100 messages

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/  Entropy and Machine Learning Classifiers  Ent – entropy classifier (from last slide)  SupML – fully-supervised ML classifier, trained on AUG BOTS  SupMLre – fully-supervised ML classifier, retrained on NOV BOTS  EntML – entropy-trained ML on AUG BOTS

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman Test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/  Ent  OVERALL results from previous slide

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/  SupML  has problems against November bots  needs to be retrained for new bots  SupMLre  detects all bots

AUG BOTSNOV BOTS periodicrandomrespondperiodicrandomreplayhuman test TP FP Ent 121/12168/6830/3051/51109/10940/4017/1713 SupML 121/12168/6830/3014/51104/1091/400/1713 SupMLre 121/12168/6830/3051/51109/10940/400/1713 EntML 121/12168/6830/3051/51109/10940/401/  EntML  false positive rate is ~ (Ent is ~0.01)  25 messages

Outline Background Measurement Classification System Experimental Evaluation Conclusion 32

Conclusion Measurements overall, chat bots are less complex than humans some chat bots more human-like Classification System exploits benefits of both classifiers quickly classifies known chat bots accurately classifies unknown chat bots 33

Thank you !