Adversarial Learning: Practice and Theory Daniel Lowd University of Washington July 14th, 2006 Joint work with Chris Meek, Microsoft Research “If you know.

Slides:

Advertisements

Similar presentations

Review: Search problem formulation

Advertisements

Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.

Active Learning with Feedback on Both Features and Instances H. Raghavan, O. Madani and R. Jones Journal of Machine Learning Research 7 (2006) Presented.

Text Categorization.

Integrated Instance- and Class- based Generative Modeling for Text Classification Antti PuurulaUniversity of Waikato Sung-Hyon MyaengKAIST 5/12/2013 Australasian.

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.

Linear Classifiers (perceptrons)

Computing Kemeny and Slater Rankings Vincent Conitzer (Joint work with Andrew Davenport and Jayant Kalagnanam at IBM Research.)

Support Vector Machines

Machine Learning Theory Machine Learning Theory Maria Florina Balcan 04/29/10 Plan for today: - problem of “combining expert advice” - course retrospective.

Boosting Approach to ML

Foundations of Adversarial Learning Daniel Lowd, University of Washington Christopher Meek, Microsoft Research Pedro Domingos, University of Washington.

Partitioned Logistic Regression for Spam Filtering Ming-wei Chang University of Illinois at Urbana-Champaign Wen-tau Yih and Christopher Meek Microsoft.

On the Hardness of Evading Combinations of Linear Classifiers Daniel Lowd University of Oregon Joint work with David Stevens.

Ao-Jan Su † Y. Charlie Hu ‡ Aleksandar Kuzmanovic † Cheng-Kok Koh ‡ † Northwestern University ‡ Purdue University How to Improve Your Google Ranking: Myths.

Active Learning of Binary Classifiers

Northwestern University Winter 2007 Machine Learning EECS Machine Learning Lecture 13: Computational Learning Theory.

Probably Approximately Correct Model (PAC)

Deep Belief Networks for Spam Filtering

Active Learning with Support Vector Machines

Introduction to Machine Learning course fall 2007 Lecturer: Amnon Shashua Teaching Assistant: Yevgeny Seldin School of Computer Science and Engineering.

Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.

Foundations of Adversarial Learning Daniel Lowd, University of Washington Christopher Meek, Microsoft Research Pedro Domingos, University of Washington.

Bing LiuCS Department, UIC1 Learning from Positive and Unlabeled Examples Bing Liu Department of Computer Science University of Illinois at Chicago Joint.

Scalable Text Mining with Sparse Generative Models

Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.

Online Learning Algorithms

Learning at Low False Positive Rate Scott Wen-tau Yih Joshua Goodman Learning for Messaging and Adversarial Problems Microsoft Research Geoff Hulten Microsoft.

Good Word Attacks on Statistical Spam Filters Daniel Lowd University of Washington (Joint work with Christopher Meek, Microsoft Research)

Ensembles of Classifiers Evgueni Smirnov

Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.

Topics on Final Perceptrons SVMs Precision/Recall/ROC Decision Trees Naive Bayes Bayesian networks Adaboost Genetic algorithms Q learning Not on the final:

Mean Field Inference in Dependency Networks: An Empirical Study Daniel Lowd and Arash Shamaei University of Oregon.

1 Naïve Bayes Models for Probability Estimation Daniel Lowd University of Washington (Joint work with Pedro Domingos)

1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.

Bayesian Networks Martin Bachler MLA - VO

ECE 8443 – Pattern Recognition Objectives: Error Bounds Complexity Theory PAC Learning PAC Bound Margin Classifiers Resources: D.M.: Simplified PAC-Bayes.

Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者：郝柏翰 2013/01/28.

Research Ranked Recall: Efficient Classification by Learning Indices That Rank Omid Madani with Michael Connor (UIUC)

Universit at Dortmund, LS VIII

Transfer Learning Task. Problem Identification Dataset : A Year: 2000 Features: 48 Training Model ‘M’ Testing 98.6% Training Model ‘M’ Testing 97% Dataset.

One-class Training for Masquerade Detection Ke Wang, Sal Stolfo Columbia University Computer Science IDS Lab.

Potential-Based Agnostic Boosting Varun Kanade Harvard University (joint work with Adam Tauman Kalai (Microsoft NE))

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

1 Machine Learning: Lecture 8 Computational Learning Theory (Based on Chapter 7 of Mitchell T.., Machine Learning, 1997)

Ensemble Learning Spring 2009 Ben-Gurion University of the Negev.

Improving Spam Detection Based on Structural Similarity By Luiz H. Gomes, Fernando D. O. Castro, Rodrigo B. Almeida, Luis M. A. Bettencourt, Virgílio A.

Spam Detection Ethan Grefe December 13, 2013.

Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University Today: Computational Learning Theory Probably Approximately.

26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.

NTU & MSRA Ming-Feng Tsai

CS 8751 ML & KDDComputational Learning Theory1 Notions of interest: efficiency, accuracy, complexity Probably, Approximately Correct (PAC) Learning Agnostic.

Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Learning to Rank: From Pairwise Approach to Listwise Approach Authors: Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li Presenter: Davidson Date:

Machine Learning Chapter 7. Computational Learning Theory Tom M. Mitchell.

Page 1 CS 546 Machine Learning in NLP Review 1: Supervised Learning, Binary Classifiers Dan Roth Department of Computer Science University of Illinois.

On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Pedro Domingos, Michael Pazzani Presented by Lu Ren Oct. 1, 2007.

Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)

1 CS 391L: Machine Learning: Computational Learning Theory Raymond J. Mooney University of Texas at Austin.

Dan Roth Department of Computer and Information Science

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Cross Domain Distribution Adaptation via Kernel Mapping

KDD 2004: Adversarial Classification

Asymmetric Gradient Boosting with Application to Spam Filtering

CSCI B609: “Foundations of Data Science”

Binghui Wang, Le Zhang, Neil Zhenqiang Gong

MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.

Presentation transcript:

Adversarial Learning: Practice and Theory Daniel Lowd University of Washington July 14th, 2006 Joint work with Chris Meek, Microsoft Research “If you know the enemy and know yourself, you need not fear the result of a hundred battles.” -- Sun Tzu, 500 BC

2 Content-based Spam Filtering cheap = 1.0 mortgage = 1.5 Total score = 2.5 From: Cheap mortgage now!!! Feature Weights > 1.0 (threshold) Spam

3 Good Word Attacks cheap = 1.0 mortgage = 1.5 Corvallis = -1.0 OSU = -1.0 Total score = 0.5 From: Cheap mortgage now!!! Corvallis OSU Feature Weights < 1.0 (threshold) OK

4 Outline Practice: good word attacks Passive attacks Active attacks Experimental results Theory: ACRE learning Definitions and examples Learning linear classifiers Experimental results

5 Can we efficiently find a list of “good words”? Types of attacks Passive attacks -- no filter access Active attacks -- test s allowed Metrics Expected number of words required to get median (blocked) spam past the filter Number of query messages sent Attacking Spam Filters

6 Filter Configuration Models used Naïve Bayes: generative Maximum Entropy (Maxent): discriminative Training 500,000 messages from Hotmail feedback loop 276,000 features Maxent let 30% less spam through

7 Comparison of Filter Weights “spammy”“good”

8 Passive Attacks Heuristics Select random dictionary words (Dictionary) Select most frequent English words (Freq. Word) Select highest ratio: English freq./spam freq. (Freq. Ratio) Spam corpus: spamarchive.org English corpora: Reuters news articles Written English Spoken English 1992 USENET

9 Passive Attack Results

10 Active Attacks Learn which words are best by sending test messages (queries) through the filter First-N: Find n good words using as few queries as possible Best-N: Find the best n words

11 First-N Attack Step 1: Find a “Barely spam” message Threshold Legitimate Spam “Barely spam” Hi, mom! Cheap mortgage now!!! “Barely legit.” mortgage now!!! Original spam Original legit.

12 First-N Attack Step 2: Test each word Threshold Legitimate Spam Good words “Barely spam” message Less good words

13 Best-N Attack Key idea: use spammy words to sort the good words. Threshold Legitimate Spam Better Worse

14 Active Attack Results (n = 100) Best-N twice as effective as First-N Maxent more vulnerable to active attacks Active attacks much more effective than passive attacks

15 Outline Practice: good word attacks Passive attacks Active attacks Experimental results Theory: ACRE learning Definitions and examples Learning linear classifiers Experimental results

16 How to formalize? Q: What’s the spammer’s goal? A: Find the best possible spam message that gets through a spam filter. Q: How? A: By sending test messages through the filter to learn about it.

17 Not just spam! Credit card fraud detection Network intrusion detection Terrorist detection Loan approval Web page search rankings …many more…

18 Definitions X1X1 X2X2 x a(x): X  R a  A (e.g., more legible spam is better) X1X1 X2X2 x + - X1X1 X2X2 Instance space Classifier Adversarial cost function c(x): X  {+,  } c  C, concept class (e.g., linear classifier) X = {X 1, X 2, …, X n } Each X i is a feature Instances, x  X (e.g., s)

19 Adversarial Classifier Reverse Engineering (ACRE) Task: minimize a(x) subject to c(x) =  Problem: the adversary doesn’t know c(x)! X1X1 X2X2 + -

20 Adversarial Classifier Reverse Engineering (ACRE) Task: minimize a(x) subject to c(x) =  Given: X1X1 X2X2 ?? ? ? ? ? ? ? - + –Full knowledge of a(x) –One positive and one negative instance, x + and x  –A polynomial number of membership queries Within a factor of k

21 Adversarial Classifier Reverse Engineering (ACRE) IF an algorithm exists that, for any a  A, c  C minimizes a(x) subject to c(x) =  within factor k GIVEN Full knowledge of a(x) Positive and negative instances, x + and x  A polynomial number of membership queries THEN we say that concept class C is ACRE k-learnable under a set of cost functions A

22 Example: trivial cost function Suppose A is the set of functions where: m instances have cost b All other instances cost b’ > b Test each of the m b-cost instances If none is negative, choose x  X1X1 X2X2 - +

23 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn: x + = (x 1 = T,  x 2 = F, x 3 = F, x 4 = T) Guess: (x 1   x 2   x 3  x 4 )

24 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn: x + = (T,  F, F, T) Guess: (x 1   x 2   x 3  x 4 )

25 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn: x + = (T, F, F, T) x’ = (F, F, F, T) c(x’) =  Guess: (x 1   x 2   x 3  x 4 )

26 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn: x + = (T, F, F, T) x’ = (T, T, F, T) c(x’) = + Guess: (x 1   x 2   x 3  x 4 )

27 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn: x + = (T, F, F, T) x’ = (T, F, T, T) c(x’) =  Guess: (x 1   x 2   x 3  x 4 )

28 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn: x + = (T, F, F, T) x’ = (T, F, F, F) c(x’) = + Guess: (x 1   x 2   x 3  x 4 ) Final Answer: (x 1   x 3 )

29 Example: Boolean conjunctions Suppose C is all conjunctions of Boolean literals (e.g., x 1   x 3 ) Starting with x +, toggle each x i in turn Exact conjunction is learnable in n queries. Now we can optimize any cost function. In general: concepts learnable with membership queries are ACRE 1-learnable

30 Comparison to other theoretical learning methods Probably Approximately Correct (PAC): accuracy over same distribution Membership queries: exact classifier ACRE: single low-cost, negative instance

31 Linear Cost Functions Cost is weighted L 1 distance from some “ideal” instance x a : X1X1 X2X2 xaxa

32 Linear Classifier c(x) = +, iff (w  x > T) Examples: Naïve Bayes, maxent, SVM with linear kernel X1X1 X2X2

33 Theorem 1: Continuous features Linear classifiers with continuous features are ACRE (1+  )-learnable under linear cost functions Proof sketch Only need to change the highest weight/cost feature We can efficiently find this feature using line searches in each dimension X1X1 X2X2 xaxa

34 Theorem 2: Boolean features Linear classifiers with Boolean features are ACRE 2-learnable under uniform linear cost functions Harder problem: can’t do line searches Uniform linear cost: unit cost per “change” xaxa x-x- wiwi wjwj wkwk wlwl wmwm c(x)c(x)

35 Algorithm Iteratively reduce cost in two ways: 1. Remove any unnecessary change: O(n) 2. Replace any two changes with one: O(n 3 ) xaxa y wiwi wjwj wkwk wlwl c(x)c(x) wmwm x-x- xaxa y’ wiwi wjwj wkwk wlwl c(x)c(x) wpwp

36 Proof Sketch (Contradiction) xaxa y wiwi wjwj wkwk wlwl c(x)c(x) wmwm x wpwp wrwr x’s average change is twice as good as y’s We can replace y’s two worst changes with x’s single best change But we already tried every such replacement! Suppose there is some negative instance x with less than half the cost of y:

37 Application: Spam Filtering Spammer goal: minimally modify a spam message to achieve a spam that gets past a spam filter. Corresponding ACRE problem: spam filter linear classifier with Boolean features “minimally modify” uniform linear cost function

38 Experimental Setup Filter configuration (same as before) Naïve Bayes (NB) and maxent (ME) filters 500,000 Hotmail messages for training > 250,000 features Adversary feature sets 23,000 English words (Dict) 1,000 random English words (Rand)

39 Results Reduced feature set almost as good Cost ratio is excellent Number of queries is reasonable (parallelize) Less efficient than good word attacks, but guaranteed to work CostRatioQueries Dict NB ,472k Dict ME k Rand NB k Rand ME k

40 Future Work Within the ACRE framework Other concept classes, cost functions Other real-world domains ACRE extensions Adversarial Regression Reverse Engineering Relational ACRE Background knowledge (passive attacks)

41 Related Work [Dalvi et al., 2004] Adversarial classification Game-theoretic approach Assume attacker chooses optimal strategy against classifier Assume defender modifies classifier knowing attacker strategy [Kolter and Maloof, 2005] Concept drift Mixture of experts Theoretical bounds against adversary

42 Conclusion Spam filters are very vulnerable Can make lists of good words without filter access With filter access, better attacks are available ACRE learning is a natural formulation for adversarial problems Pick a concept class, C Pick a set of cost functions, A Devise an algorithm to optimize through querying