Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey.

Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey S. Rosenschein

Strategy-Proof Classification An Example of Strategic Labels in Classification Motivation Our Model Previous work (positive results) An impossibility theorem An impossibility theorem More results (if there is time) More results (if there is time) (~12 minutes)

ERM MotivationModelResults Strategic labeling: an example Introduction 5 errors

There is a better classifier! (for me…) MotivationModelResultsIntroduction

If I will only change the labels… MotivationModelResultsIntroduction 2+4 = 6 errors

Classification The Supervised Classification problem: – Input: a set of labeled data points {(x i,y i )} i=1..m – output: a classifier c from some predefined concept class C ( functions of the form f : X  {-,+} ) – We usually want c to classify correctly not just the sample, but to generalize well, i.e.to minimize R(c) ≡ the expected number of errors w.r.t. the distribution D MotivationResultsIntroductionModel E (x,y)~D [ c(x)≠y ]

Classification (cont.) ERM A common approach is to return the ERM, i.e. the concept in C that is the best w.r.t. the given samples (has the lowest number of errors) Generalizes well under some assumptions on the concept class C With multiple experts, we can’t trust our ERM! MotivationResultsIntroductionModel

Where do we find “experts” with incentives? Example 1: A firm learning purchase patterns – Information gathered from local retailers – The resulting policy affects them – “the best policy, is the policy that fits my pattern” IntroductionModelResultsMotivation

Users Reported Dataset Classification Algorithm Classifier IntroductionModelResults Example 2: Internet polls / expert systems Motivation

Related work A study of SP mechanisms in Regression learning – O. Dekel, F. Fischer and A. D. Procaccia, Incentive Compatible Regression Learning, SODA 2008 No SP mechanisms for Clustering – J. Perote-Peña and J. Perote. The impossibility of strategy-proof clustering, Economics Bulletin, 2003 IntroductionMotivationModelResults

A problem instance is defined by Set of agents I = {1,...,n} A partial dataset for each agent i  I, X i = {x i1,...,x i,m(i) }  X For each x ik  X i agent i has a label y ik  { ,  } – Each pair s ik=  x ik,y ik  is an example – All examples of a single agent compose the labeled dataset S i = {s i1,...,s i,m(i) } The joint dataset S=  S 1, S 2,…, S n  is our input – m=|S| We denote the dataset with the reported labels by S’ IntroductionMotivationResultsModel

Agent 1 Agent 2 Agent 3 Input: Example + + – – – – + + – – – – – – – – – – + + + + + + + + + + + + – – X1  Xm1X1  Xm1 X2  Xm2X2  Xm2 X3  Xm3X3  Xm3 Y 1  {-,+} m 1 Y 2  {-,+} m 2 Y 3  {-,+} m 3 S =  S 1, S 2,…, S n  =  (X 1,Y 1 ),…, (X n,Y n )  IntroductionMotivationResultsModel

Incentives and Mechanisms A Mechanism M receives a labeled dataset S’ and outputs c  C Private risk of i: R i (c,S) = |{k: c(x ik )  y ik }| / m i Global risk: R (c,S) = |{i,k: c(x ik )  y ik }| / m We allow non-deterministic mechanisms – The outcome is a random variable – Measure the expected risk IntroductionMotivationResultsModel

ERM We compare the outcome of M to the ERM: c* = ERM(S) = argmin( R (c),S) r* = R (c*,S) c  Cc  C Can our mechanism simply compute and return the ERM? IntroductionMotivationResultsModel

Requirements 1.Good approximation:  S R ( M (S),S) ≤ β∙r* 2.Strategy-Proofness (SP):  i,S,S i ‘ R i ( M (S -i, S i ‘),S) ≥ R i ( M (S),S) ERM(S) is 1-approximating but not SP ERM(S 1 ) is SP but gives bad approximation Are there any mechanisms that guarantee both SP and good approximation? IntroductionMotivationResultsModel MOST IMPORTANT SLIDE

Restricted settings A very small concept class: |C| = 2 – There is a deterministic SP mechanism that obtains a 3-approximation ratio – This bound is tight – Randomization can improve the bound to 2 R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification under Constant Hypotheses: A Tale of Two Functions, AAAI 2008 IntroductionMotivationModelResults

Restricted settings (cont.) Agents with similar interests: – There is a randomized SP 3-approximation mechanism (works for any class C) IntroductionMotivationModelResults R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification with Shared Inputs, IJCAI 2009. Agent 1 Agent 2 Agent 3

But not everything shines  Without restrictions on the input, we cannot guarantee a constant approximation ratio Our main result: Theorem: There is a concept class C, for which there are no deterministic SP mechanisms with o(m)-approximation ratio IntroductionMotivationModelResults

Deterministic lower bound Proof idea: – First construct a classification problem that is equivalent to a voting problem with 3 candidates – Then use the Gibbard-Satterthwaite theorem to prove that there must be a dictator – Finally, the dictator’s opinion might be very far from the optimal classification IntroductionMotivationModelResults

Proof (1) Construction: We have X={a,b}, and 3 classifiers as follows The dataset contains two types of agents, with samples distributed unevenly over a and b IntroductionMotivationModelResults We do not set the labels. Instead, we denote by Y all the possible labelings of an agent’s dataset.

Proof (2) Let P be the set of all 6 orders over C A voting rule is a function of the form f: P n  C But our mechanism is a function M: Y n  C ! (its input are labels and not orders) Lemma 1: there is a valid mapping g: P n  Y n, s.t. (M * g) is a voting rule IntroductionMotivationModelResults

Proof (3) Lemma 2: If M is SP, and guarantees any bounded approximation ratio, then f=M*g is dictatorial Proof: (f is onto) any profile that c classifies perfectly must induce the selection of c (f is SP) suppose there is a manipulation By mapping this profile to labels with g, we find a manipulation of M, in contradiction to its SP From the G-S theorem, f must be dictatorial IntroductionMotivationModelResults

Proof (4) IntroductionMotivationModelResults Finally, f (and thus M) can only be dictatorial. We assume w.l.o.g. that the dictator is agent 1 of type I a. We now label the data points as follows: The optimal classifier is c ab, which makes 2 errors The dictator selects c a, which makes m/2 errors

Real concept classes IntroductionMotivationModelResults We managed to show that there are no good (deterministic) SP mechanisms, but only for a synthetically constructed class. We are interested in more common classes, that are really used in machine learning. For example: Linear Classifiers Boolean Conjunctions

Linear classifiers Only 2 errors IntroductionMotivationModelResults “b” caca cbcb c ab “a” Ω (√m) errors

A lower bound for randomized SP mechanisms A lottery over dictatorships is still bad – Ω (k) instead of Ω (m), where k is the size of the largest dataset controlled by an agent ( m ≈ k * n ) However, it is not clear how to eliminate other mechanisms – G-S works only for deterministic mechanisms – Another theorem by Gibbard [’79] can help But only under additional assumptions IntroductionMotivationModelResults

Upper bounds So, our lower bounds do not leave much hope for good SP mechanisms We would still like to know if they are tight A deterministic SP O(m)-approximation is easy: – break ties iteratively according to dictators What about randomized SP O(k) mechanisms? IntroductionMotivationModelResults

The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv

The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors

The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors

The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors Iteration 4: 0 errors

The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors Iteration 4: 0 errors Iteration 5: 1 error Theorem: The IRD is O(k 2 ) approximating for Linear Classifiers in R 1

Future work Other concept classes Other loss functions Alternative assumptions on structure of data Other models of strategic behavior … IntroductionMotivationModelResults

Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey.

Similar presentations

Presentation on theme: "Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey.

Similar presentations

Presentation on theme: "Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey."— Presentation transcript:

Similar presentations

About project

Feedback