Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey.

Similar presentations


Presentation on theme: "Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey."— Presentation transcript:

1 Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey S. Rosenschein

2 Strategy-Proof Classification An Example of Strategic Labels in Classification Motivation Our Model Previous work (positive results) An impossibility theorem An impossibility theorem More results (if there is time) More results (if there is time) (~12 minutes)

3 ERM MotivationModelResults Strategic labeling: an example Introduction 5 errors

4 There is a better classifier! (for me…) MotivationModelResultsIntroduction

5 If I will only change the labels… MotivationModelResultsIntroduction 2+4 = 6 errors

6 Classification The Supervised Classification problem: – Input: a set of labeled data points {(x i,y i )} i=1..m – output: a classifier c from some predefined concept class C ( functions of the form f : X  {-,+} ) – We usually want c to classify correctly not just the sample, but to generalize well, i.e.to minimize R(c) ≡ the expected number of errors w.r.t. the distribution D MotivationResultsIntroductionModel E (x,y)~D [ c(x)≠y ]

7 Classification (cont.) ERM A common approach is to return the ERM, i.e. the concept in C that is the best w.r.t. the given samples (has the lowest number of errors) Generalizes well under some assumptions on the concept class C With multiple experts, we can’t trust our ERM! MotivationResultsIntroductionModel

8 Where do we find “experts” with incentives? Example 1: A firm learning purchase patterns – Information gathered from local retailers – The resulting policy affects them – “the best policy, is the policy that fits my pattern” IntroductionModelResultsMotivation

9 Users Reported Dataset Classification Algorithm Classifier IntroductionModelResults Example 2: Internet polls / expert systems Motivation

10 Related work A study of SP mechanisms in Regression learning – O. Dekel, F. Fischer and A. D. Procaccia, Incentive Compatible Regression Learning, SODA 2008 No SP mechanisms for Clustering – J. Perote-Peña and J. Perote. The impossibility of strategy-proof clustering, Economics Bulletin, 2003 IntroductionMotivationModelResults

11 A problem instance is defined by Set of agents I = {1,...,n} A partial dataset for each agent i  I, X i = {x i1,...,x i,m(i) }  X For each x ik  X i agent i has a label y ik  { ,  } – Each pair s ik=  x ik,y ik  is an example – All examples of a single agent compose the labeled dataset S i = {s i1,...,s i,m(i) } The joint dataset S=  S 1, S 2,…, S n  is our input – m=|S| We denote the dataset with the reported labels by S’ IntroductionMotivationResultsModel

12 Agent 1 Agent 2 Agent 3 Input: Example + + – – – – + + – – – – – – – – – – + + + + + + + + + + + + – – X1  Xm1X1  Xm1 X2  Xm2X2  Xm2 X3  Xm3X3  Xm3 Y 1  {-,+} m 1 Y 2  {-,+} m 2 Y 3  {-,+} m 3 S =  S 1, S 2,…, S n  =  (X 1,Y 1 ),…, (X n,Y n )  IntroductionMotivationResultsModel

13 Incentives and Mechanisms A Mechanism M receives a labeled dataset S’ and outputs c  C Private risk of i: R i (c,S) = |{k: c(x ik )  y ik }| / m i Global risk: R (c,S) = |{i,k: c(x ik )  y ik }| / m We allow non-deterministic mechanisms – The outcome is a random variable – Measure the expected risk IntroductionMotivationResultsModel

14 ERM We compare the outcome of M to the ERM: c* = ERM(S) = argmin( R (c),S) r* = R (c*,S) c  Cc  C Can our mechanism simply compute and return the ERM? IntroductionMotivationResultsModel

15 Requirements 1.Good approximation:  S R ( M (S),S) ≤ β∙r* 2.Strategy-Proofness (SP):  i,S,S i ‘ R i ( M (S -i, S i ‘),S) ≥ R i ( M (S),S) ERM(S) is 1-approximating but not SP ERM(S 1 ) is SP but gives bad approximation Are there any mechanisms that guarantee both SP and good approximation? IntroductionMotivationResultsModel MOST IMPORTANT SLIDE

16 Restricted settings A very small concept class: |C| = 2 – There is a deterministic SP mechanism that obtains a 3-approximation ratio – This bound is tight – Randomization can improve the bound to 2 R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification under Constant Hypotheses: A Tale of Two Functions, AAAI 2008 IntroductionMotivationModelResults

17 Restricted settings (cont.) Agents with similar interests: – There is a randomized SP 3-approximation mechanism (works for any class C) IntroductionMotivationModelResults R. Meir, A. D. Procaccia and J. S. Rosenschein, Incentive Compatible Classification with Shared Inputs, IJCAI 2009. Agent 1 Agent 2 Agent 3

18 But not everything shines  Without restrictions on the input, we cannot guarantee a constant approximation ratio Our main result: Theorem: There is a concept class C, for which there are no deterministic SP mechanisms with o(m)-approximation ratio IntroductionMotivationModelResults

19 Deterministic lower bound Proof idea: – First construct a classification problem that is equivalent to a voting problem with 3 candidates – Then use the Gibbard-Satterthwaite theorem to prove that there must be a dictator – Finally, the dictator’s opinion might be very far from the optimal classification IntroductionMotivationModelResults

20 Proof (1) Construction: We have X={a,b}, and 3 classifiers as follows The dataset contains two types of agents, with samples distributed unevenly over a and b IntroductionMotivationModelResults We do not set the labels. Instead, we denote by Y all the possible labelings of an agent’s dataset.

21 Proof (2) Let P be the set of all 6 orders over C A voting rule is a function of the form f: P n  C But our mechanism is a function M: Y n  C ! (its input are labels and not orders) Lemma 1: there is a valid mapping g: P n  Y n, s.t. (M * g) is a voting rule IntroductionMotivationModelResults

22 Proof (3) Lemma 2: If M is SP, and guarantees any bounded approximation ratio, then f=M*g is dictatorial Proof: (f is onto) any profile that c classifies perfectly must induce the selection of c (f is SP) suppose there is a manipulation By mapping this profile to labels with g, we find a manipulation of M, in contradiction to its SP From the G-S theorem, f must be dictatorial IntroductionMotivationModelResults

23 Proof (4) IntroductionMotivationModelResults Finally, f (and thus M) can only be dictatorial. We assume w.l.o.g. that the dictator is agent 1 of type I a. We now label the data points as follows: The optimal classifier is c ab, which makes 2 errors The dictator selects c a, which makes m/2 errors

24 Real concept classes IntroductionMotivationModelResults We managed to show that there are no good (deterministic) SP mechanisms, but only for a synthetically constructed class. We are interested in more common classes, that are really used in machine learning. For example: Linear Classifiers Boolean Conjunctions

25 Linear classifiers Only 2 errors IntroductionMotivationModelResults “b” caca cbcb c ab “a” Ω (√m) errors

26 A lower bound for randomized SP mechanisms A lottery over dictatorships is still bad – Ω (k) instead of Ω (m), where k is the size of the largest dataset controlled by an agent ( m ≈ k * n ) However, it is not clear how to eliminate other mechanisms – G-S works only for deterministic mechanisms – Another theorem by Gibbard [’79] can help But only under additional assumptions IntroductionMotivationModelResults

27 Upper bounds So, our lower bounds do not leave much hope for good SP mechanisms We would still like to know if they are tight A deterministic SP O(m)-approximation is easy: – break ties iteratively according to dictators What about randomized SP O(k) mechanisms? IntroductionMotivationModelResults

28 The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv

29 The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors

30 The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors

31 The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors Iteration 4: 0 errors

32 The iterative random dictator (IRD) (example with linear classifiers on R 1 ) IntroductionMotivationModelResults vv Iteration 1: 2 errors Iteration 2: 5 errors Iteration 3: 0 errors Iteration 4: 0 errors Iteration 5: 1 error Theorem: The IRD is O(k 2 ) approximating for Linear Classifiers in R 1

33 Future work Other concept classes Other loss functions Alternative assumptions on structure of data Other models of strategic behavior … IntroductionMotivationModelResults

34


Download ppt "Strategy-Proof Classification Reshef Meir School of Computer Science and Engineering, Hebrew University A joint work with Ariel. D. Procaccia and Jeffrey."

Similar presentations


Ads by Google