Download presentation

Presentation is loading. Please wait.

Published byIssac Walcott Modified about 1 year ago

1
1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University

2
2 The online expert game Master : Expert i : Master : Nature : Repeats indefinitely. Master’s goal: minimize total number of mistakes. Constraint: Some expert makes at most k mistakes N Experts Nature Master

3
3 Continuous Experts Continuous experts predict real numbers in [-1,+1]. Interpretation: Given y, expected number of mistakes or loss: Constraint: some expert with loss at most k. predicts w.p w.p. 0.2 Mistake w.p. 0.8

4
4 Previous work Exp weights: regret For binary experts, BW improves this to Exp weights: regret For binary experts, BW improves this to

5
5 Our results Similar improvements for more general experts Abstaining experts: make -1,0,+1 predictions Continuous experts: make predictions in [-1,+1] For integral 2k, same regret as abstaining experts For general 2k, an additional additive constant Lower bound constructions For, tight up to an additive term Similar improvements for more general experts Abstaining experts: make -1,0,+1 predictions Continuous experts: make predictions in [-1,+1] For integral 2k, same regret as abstaining experts For general 2k, an additional additive constant Lower bound constructions For, tight up to an additive term

6
6 Outline Approximate game value Derive master strategy Approximation is tight Future work Approximate game value Derive master strategy Approximation is tight Future work

7
7 Game Value Natural approach: compute minimax value of state State: cumulative losses Solved by Dynamic Programming on states. Exponentially large state space. Need approximation Natural approach: compute minimax value of state State: cumulative losses Solved by Dynamic Programming on states. Exponentially large state space. Need approximation

8
8 Approximating Game Value Approximation: assume experts are distributed. State space drops to O(N) from. But distributed experts are less adversarial ! Solution: allow experts to make random choices. Approximation: assume experts are distributed. State space drops to O(N) from. But distributed experts are less adversarial ! Solution: allow experts to make random choices. Random Colluding Non-random Distributed Large no. of experts

9
9 Potential for random expert One random expert, fixed number t of rounds. Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer? ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}. Expected number of violating experts ~ Upper bound: max t s.t. potential < 1. One random expert, fixed number t of rounds. Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer? ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}. Expected number of violating experts ~ Upper bound: max t s.t. potential < 1. (state-potential)

10
10 The OS potential Base case: Intuition: Optimal random expert is balanced. Recurrence: Optimal expert insensitive to master’s response. Base case: Intuition: Optimal random expert is balanced. Recurrence: Optimal expert insensitive to master’s response.

11
11 Derive master strategy Dual form captures notion of balance. A measure of expert-bias ~ State-potential drops by above quantity if: For optimal random expert: Upper bound: Dual form captures notion of balance. A measure of expert-bias ~ State-potential drops by above quantity if: For optimal random expert: Upper bound:

12
12 The online expert game Master : Expert i : Master : Nature : Repeats T times Master’s goal: min. regret Expert mistake bound of k N Experts Nature Master

13
13 Computing Binary experts: {-1,+1} Randomly choose -1 or +1 Recovers BW potential Abstaining experts: {-1,0,+1} Play 0 or {-1,+1} at random Binary experts: {-1,+1} Randomly choose -1 or +1 Recovers BW potential Abstaining experts: {-1,0,+1} Play 0 or {-1,+1} at random

14
14 Computing for continuous experts y ranges over infinitely many values [-1,+1] ! Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for. y ranges over infinitely many values [-1,+1] ! Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.

15
15 Computing for continuous experts Compute potential by Dynamic Programming Compute potential by Dynamic Programming (Piecewise convex in y)

16
16 sss-1s+1 s+y window Integer

17
17 Computing for continuous experts Compute potential by Dynamic Programming For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases ! Compute potential by Dynamic Programming For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases ! (Piecewise convex in y)

18
18 Outline Approximate game value Derive master strategy Approximation is tight Future work Approximate game value Derive master strategy Approximation is tight Future work

19
19 Approximation is tight Recall: Sample: Drop in potential: Lipschitz:, yields: Hoeffding bounds to get small expert-bias, preserving potential. Recall: Sample: Drop in potential: Lipschitz:, yields: Hoeffding bounds to get small expert-bias, preserving potential.

20
20 Improved lower bounds Sampling analysis relies on approx. bounds De-randomize: take balanced partition of Abstaining case: balance with Most balanced partition has difference Second observation: decays with t Reduces gap from to Sampling analysis relies on approx. bounds De-randomize: take balanced partition of Abstaining case: balance with Most balanced partition has difference Second observation: decays with t Reduces gap from to

21
21 Conclusion and future work Ideas based on Schapire’s drifting game. Close gap of O(log k) to show abstaining experts as powerful as continuous ones. Boosting with confidence rated predictors. Optimal algorithms for multi-class boosting. Ideas based on Schapire’s drifting game. Close gap of O(log k) to show abstaining experts as powerful as continuous ones. Boosting with confidence rated predictors. Optimal algorithms for multi-class boosting.

22
22 Thank you

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google