1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University.

1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University

2 The online expert game  Master :  Expert i :  Master :  Nature :  Repeats indefinitely.  Master’s goal: minimize total number of mistakes.  Constraint: Some expert makes at most k mistakes. +1 +1 N Experts Nature Master

3 Continuous Experts  Continuous experts predict real numbers in [-1,+1].  Interpretation:  Given y, expected number of mistakes or loss:  Constraint: some expert with loss at most k. predicts 0.6 +1 w.p. 0.8 -1 w.p. 0.2 Mistake w.p. 0.8

4 Previous work  Exp weights: regret  For binary experts, BW improves this to  Exp weights: regret  For binary experts, BW improves this to

5 Our results  Similar improvements for more general experts  Abstaining experts: make -1,0,+1 predictions  Continuous experts: make predictions in [-1,+1]  For integral 2k, same regret as abstaining experts  For general 2k, an additional additive constant  Lower bound constructions For, tight up to an additive term  Similar improvements for more general experts  Abstaining experts: make -1,0,+1 predictions  Continuous experts: make predictions in [-1,+1]  For integral 2k, same regret as abstaining experts  For general 2k, an additional additive constant  Lower bound constructions For, tight up to an additive term

6 Outline  Approximate game value  Derive master strategy  Approximation is tight  Future work  Approximate game value  Derive master strategy  Approximation is tight  Future work

7 Game Value  Natural approach: compute minimax value of state  State: cumulative losses  Solved by Dynamic Programming on states.  Exponentially large state space. Need approximation  Natural approach: compute minimax value of state  State: cumulative losses  Solved by Dynamic Programming on states.  Exponentially large state space. Need approximation

8 Approximating Game Value  Approximation: assume experts are distributed.  State space drops to O(N) from.  But distributed experts are less adversarial !  Solution: allow experts to make random choices.  Approximation: assume experts are distributed.  State space drops to O(N) from.  But distributed experts are less adversarial !  Solution: allow experts to make random choices. Random Colluding Non-random Distributed Large no. of experts

9 Potential for random expert  One random expert, fixed number t of rounds.  Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer?  ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}.  Expected number of violating experts ~  Upper bound: max t s.t. potential < 1.  One random expert, fixed number t of rounds.  Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer?  ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}.  Expected number of violating experts ~  Upper bound: max t s.t. potential < 1. (state-potential)

10 The OS potential  Base case:  Intuition: Optimal random expert is balanced.  Recurrence:  Optimal expert insensitive to master’s response.  Base case:  Intuition: Optimal random expert is balanced.  Recurrence:  Optimal expert insensitive to master’s response.

11 Derive master strategy  Dual form captures notion of balance.  A measure of expert-bias ~  State-potential drops by above quantity if:  For optimal random expert:  Upper bound:  Dual form captures notion of balance.  A measure of expert-bias ~  State-potential drops by above quantity if:  For optimal random expert:  Upper bound:

12 The online expert game  Master :  Expert i :  Master :  Nature :  Repeats T times  Master’s goal: min. regret  Expert mistake bound of k +1 +1 N Experts Nature Master 0.1 0.5 0.2 0.1

13 Computing   Binary experts: {-1,+1}  Randomly choose -1 or +1  Recovers BW potential  Abstaining experts: {-1,0,+1}  Play 0 or {-1,+1} at random   Binary experts: {-1,+1}  Randomly choose -1 or +1  Recovers BW potential  Abstaining experts: {-1,0,+1}  Play 0 or {-1,+1} at random

14 Computing for continuous experts  y ranges over infinitely many values [-1,+1] !  Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.  y ranges over infinitely many values [-1,+1] !  Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.

15 Computing for continuous experts   Compute potential by Dynamic Programming   Compute potential by Dynamic Programming (Piecewise convex in y)

16 sss-1s+1 s+y window Integer

17 Computing for continuous experts   Compute potential by Dynamic Programming  For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases !   Compute potential by Dynamic Programming  For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases ! (Piecewise convex in y)

18 Outline  Approximate game value  Derive master strategy  Approximation is tight  Future work  Approximate game value  Derive master strategy  Approximation is tight  Future work

19 Approximation is tight  Recall:  Sample:  Drop in potential:  Lipschitz:, yields:  Hoeffding bounds to get small expert-bias, preserving potential.  Recall:  Sample:  Drop in potential:  Lipschitz:, yields:  Hoeffding bounds to get small expert-bias, preserving potential.

20 Improved lower bounds  Sampling analysis relies on approx. bounds  De-randomize: take balanced partition of  Abstaining case: balance with Most balanced partition has difference  Second observation: decays with t  Reduces gap from to  Sampling analysis relies on approx. bounds  De-randomize: take balanced partition of  Abstaining case: balance with Most balanced partition has difference  Second observation: decays with t  Reduces gap from to

21 Conclusion and future work  Ideas based on Schapire’s drifting game.  Close gap of O(log k) to show abstaining experts as powerful as continuous ones.  Boosting with confidence rated predictors.  Optimal algorithms for multi-class boosting.  Ideas based on Schapire’s drifting game.  Close gap of O(log k) to show abstaining experts as powerful as continuous ones.  Boosting with confidence rated predictors.  Optimal algorithms for multi-class boosting.

22 Thank you

1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University.

Similar presentations

Presentation on theme: "1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University.

Similar presentations

Presentation on theme: "1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University."— Presentation transcript:

Similar presentations

About project

Feedback