# 1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University.

## Presentation on theme: "1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University."— Presentation transcript:

1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University

2 The online expert game  Master :  Expert i :  Master :  Nature :  Repeats indefinitely.  Master’s goal: minimize total number of mistakes.  Constraint: Some expert makes at most k mistakes. +1 +1 N Experts Nature Master

3 Continuous Experts  Continuous experts predict real numbers in [-1,+1].  Interpretation:  Given y, expected number of mistakes or loss:  Constraint: some expert with loss at most k. predicts 0.6 +1 w.p. 0.8 -1 w.p. 0.2 Mistake w.p. 0.8

4 Previous work  Exp weights: regret  For binary experts, BW improves this to  Exp weights: regret  For binary experts, BW improves this to

5 Our results  Similar improvements for more general experts  Abstaining experts: make -1,0,+1 predictions  Continuous experts: make predictions in [-1,+1]  For integral 2k, same regret as abstaining experts  For general 2k, an additional additive constant  Lower bound constructions For, tight up to an additive term  Similar improvements for more general experts  Abstaining experts: make -1,0,+1 predictions  Continuous experts: make predictions in [-1,+1]  For integral 2k, same regret as abstaining experts  For general 2k, an additional additive constant  Lower bound constructions For, tight up to an additive term

6 Outline  Approximate game value  Derive master strategy  Approximation is tight  Future work  Approximate game value  Derive master strategy  Approximation is tight  Future work

7 Game Value  Natural approach: compute minimax value of state  State: cumulative losses  Solved by Dynamic Programming on states.  Exponentially large state space. Need approximation  Natural approach: compute minimax value of state  State: cumulative losses  Solved by Dynamic Programming on states.  Exponentially large state space. Need approximation

8 Approximating Game Value  Approximation: assume experts are distributed.  State space drops to O(N) from.  But distributed experts are less adversarial !  Solution: allow experts to make random choices.  Approximation: assume experts are distributed.  State space drops to O(N) from.  But distributed experts are less adversarial !  Solution: allow experts to make random choices. Random Colluding Non-random Distributed Large no. of experts

9 Potential for random expert  One random expert, fixed number t of rounds.  Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer?  ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}.  Expected number of violating experts ~  Upper bound: max t s.t. potential < 1.  One random expert, fixed number t of rounds.  Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer?  ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}.  Expected number of violating experts ~  Upper bound: max t s.t. potential < 1. (state-potential)

10 The OS potential  Base case:  Intuition: Optimal random expert is balanced.  Recurrence:  Optimal expert insensitive to master’s response.  Base case:  Intuition: Optimal random expert is balanced.  Recurrence:  Optimal expert insensitive to master’s response.

11 Derive master strategy  Dual form captures notion of balance.  A measure of expert-bias ~  State-potential drops by above quantity if:  For optimal random expert:  Upper bound:  Dual form captures notion of balance.  A measure of expert-bias ~  State-potential drops by above quantity if:  For optimal random expert:  Upper bound:

12 The online expert game  Master :  Expert i :  Master :  Nature :  Repeats T times  Master’s goal: min. regret  Expert mistake bound of k +1 +1 N Experts Nature Master 0.1 0.5 0.2 0.1

13 Computing   Binary experts: {-1,+1}  Randomly choose -1 or +1  Recovers BW potential  Abstaining experts: {-1,0,+1}  Play 0 or {-1,+1} at random   Binary experts: {-1,+1}  Randomly choose -1 or +1  Recovers BW potential  Abstaining experts: {-1,0,+1}  Play 0 or {-1,+1} at random

14 Computing for continuous experts  y ranges over infinitely many values [-1,+1] !  Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.  y ranges over infinitely many values [-1,+1] !  Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.

15 Computing for continuous experts   Compute potential by Dynamic Programming   Compute potential by Dynamic Programming (Piecewise convex in y)

16 sss-1s+1 s+y window Integer

17 Computing for continuous experts   Compute potential by Dynamic Programming  For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases !   Compute potential by Dynamic Programming  For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases ! (Piecewise convex in y)

18 Outline  Approximate game value  Derive master strategy  Approximation is tight  Future work  Approximate game value  Derive master strategy  Approximation is tight  Future work

19 Approximation is tight  Recall:  Sample:  Drop in potential:  Lipschitz:, yields:  Hoeffding bounds to get small expert-bias, preserving potential.  Recall:  Sample:  Drop in potential:  Lipschitz:, yields:  Hoeffding bounds to get small expert-bias, preserving potential.

20 Improved lower bounds  Sampling analysis relies on approx. bounds  De-randomize: take balanced partition of  Abstaining case: balance with Most balanced partition has difference  Second observation: decays with t  Reduces gap from to  Sampling analysis relies on approx. bounds  De-randomize: take balanced partition of  Abstaining case: balance with Most balanced partition has difference  Second observation: decays with t  Reduces gap from to

21 Conclusion and future work  Ideas based on Schapire’s drifting game.  Close gap of O(log k) to show abstaining experts as powerful as continuous ones.  Boosting with confidence rated predictors.  Optimal algorithms for multi-class boosting.  Ideas based on Schapire’s drifting game.  Close gap of O(log k) to show abstaining experts as powerful as continuous ones.  Boosting with confidence rated predictors.  Optimal algorithms for multi-class boosting.

22 Thank you

Download ppt "1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University."

Similar presentations