1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University.

Slides:

Advertisements

Similar presentations

IDSIA Lugano Switzerland Master Algorithms for Active Experts Problems based on Increasing Loss Values Jan Poland and Marcus Hutter Defensive Universal.

Advertisements

Thursday, March 7 Duality 2 – The dual problem, in general – illustrating duality with 2-person 0-sum game theory Handouts: Lecture Notes.

Primal Dual Combinatorial Algorithms Qihui Zhu May 11, 2009.

Integer Optimization Basic Concepts Integer Linear Program(ILP): A linear program except that some or all of the decision variables must have integer.

Winning concurrent reachability games requires doubly-exponential patience Michal Koucký IM AS CR, Prague Kristoffer Arnsfelt Hansen, Peter Bro Miltersen.

On-line learning and Boosting

Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.

Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.

Boosting Rong Jin.

Lecturer: Moni Naor Algorithmic Game Theory Uri Feige Robi Krauthgamer Moni Naor Lecture 8: Regret Minimization.

A Simple Distribution- Free Approach to the Max k-Armed Bandit Problem Matthew Streeter and Stephen Smith Carnegie Mellon University.

MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)

Taming the monster: A fast and simple algorithm for contextual bandits

Follow the regularized leader

Power of Selective Memory. Slide 1 The Power of Selective Memory Shai Shalev-Shwartz Joint work with Ofer Dekel, Yoram Singer Hebrew University, Jerusalem.

Games of Prediction or Things get simpler as Yoav Freund Banter Inc.

Nonstochastic Multi-Armed Bandits With Graph-Structured Feedback Noga Alon, TAU Nicolo Cesa-Bianchi, Milan Claudio Gentile, Insubria Shie Mannor, Technion.

1 A Prediction Interval for the Misclassification Rate E.B. Laber & S.A. Murphy.

Duality Lecture 10: Feb 9. Min-Max theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum Cut Both.

An Approximation Algorithm for Requirement cut on graphs Viswanath Nagarajan Joint work with R. Ravi.

Approximation algorithms and mechanism design for minimax approval voting Ioannis Caragiannis Dimitris Kalaitzis University of Patras Vangelis Markakis.

Dasgupta, Kalai & Monteleoni COLT 2005 Analysis of perceptron-based active learning Sanjoy Dasgupta, UCSD Adam Tauman Kalai, TTI-Chicago Claire Monteleoni,

Integer Programming Difference from linear programming –Variables x i must take on integral values, not real values Lots of interesting problems can be.

1 A Prediction Interval for the Misclassification Rate E.B. Laber & S.A. Murphy.

Fitting Tree Metrics: Hierarchical Clustering and Phylogeny Nir AilonMoses Charikar Princeton University.

Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.

Tight Integrality Gaps for Lovász-Schrijver LP relaxations of Vertex Cover Grant Schoenebeck Luca Trevisan Madhur Tulsiani UC Berkeley.

Online Function Tracking with Generalized Penalties Marcin Bieńkowski Institute of Computer Science, University of Wrocław, Poland Stefan Schmid Deutsche.

Experts Learning and The Minimax Theorem for Zero-Sum Games Maria Florina Balcan December 8th 2011.

Commitment without Regrets: Online Learning in Stackelberg Security Games Nika Haghtalab Carnegie Mellon University Joint work with Maria-Florina Balcan,

Online Learning Algorithms

Dana Moshkovitz, MIT Joint work with Subhash Khot, NYU.

Support Vector Machines

Asaf Cohen (joint work with Rami Atar) Department of Mathematics University of Michigan Financial Mathematics Seminar University of Michigan March 11,

online convex optimization (with partial information)

Frame by Frame Bit Allocation for Motion-Compensated Video Michael Ringenburg May 9, 2003.

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

Small subgraphs in the Achlioptas process Reto Spöhel, ETH Zürich Joint work with Torsten Mütze and Henning Thomas TexPoint fonts used in EMF. Read the.

Chapter 12 Discrete Optimization Methods

Trust-Aware Optimal Crowdsourcing With Budget Constraint Xiangyang Liu 1, He He 2, and John S. Baras 1 1 Institute for Systems Research and Department.

ANTs PI Meeting, Nov. 29, 2000W. Zhang, Washington University1 Flexible Methods for Multi-agent distributed resource Allocation by Exploiting Phase Transitions.

1 Multiplicative Weights Update Method Boaz Kaminer Andrey Dolgin Based on: Aurora S., Hazan E. and Kale S., “The Multiplicative Weights Update Method:

Ch. 6 - Approximation via Reweighting Presentation by Eran Kravitz.

Benk Erika Kelemen Zsolt

Online Passive-Aggressive Algorithms Shai Shalev-Shwartz joint work with Koby Crammer, Ofer Dekel & Yoram Singer The Hebrew University Jerusalem, Israel.

Integer LP In-class Prob

1 Monte-Carlo Planning: Policy Improvement Alan Fern.

Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,

Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling Michael J. Neely University of Southern.

Boosting ---one of combining models Xin Li Machine Learning Course.

Massive Support Vector Regression (via Row and Column Chunking) David R. Musicant and O.L. Mangasarian NIPS 99 Workshop on Learning With Support Vectors.

Approximation Algorithms based on linear programming.

1 Chapter 5 Branch-and-bound Framework and Its Applications.

Towards Robust Revenue Management: Capacity Control Using Limited Demand Information Michael Ball, Huina Gao, Yingjie Lan & Itir Karaesmen Robert H Smith.

Data Driven Resource Allocation for Distributed Learning

Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS

Vitaly Feldman and Jan Vondrâk IBM Research - Almaden

Introduction to Machine Learning

Understanding Generalization in Adaptive Data Analysis

Generalization and adaptivity in stochastic convex optimization

Chapter 6. Large Scale Optimization

Integer Programming (정수계획법)

CSCI B609: “Foundations of Data Science”

Aviv Rosenberg 10/01/18 Seminar on Experts and Bandits

András Sebő and Anke van Zuylen

The Byzantine Secretary Problem

Integer Programming (정수계획법)

On Approximating Covering Integer Programs

Generalization bounds for uniformly stable algorithms

Presentation transcript:

1 Learning with continuous experts using Drifting Games work with Robert E. Schapire Princeton University work with Robert E. Schapire Princeton University

2 The online expert game  Master :  Expert i :  Master :  Nature :  Repeats indefinitely.  Master’s goal: minimize total number of mistakes.  Constraint: Some expert makes at most k mistakes N Experts Nature Master

3 Continuous Experts  Continuous experts predict real numbers in [-1,+1].  Interpretation:  Given y, expected number of mistakes or loss:  Constraint: some expert with loss at most k. predicts w.p w.p. 0.2 Mistake w.p. 0.8

4 Previous work  Exp weights: regret  For binary experts, BW improves this to  Exp weights: regret  For binary experts, BW improves this to

5 Our results  Similar improvements for more general experts  Abstaining experts: make -1,0,+1 predictions  Continuous experts: make predictions in [-1,+1]  For integral 2k, same regret as abstaining experts  For general 2k, an additional additive constant  Lower bound constructions For, tight up to an additive term  Similar improvements for more general experts  Abstaining experts: make -1,0,+1 predictions  Continuous experts: make predictions in [-1,+1]  For integral 2k, same regret as abstaining experts  For general 2k, an additional additive constant  Lower bound constructions For, tight up to an additive term

6 Outline  Approximate game value  Derive master strategy  Approximation is tight  Future work  Approximate game value  Derive master strategy  Approximation is tight  Future work

7 Game Value  Natural approach: compute minimax value of state  State: cumulative losses  Solved by Dynamic Programming on states.  Exponentially large state space. Need approximation  Natural approach: compute minimax value of state  State: cumulative losses  Solved by Dynamic Programming on states.  Exponentially large state space. Need approximation

8 Approximating Game Value  Approximation: assume experts are distributed.  State space drops to O(N) from.  But distributed experts are less adversarial !  Solution: allow experts to make random choices.  Approximation: assume experts are distributed.  State space drops to O(N) from.  But distributed experts are less adversarial !  Solution: allow experts to make random choices. Random Colluding Non-random Distributed Large no. of experts

9 Potential for random expert  One random expert, fixed number t of rounds.  Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer?  ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}.  Expected number of violating experts ~  Upper bound: max t s.t. potential < 1.  One random expert, fixed number t of rounds.  Recall: master makes mistake in each round Q: How much loss can one force the expert to suffer?  ~ Prob {optimal random expert can be forced to suffer more than (k-s) loss in t rounds}.  Expected number of violating experts ~  Upper bound: max t s.t. potential < 1. (state-potential)

10 The OS potential  Base case:  Intuition: Optimal random expert is balanced.  Recurrence:  Optimal expert insensitive to master’s response.  Base case:  Intuition: Optimal random expert is balanced.  Recurrence:  Optimal expert insensitive to master’s response.

11 Derive master strategy  Dual form captures notion of balance.  A measure of expert-bias ~  State-potential drops by above quantity if:  For optimal random expert:  Upper bound:  Dual form captures notion of balance.  A measure of expert-bias ~  State-potential drops by above quantity if:  For optimal random expert:  Upper bound:

12 The online expert game  Master :  Expert i :  Master :  Nature :  Repeats T times  Master’s goal: min. regret  Expert mistake bound of k N Experts Nature Master

13 Computing   Binary experts: {-1,+1}  Randomly choose -1 or +1  Recovers BW potential  Abstaining experts: {-1,0,+1}  Play 0 or {-1,+1} at random   Binary experts: {-1,+1}  Randomly choose -1 or +1  Recovers BW potential  Abstaining experts: {-1,0,+1}  Play 0 or {-1,+1} at random

14 Computing for continuous experts  y ranges over infinitely many values [-1,+1] !  Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.  y ranges over infinitely many values [-1,+1] !  Technical lemma If is piecewise convex, continuous, with pieces breaking at integers, then same holds for.

15 Computing for continuous experts   Compute potential by Dynamic Programming   Compute potential by Dynamic Programming (Piecewise convex in y)

16 sss-1s+1 s+y window Integer

17 Computing for continuous experts   Compute potential by Dynamic Programming  For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases !   Compute potential by Dynamic Programming  For integral s and 2k, optimal y in {-1,0,+1} Same bounds for continuous and abstaining cases ! (Piecewise convex in y)

18 Outline  Approximate game value  Derive master strategy  Approximation is tight  Future work  Approximate game value  Derive master strategy  Approximation is tight  Future work

19 Approximation is tight  Recall:  Sample:  Drop in potential:  Lipschitz:, yields:  Hoeffding bounds to get small expert-bias, preserving potential.  Recall:  Sample:  Drop in potential:  Lipschitz:, yields:  Hoeffding bounds to get small expert-bias, preserving potential.

20 Improved lower bounds  Sampling analysis relies on approx. bounds  De-randomize: take balanced partition of  Abstaining case: balance with Most balanced partition has difference  Second observation: decays with t  Reduces gap from to  Sampling analysis relies on approx. bounds  De-randomize: take balanced partition of  Abstaining case: balance with Most balanced partition has difference  Second observation: decays with t  Reduces gap from to

21 Conclusion and future work  Ideas based on Schapire’s drifting game.  Close gap of O(log k) to show abstaining experts as powerful as continuous ones.  Boosting with confidence rated predictors.  Optimal algorithms for multi-class boosting.  Ideas based on Schapire’s drifting game.  Close gap of O(log k) to show abstaining experts as powerful as continuous ones.  Boosting with confidence rated predictors.  Optimal algorithms for multi-class boosting.

22 Thank you