# Calibrated Learning and Correlated Equilibrium By: Dean Foster and Rakesh Vohra Presented by: Jason Sorensen.

## Presentation on theme: "Calibrated Learning and Correlated Equilibrium By: Dean Foster and Rakesh Vohra Presented by: Jason Sorensen."— Presentation transcript:

Calibrated Learning and Correlated Equilibrium By: Dean Foster and Rakesh Vohra Presented by: Jason Sorensen

Nash Equilibrium A Nash Equilibrium is a set of strategies in which no player benefits by individually altering their strategy. The Mini-max theorem gives the value of a zero sum game at an N.E.: In any game with a finite strategy set, an N.E. exists. N.E. exists and is played if: 1. Each players is rational and believes others to be so 2. The utility matrix is accurate 3. The play execution is flawless 4. The players can accurately deduce the N.E. solution

But… In many instances, repeated play may not converge to a Nash Equilibrium Actually, in general there is no method guaranteed to converge to an N.E. The payoff matrix may be unknown There may be multiple equilibriums Opponent may be irrational N.E is Inconsistent with Bayesian perspective

Fictitious Play Each player assumes the opponent’s strategy is the ratio of plays of each made move to present N.E. are absorbing points of F.P. If F.P. Converges, it is to an N.E. Converges under if the game: 1. Has a 2x2 general payoff matrix 2. Is zero sum 3. Is solvable by elimination of dominated strategies 4. Is a potential game Convergence not guaranteed for general games

Correlated Equilibrium More general than N.E. (Every N.E is a C.E.) Neither player can benefit from deviating from C.E. without other player deviating also Calibrated forecasts lead to C.E. Consistent with Bayesian viewpoint Differs from N.E. in that it is a distribution over all combined moves that both players follow

An example: C.E. vs N.E. Game of Dare or Chicken Three N.E. exist at (D,C), (C,D) and each player Dare with p = 1/3 C.E. at (C,C), (D,C), (C,D) each with p=1/3 Payoff for mixed N.E. is 4 Payoff for C.E. is 5 DC D0,07,2 C2,76,6

The Shapely game F.P. Oscillates between (1,1),(1,2),(2,2),(2,3),(3,3),(3,1) Does NOT converge to N.E. N.E. when both players have strategy (1/3, 1/3, 1/3) C.E. with support on the above states each with p=1/6 N.E. payoff = 1/3 C.E. payoff = 1/2 This C.E. is not a mixture of Nash Equilibriums 123 11,00,10,0 2 1,00,1 3 0,01,0

Calibrated Forecasts A forecast is calibrated if, on average of all instances in which x is predicted with probability p, it occurred p percent of those times. If each player uses a calibrated forecast, then i.e. Convergence to C.E. is guaranteed There is a calibrated forecast that converges to any C.E. except for a measure zero set of games There is a randomized forecast for a player regardless of what learning rule the opponent uses

This is all great, but… How long does convergence really take … to a calibrated forecast? … to a correlated equilibrium? How do we know which C.E. we are learning? Was N.E. really so bad in the first place? What is the expected payoff increase of a general C.E. over an N.E. In a 2003 paper Foster showed a powerful method of how to learn N.E. _insert your own problem with paper here_

Download ppt "Calibrated Learning and Correlated Equilibrium By: Dean Foster and Rakesh Vohra Presented by: Jason Sorensen."

Similar presentations