Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Slides:



Advertisements
Similar presentations
The Basics of Game Theory
Advertisements

Chapter 17: Making Complex Decisions April 1, 2004.
NON - zero sum games.
CPS Bayesian games and their use in auctions Vincent Conitzer
Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.
Situation Calculus for Action Descriptions We talked about STRIPS representations for actions. Another common representation is called the Situation Calculus.
Strongly based on Slides by Vincent Conitzer of Duke
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
Game Theory S-1.
C&O 355 Mathematical Programming Fall 2010 Lecture 12 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A.
The Voting Problem: A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor SIUC.
Two-Player Zero-Sum Games
1 Chapter 14 – Game Theory 14.1 Nash Equilibrium 14.2 Repeated Prisoners’ Dilemma 14.3 Sequential-Move Games and Strategic Moves.
An Introduction to... Evolutionary Game Theory
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.
Game Theory 1. Game Theory and Mechanism Design Game theory to analyze strategic behavior: Given a strategic environment (a “game”), and an assumption.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 15 Game Theory.
Network Theory and Dynamic Systems Game Theory: Mixed Strategies
Short introduction to game theory 1. 2  Decision Theory = Probability theory + Utility Theory (deals with chance) (deals with outcomes)  Fundamental.
A Brief History of Game Theory From various sources.
Part 3: The Minimax Theorem
An Introduction to Game Theory Part I: Strategic Games
Game Theory Part 5: Nash’s Theorem.
GAME THEORY By Ben Cutting & Rohit Venkat. Game Theory: General Definition  Mathematical decision making tool  Used to analyze a competitive situation.
Algoritmi per Sistemi Distribuiti Strategici
An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.
Lecture 1 - Introduction 1.  Introduction to Game Theory  Basic Game Theory Examples  Strategic Games  More Game Theory Examples  Equilibrium  Mixed.
Review: Game theory Dominant strategy Nash equilibrium
Advanced Microeconomics Instructors: Wojtek Dorabialski & Olga Kiuila Lectures: Mon. & Wed. 9:45 – 11:20 room 201 Office hours: Mon. & Wed. 9:15 – 9:45.
AWESOME: A General Multiagent Learning Algorithm that Converges in Self- Play and Learns a Best Response Against Stationary Opponents Vincent Conitzer.
Game Theory Here we study a method for thinking about oligopoly situations. As we consider some terminology, we will see the simultaneous move, one shot.
Lecture Slides Dixit and Skeath Chapter 4
Introduction to Game Theory and Behavior Networked Life CIS 112 Spring 2009 Prof. Michael Kearns.
DANSS Colloquium By Prof. Danny Dolev Presented by Rica Gonen
1 On the Agenda(s) of Research on Multi-Agent Learning by Yoav Shoham and Rob Powers and Trond Grenager Learning against opponents with bounded memory.
Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer
CPS Learning in games Vincent Conitzer
MAKING COMPLEX DEClSlONS
Game Theory, Strategic Decision Making, and Behavioral Economics 11 Game Theory, Strategic Decision Making, and Behavioral Economics All men can see the.
Chapter 12 Choices Involving Strategy Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written.
ECO290E: Game Theory Lecture 12 Static Games of Incomplete Information.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
Small clique detection and approximate Nash equilibria Danny Vilenchik UCLA Joint work with Lorenz Minder.
Microeconomics 2 John Hey. Game theory (and a bit of bargaining theory) A homage to John Nash. Born Still alive (as far as Google knows). Spent.
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Worlds with many intelligent agents An important consideration in AI, as well as games, distributed systems and networking, economics, sociology, political.
McGraw-Hill/Irwin Copyright  2008 by The McGraw-Hill Companies, Inc. All rights reserved. GAME THEORY, STRATEGIC DECISION MAKING, AND BEHAVIORAL ECONOMICS.
Game theory & Linear Programming Steve Gu Mar 28, 2008.
Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Lecture 2: two-person non.
Chapters 29, 30 Game Theory A good time to talk about game theory since we have actually seen some types of equilibria last time. Game theory is concerned.
The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.
Great Theoretical Ideas in Computer Science.
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
6.853: Topics in Algorithmic Game Theory Fall 2011 Constantinos Daskalakis Lecture 22.
Empirical Aspects of Plurality Elections David R. M. Thompson, Omer Lev, Kevin Leyton-Brown & Jeffrey S. Rosenschein COMSOC 2012 Kraków, Poland.
Lec 23 Chapter 28 Game Theory.
By: Donté Howell Game Theory in Sports. What is Game Theory? It is a tool used to analyze strategic behavior and trying to maximize his/her payoff of.
Game Theory Dr. Andrew L. H. Parkes “Economics for Business (2)” 卜安吉.
Game Theory By Ben Cutting & Rohit Venkat.
Mixed Strategies Keep ‘em guessing.
Game Theory M.Pajhouh Niya M.Ghotbi
Introduction to Game Theory
Game Theory.
Choices Involving Strategy
Multiagent Systems Game Theory © Manfred Huber 2018.
Lecture 20 Linear Program Duality
Game Theory: The Nash Equilibrium
Presentation transcript:

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash equilibrium. Nash, John (1951) "Non-Cooperative Games" The Annals of Mathematics 54(2):286-295. (John Nash did not call them “Nash equilibria”, that name came later.) He shared the 1994 Nobel Memorial Prize in Economic Sciences with game theorists Reinhard Selten and John Harsanyi for his work on Nash equilibria. He suffered from schizophrenia in the 1950s and 1960s, as depicted in the 1998 film, “A Beautiful Mind”. He nevertheless recovered enough to return to academia and continue his research.

More on Constant-Sum Games Minimax Theorem (John von Neumann, 1928): For every two-person, zero-sum game with finitely many pure strategies, there exists a mixed strategy for each player and a value V such that: Given player 2’s strategy, the best possible payoff for player 1 is V Given player 1’s strategy, the best possible payoff for player 2 is –V. The existence of strategies part is a special case of Nash’s theorem, and a precursor to it. This basically says that player 1 can guarantee himself a payoff of at least V, and player 2 can guarantee himself a payoff of at least –V. If both players play optimally, that’s exactly what they will get. It’s called “minimax” because the players get this value by pursuing a strategy that tries to minimize the maximum payoff of the other player. We’ll come back to this. Definition: The value V is called the value of the game. Eg: The value of Rock-paper-scissors is 0; the best that P1 can hope to achieve, assuming P2 plays optimally (1/3 probability of each action), is a payoff of 0.

Computing Nash Equilibria In general, it’s quite expensive, although it’s not known exactly how this relates to P or NP. For two-person, constant-sum games, this problem reduces to another problem called “Linear Programming”, which is in P.

Computing Nash Equilibria: 2-person, Zero-Sum Games The Odds and Evens game has no pure-strategy Nash equilibria. By Nash’s theorem, it must have a mixed-strategy Nash equilibrium. How can we find it? Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Let’s start by making some definitions. Let p1 be the probability that the Even player plays 1 finger, in the Nash equilibrium. So with probability 1-p1, Even will play 2 fingers. Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Let’s start by making some definitions. Likewise, let q1 be the probability that the Odd player plays 1 finger, in the Nash equilibrium. So with probability 1-q1, Odd will play 2 fingers. Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Next, let’s write down what we know about the outcomes, in terms of p1 and q1. In equilibrium, Odd’s expected payoff is: q1*p1*(-2) + q1*(1-p1)*(+3) + (1-q1)*p1*(+3) + (1-q1)*(1-p1)*(-4) Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Next, let’s write down what we know about the outcomes, in terms of p1 and q1. In equilibrium, Even’s expected payoff is: q1*p1*(+2) + q1*(1-p1)*(-3) + (1-q1)*p1*(-3) + (1-q1)*(1-p1)*(+4) Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Observation: If Even selects p1 so that Odd gets a higher utility by playing 1 finger instead of 2 fingers, then Odd will always select 1 finger. But that can’t be an equilibrium! (Why not?) Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Observation: Likewise, if Even selects p1 so that Odd gets a higher utility by playing 2 fingers instead of 1 fingers, then Odd will always select 2 fingers. But that can’t be an equilibrium, either! Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Observation: So, the only possible equilibrium has Even selecting p1 so that Odd’s payoff for selecting 1 finger equals Odd’s payoff for selecting 2 fingers. Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games In algebra: Odd’s payoff when Even plays 1 finger with probability p1, and Odd always plays 1 finger: p1*(-2) + (1-p1)*(+3) Odd’s payoff when Even plays 1 finger with probability p1, and Odd always plays 2 fingers: p1*(+3) + (1-p1)*(-4) Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games Our observation says these should be equal: p1*(-2) + (1-p1)*(+3) = p1*(+3) + (1-p1)*(-4) => -2p1 + 3 – 3p1 = 3p1 -4 + 4p1 7 = 12p1 p1 = 7 / 12 Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games We could have done this for either player; here it is from Odd’s perspective: q1*(+2) + (1-q1)*(-3) = q1*(-3) + (1-q1)*(+4) => 2q1 – 3 + 3q1 = -3 q1 +4 -4q1 12q1 = 7 q1 = 7/12 Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Computing Nash Equilibria: 2-person, Zero-Sum Games So now we know a mixed-strategy Nash equilibrium: POdd(1 finger) = 7/12 POdd(2 fingers) = 5/12 PEven(1 finger) = 7/12 PEven(2 fingers) = 5/12 Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Quiz: 2-person, Zero-Sum Games What is the value of this game for Even? (Remember, the value of the game is the expected payoff for the player in equilibrium.) Likewise, what is the value of the game for Odd? Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Answer: 2-person, Zero-Sum Games You can get the value for Even three ways: Recall: In equilibrium, Even’s expected payoff is: q1*p1*(+2) + q1*(1-p1)*(-3) + (1-q1)*p1*(-3) + (1-q1)*(1-p1)*(+4) Or, q1*(+2) + (1-q1)*(-3) or, q1*(-3) + (1-q1)*(+4) These all equal: -1/12 Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

Answer: 2-person, Zero-Sum Games You can get the value for Odd the same three ways, or you can just say that this is a zero-sum game, so the value for Odd must be opposite the value for Even: +1/12 In other words, it’s better to be the Odd player than the Even player, since Odd will win, on average. Odd Player 1 finger 2 fingers -2, +2 +3, -3 -4, +4 1 finger Even Player 2 finger The Odds and Evens Game

2-person games with more actions When there are more actions available than 2 per person, the simple algorithm I gave will no longer work. However, it is still possible to compute Nash equilibria for zero-sum games in polynomial time using a technique called Linear Programming. Linear Programming is a well-known kind of problem with existing solvers, and I won’t cover it in detail here.

Quiz: Computing an Equilibrium for Zero-Sum Games In equilibrium, What is the probability that P1 plays X? What is the probability that P2 plays X? What is the value of the game for P1? Player 1 X Y +5, -5 +2, -2 +3, -3 +6, -6 X Player 2 Y

Answer: Computing an Equilibrium for Zero-Sum Games In equilibrium, What is the probability that P1 plays X? 2/3 What is the probability that P2 plays X? 0.5 What is the value of the game for P1? 4 Player 1 X Y +5, -5 +2, -2 +3, -3 +6, -6 X Player 2 Y

Games beyond this class’s limits There are MANY aspects of games and Game Theory in AI that we will not cover. I’ll briefly mention some of them: Repeated games and Learning Communication between agents Mechanism Design: How to create games so that agents have the incentives to behave in desirable ways (eg, voting and auctions)

1. Repeated Games and Learning Many games (e.g., Rock-Paper-Scissors) are typically played multiple times. These are called repeated games. This can change the incentive structure and the best strategies: E.g., in the Prisoner’s Dilemma, it might be better to say nothing if you believe you can teach your opponent to cooperate and say nothing as well.

Learning and Teaching in Repeated Games This history of play in repeated games offer examples of your opponent’s strategy. This provides an opportunity for learning. It also provides an opportunity for teaching! In multi-agent settings with repeated games, every agent is both a learner and a teacher.

Example learning strategy: “Fictitious Play” Idea: build a model of what the opponent’s strategy is, and then play a best response. Fictitious Play Learning Create an array A that has an entry for each of the opponent’s actions. Initialize with prior beliefs. Repeat: Assuming the counts in A represent the opponents mixed strategy, play a best response to A. Observe the opponent’s action, and update the appropriate count in A.

Some Theoretical Results about Fictitious Play Theorem: If both players use fictitious play, and if the empirical distribution of their chosen actions converges, then it converges to a Nash equilibrium. Theorem: In zero sum games, if both players use fictitious play, they will converge on a Nash equilibrium.

2. Communication in Games Sometimes, communication can improve player outcomes. Player 1 says: “I will play C”. Response? Player 1 says: “I will play C”. Response? D C D C +1, +1 0, +5 +5, 0 +3, +3 +1, +1 0, 0 D D C C Prisoner’s Dilemma Coordination game

2. Communication in Games In the coordination game, P1’s statement is self-commiting and self-revealing, so believable. Player 1 says: “I will play C”. Response? Player 1 says: “I will play C”. Response? D C D C +1, +1 0, +5 +5, 0 +3, +3 +1, +1 0, 0 D D C C Prisoner’s Dilemma Coordination game

3. Mechanism Design: Creating Games with Desired Outcomes Elections and auctions are examples of games: they involve multiple agents, possible actions for each agent (who to vote for, how much to bid), and outcomes that depend on all of the agents’ outcomes. “Mechanism Design” is the study of creating a reward structure so that we have good outcomes, such as that the most popular politician gets elected, or that the person who benefits most from a good wins the auction.

Arrow’s Theorem Definition: A voting mechanism is dictatorial if it exactly follows the preferences of a single voter (called the dictator). Theorem (Arrow, 1951) (Informally): Any voting mechanism in which voters express their true preferences for the outcomes (candidates) that has at least 3 outcomes that always selects the most popular outcome and where the choice between two outcomes is not affected by other less-popular outcomes must be dictatorial. Note: This is a well-known example of an impossibility theorem: a theorem that says it is impossible to design a game with a certain list of desirable properties. This theorem and many like it don’t apply to certain kinds of voting, like rating systems (where voters rate each outcome, for example on a scale of 1-10, rather than specifying preferences.) But it does apply to most voting mechanisms in modern democracies. Which property does the US presidential voting system fail on?

Second-Price Auctions Definition: A second-price auction awards the good to the highest bidder, and charges a price equal to the second-highest bid.

Quiz: Second-Price Auctions Fill in the matrix of payoffs Let v=10 be your value for a good. Let b be your bid. Let c be the highest bid by anyone else in the auction. Your payoff is: v-c if b > c (you win the auction) 0 if b <= c (you lose the auction) C=7 C=9 C=11 C=13 B=12 B=10 B=8 Is there a dominant strategy? If yes, is the strategy “truth-revealing”? (That is, does the strategy make you bid exactly how much you value the good?)

Answer: Second-Price Auctions Fill in the matrix of payoffs Let v=10 be your value for a good. Let b be your bid. Let c be the highest bid by anyone else in the auction. Your payoff is: v-c if b > c (you win the auction) 0 if b <= c (you lose the auction) C=7 C=9 C=11 C=13 B=12 3 1 -1 B=10 B=8 Is there a dominant strategy? Yes, bid b=10 If yes, is the strategy “truth-revealing”? Yes, the dominant strategy matches the value v=10

Second-Price Auctions Definition: A second-price auction awards the good to the highest bidder, and charges a price equal to the second-highest bid. Some properties (under a bunch of assumptions that I won’t get into): They are pareto efficient They are dominant strategy-truthful: the best strategy is to bid exactly what you think the good is worth to you. It is always worth it for agents to take part in the auction. The auctioneer will never lose money. These auctions come from a family of auctions called Vickrey-Clarke-Groves mechanisms, and these are the only possible mechanisms that have the first two properties.