Presentation on theme: "Game Theory Developed to explain the optimal strategy in two-person interactions. Initially, von Neumann and Morganstern Zero-sum games John Nash Nonzero-sum."— Presentation transcript:
1Game TheoryDeveloped to explain the optimal strategy in two-person interactions.Initially, von Neumann and MorgansternZero-sum gamesJohn NashNonzero-sum gamesHarsanyi, SeltenIncomplete information
2An example: Big Monkey and Little Monkey cBig monkeywcwcLittle monkeyw0,09,14,45,3What should Big Monkey do?If BM waits, LM will climb – BM gets 9If BM climbs, LM will wait – BM gets 4BM should wait.What about LM?Opposite of BM (even though we’ll never get to the right sideof the tree)
3An example: Big Monkey and Little Monkey These strategies (w and cw) are called best responses.Given what the other guy is doing, this is the best thing to do.A solution where everyone is playing a best response is called a Nash equilibrium.No one can unilaterally change and improve things.This representation of a game is called extensive form.
4An example: Big Monkey and Little Monkey What if the monkeys have to decide simultaneously?cBig monkeywcwcLittle monkeyw0,09,16-2,47-2,3Now Little Monkey has to choose before he sees Big Monkey moveTwo Nash equilibria (c,w), (w,c)Also a third Nash equilibrium: Big Monkey chooses between c & wwith probability 0.5 (mixed strategy)
5An example: Big Monkey and Little Monkey It can often be easier to analyze a game through a different representation, called normal formLittle MonkeycvBig Monkey5,34,4cv9,10,0
6Choosing StrategiesIn the simultaneous game, it’s harder to see what each monkey should doMixed strategy is optimal.Trick: How can a monkey maximize its payoff, given that it knows the other monkeys will play a Nash strategy?Oftentimes, other techniques can be used to prune the number of possible actions.
7Eliminating Dominated Strategies The first step is to eliminate actions that are worse than another action, no matter what.wcBig monkeywccwc9,14,4wLittle monkeyWe can see that BigMonkey will always choosew.So the tree reduces to:9,10,09,16-2,47-2,3Little Monkey willNever choose this path.Or this one
8Eliminating Dominated Strategies We can also use this technique in normal-form games:Columnab9,14,4aRowb0,05,3
9Eliminating Dominated Strategies We can also use this technique in normal-form games:ab9,14,4ab0,05,3For any column action, row will prefer a.
10Eliminating Dominated Strategies We can also use this technique in normal-form games:ab9,14,4ab0,05,3Given that row will pick a, column will pick b.(a,b) is the unique Nash equilibrium.
11Prisoner’s Dilemma Each player can cooperate or defect Column -1,-1-10,0Rowdefect-8,-80,-10
12Prisoner’s Dilemma Each player can cooperate or defect Column -1,-1-10,0Rowdefect-8,-80,-10Defecting is a dominant strategy for row
13Prisoner’s Dilemma Each player can cooperate or defect Column -1,-1-10,0Rowdefect-8,-80,-10Defecting is also a dominant strategy for column
14Prisoner’s DilemmaEven though both players would be better off cooperating, mutual defection is the dominant strategy.What drives this?One-shot gameInability to trust your opponentPerfect rationality
15Prisoner’s Dilemma Relevant to: How do players escape this dilemma? Arms negotiationsOnline PaymentProduct descriptionsWorkplace relationsHow do players escape this dilemma?Play repeatedlyFind a way to ‘guarantee’ cooperationChange payment structure
16Definition of Nash Equilibrium A game has n players.Each player i has a strategy set SiThis is his possible actionsEach player has a payoff functionpI: S RA strategy ti in Si is a best response if there is no other strategy in Si that produces a higher payoff, given the opponent’s strategies.
17Definition of Nash Equilibrium A strategy profile is a list (s1, s2, …, sn) of the strategies each player is using.If each strategy is a best response given the other strategies in the profile, the profile is a Nash equilibrium.Why is this important?If we assume players are rational, they will play Nash strategies.Even less-than-rational play will often converge to Nash in repeated settings.
18An Example of a Nash Equilibrium Columnaba1,20,1Rowb1,02,1(b,a) is a Nash equilibrium.To prove this:Given that column is playing a, row’s best response is b.Given that row is playing b, column’s best response is a.
19Finding Nash Equilibria – Dominated Strategies What to do when it’s not obvious what the equilibrium is?In some cases, we can eliminate dominated strategies.These are strategies that are inferior for every opponent action.In the previous example, row = a is dominated.
20Example A 3x3 example: Column a b c a 73,25 57,42 66,32 Row b 80,26 35,1232,54c28,2763,3154,29
21Example A 3x3 example: Column a b c a 73,25 57,42 66,32 Row b 80,26 35,1232,54c28,2763,3154,29c dominates a for the column player
22Example A 3x3 example: Column a b c a 73,25 57,42 66,32 Row b 80,26 35,1232,54c28,2763,3154,29b is then dominated by both a and c for the row player.
23Example A 3x3 example: Column a b c a 73,25 57,42 66,32 Row b 80,26 35,1232,54c28,2763,3154,29Given this, b dominates c for the column player –the column player will always play b.
24Example A 3x3 example: Column a b c a 73,25 57,42 66,32 Row b 80,26 35,1232,54c28,2763,3154,29Since column is playing b, row will prefer c.
25Example Column a b c a 73,25 57,42 66,32 Row b 80,26 35,12 32,54 c 28,2763,3154,29We verify that (c,b) is a Nash Equilibrium by observation:If row plays c, b is the best response for column.If column plays b, c is the best response by row.
26Example #2 You try this one: Column a b c a 2,2 1,1 4,0 Row b 1,2 4,1 3,5
27Coordination Games Consider the following problem: A supplier and a buyer need to decide whether to adopt a new purchasing system.Buyernewoldnew20,200,0Supplierold5,50,0No dominated strategies!
28Coordination Games new old 0,0 5,5 20,20 Supplier Buyer This game has two Nash equilibria (new,new) and (old,old)Real-life examples: Beta vs VHS, Mac vs Windows vs Linux, others?Each player wants to do what the other doeswhich may be different than what they say they’ll doHow to choose a strategy? Nothing is dominated.
29Solving Coordination Games Coordination games turn out to be an important real-life problemTechnology/policy/strategy adoption, delegation of authority, synchronizationHuman agents tend to use “focal points”Solutions that seem to make “natural sense”e.g. pick a number between 1 and 10Social norms/rules are also usedDriving on the right/left side of the roadThese strategies change the structure of the game
30Price-matching Example Two sellers are offering the same book for sale.This book costs each seller $25.The lowest price gets all the customers; if they match, profits are split.What is the Nash Equilibrium strategy?
31Mixed strategiesUnfortunately, not every game has a pure strategy equilibrium.Rock-paper-scissorsHowever, every game has a mixed strategy Nash equilibrium.Each action is assigned a probability of play.Player is indifferent between actions, given these probabilities.
32Mixed StrategiesIn many games (such as coordination games) a player might not have a pure strategy.Instead, optimizing payoff might require a randomized strategy (also called a mixed strategy)Wifefootballshoppingfootball2,10,0Husbandshopping1,20,0
33Strategy Selection Wife football shopping 2,1 Husband 1,2 0,0 If we limit to pure strategies:Husband: U(football) = 0.5 * * 0 = 1U(shopping) = 0.5 * * 1 = ½Wife: U(shopping) = 1, U(football) = ½Problem: this won’t lead to coordination!
34Mixed strategyInstead, each player selects a probability associated with each actionGoal: utility of each action is equalPlayers are indifferent to choices at this probabilitya=probability husband chooses footballb=probability wife chooses shoppingSince payoffs must be equal, for husband:b*1=(1-b)*2 b=2/3For wife:a*1=(1-a)*2 = 2/3In each case, expected payoff is 2/32/9 of time go to football, 2/9 shopping, 5/9 miscoordinateIf they could synchronize ahead of time they could do better.
35Example: Rock paper scissors Columnrockpaperscissors0,0-1,11,-1rockRowpaper1,-10,0-1,1scissors-1,11,-10,0
36SetupPlayer 1 plays rock with probability pr, scissors with probability ps, paper with probability 1-pr –psP2: Utility(rock) = 0*pr + 1*ps – 1(1-pr –ps) = ps + pr -1P2: Utility(scissors) = 0*ps + 1*(1 – pr – ps) – 1pr = 1 – 2pr –psP2: Utility(paper) = 0*(1-pr –ps)+ 1*pr – 1ps = pr –psPlayer 2 wants to choose a probability for each strategyso that the expected payoff for each strategy is the same.
37Repeated games Many games get played repeatedly A common strategy for the husband-wife problem is to alternateThis leads to a payoff of 1, 2,1,2,…1.5 per week.Requires initial synchronization, plus trust that partner will go along.Difference in formulation: we are now thinking of the game as a repeated set of interactions, rather than as a one-shot exchange.
38Repeated vs Stage Games There are two types of multiple-action games:Stage games: players take a number of actions and then receive a payoff.Checkers, chess, bidding in an ascending auctionRepeated games: Players repeatedly play a shorter game, receiving payoffs along the way.Poker, blackjack, rock-paper-scissors, etc
39Analyzing Stage GamesAnalyzing stage games requires backward inductionWe start at the last action, determine what should happen there, and work backwards.Just like a game tree with extensive form.Strange things can happen here:Centipede gamePlayers alternate – can either cooperate and get $1 from nature or defect and steal $2 from your opponentGame ends when one player has $100 or one player defects.
40Analyzing Repeated Games Analyzing repeated games requires us to examine the expected utility of different actions.Assumption: game is played “infinitely often”Weird endgame effects go away.Prisoner’s Dilemma again:In this case, tit-for-tat outperforms defection.Collusion can also be explained this way.Short-term cost of undercutting is less than long-run gains from avoiding competition.