Presentation on theme: "Network Theory and Dynamic Systems Game Theory: Mixed Strategies"— Presentation transcript:
1Network Theory and Dynamic Systems Game Theory: Mixed Strategies Prof. Dr. Steffen Staab
2Rationale to randomize RandomizationRationale to randomizeMatching pennies do not result in a Nash equilibrium, because opponent would switch if he knew your choice→ Randomize your choice such that your opponent cannot determine the best response→ Mixed strategyPlayer 1\ Player 2HeadTail1,-1-1,1Players secretely choose head or tealPlayer 1 wins if the pennies matchPlayer 2 wins if they mismatch
3Nash EquilibriumDefinition: A Nash equilibrium for a mixed-strategy game is a pair of strategies (S,T), where S and T are both probability distributions over individual strategies available to player 1 and 2, respectively, such that each is a best response to the other. Can a pure strategy lead to a Nash equilibrium for the pennies game?
4Each player chooses a probability for each strategy Mixed StrategyEach player chooses a probability for each strategyPlayer 1 chooses Head with p and Tail with 1-pPlayer 2 chooses Head with q and Tail with 1-qReward for player 1:pq*1 + (1-p)(1-q) * 1 + p(1-q) * (-1) + (1-p)q * (-1) =pq pq - p - q - p + pq - q + pq =1 – 2 (p+q) + 4pqPlayer 1\ Player 2HeadTail1,-1-1,1
5Nash Equilibrium for p=q=0.5 ! Mixed StrategyReward for player 1:pq*1 + (1-p)(1-q) * 1 + p(1-q) * (-1) + (1-p)q * (-1) =pq pq - p - q - p + pq - q + pq =1 – 2 (p+q) + 4pqReward for player 2: Multiply with -1: (p+q) - 4pqChoose optimum: differentiate for both variablesd(1 – 2 (p+q) + 4pq)/dp = q = 0d(1 – 2 (p+q) + 4pq)/dq = p = 0(also for player 2)No better result achievable for p=q=0.5Player 1\ Player 2HeadTail1,-1-1,1Nash Equilibrium for p=q=0.5 !
6NoteWhile there need not be a Nash equilibrium for the pure strategies, there is always at least one Nash equilibrium for mixed strategies.
7How to solve this a bit more easily? Consider matching pennies again:If Player 2 chooses a probability of q, and Player 1 chooses Head the expected payoff to Player 1 is:(-1) (q) + 1 (1-q) = 1-2qIf Player 1 chooses Tail:1 (q) + (-1) (1-q) = 2q-1Note 1: no pure strategy can be part of a Nash equilibriumNote 2a: If 1-2q > 2q-1 then Head is the best strategy for P1Note 2b: If 1-2q < 2q-1 then Tail is the best strategy for P1Note 2c: Pure strategies do not lead to Nash equilibrium, thus Player 2 must choose a point where 1-2q = 2q-1Note 2d: Likewise for Player 1
8Derived PrincipleA mixed equilibrium arises when the probabilities used by each player make his opponent indifferent between his two options.
9Applying the „indifference principle“ to another example „American Football“Assume defense chooses a probability of q for defending against the pass:Then payoff for offense from passing is0 q + 10 (1-q) = 10-10qPayoff from running is5q + 0 (1-q) = 5q10-10q=5q is solved for q=2/3Likewise p=1/3Offense\ DefenseDefend PassDefend RunPass0,010,-10Run5,-5
10Multiple Nash equilibria How many Nash equilibria in this case?Student 1 \ Student 2Power PointKeynote1,10,02,2Unbalanced coordination game
11Multiple Nash equilibria Mixed-strategy equilibrium1 q + 0 (1-q) = 0q + 2 (1-q)→ q=2/3Also, because of symmetry→ p=2/3Student 1 \ Student 2Power PointKeynote1,10,02,2Unbalanced coordination game
13Linear optimization task Linear ProgrammingLinear optimization taskMaximize or minimize a linear objective function in multiple variablesLinear constraintsGiven with linear (in-)equalities
14Players x and y choose election topics to gather votes. Zero sum gamePlayers x and y choose election topics to gather votes.Depending on the chosen topics they gain or loose votes.What is the best mixed strategy?yx
15Dual argument for player y: Zero Sum GamePlayer x knows that player y tries to minimize his loss by minimizing the following two values: While x tries to maximize this valueDual argument for player y:
16Write zero sum game as linear programs Spieler xSpieler y-2y1+y2-2
17Given primal problem: Maximize cTx subject to Ax ≤ b, x ≥ 0; Dual Linear ProgramsEvery linear programming problem, referred to as a primal problem, can be converted into a dual problem:Given primal problem: Maximize cTx subject to Ax ≤ b, x ≥ 0;Has same value of the objective function as the corresponding dual problem: Minimize bTy subject to ATy ≥ c, y ≥ 0.
18Write zero sum game as linear programs Spieler xSpieler y-2y1+y2-2Observation:The two problems are dual to each other, thus they have the same objective function value: Equilibrium
19There are mixed strategies that are optimal for both players. Minimax TheoremMinimax-Theorem:There are mixed strategies that are optimal for both players.
20What are the best strategies now? Player x chooses strategies with Zero Sum GameWhat are the best strategies now?Player x chooses strategies withPlayer y chooses strategies withBoth achieve an objective function value of
22Nash Equilibrium Optimality – 1 Individual optimum (-4,-4) Not a global optimumSuspect 1 \ Suspect 2Not-ConfessConfess-1, -1-10, 00, -10-4, -4Prisoners‘ Dilemma
23Social OptimalityDefinition: A choice of strategies – one by each player – is a social welfare maximizer (or socially optimal) if it maximizes the sum of the players‘ payoffs. However: Adding payoffs does not always make sense!Suspect 1 \ Suspect 2Not-ConfessConfess-1, -1-10, 00, -10-4, -4Prisoners‘ Dilemma
24Core here: binding agreement Pareto OptimalityDefinition: A choice of strategies – one by each player – is Pareto-optimal if there is no other choice of strategies in which all players receive payoffs at least as high, and at least one player receives a strictly higher payoff.Core here: binding agreement(not-confess,not-confess) is Pareto-optimal, but not a Nash equilibriumNash equilibrium is only Non-Pareto optimal choice!Suspect 1 \ Suspect 2Not-ConfessConfess-1, -1-10, 00, -10-4, -4
26each player has set of possible strategies Multiplayer GamesStructure:n playerseach player has set of possible strategieseach player has a payoff function Pi such that for reach outcome consisting of strategies chosen by the players (S1, S2,…Sn), there is a payoff Pi(S1, S2,…Sn) to player i.Best response: A strategy Si is a best response by Player i to a choice of strategies (S1, S2,…Si-1, Si+1,…,Sn), ifPi(S1, …Si-1, Si, Si+1,…,Sn) ≥ Pi(S1, …Si-1, S'i, Si+1,…,Sn)for all other possible strategies S'i available to player i.
27each player has set of possible strategies Multiplayer GamesStructure:n playerseach player has set of possible strategieseach player has a payoff function Pi such that for reach outcome consisting of strategies chosen by the players (S1, S2,…Sn), there is a payoff Pi(S1, S2,…Sn) to player i.Nash equilibrium: A choice of strategies (S1, S2,…Sn) is a Nash equilibrium if each strategy it contains is a best response to all others.
29Dominated StrategiesDefinition: A strategy is strictly dominated if there is some other strategy available to the same player that produces a strictly higher payoff in response to every choice of strategies by the other players. (notion is only useful if there are more than two strategies)
30Facility Location Game (cf Hotelling‘s law) Firm 1: Strategy A is dominated by CFirm 2: Strategy F is dominated by DReason about the structure and delete these strategies
31Facility Location Game (cf Hotelling‘s law) Firm 1: Strategy E is dominated by CFirm 2: Strategy B is dominated by DReason about the structure and delete these strategies
32Facility Location Game (cf Hotelling‘s law) One pair of strategies remainingThe unique Nash equilibriumGeneralizes from 6 to an arbitrary number of nodes