CPS 296.1 LP and IP in Game theory (Normal-form Games, Nash Equilibria and Stackelberg Games) Joshua Letchford.

Slides:



Advertisements
Similar presentations
CPS Extensive-form games Vincent Conitzer
Advertisements

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Non-Cooperative Game Theory To define a game, you need to know three things: –The set of players –The strategy sets of the players (i.e., the actions they.
Chapter 6 Game Theory © 2006 Thomson Learning/South-Western.
Eponine Lupo.  Questions from last time  3 player games  Games larger than 2x2—rock, paper, scissors  Review/explain Nash Equilibrium  Nash Equilibrium.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.
Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Zero-Sum Games (follow-up)
General-sum games You could still play a minimax strategy in general- sum games –I.e., pretend that the opponent is only trying to hurt you But this is.
Part 3: The Minimax Theorem
Game-theoretic analysis tools Necessary for building nonmanipulable automated negotiation systems.
Extensive-form games. Extensive-form games with perfect information Player 1 Player 2 Player 1 2, 45, 33, 2 1, 00, 5 Players do not move simultaneously.
An Introduction to Game Theory Part I: Strategic Games
Vincent Conitzer CPS Normal-form games Vincent Conitzer
Chapter 6 © 2006 Thomson Learning/South-Western Game Theory.
by Vincent Conitzer of Duke
Sep. 5, 2013 Lirong Xia Introduction to Game Theory.
Complexity Results about Nash Equilibria
An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.
Computational problems, algorithms, runtime, hardness
This Wednesday Dan Bryce will be hosting Mike Goodrich from BYU. Mike is going to give a talk during :30 to 12:20 in Main 117 on his work on UAVs.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Review Midterm3/19 3/12.
Advanced Microeconomics Instructors: Wojtek Dorabialski & Olga Kiuila Lectures: Mon. & Wed. 9:45 – 11:20 room 201 Office hours: Mon. & Wed. 9:15 – 9:45.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Extensive Game with Imperfect Information Part I: Strategy and Nash equilibrium.
Games & Adversarial Search Chapter 6 Section 1 – 4.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Review Midterm3/23 3/9.
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer
Game Theory.
Vincent Conitzer CPS Game Theory Vincent Conitzer
Risk attitudes, normal-form games, dominance, iterated dominance Vincent Conitzer
CPS 170: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.
Reading Osborne, Chapters 5, 6, 7.1., 7.2, 7.7 Learning outcomes
CPS 173 Mechanism design Vincent Conitzer
Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie Mellon University.
Nash equilibrium Nash equilibrium is defined in terms of strategies, not payoffs Every player is best responding simultaneously (everyone optimizes) This.
Dynamic Games of complete information: Backward Induction and Subgame perfection - Repeated Games -
EC941 - Game Theory Prof. Francesco Squintani Lecture 5 1.
CPS LP and IP in Game theory (Normal-form Games, Nash Equilibria and Stackelberg Games) Joshua Letchford.
Extensive-form games Vincent Conitzer
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Dynamic Games & The Extensive Form
Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.
Game Theory: introduction and applications to computer networks Game Theory: introduction and applications to computer networks Lecture 2: two-person non.
Chapters 29, 30 Game Theory A good time to talk about game theory since we have actually seen some types of equilibria last time. Game theory is concerned.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
The Science of Networks 6.1 Today’s topics Game Theory Normal-form games Dominating strategies Nash equilibria Acknowledgements Vincent Conitzer, Michael.
Topic 3 Games in Extensive Form 1. A. Perfect Information Games in Extensive Form. 1 RaiseFold Raise (0,0) (-1,1) Raise (1,-1) (-1,1)(2,-2) 2.
CPS 270: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.
Complexity of Determining Nonemptiness of the Core Vincent Conitzer, Tuomas Sandholm Computer Science Department Carnegie Mellon University.
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
CPS 570: Artificial Intelligence Game Theory Instructor: Vincent Conitzer.
CPS Utility theory, normal-form games
CPS Computational problems, algorithms, runtime, hardness (a ridiculously brief introduction to theoretical computer science) Vincent Conitzer.
Zero-sum Games The Essentials of a Game Extensive Game Matrix Game Dominant Strategies Prudent Strategies Solving the Zero-sum Game The Minimax Theorem.
Vincent Conitzer CPS Learning in games Vincent Conitzer
5.1.Static Games of Incomplete Information
By: Donté Howell Game Theory in Sports. What is Game Theory? It is a tool used to analyze strategic behavior and trying to maximize his/her payoff of.
Yuan Deng Vincent Conitzer Duke University
CPS 570: Artificial Intelligence Game Theory
Communication Complexity as a Lower Bound for Learning in Games
Multiagent Systems Extensive Form Games © Manfred Huber 2018.
Vincent Conitzer Normal-form games Vincent Conitzer
CPS Extensive-form games
Vincent Conitzer Extensive-form games Vincent Conitzer
CPS 173 Extensive-form games
Instructor: Vincent Conitzer
Presentation transcript:

CPS LP and IP in Game theory (Normal-form Games, Nash Equilibria and Stackelberg Games) Joshua Letchford

Rock-paper-scissors – Seinfeld variant 0, 01, -1 -1, 10, 0-1, 1 1, -10, 0 MICKEY: All right, rock beats paper! (Mickey smacks Kramer's hand for losing) KRAMER: I thought paper covered rock. MICKEY: Nah, rock flies right through paper. KRAMER: What beats rock? MICKEY: (looks at hand) Nothing beats rock.

Dominance Player i’s strategy s i strictly dominates s i ’ if –for any s -i, u i (s i, s -i ) > u i (s i ’, s -i ) s i weakly dominates s i ’ if –for any s -i, u i (s i, s -i ) ≥ u i (s i ’, s -i ); and –for some s -i, u i (s i, s -i ) > u i (s i ’, s -i ) 0, 01, -1 -1, 10, 0-1, 1 1, -10, 0 strict dominance weak dominance -i = “the player(s) other than i”

Mixed strategies Mixed strategy for player i = probability distribution over player i’s (pure) strategies E.g.,1/3, 1/3, 1/3 Example of dominance by a mixed strategy: 3, 00, 0 3, 0 1, 0 1/2 Usage: σ i denotes a mixed strategy, s i denotes a pure strategy

Checking for dominance by mixed strategies Linear program for checking whether strategy s i * is strictly dominated by a mixed strategy: maximize ε such that: –for any s -i, Σ s i p s i u i (s i, s -i ) ≥ u i (s i *, s -i ) + ε –Σ s i p s i = 1 Linear program for checking whether strategy s i * is weakly dominated by a mixed strategy: maximize Σ s -i [(Σ s i p s i u i (s i, s -i )) - u i (s i *, s -i )] such that: –for any s -i, Σ s i p s i u i (s i, s -i ) ≥ u i (s i *, s -i ) –Σ s i p s i = 1

Best-response strategies Suppose you know your opponent’s mixed strategy –E.g., your opponent plays rock 50% of the time and scissors 50% What is the best strategy for you to play? Rock gives.5*0 +.5*1 =.5 Paper gives.5*1 +.5*(-1) = 0 Scissors gives.5*(-1) +.5*0 = -.5 So the best response to this opponent strategy is to (always) play rock There is always some pure strategy that is a best response –Suppose you have a mixed strategy that is a best response; then every one of the pure strategies that that mixed strategy places positive probability on must also be a best response

How to play matching pennies Assume opponent knows our mixed strategy If we play L 60%, R 40%... … opponent will play R… … we get.6*(-1) +.4*(1) = -.2 What’s optimal for us? What about rock-paper-scissors? 1, -1-1, 1 1, -1 L R LR Us Them

General-sum games You could still play a minimax strategy in general- sum games –I.e., pretend that the opponent is only trying to hurt you But this is not rational: 0, 03, 1 1, 02, 1 If Column was trying to hurt Row, Column would play Left, so Row should play Down In reality, Column will play Right (strictly dominant), so Row should play Up Is there a better generalization of minimax strategies in zero-sum games to general-sum games?

Nash equilibrium [Nash 50] A vector of strategies (one for each player) is called a strategy profile A strategy profile (σ 1, σ 2, …, σ n ) is a Nash equilibrium if each σ i is a best response to σ -i –That is, for any i, for any σ i ’, u i (σ i, σ -i ) ≥ u i (σ i ’, σ -i ) Note that this does not say anything about multiple agents changing their strategies at the same time In any (finite) game, at least one Nash equilibrium (possibly using mixed strategies) exists [Nash 50] (Note - singular: equilibrium, plural: equilibria)

The presentation game Pay attention (A) Do not pay attention (NA) Put effort into presentation (E) Do not put effort into presentation (NE) 4, 4-16, -14 0, -20, 0 Presenter Audience Pure-strategy Nash equilibria: (A, E), (NA, NE) Mixed-strategy Nash equilibrium: ((1/10 A, 9/10 NA), (4/5 E, 1/5 NE)) –Utility 0 for audience, -14/10 for presenter –Can see that some equilibria are strictly better for both players than other equilibria

Some properties of Nash equilibria If you can eliminate a strategy using strict dominance or even iterated strict dominance, it will not occur (i.e., it will be played with probability 0) in every Nash equilibrium –Weakly dominated strategies may still be played in some Nash equilibrium In 2-player zero-sum games, a profile is a Nash equilibrium if and only if both players play minimax strategies –Hence, in such games, if (σ 1, σ 2 ) and (σ 1 ’, σ 2 ’) are Nash equilibria, then so are (σ 1, σ 2 ’) and (σ 1 ’, σ 2 ) No equilibrium selection problem here!

Solving for a Nash equilibrium using MIP (2 players) [ Sandholm, Gilpin, Conitzer AAAI05] maximize whatever you like (e.g., social welfare) subject to –for both i, Σ s i p s i = 1 –for both i, for any s i, Σ s -i p s -i u i (s i, s -i ) = u s i –for both i, for any s i, u i ≥ u s i –for both i, for any s i, p s i ≤ b s i –for both i, for any s i, u i - u s i ≤ M(1- b s i ) b s i is a binary variable indicating whether s i is in the support, M is a large number

Stackelberg (commitment) games (My research) Unique Nash equilibrium is (R,L) –This has a payoff of (2,1) 1, -13, 1 2, 14, -1 L R L R

Commitment What if the officer has the option to (credibly) announce where he will be patrolling? This would give him the power to “commit” to being at one of the buildings –This would be a pure-strategy Stackelberg game LR L(1,-1)(3,1) R(2,1)(4,-1)

Commitment… If the officer can commit to always being at the left building, then the vandal's best response is to go to the right building –This leads to an outcome of (3,1) LR L(1,-1)(3,1)

Committing to mixed strategies What if we give the officer even more power: the ability to commit to a mixed strategy –This results in a mixed-strategy Stackelberg game –E.g., the officer commits to flip a weighted coin which decides where he patrols LR L(1,-1)(3,1) R(2,1)(4,-1)

Committing to mixed strategies is more powerful Suppose the officer commits to the following strategy: {(.5+ε)L,(.5- ε)R} –The vandal’s best response is R –As ε goes to 0, this converges to a payoff of (3.5,0) LR L(1,-1)(3,1) R(2,1)(4,-1)

Stackelberg games in general One of the agents (the leader) has some advantage that allows her to commit to a strategy (pure or mixed) The other agent (the follower) then chooses his best response to this

Visualization LCR U 0,11,00,0 M 4,00,10,0 D 1,01,1 (1,0,0) = U (0,1,0) = M (0,0,1) = D L C R

Easy polynomial-time algorithm for two players For every column t separately, we solve separately for the best mixed row strategy (defined by p s ) that induces player 2 to play t maximize Σ s p s u 1 (s, t) subject to for any t’, Σ s p s u 2 (s, t) ≥ Σ s p s u 2 (s, t’) Σ s p s = 1 (May be infeasible) Pick the t that is best for player 1

(a particular kind of) Bayesian games leader utilities follower utilities (type 1) follower utilities (type 2) probability.6probability.4

Multiple types - visualization (1,0,0) (0,1,0) L C R (0,0,1) (1,0,0) (0,1,0) L C R (0,0,1) (1,0,0) (0,1,0) (0,0,1) (R,C) Combined

Solving Bayesian games There’s a known MIP for this 1 Details omitted due to the fact that its rather nasty. The main trick of the MIP is encoding a exponential number of LP’s into a single MIP Used in the ARMOR system deployed at LAX [1] Paruchuri et al. Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games

(In)approximability (# types)-approximation: optimize for each type separately using the LP method. Pick the solution that gives the best expected utility against the entire type distribution. Can’t do any better in polynomial time, unless P=NP –Reduction from INDEPENDENT-SET For adversarially chosen types, cannot decide in polynomial time whether it is possible to guarantee positive utility, unless P=NP –Again, a MIP formulation can be given

Reduction from independent set AB al1al1 31 al2al2 010 al3al3 01 AB al1al1 0 al2al2 31 al3al3 0 AB al1al1 01 al2al2 0 al3al AB al1al1 10 al2al2 10 al3al3 10 leader utilities follower utilities (type 1) follower utilities (type 2) follower utilities (type 3)

Extensive-form games Often games have an inherent time structure –In these cases, it is often easier to represent these games in the extensive form The focus of my most recent paper (EC ‘10) was to determine in which extensive-form games the Stackelberg solution can be found efficiently

Stackelberg games in extensive form Player 2 2, 1 (0, 1) (2, 2) (1, 3)(3, 0) Player 1 50% Pure strategy commitmentMixed strategy commitment (2.5,1) Subgame Perfect Nash Equilibrium (2,2) (3,0) (1,3) (0,1) (2,2) (2.5,1)

Other aspects considered Pure or mixed strategy commitment Perfect vs imperfect information Chance nodes Restricted or costly commitment –Player 1 either incurs a cost for committing at some nodes/information sets or is unable to do so Tree vs DAG –The key difference in a DAG is the inability for player 1 to commit differently based on what path is taken to a node/information set

Overview of results (decision tree) Left ChanceNo Chance Imperfect Info. Perfect Info. Pure Mixed TreeDAG TreeDAG Two Players Three+ Players No Restrictions Restrictions P PP NP-hard ?

Case 1: pure strategy commitment THEOREM. Can be solved in O(nm) time when: perfect information tree form no chance nodes no costs/restrictions pure strategy commitment any number of players n is the number of internal nodes, m the number of leaf nodes

Case 1: algorithm Two main steps –An upward pass to determine what subset of each node’s descendant leaf nodes can be achieved –A downward pass to determine the correct commitment at each node This is both on and off the path to the desired outcome

The upward pass At player 1 nodes –Take the union of all children’s achievable sets At player i ≠1 nodes –Determine the pruning value for each child max (other children) min u i This is how much we can punish player i for not going to this child –Prune each set, take the union of what remains

Case 1 example: upward pass Player 2 2, 1 (0, 1) (2, 2) Left (1, 3)(3, 0) Player 1 ((1,3),(0,1)) ((2,2),(3,0)) pruning value = 0 pruning value = 1 ((1,3),(0,1),(2,2)) ((2,2),(3,0))

The downward pass A recursive algorithm –At player 1 nodes Simply commit on the path to the desired node and recurse on that child –At player i ≠1 nodes Recurse towards the desired outcome, as well as to the smallest outcome for every other child

Case 1 example: downward pass Player 2 2, 1 (0, 1) (2, 2) Left (1, 3)(3, 0) Player 1 ((1,3),(0,1)) ((2,2),(3,0)) ((1,3),(0,1),(2,2))

Case 2: mixed (behavioral) strategy commitment THEOREM. Can be solved in O(nm 2 ) time when: perfect information tree form no chance nodes no costs/restrictions mixed strategy commitment two players n is the number of internal nodes, m the number of leaf nodes

Case 2: algorithm (sketch) Two main steps –An upward pass to determine what mixtures of each node’s descendants can be achieved –A downward pass to determine the correct commitment to achieve the best mixed strategy

The upward pass This time we will need to store mixed strategies (meaning convex sets), rather than points –It turns out that since our eventual goal is to maximize player 1’s utility, that maintaining the ceiling of the convex sets is enough (line segments) –For computational reasons, we will not actually ever compute the ceiling, but instead maintain a slightly larger superset of the ceiling

The upward pass At player 1 nodes –Take the union of all children’s achievable sets Represented as line segments –Also, for endpoints of line segments from two different children, can take convex combinations This may result in another segment These endpoints will either be leaf nodes or generated at player 2 nodes

The upward pass At player 2 nodes –For each child find the pruning value –Prune each line segment at this value (if either end point is smaller than this value) –Take the union of all children’s achievable sets

Case 2 example: upward pass Player 2 2, 1 (0, 1) (2, 2) Left (1, 3)(3, 0) Player 1 ((1,3),(0,1)) ((2,2),(3,0)) pruning value = 0 pruning value = 1 (((1,3),(0,1)), ((2,2), (2.5,1))) ((2,2),(3,0)) (2.5,1)

The downward pass A recursive algorithm –At player 1 nodes Compute and commit to the necessary probabilites Recurse on the children that receive positive probability –At player 2 nodes Recurse towards the desired outcome, as well as to the smallest outcome on every other child (note: player 2 does not ever need to randomize)

Case 2 example: downward pass Player 2 2, 1 (0, 1) (2, 2) Left (1, 3)(3, 0) Player 1 ((1,3),(0,1)) ((2,2),(3,0)) (((1,3),(0,1)), ((2.5,1),(2,2))) 50%

Chance nodes Moves by a player with a fixed behavorial strategy that has no stake in the game –Usually referred to as moves by Nature. –Behavorial strategy is common knowledge –We don’t include Nature when we count the number of players

Chance node results THEOREM. It is NP-hard to solve for the optimal strategy to commit to in a game with: –chance nodes, –two players –tree form –perfect information –no costs/restrictions –pure or mixed strategy commitment We prove this via reduction from Knapsack.

Knapsack Set of N items –Each has a value p i and a weight w i Find a subset of items that –Maximizes the sum of the p i of the items in the subset –s.t. the sum of the w i of the items in the subset is below a given limit W.

Knapsack reduction One subtree for each item –Player 1 commitment determines if the item is included or not Chance node to make all items be considered Player 2 at the top makes a choice to enforce the weight limit

Knapsack reduction Left Player 2 Player 1 Player 2 C (0, -Nw N ) (Nw N, -Nw N ) (0, 0) (Nw i, -Nw i )(Nw 1, -Nw 1 ) (0, -Nw i )(0, -Nw 1 ) 1 N 1 N 1 N 1 N 1 N 1 N (0, -W) Item 1’s subtree Imposes the weight constraint Forces all items to be considered

Open questions Are there good heuristics/approximation algorithms for any of the NP-hard cases? Are there other restrictions that allow for fast algorithms? Are the given algorithms tight or is there room for improvement?

Thank you for your attention Left ChanceNo Chance Imperfect Info. Perfect Info. Pure Mixed TreeDAG TreeDAG Two Players Three+ Players No Restrictions Restrictions P PP NP-hard ?

Pure-strategy extensive form representation of normal form Player 1 Player 2 2, 1(2, 1)(3, 1) 3, 1 (1,0) (=Left) Left Right (0,1) (=Right) (1, -1)(4, -1) Left Player 2

Mixed strategy extensive form representation of normal form Player 1 Player 2 2, 1(3, 1)(2, 1) 3, 1 (1,0) (=Up) Left Right (1.5, 0)(3.5, 0) LeftRight (0,1) (=Down) (.5,.5) … … While conceptually useful, this is not useful computationally: the tree has infinite size (1, -1)(4, -1) Left

Tie breaking As is commonly done, we assume that all players break ties in player 1’s favor Consider a case where player 1 makes a mixed strategy commitment between two choices, (1,0), and (0,1). If player 2 has choice between the result of player 1’s commitment and (0,.5): –Player 1 can commit to a (.5+epsilon) probability of playing (0,1) and a (.5-\epsilon) probability of playing (1,0) –Then, player 2 will prefer the outcome of player 1’s commitment.

DAG Player 1 Player 2 2, 1(2, 1)(3, 1) 3, 1 (1,0) (=Left) Left Right (0,1) (=Right) (1, -1)(4, -1) Left Player 2

DAG example Player 2 (0, 2) (2, 0) (0, 1) (1,0) C C Player 2 Player 1 H HH T TT