# Chapter 4 Sequential Games

## Presentation on theme: "Chapter 4 Sequential Games"— Presentation transcript:

Chapter 4 Sequential Games

Extensive Form Games Any finite game of perfect information has a pure strategy Nash equilibrium. It can be found by backward induction. H T (1,2) (4,0) (2,1) Chess is a finite game of perfect information. Therefore it is a “trivial” game from a game theoretic point of view.

Extensive Form Games - Intro
A game can have complex temporal structure. Information set of players who moves when and under what circumstances what actions are available when called upon to move what is known when called upon to move what payoffs each player receives Foundation is a game tree.

Big Monkey and Little Monkey eat warifruit, which dangle from the extreme tip of a lofty branch of the waritree. A waritree produces only one fruit. To get the warifruit, at least one of the monkeys must climb the tree and shake the branch bearing the fruit until the fruit comes loose and falls to the ground. A warifruit is worth 10 calories of energy. Climbing the tree uses 2 calories for Big Monkey, but uses no energy for Little Monkey, who is smaller. If Little Monkey climbs the tree and shakes it down, Big Monkey will eat 90% of the fruit (or 9 calories) before Little Monkey climbs back down, and Little Monkey will get only 10% of the fruit (or 1 calorie). If Big Monkey climbs the tree and Little Monkey waits, Little Monkey will get 40% of the fruit and Big Monkey will get 60%. If both monkeys climb the tree, Big Monkey will get 70% of the fruit and Little Monkey will get 30%. Assume each monkey is simply interested in maximizing his caloric intake. Each monkey can decide to climb the tree or wait at the bottom. a. What is likely to happen if Big Monkey makes his decision first? b. What is likely to happen if Little Monkey must decide first? c. What if they both decide simultaneously?

Fundamental Tools Big Monkey (BM) – Little Monkey (LM)
Warifruit from waritree (only one per tree) = 10 Calories Climb the tree to get the fruit Cost to get the fruit : 2 Calories for Big Monkey zero for Little Monkey Payoff : Both climb : BM 7 Calories – LM 3 Calories BM climbs : BM 6 Calories – LM 4 Calories LM climbs : BM 9 Calories – LM 1 Calories What will they do to maximize payoff taking into account cost?

Fundamental Tools Extensive form games--Definition
An extensive form game G consists of : Players Game tree Payoffs Terminal node t : i(t)   (payoffs) G has tree property : only 1 path from root to any terminal node Occurrence of stochastic event : fictitious player Nature probability assigned to each branch of which Nature is head node

Fundamental Tools Extensive form games—Illustration (BM-LM)
3 possibilities : BM decides first what to do LM decides first what to do Both decide simultaneously BM decides first : Big Monkey w c Little Monkey Little Monkey w c w c 0,0 9,1 4,4 5,3

Fundamental Tools Extensive form games—Illustration (BM-LM)
Strategies : BM : Wait (w) Climb (c) LM : Actions are ordered, depending on (w,c) of BM Climb no matter what BM does (cc) Wait no matter what BM does (ww) Do the same thing BM does (wc) Do the opposite of what BM does (cw) A series of actions that fully define the behavior of a player = strategy. A strategy for a player is a complete plan of how to plan the game and prescribes his choices at every information set (in this case, node).

Fundamental Tools Extensive form games—Illustration (BM-LM)
LM decides first : The strategies are conversed Utility=(LM, BM) Little Monkey w c Big Monkey Big Monkey w c w c 0,0 4,4 1,9 3,5

Fundamental Tools Extensive form games—Illustration (BM-LM)
They choose simultaneously : Information Set : a set of nodes at which : The same player chooses The player choosing does not know which node represents the actual choice node – represented by dotted line Big Monkey LM c w 5,3 4,4 9,1 0,0 w c BM Little Monkey w c w c 0,0 9,1 4,4 5,3

The key to representing information in a game tree is realizing the connection between nodes and history. If you know which node you have reached, you know precisely the history of the play. To express uncertainty, we use concept of information set (set of nodes you could be in at a given time).

Composition of information sets
Each decision node is in exactly one information set all nodes of an information set must belong to same player every node of an information set must have exactly the same set of available actions. If every information set of every player is a singleton, we have a game of perfect information.

Fundamental Tools Normal form games--Definition
The n-player normal form game consists of : Players i = 1,…,n A set Si of strategies for player i = 1,…,n. We call s = (s1, …, sn) where si  Si for i = 1,…,n, a strategy profile for the game. Each si is a strategy for player i. A function i : S   for player i = 1,…,n, where S is the set of strategy profiles, so i(s) is player i’s payoff when strategy profile s is chosen.

Fundamental Tools Normal form games--Illustration
Another way to depict the BM-LM game (where BM chooses first) : LM : Actions are ordered, depending on (w,c) of BM Climb no matter what BM does (cc) Wait no matter what BM does (ww) Do the same thing BM does (wc) Do the opposite of what BM does (cw) LM BM cc cw wc ww w 9,1 0,0 c 5,3 4,4

Fundamental Tools Normal form games--Illustration
Don’t get rid of weakly dominated, as lose equilibrium LM BM cc cw wc ww w 9,1 0,0 c 5,3 4,4

Sequential games If players take turns to move then we have a sequential game (sometimes called a dynamic game) We model a sequential game by using a ‘game tree’ (or an ‘extensive form representation’)

It can be shown that every strategic form game can be represented by an extensive game form game and vice versa But strategies that are in equilibrium in strategic form games are not necessary equilibrium strategies in extensive form games games. Ex. Monopolist. Made sense to fight as an equilibrium, not not if other has already entered. – We need to define the concept of an equilibrium in extensive form games

Problems with Nash equilibrium
Sequential nature of the game is lost when representing extensive form games in strategic form Some Nash equilibria rely on playing actions that are not rational once that action node has been reached. In other words, a choice only makes sense if you know what the opponent will do. Nash equilibrium does not distinguish between credible and non-credible threats

Solving sequential games
To solve a sequential game we look for the ‘subgame perfect Nash equilibrium’ For our purposes, this means we solve the game using ‘rollback’ To use rollback, start at the end of each branch and work backwards, eliminating all but the optimal choice for the relevant player (Technical point – you can only use this trick if there are no information sets. If you don’t know where you are, it may be too difficult to decide.)

Subgame Its game tree is a branch of the original game tree
The information sets in the branch coincide with the information sets of the original game and cannot include nodes that are outside the branch. The payoff vectors are the same as in the original game.

Subgame perfect equilibrium & credible threats
Proper subgame = subtree (of the game tree) whose root is alone in its information set Subgame perfect equilibrium Strategy profile that is in Nash equilibrium in every proper subgame (including the root), whether or not that subgame is reached along the equilibrium path of play

On October 22, 1962, after reviewing newly acquired intelligence, President John F. Kennedy informed the world that the Soviet Union was building secret missile bases in Cuba, a mere 90 miles off the shores of Florida. After weighing such options as an armed invasion of Cuba and air strikes against the missiles, Kennedy decided on a less dangerous response. In addition to demanding that Russian Premier Nikita S. Khrushchev remove all the missile bases and their deadly contents, Kennedy ordered a naval quarantine (blockade) of Cuba in order to prevent Russian ships from bringing additional missiles and construction materials to the island. In response to the American naval blockade, Premier Khrushchev authorized his Soviet field commanders in Cuba to launch their tactical nuclear weapons if invaded by U.S. forces. Deadlocked in this manner, the two leaders of the world's greatest nuclear superpowers stared each other down for seven days - until Khrushchev blinked. On October 28, thinking better of prolonging his challenge to the United States, the Russian Premier conceded to President Kennedy's demands by ordering all Soviet supply ships away from Cuban waters and agreeing to remove the missiles from Cuba's mainland. After several days of teetering on the brink of nuclear holocaust, the world breathed a sigh of relief.

Example: Cuban Missile Crisis
Nuke - 100, - 100 Kennedy Arm Khrushchev Fold 10, -10 -1, 1 Retract Pure strategy Nash equilibria: (Arm, Fold) and (Retract, Nuke) Pure strategy subgame perfect equilibria: (Arm, Fold) Conclusion: Kennedy’s Nuke threat was not credible.

Backwards induction Start from the smallest subgames containing the terminal nodes of the game tree Determine the action that a rational player would choose at that action node At action nodes immediately adjacent to terminal nodes, the player should maximize the utility, This is because she no longer cares about strategic interactions. Regardless of how she moves, nobody else can affect the payoff of the game. Replace the subgame with the payoffs corresponding to the terminal node that would be reached if that action were played Repeat until there are no action nodes left

(MD,BK) payoff

The predation game Nasty Guys is an incumbent firm producing bricks
SIC (Sweet Innocent Corporation) is a potential new entrant in the brick market. Nasty Guys says that if SIC enters then it will “squish them like a bug”. What should SIC do?

The predation game Don’t enter SIC=0, NG=100 Fight SIC
Don’t fight SIC = 30, NG = 30

The predation game Don’t enter SIC=0, NG=100 Fight SIC
If SIC actually enters, then ‘fighting’ is an incredible threat – it hurts SIC but also hurts NG. So SIC knows the threat is just bluff Don’t fight SIC = 30, NG = 30

The predation game Don’t enter SIC=0, NG=100 Fight SIC
So the equilibrium is: SIC will enter NG will not fight Don’t fight SIC = 30, NG = 30

Credible commitments When Cortes arrived in Mexico he ordered that his ships should be burnt This seems silly His troops were vastly outnumbered Surely it is better to keep an ‘escape route’ home?

Think of Cortes trying to motivate his own soldiers
Fight Hard C = 100, S = 0 Keep Ships S Be careful C = 0, S = 10 Fight Hard C = 100, S = 0 C S Burn ships Be careful C = -100, S = -100

If no retreat possible, will fight hard or die
If no retreat possible, will fight hard or die. But if retreat is possible, may fight less hard and ‘run away’ Fight Hard C = 100, S = 0 Keep Ships S Be careful C = 0, S = 10 Fight Hard C = 100, S = 0 C S Burn ships Be careful C = -100, S = -100

So Cortes wants to burn his ships
So Cortes wants to burn his ships. It is a credible commitment not to retreat – and this alters how his own troops behave. Fight Hard C = 100, S = 0 Keep Ships S Be careful C = 0, S = 10 Fight Hard C = 100, S = 0 C S Burn ships Be careful C = -100, S = -100

Hold up Hold up occurs if one party has to incur sunk costs before they bargain with another party For example, hardware manufacturers and software developers Hardware manufacturers want software manufacturers to make applications for their hardware But most of the cost of software is sunk So if bargain after the software is designed, the hardware manufacturer can seize most of the benefits

Holdup: in equilibrium, no-one designs software payoffs = (software, nintendo)
Bargain hard (= Pay low price) (-\$50,000; \$250,000) Nintendo design Software Designer Bargain “soft” (\$100,000; \$100,000) Don’t design (0, 0)

Strategies in extensive form
A strategy in an extensive form game is a complete description of the actions that player performs at any action node at which it is her turn to move turn to move Key points –It is not sufficient to specify responses only at those action nodes that are arrived at via some particular sequence of plausible play –A strategy must prescribe an action at any action node where that player moves. node where that player

Definition: The strategy set of agent i is the cartesian product of the sets of children nodes of each information set belonging to i. Definition: An information set I is a subset of the nodes in a game tree belonging to player P such that - All iÎI belong to P - For i,jÎI there is no path from i to j - All iÎI have the same number of outgoing edges

Sequential Prisoner’s Dilemma dotted line means P2 doesn’t know which state he is in
Confess Deny P2 P2 Confess Deny Confess Deny (-5,-5) (0,-10) (-10,0) (-1,-1)

With perfect information – each information set is a singleton (as you always know which state you are in) A strategy profile (s1,s2,…sn) determines a unique path from the root to some terminal node. (where s1 states what player 1 will do in every situation) We say this unique path is supported by the strategy profile. A path supported by a Nash equilibrium will be called an equilibrium path. A Nash equilbrium in sequential game (perfect or imperfect): U(si*, s-i*) >U(si, s-i*) for all i. Note, there can be two different strategy profiles which have the same path. Every path from the root to a terminal node is supported by at least one strategy profile.

P1 P2 P2 (2,1) (0,3) (4,1) (1,0) Example 4.9 A B C D E F G L R L’ R’
RL” is best path Stategies ({R}, {R’,L”}) and ({R}, {L’,L”}) both support it ({R}, {R’,L”}) means P1 always takes R, P2 takes R’ if at node B and L” if at node C [Note: Notation is confusing; you always have to read to get meaning.] P1 A L R P2 P2 B C L’ R’ L” R” D E F G (2,1) (0,3) (4,1) (1,0)

Theorem (Kuhn): Every sequential game with perfect information has a Nash equilibrium (use backwards induction). P1 A P2 P2 B C D E F (1,0) (0,1) (2,2)

Example 4.12 Stackelberg Duopoly Model
Stackelberg duopoly (like a monopoly, but with exactly two players) corresponds to a sequential game, first leader (firm 1) chooses how much to produce, then follower (firm 2) chooses can be solved by backward induction: for each quantity q1, the follower chooses its best response q2 i (q1, q2) = qi[p(q) -ci] where q = q1+q2 p(q) = A-q is the market clearing price when the total output in the market is q ci is the marginal cost of the production of the product by firm i. That is, the profit for each firm is i (q1, q2) = qi[A-q1-q2 -ci]

Solving by backwards induction
This is a two person game sequential game with two stages and perfect information. Find best response for each choice of q1 2 (q1, q2*) = max q2[A-q1-q2 –c2] 2 (q1, q2) = -(q2)2 + q2[A-q1 –c2] = -2q2 +A-q1-c2 Second derivative = -2 So the maximizer is (A-q1-c2)/2

Continuing Thus, firm 1 should anticipate this result and choose q1 to maximize 1 (q1, q2*) = q1[A-q1-(A-q1-c2)/2 –c1] = ½(-q12 + (A+c2-2c1)q1) = -q1 +1/2(A+c2-2c1) q1= ½((A+c2-2c1) and q2 = ¼*(A+2c1-3c2)

Type of games

subgame perfect equilibrium
A strategy profile of a sequential game is a subgame perfect equilibrium if it is a Nash equilbrium for every subgame of the original game. In other words, the strategy is perfect even if the play never goes to that part of the tree. An imperfect equilibrium is like a strategy that wouldn’t be optimal if the other player did something different.

Imperfect information
􀂄􀂄 SPE is not an appropriate equilibrium concept because most games with imperfect information have too few proper subgames to rule out extraneous Nash equilibria of the game Alternative equilibrium concepts – Bayesian Equilibrium/Perfect Bayesian Equilibrium – Sequentially rational equilibrium – Forwards induction – Trembling hand equilibrium Topic of active research

A Nash equilibrium that fails to be subgame perfect is also known as Nash equilibrium supported by noncredible behavior. To find subgame perfect equilibrium, use backward induction on the subgames of the original problem.

Bob and Betty Bob and Betty must cook, wash dishes, and vacuum. Bob can't cook very well and just doesn't like to wash dishes, so they have concocted the following game for allocating the tasks. Betty moves first and she chooses between cooking and doing the dishes. If she chooses dishes, then Bob chooses to Go Out or Cook. On the other hand, if Betty chooses to cook, then they simultaneously choose between the remaining two tasks; vacuuming and doing the dishes. The payoffs are at the end of the tree. You may conclude whatever you want about the relationships between the payoffs and the preferences of Bob and Betty for doing chores and being together.

Note, use normal form game to pick what Betty should do at CD

There are three subgames: There are two proper subgames; one beginning at node B and one beginning at node A. The game itself is a subgame. There are two paths to Nash equilibria. The first one (Path One) from the root through node A is BettyDishes, BobOut. The other one (Path Two) from the root through node B to the information set containing nodes C and D is BettyCook, BobVacuum, BettyVacuum. You must use normal form on the subgame beginning at node B in order to find this second path. Path Two is supported by the strategy profile ({Cook, Vacuum}, {Out, Vacuum}). That is, Betty plays Cook at the root and Vacuum at C or D. Bob plays Out at node A and Vacuum at node B. This path is a subgame perfect equilibrium. For the subgame starting at A the proposed strategy profile reduces to ({.}, {Out}), which is a Nash equilibrium. For the subgame starting at node B the strategy profile reduces to ({.,Vacuum}, {.,Vacuum}), which is a Nash equilibrium. Are there any other strategy profiles that will support Path Two?

The strategy profile ({Dishes, Vacuum}, {Out, Vacuum}) supports Path One, the road to a Nash equilibrium. At the root node Betty plays Dishes and at node A Bob plays Out. If they should happen to find themselves at the subgame starting at node B, then they both play vacuum, which is a Nash Equilibrium. But this strategy profile is not the only one that supports Path One. Path One is also supported by the strategy profile ({Dishes, Dishes}, {Out, Dishes}). Betty plays Dishes at the root and Dishes at nodes C or D; Bob plays Out at node A and Dishes at node B. This is a Nash equilibrium profile since Out is Bob's best response to a play of Dishes by Betty at the root node. But this is not a subgame perfect equilibrium since a play of Dishes by Bob at node B and a play of Dishes by Betty at C or D could be improved upon. In this case we have a strategy profile that involves a Nash equilibrium in one subgame, but noncredible plays in another subgame.

Consider the strategy profile ({Dishes, Vacuum}, {Out, Dishes}), which also supports Path One.
Is this profile a subgame perfect equilibrium? No. This profile does result in a Nash equilibrium in the subgame beginning at node A, but there is a hitch. In the subgame that includes the root Betty would never play Dishes at the root, as called for by the profile. Are there any more strategic profiles that support Path One? Consider ({Dishes, Dishes}, {Out, Vacuum}) and explain why this is not a subgame perfect equilibrium. As in the normal form games we have seen, there may be multiple Nash equilibria in an extensive form game. The principle of subgame perfect equilibrium is to eliminate those Nash equilibria which may be based on non-credible or unreasonable promises or threats. In the analysis of the second Bob and Betty example we eliminated two strategy profiles that involved a Nash equilibrium in a subgame, but which also included unreasonable behavior in other subgames.

In Class Exercise Ask the students to choose partners from the other side of the room or have them imagine that each is playing with one person who is sitting on the other side of the room. Each student will eventually be asked to write either pink or purple. If both students in the real or imaginary pair write pink, the person on the right-hand side of the room gets 50 points and the person on the left-hand side of the room gets 40 points. ("Right-hand" and "left-hand" are defined from the students’ point of view.) If both write purple, the person on the lefthand side of the room gets 50 points and the person on the right-hand side of the room gets 40 points. If the answers don't match, neither player gets anything. To play without the delay tactic, simply ask the students to choose a color and write the choice. Then play again, immediately, but explain that you will flip a coin first. If it comes up heads, those on the right-hand side of the room get to write their answers first; otherwise those on the left-hand side of the room write first. What happens if we delay?

Games like the battle of the two cultures or chicken have first-mover advantages in their sequential move versions; the tennis-point example has a second-mover advantage in its sequential-move version. Other games show no change in equilibrium as a result of the change in rules; games like the prisoners’ dilemma, in which both players have dominant strategies, fall into this category.

Prove or disprove! When both players have a dominant strategy, the dominant-strategy equilibrium will hold in both the simultaneous and the sequential versions of the game.

Sequential Monopolist View
What are Nash equilibria? Are they subgame perfect?

Thought Question How do we change a game to our advantage?
Use commitment, threats, and promises to change the nature of a game.

Commitment Reduce freedom of action by commitment.
Thereby forcing a “first mover advantage.” This move has to be “observable.” Move has to be credible (believable). Reputation leads to credibility.

Commitment – An Example
STUDENT TEACHER Punctual Late Weak 4, 3 2, 4 Tough 3, 2 1, 1

Commitment – An Example For those that intend to teach…
STUDENT TEACHER Punctual Late Weak 4, 3 2, 4 N.E. Tough 3, 2 1, 1

Commitment – An Example But if we announce we are tough
STUDENT TEACHER Punctual Late Weak 4, 3 2, 4 Tough 3, 2 1, 1

Commitment – An Example Get different NE
STUDENT TEACHER Punctual Late Weak 4, 3 2, 4 Tough 3, 2 N.E. 1, 1

Strategic Moves and Threats!
Deterrence and Compellence achieved through either a threat with an implied promise or a promise with an implied threat. Either deterrence or compellence requires credibility. Credibility involves some cost to the threatener as well. If threatener preferred the threat then she would carry it out anyway and the promise part of the threat would never be carried out. This cost is problematic because it might tempt the threatener to avoid actually carrying out the threat, thus making it less credible! Thus imposing a cost on herself is a necessary condition for a threat to be successful, but is not a sufficient condition. CAREFUL!!!!!

Trade Negotiation Japan U.S. Open Closed 4, 3 3, 4 2, 1 1, 2

Trade Negotiation Japan U.S. Open Closed 4, 3 3, 4 2, 1 1, 2

Changing the game – A threat!
A threat -- “We will close our market if you close yours.” And a promise – “ We will open market if you open yours.” Effectively reduces Japans options to If Japan stays open then the U.S. stays open, giving Japan 3 and the U.S., 4 (favorable to the U.S. over the no threat scenario) If Japan stays closed then U.S. closes as well, thus giving Japan 2 and the U.S., 1.

Trade Relations – Threats in Action.
Credible partly because the threat is costly to the US. If it were not costly to the US to play close then Japan would not believe that US would play open otherwise and therefore would never open her own borders. If Japan calls US bluff then US has the temptation to not carry out the threat! – cutting off freedom of action in the future (as threat will not be believed) That fact may keep threat credible. If Japanese market is already open then threat is part of deterrence strategy If Japanese market is closed then threat is part of compellence strategy. A good threat is one that does not need to be carried out. NOTE: If the size of the threat is too big to be credible then a probabilistic element may be introduced. (Gamble on whether will be carried out and on what amount of penalty will be.)

Threats in action (cont)
Japan may respond by agreeing in principle and then stalling – banking on the US’s lack of desire to impose a cost on itself, a technique called salami tactics. Salami tactics – fail to comply with the others wishes (particularly in compellence situations) in such a small way that it is not worth the while of the threatener to carry out her threat. If that works you transgress a little more..the way to oppose salami tactics is by using a graduated threat. Used in politics: a gradual process of threats and alliances as a means of overcoming opposition. Consequently, the player was able to exert his/her influence and eventually dominate the political landscape -- slice by slice.

Prisoners Dilemma – Promises to Keep!
Tit for Tat strategies are examples of promises that act as a deterrent to cheating.

Promises or Threats? Deterrence: “Prevent you from doing something” Threat  “I will punish you if you do it” – requires me to only wait till you mess up – no monitoring. Promise  “I will reward you if you don’t do it” – requires me to monitor you constantly. Threats are cheaper than promises. Compellence: “I want you to do something” Threat  “I will punish you if you don’t do it” – requires me to monitor you constantly. Promise  “I will reward you if you do it” – requires me to only wait till you do what I want you to do! Promises are cheaper than threats.

Countering Threats Irrationality: So nuts that any threats will not have an effect on your behavior. Cut of communication so threats don’t reach you. Open escape routes for enemy thus tempting them to renege on threats. Undermine opponents motive to maintain reputation by promising secrecy if she does not punish you.

Credibility Devices Since credibility implies the temptation to be “not” credible certain devices are required to ensure credibility. Reducing the freedom of action through: automatic fulfillment (doomsday device) According to a new book exclusively obtained by the Huffington Post, Saudi Arabia has crafted a plan to protect itself from a possible invasion or internal attack. It includes the use of a series of explosives, including radioactive “dirty bombs,” that would cripple Saudi Arabian oil production and distribution systems for decades... burning bridges, cutting off communication so nobody can argue with you regarding your threat.

Credibility Devices Changing payoffs by using reputation:
make game part of larger game so that payoffs in the current game are linked to repurcussions in other game, dividing game into little subgames to allow reputation to work, reduce monitoring costs and thus change payoffs by allowing players to monitor each other, irrationality – worry I just might Nuke as I’m irrational contracts – have way of punishing if deal not kept Brinkmanship: the policy or practice, especially in politics and foreign policy, of pushing a dangerous situation to the brink of disaster (to the limits of safety) in order to achieve the most advantageous outcome by forcing the opposition to make concessions. This might be achieved through diplomatic maneuvers by creating the impression that one is willing to use extreme methods rather than concede. During the Cold War, the threat of nuclear force was often used as such a deterrent.

Solving Extensive Form Games
the usual procedure is to convert the extensive-form game to strategic form, and find its equilibria. However, some of these equilibria would have important drawbacks because they ignore the dynamic nature of the extensive-form. Reinchard Selten was the first to argue that some Nash equilibria are “more reasonable” than others in his 1965 article.

Selten’s Game

The strategic form representation has two pure-strategy Nash equilibria, (D, L) and (U,R). Look closely at the Nash equilibrium (U,R) and what it implies for the extensive form. In the profile (U,R), player 2’s information set is never reached, and she loses nothing by playing R there. But there is something “wrong” with this equilibrium: if player 2’s information set is ever reached, then she would be strictly better off by choosing L instead of R. In effect, player 2 is threatening player 1 with an action that would not be in her own interest to carry out. we are interested in sequencing of moves presumably because players get to reassess their plans of actions in light of past moves

To anticipate a bit of what follows, the problem with the (U,R) solution is that it specifies the incredible action at an information set that is off the equilibrium path of play. Player 2’s information set is never reached if player 1 chooses U. Consequently, Nash equilibrium cannot pin down the optimality of the action at that information set.

Little Horsey Consider the following simple game.Player 1 gets to choose between U, M, or D. If he chooses D, the game ends. If he chooses either U or M, player 2 gets to choose between L and R without knowing what action player 1 has taken except that it was not D.

Where are the NE? Convert to normal form Use standard techniques

There are two players and player 1 receives a book which, with probability p is a small game theory pocket reference, and with probability 1 − p is a Star Trek data manual. The player sees the book, wraps it up, and decides whether to offer it to player 2 as a gift. Player 2 hates Star Trek and is currently suffering in a graduate game theory course, so she would prefer to get one but not the other. Unfortunately, she cannot know what is being offered until she accepts it.

Player 1 observes Nature’s move and offers the wrapped gift to player 2. If the gift is accepted, then player 1 derives a positive payoff because everyone likes when their gifts are accepted. Player 1 hates the humiliation of having a gift rejected, so the payoff is −1. Player 2 strictly prefers accepting the game theory book to not accepting it; she is indifferent between not accepting this book and accepting the Star Trek manual, but hates rejecting the Star Trek manual more than the game theory book because while dissing game theory is cool, dissing Star Trek is embarrassing. Let’s construct the strategic form of this game

Two nash equilibruim. GG-Y and NN-N.
The action GN: refers to “give if get game theory, not give if star trek”. (GG,Y) is a Nash equilibrium for any value of p because p −1 < 0 < p < 1. Player 1 offers the gift regardless of its type, and player 2 accepts always. In addition, (NN,N) is a Nash equilibrium. Player 1 never offers any gifts, and player 2 refuses any gifts if offered. Two nash equilibruim. GG-Y and NN-N. But (NN,N) is clearly irrational. If the game ever reaches player 2’s information set, accepting a gift strictly dominates non accepting a gift, regardless of the gift.

Because the strategic form ignores timing, Nash equilibrium only ensures optimality at the start of the game. That is, equilibrium strategies are optimal if the other players follow their equilibrium strategies. But we cannot see whether the strategies continue to be optimal once the game begins.

We shall require that player 2 form beliefs about the probability of being at any particular node in her information set. Obviously, if the information set consists of a single node, then, if that information set is reached, the probability of being at that node is 1. Let’s look at the game in the figure. Let q denote player 2’s belief that she is at the left node in her information set (given that the set is reached, this is the probability of player 1 having offered the game theory book), and 1 − q be the probability that she is at the right node (given that the set is reached, this is the probability of player 1 having offered the Star Trek book).

That is, p is player 2’s initial belief (or the prior) of the book being a game theory reference; and q is player 2’s conditional belief (updated belief, or posterior) For example, I believe that if you receive a game theory book, there is a 90% chance you will offer the book as a gift, but if you receive star trek, there is a 50% chance you will offer it as a gift. Requirement 1 (Beliefs). At each information set, the player who gets to move must have a belief about which node in the information set has been reached by the play of the game. The belief will be a probability distribution over the nodes.

A strategy is sequentially rational for player i at the information set h if player i would actually want to chose the action prescribed by the strategy if h is reached. A continuation game refers to the information set and all nodes that follow from that information set.

Take any two nodes x y (that is, y follows x in the game tree), and consider the mixed strategy profile σ. Let P(y|σ,x) denote the probability of reaching y starting from x and following the moves prescribed by σ. That is, P(y|σ,x) is the conditional probability that the path of play would go through y after x if all players chose according to σ and the game started at x.

Player i’s expected utility in the continuation game starting at node x then is:
Ui(σ|x) =  zP(z|σ,x)ui(z), where Z is the set of terminal nodes in the game, and ui(·) is the Bernoulli payoff function specifying player i’s utility from the outcome z ∈ Z.

In the gift game, Player 2 can calculate the expected payoff from choosing Y, which is q(1)+(1−q)(0) = q, the expected payoff from choosing N, which is q(0)+(1−q)(−1) = q−1. Since q > q−1 for all values of q (as q-1 is negative), it is never optimal for player 2 to choose N regardless of the beliefs player 2 might hold. Therefore, the strategy N is not sequentially rational because there is no belief that player 2 can have that will make it optimal at her information set. In other words, the unique sequentially rational strategy is to choose Y with certainty.