Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley.

Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley

Repeated, Bargaining, Dynamic Motivation Repeated Games Bargaining Dynamic Games

Motivation So far: One-shot (static) games Many games are repeated or dynamic: Repeated interactions (market, network usage) Multiple stages (chess, card games) Negotiations (bargaining) New effects: Past actions affect information Reputation of players Threats Learning

Repeated Games Example 1: Prisoners’ Dilemma Example 2: Battle of the Sexes Folk Theorem

Repeated Prisoners’ Dilemma Model: LR T4, 41, 5 B5, 12, 2 One-Shot: (B, R) is the unique NE. Intuition: If players know the game is repeated, they might want to collaborate and play (T, L). Finitely Repeated Game: Assume game is repeated N times (N > 1). Reward of each player is the sum of the step rewards. Players see and remember the past actions. Both players play simultaneously at each step. Both players know the matrix of rewards. One-Shot: (B, R) is the unique NE. Intuition: If players know the game is repeated, they might want to collaborate and play (T, L). Finitely Repeated Game: Assume game is repeated N times (N > 1). Reward of each player is the sum of the step rewards. Players see and remember the past actions. Both players play simultaneously at each step. Both players know the matrix of rewards.

Repeated Prisoners’ Dilemma LR T4, 41, 5 B5, 12, 2 Definitions: Strategy: Specifies what to do at each step, given available information. Subgame Perfect Equilibrium: Pair of strategies from which no player has an incentive to deviate unilaterally. Theorem: The unique SPE for the N-time repeated PD is as follows: Player 1 plays B at every step; Player 2 plays R at every step. Proof: Backward induction (obvious at last step,...). Definitions: Strategy: Specifies what to do at each step, given available information. Subgame Perfect Equilibrium: Pair of strategies from which no player has an incentive to deviate unilaterally. Theorem: The unique SPE for the N-time repeated PD is as follows: Player 1 plays B at every step; Player 2 plays R at every step. Proof: Backward induction (obvious at last step,...).

Repeated Prisoners’ Dilemma LR T4, 41, 5 B5, 12, 2 Infinitely Repeated Game: Game is repeated forever Reward of each player is the sum of the discounted step rewards: R i = (1 –  )  0   n R i (n) where 0 <  < 1. Players see and remember the past actions. Both players play simultaneously at each step. Both players know the matrix of rewards. Infinitely Repeated Game: Game is repeated forever Reward of each player is the sum of the discounted step rewards: R i = (1 –  )  0   n R i (n) where 0 <  < 1. Players see and remember the past actions. Both players play simultaneously at each step. Both players know the matrix of rewards.

Repeated Prisoners’ Dilemma LR T4, 41, 5 B5, 12, 2 Definition: r is in C  r = convex comb. of reward pairs and r dominates a NE Theorem: Consider the infinitely repeated PD. For any r in C, there is some  0  0. Definition: r is in C  r = convex comb. of reward pairs and r dominates a NE Theorem: Consider the infinitely repeated PD. For any r in C, there is some  0  0.

Repeated Prisoners’ Dilemma LR T4, 41, 5 B5, 12, 2 Proof:Pick r is in C. Then r  rewards of playing the pair (a k,b k ) of actions a fraction p k  N k /N of the times, k = 1, …, 4. Strategy of player 1 [2, resp.]: We play (a 1, b 1 ) for N 1 steps, then … (a 4, b 4 ) for N 4 steps, and we repeat forever. If you deviate at any time, I play B [R, resp.] forever thereafter.

Repeated Prisoners’ Dilemma LR T4, 41, 5 B5, 12, 2 Proof (continued): Reward of P2 at steps n, n +1, … if he deviates at time n: (1 –  )  n 1 +  n+1 r 2 –  =: w if  is large enough. Note that w > v for b large enough since r 2 > 2. The strategy is an SPE: Say P2 deviates at time n. Then P1 plays B forever and, knowing this, P2 must play R forever and P1 must accordingly play B forever since (B, R) is NE.

Repeated Prisoners’ Dilemma LR T4, 41, 5 B5, 12, 2 Comments: The SPE that enforces the rewards r is a “threat strategy.” The key step in the argument is to show that the threat strategy is an SPE. In other words, the threat is “credible.” This is the case because the threat is to revert to playing a NE forever. Comments: The SPE that enforces the rewards r is a “threat strategy.” The key step in the argument is to show that the threat strategy is an SPE. In other words, the threat is “credible.” This is the case because the threat is to revert to playing a NE forever.

Theorem: Consider the infinitely repeated PD. For any r in C, there is some  0  0. Theorem: Consider the infinitely repeated PD. For any r in C, there is some  0  0. Repeated Battle of the Sexes LR T4, 12, 2 B0, 01, 4

Folk Theorem

Bargaining Question: What is a reasonable agreement between two parties if negotiations are expensive? Model: Alice and Bob bargain on how to divide a pie of value 1. After one step, the value of the pie gets multiplied by  1 < 1 for Alice and by  2 for Bob. Alice makes the initial offer which Bob accepts or refuses and makes an alternate offer, and so on. Theorem: (Rubinstein-Stahl, 1972, 1982) SPE: Alice always demands the fraction x := (1 –  2 )/(1 –  1  2 ) of the pie and Bob accepts an offer iff it is at least z :=  2 (1 –  1 )/(1 –  1  2 ).

Bargaining Theorem: (Rubinstein-Stahl, 1972, 1982) SPE: Alice always demands the fraction x 1 := (1 –  2 )/(1 –  1  2 ) of the pie and Bob accepts an offer iff it is at least z 2 :=  2 (1 –  1 )/(1 –  1  2 ). Comments: z 2 = 1 – x 1 is the smallest offer Bob accepts. Alice cannot gain by proposing a larger 1 – x 1 to Bob. If Alice offers less than 1 – x 1, Bob refuses and offers x 2 = (1 –  1 )/(1 –  1  2 ) so that Alice would get only  1 ( 1 – x 2 ) =  1 2 x 1 < x 1.

Bargaining Theorem: SPE: Alice always demands the fraction x 1 := (1 –  2 )/(1 –  1  2 ) of the pie and Bob accepts an offer iff it is at least z 2 :=  2 (1 –  1 )/(1 –  1  2 ). Proof:

Bargaining Theorem: SPE: Alice always demands the fraction x 1 := (1 –  2 )/(1 –  1  2 ) of the pie and Bob accepts an offer iff it is at least z 2 :=  2 (1 –  1 )/(1 –  1  2 ). Proof (continued):

Dynamic Games Example 1 Example 2 Example 3

Dynamic Game: Example 1 If P1 does not choose L: Claim: 2 NEs = (L, L) and (R, R) Fact: Only one SPE: (R, R)

Dynamic Game: Example 2 SPE: P1: L P2: R P1: R if P2 = L L otherwise SPE: P1: L P2: R P1: R if P2 = L L otherwise

Dynamic Game: Example 3 Cannot solve. Equivalent matrix game

Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley.

Similar presentations

Presentation on theme: "Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley.

Similar presentations

Presentation on theme: "Communication Networks A Second Course Jean Walrand Department of EECS University of California at Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback