Download presentation

Presentation is loading. Please wait.

Published byAlejandra Hambly Modified over 4 years ago

1
Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

2
2 Theory of repeated games important central model for explaining how self-interested agents can cooperate used in economics, biology, political science and other fields

3
3 But theory has a serious flaw: although cooperative behavior possible, so is uncooperative behavior (and everything in between) theory doesn’t favor one behavior over another theory doesn’t make sharp predictions

4
4 Evolution (biological or cultural) can promote efficiency might hope that uncooperative behavior will be “weeded out” this view expressed in Axelrod (1984)

5
5 Basic idea: Start with population of repeated game strategy Always D Consider small group of mutants using Conditional C (Play C until someone plays D, thereafter play D) –does essentially same against Always D as Always D does –does much better against Conditional C than Always D does Thus Conditional C will invade Always D uncooperative behavior driven out

6
6 But consider ALT Alternate between C and D until pattern broken, thereafter play D can’t be invaded by some other strategy –other strategy would have to alternate or else would do much worse against ALT than ALT does Thus ALT is “evolutionarily stable” But ALT is quite inefficient (average payoff 1)

7
7 Still, ALT highly inflexible –relies on perfect alternation –if pattern broken, get D forever What if there is a (small) probability of mistake in execution?

8
8 Consider mutant strategy identical to ALT except if (by mistake) alternating pattern broken – “intention” to cooperate by playing C in following period –if other strategy plays C too, –if other strategy plays D,

9
9 Main results in paper (for 2-player symmetric repeated games) (1)If s evolutionarily stable and –discount rate r small (future important) –mistake probability p small (but p > 0) then s (almost) “efficient” (2) If payoffs (v, v) “efficient”, then exists ES strategy s (almost) attaining (v, v) provided –r small –p small relative to r generalizes Fudenberg-Maskin (1990), in which r = p = 0

10
10 Finite symmetric 2–player game if normalize payoffs so that

11
11 strongly efficient if

12
12 Repeated game: g repeated infinitely many times period t history H = set of all histories repeated game strategy –assume finitely complex (playable by finite computer) in each period, probability p that i makes mistake –chooses (equal probabilities for all actions) –mistakes independent across players

13
13

14
14 informally, s evolutionarily stable (ES), if no mutant can invade population with big proportion s and small proportion formally, s is ES w.r.t. if for all and all evolutionary stability –expressed statically here –but can be given precise dynamic meaning

15
15 population of suppose time measure in “epochs” T = 1, 2,... strategy state in epoch T −most players in population use group of mutants (of size a) plays s' a drawn randomly from s' drawn randomly from finitely complex strategies M random drawings of pairs of players −each pair plays repeated game = strategy with highest average score

16
16 Theorem 1: For any exists such that, for all there exists such that, for all (i) if s not ES, (ii) if

17
17 Let Theorem 2: Given such that, for all if s is ES w.r.t. then

18
18

19
19 Proof: Suppose will construct mutant s' that can invade let if s = ALT, = any history for which alternating pattern broken

20
20 Construct s' so that if h not a continuation of after, strategy s' –“signals” willingness to cooperate by playing differently from s for 1 period (assume s is pure strategy) –if other player responds positively, plays strongly efficiently thereafter –if not, plays according to s thereafter after –responds positively if other strategy has signaled, and thereafter plays strongly efficiently –plays according to s otherwise

21
21 because is already worst history, s' loses for only 1 period by signaling (small loss if r small) if p small, probability that s' “misreads” other player’s intention is small hence, s' does nearly as well against s as s does against itself (even after ) s' does very well against itself (strong efficiency), after

22
22 remains to check how well s does against s' by definition of Ignoring effect of p, Also, after deviation by s', punishment started again, and so Hence so s does appreciably worse against s' than s' does against s'

23
23 Summing up, we have: s is not ES

24
24 Theorem 2 implies for Prisoner’s Dilemma that, for any doesn’t rule out punishments of arbitrary (finite) length

25
25 Consider strategy s with “cooperative” and “punishment” phases –in cooperative phase, play C –stay in cooperative phase until one player plays D, in which case go to punishment phase –in punishment phase, play D –stay in punishment phase for m periods (and then go back to cooperative phase) unless at some point some player chooses C, in which case restart punishment For any m,

26
26 Can sharpen Theorem 2 for Prisoner’s Dilemma: Given, there exist such that, for all if s is ES w.r.t. then it cannot entail a punishment lasting more than periods Proof: very similar to that of Theorem 2

27
27 For r and p too big, ES strategy s may not be “efficient” if if fully cooperative strategies in Prisoner’s Dilemma generate payoffs

28
28 Theorem 3: Let For all for all

29
29 Proof: Construct s so that along equilibrium path of (s, s), payoffs are (approximately) (v, v) punishments are nearly strongly efficient –deviating player (say 1) minimaxed long enough wipe out gain –thereafter go to strongly efficient point –overall payoffs after deviation: if r and p small (s, s) is a subgame perfect equilibrium

30
30 In Prisoner’s Dilemma, consider s that –plays C the first period –thereafter, plays C if and only if either both players played C previous period or neither did strategy s –is efficient –entails punishments that are as short as possible –is modification of Tit-for-Tat (C the first period; thereafter, do what other player did previous period) Tit-for-Tat not ES –if mistake (D, C) occurs then get wave of alternating punishments: (C, D), (D, C), (C, D),... until another mistake made

31
31 Let s = play d as long as in all past periods –both players played d –neither played d if single player deviates from d –henceforth, that player plays b –other player plays a s is ES even though inefficient –any attempt to improve on efficiency, punished forever –can’t invade during punishment, because punishment efficient

32
32 Consider potential invader s' For any h, s' cannot do better against s than s does against itself, since (s, s) equilibrium hence, for all h, and so For s' to invade, need Claim: implies h' involves deviation from equil path of (s, s) only other possibility: –s' different from s on equil path –then s' punished by –violates we thus have Hence, from rhs of

33
33 For Theorem 3 to hold, p must be small relative to r consider modified Tit-for-Tat against itself (play C if and only if both players took same action last period) with every mistake, there is an expected loss of 2 – (½ · 3 + ½ (−1)) = 1 the first period 2 – 0 = 2 the second period so over-all the expected loss from mistakes is approximately By contrast, a mutant strategy that signals, etc. and doesn’t punish at all against itself loses only about so if r is small enough relative to p, mutant can invade

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google