1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar.

1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar

2 On The Agenda Best-Reply Dynamics Convergence issues - Max Solvable Games Strategic issues – Universally Max Solvable Games. Best Reply as a Mechanism Examples –Single Item Auction, Matching, Congestion Control.

3 Best-Reply Dynamics Repeatedly: –Fix the strategies of all players but one. –Set that player ’ s strategy to be a best reply to the others. Greedy, myopic. A natural na ï ve approach for computing pure Nash. Often used as an actual strategy (Internet protocols, markets … ) Does it make sense to use best-reply in such settings?

4 Example: Battle of the Sexes 2,12,10,00,0 0,00,01,21,2 Row Player Column Player

5 Three Desirable Properties An equilibrium point – pure Nash –At some point in time everything settles down. –Does not have to exist (e.g. rock-paper-scissors). (Fast) convergence to equilibrium –Polynomial in the size of strategy spaces. Incentive Compatibility –Players will want to follow the prescribed strategy.

6 Potential Games Defined using better-reply dynamics [Monderer&Shapley] Potential games = all games for which better reply always converges. Convergence may take exponential time. –It is PLS-Complete to find a pure Nash. [Fabrikant, Papadimitriou, Talwar] Not incentive compatible (an example later).

7 Max Dominated Strategies Definition: A strategy is max-dominated if it is not a best-reply to any strategy-profile of the other players. –Any strictly-dominated strategy is max-dominated. –Ties can be handled too. (Not in this talk.) 231 012 23 Max Dominated Strategy

8 Max Solvable Games Definition: A max-solvable game is a game in which iterated elimination of max-dominated strategies leaves only one strategy for each player. 5,13,51,2 2,04,1

9 Convergence Theorem: max-solvable games have a unique pure Nash equilibrium. Theorem: in max-solvable games, with n players, any (round-robin) best-reply dynamics converges in n(  i m i ) steps. –m i is the size of the strategy-space of player i.

10 Asynchronous Convergence –Players do not have to act one at a time. –Best-reply relies on the current action of others. What if these messages get delayed? 2,12,10,00,0 0,00,01,21,2

11 Asynchronous Convergence Theorem: Max-solvable games converge in any asynchronous timing that –does not delay any player ’ s activation indefinitely. –does not delay messages indefinitely.

12 Incentive Compatibility Prescribed behavior: Best-Reply. –Will you follow it? Notice: not a fully observable setting. A player does not always know the utilities of others. –To play best-reply a player only needs to know his own utility and the actions of others. Max solvable games are not enough to guarantee incentive compatibility.

13 Example: Not Incentive Compatible 5,35,30,00,0 10,14,44,4 Row Player Column Player

14 Univesally Max Dominated Strategies Definition: A set of strategies for some player is universally-max-dominated if its best payoff is strictly worse than all payoffs of the other strategies. 978 865 123 043 Universally-max- dominated Not universally- max-dominated

15 Univesally Max Solvable Games Definition: A game is universally max-solvable if repeated elimination of universally-max dominated strategies leaves only one strategy profile. Every universally-max-solvable game is also max-solvable

16 Universally-Max-Solvable Games Theorem: The pure-Nash equilibrium in universally-max-solvable games is Collusion- proof. –No group of players can change strategies without hurting at least one member. Corollary: The pure-Nash is also Pareto optimal.

17 Best Reply Mechanisms Players have hidden utility functions (Their types) For simplicity, we assume a central mechanism that queries them about best-replies. The goal: to decide on a strategy profile for them to play that is hopefully a pure Nash. Needed: A penalty that the mechanism can activate to punish players that did not converge. –Natural in our examples. –Needs to be worse than the equilibrium outcome.

18 Best Reply Mechanisms The mechanism: –Start with some strategy profile. –Go over the players in round-robin order and repeatedly update their best-reply. –If in some round no one changes strategy, stop and output the strategy profile. –If a certain (polynomial) number of rounds have passed and players still did not converge, invoke the penalty.

19 Best Reply Mechanisms Theorem: For a universally-max-solvable game the given mechanism is incentive compatible in ex-post Nash equilibrium. Meaning: when queried you will always report your best-reply, and not some other strategy. The result of the mechanism will be the pure-Nash equilibrium of the game. “ Ex-post ” means that you will not act differently even if you knew the specific utility functions of all others. All you assume: they also play best-reply.

20 Examples of Universally-Max-Solvable Games

21 Single Item Auction A single item is being auctioned. Each player has a private value in {1,2 …,k}. Players announce what they are willing to pay. Highest bidder gets the item for his bid. (Ties are broken in some predefined way) 5 4

22 Single Item Auction Utility of a player: –0 if he did not win. –Valuation minus payment if he did win. Best Reply Strategy: If Bid>Valuation decrease bid to valuation. (this involves tie breaking) If not highest bidder and Bid<Valuation increase bid by 1. 4 7

23 The Mechanism Start at any initial bids (Not necessarily 0) Query players in order and ask if they want to change their bid When no one wants to change, allocate the item. If there is no convergence after k*n 2 rounds give the item to no one. Notice: –We do not force ascending bids. –Do not have to start at 0

24 Single Item Auction Theorem: The single item auction is universally- max-solvable (after tie breaking). Therefore: –A unique pure Nash exists. –We converge to it quickly if everyone is truthful –The mechanism we suggested is incentive compatible Note that this is just the English auction behavior (but with rules that are less strict).

25 Congestion Control The setting: A simplified model of packets flowing through a computer network. Assume a network graph with capacities on the edges (Like a flow problem). 2 3 2 1 1 4 3

26 Congestion Control Flows have a fixed unchangeable single path. Vertices that get more flow than they can send out must dump some. 2 3 2 1 1 4 3 S1S1 T2T2 S2S2 T1T1

27 Congestion Control Policy of the vertices: Distribute the capacity of an edge equally between flows. If some flow does not use its full share, distribute it evenly among the others. Similar to the fair-queuing strategy in the Internet Maximizes the minimal flow. 5 4 7 1

28 Congestion Control Game Each flow is a player. Utility of a player: How much he manages to send through. Decides alone how much to send through the network. Players do not know the structure of the network. Only know how much of their flow goes through, or if there is free capacity.

29 Congestion Control Best-reply strategy: –If there is free capacity increase your flow. –If you lose some of your flow decrease your flow. (This is tie breaking between outcomes with equal payoff) THM: congestion control is universally-max solvable. Natural Penalty: Everyone sends full flows. We thus have: –A Pareto optimal pure Nash that maximizes min flow. –Fast Convergence. –Incentive compatibility of following best-reply.

30 Stable Roommates A set of college students needs to be paired up to share dorm rooms. Each student has strict preferences over the other students (these are private). We allow students to announce a single person they want to pair up with.

31 Stable Roommates Game A player gets the utility associated with the roommate he selected if: –that roommate selected him –that roommate would prefer him over his current selection. Nash equilibria in this game are stable matchings There may be several.

32 Stable Roommates The mechanism: –Allow students to iteratively update their selection –Stop after students no longer change –If after a while players do not stop, match no one.

33 Preference Cycles Definition: A preference cycle is a cycle of players such that each player prefers the following player more than the previous player.

34 Stable Roomates Theorem: A roommate matching game is that has no preference cycle is a universally-max-solvable game. Example of no-preference-cycle: bipartite graphs with an agreed preference. (Med. students and hospitals) Therefore for no-preference-cycle cases: –There is a unique stable matching. –Best-reply converges to it (asynchronously) and quickly –The mechanism we offered is incentive compatible.

35 Thanks!

1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar.

Similar presentations

Presentation on theme: "1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar.

Similar presentations

Presentation on theme: "1 Best-Reply Mechanisms Noam Nisan, Michael Schapira and Aviv Zohar."— Presentation transcript:

Similar presentations

About project

Feedback