# Price of Total Anarchy June 2008 Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi,

## Presentation on theme: "Price of Total Anarchy June 2008 Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi,"— Presentation transcript:

Price of Total Anarchy June 2008 Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi, Katrina Ligett and Aaron Roth 1

Agenda  Preliminaries  Game Theory Basics  Regret Minimization  Hotelling games  Valid games  Atomic congestion games  Algorithmic efficiency 2

Games in Strategic Form  The game has players  Each player has his available pure strategies  marks the strategy profiles  Individual utility (payoff) functions 3

Games in Strategic Form – cont’d  Examples:  Rock, Paper, Scissors  Prisoner’s Dilemma RockPaperScissors Rock0, 0-1, 11, -1 Paper1,-10,0-1, 1 Scissors-1,11, -10, 0 DenyConfess Deny1, 15, 0 Confess0, 53, 3 4

Mixed Strategies  Users can play “mixed strategies” as well – a probability distribution over, we mark this as  marks the mixed strategy profiles  The payoffs are now defined as the expected value of over the randomness of the players  Sometimes marked by 5

Best Response and Nash Equilibria  Lowercase letters will usually denote elements: ,, …  We denote by the selected strategies of the players other than i ( )   A strategy is best response to if for all :  A strategy profile is a Nash Equilibrium if for all i, is a best response to.  Pure equilibria might exist, but in every game there is at least one mixed Nash Equilbrium. 6

Nash Equilibria  Examples:  Rock, Paper, Scissors  Mixed equilibrium: ([1/3,1/3,1/3], [1/3, 1/3, 1/3])  Prisoner’s Dilemma  Pure equilibrium (Confess, Confess) RockPaperScissors Rock0, 0-1, 11, -1 Paper1,-10,0-1, 1 Scissors-1,11, -10, 0 DenyConfess Deny1, 15, 0 Confess0, 53, 3 7

Social Optimum  Sometimes, we’ll define a social utility (welfare) function, similar to payoffs:  Choices that would make sense:  For mixed strategies, we’ll look for the expected value (analogous to payoff in mixed strategies)  Socially optimum strategy profile (and OPT) are: We are assuming a maximizing game throughout, the minimization is analogous 8

Price of Anarchy  Let mark all the Nash Equilibria in the game  The price of anarchy is defined as the ratio of the worst NE to optimum: 9

Price of Anarchy  Prisoner’s Dilemma  Notice that the fraction is flipped (minimization game) DenyConfess Deny1, 15, 0 Confess0, 53, 3 10 OPT = 2 N = 6

Regret Minimization  Let mark the strategy profiles in T steps  We define the regret of player i in a maximization game:  Intuitively, this is “how much i could gain more in average had he played a single strategy throughout the game” 11

Regret Minimization  When a player i uses a regret-minimizing algorithm, for any sequence, we have the property  Where:  vanishes as  marks the number of steps before  The expectancy is over the algorithm’s randomness  In other words, the expected value of regret vanishes  Notice that this is for maximizing games 12

Regret Minimization  This implies that for any sequence, if player i is regret-minimizing, then:  The price of total anarchy is defined as:  Where max is taken over, that are play profiles with regret-minimization property 13

Regret and NE  Notice that when playing a Nash Equilibrium, all players will have zero-regret  If there’s a better “constant” response, we can improve by moving to it  Therefore, the price of total anarchy in any game consists an upper bound for the price of anarchy Regret-minimizing strategies NE 14

Advantages of Regret Minimization  Computational  Nash Equilibria are hard (PPAD-hard) to calculate – even for small action spaces  There are efficient regret minimization algorithms for polynomial number of actions  Motivational  No particular reason for players to converge down to NE  There might be multiple equilibria, and agents may individually prefer different ones  Byzantine players’ actions are not taken into account in NE  Regret-minimization considers only local information, much more practical 15

Agenda  Preliminaries  Hotelling games  Definition  POA/POTA  Generalization  Valid games  Atomic congestion games  Algorithmic efficiency 16

Hotelling - Game Definition  Souvenir stand owners in Paris:  There are tourists every day, they buy from whichever stand they find first  Each stand owner wishes to maximize his own sales  We want “fairness”, the social welfare function is the minimum of the total sales made.  Formally:  We have an n-vertex graph.  Each seller locates himself at a vertex  Each day, a tourist in each vertex, goes to the closest seller  If there is a “tie” between the sellers, they split the gains  Minimum utility: 17

Hotelling - Optimum Solution  Notice that the sum of payoffs is always exactly n  Therefore, the social optimum is achieved when all players have equal payoffs  This can happen if all players play on the same vertex  Therefore 18

Hotelling – POA  Theorem 3.1 The price of anarchy in the Hotelling game is (2k – 2)/k  Proof  We are to show that all players gain at least n/(2k – 2)  Assume the contrary, that player i gains less than that in S  Consider player i “leaving” the game. The total payoff is still n, so the average payoff for players is now n/(k-1)  There must be at least one player h gaining at least the average, playing the vertex v h  Player i can assure n/(2k – 2) by moving to v h  Contradiction to Nash equilibrium 19

Theorem 3.1 – cont’d  We are left with showing tightness  Consider a game with k-1 stars  k-1 players play at centers of their own stars, and player k plays uniformly over all the star centers  This is NE  The randomizing player earns n/(2k - 2) 1 2 3 k-1 20

Hotelling – POTA  Let be the strategy of playing an arbitrary strategy from strategies in.  Define  Notice that, since when player i is removed, the rest have average payoff of  Lemma 3.4 For all i, for all,. (Trivial for t = u) 21

Lemma 3.4 - Proof  Consider a -player game  Each player other than i replicated twice: once as time- t player and once as time- u player, with strategies and.  Average payoff is  If player i replaces a time- t player, that’s his expected payoff  If we further remove time-t players, we only improve 22

The Imaginary Game, n=10, k=4 23 time-t players time-u players replacing time-t player in imaginary ≤E replacing time-t & removing other time-t players =E =

Lemma 3.4 – cont’d 24  Same argument holds for replacing u-player:

Hotelling – POTA  Theorem 3.2 Each regret minimizing player has at least n/(2k-2) payoff  Proof  Provided a sequence of T plays, select a random time u  The average expected payoff if we played throughout is:  Averaging over different u, we reach: 25

Hotelling – POTA  We reached  The second term is non-negative due to Lemma 3.4  There is value for u that achieves the average  For that u, if player i mixes between, he’ll achieve  A regret minimizing player achieves this expected payoff 26

Hotelling – POTA  Corollary: The price of total anarchy in the Hotelling game is (2k-2)/k, matching the price of anarchy  Notice that in the we haven’t made any assumptions about how other players behave, so the proof holds even in the presence of Byzantine players making arbitrary (or adversarial) decisions! 27

Generalized Hotelling Game  Notice that in the proof we have used only three features of the hotelling game:  Constant sum – the sum of utilities is constant  Symmetric – the “names” of the stand owners don’t matter  Monotone – any player can “leave” the game and the sum does not change  We call such games with the “fairness” social utility generalized Hotelling games.  Theorem 3.6: In any k -player generalized Hotelling game, the price of total anarchy among regret minimizing players is (2k-2)/k even in the presence of arbitrarily many Byzantine players. 28

Non-Convergence  Consider the game with:  Players {0, …, k-1}  k-1 n -vertex stars, with centers at v 0, …, v k-2 and isolated vertex v k-1  Consider  Each player’s payoff  No single vertex has expected payoff more than  No regrets  However, this is not Nash! Players at the isolated vertex will deviate! 1 2 3 k-1 k 29

Break? 30

Agenda  Preliminaries  Hotelling games  Valid games  Definition  Market sharing game  POA/POTA  Byzantine players  Atomic congestion games  Algorithmic efficiency 31

Valid Games – Definitions  Consider a k -player maximization game  For each player, there is a groundset of actions V i  Player i plays from some feasible set  Definitions  Let  The discrete derivative of at in the direction is  The function is said to be submodular if for This should remind us “concavity” – decreasing marginal utility 32

Submodularity 33  Adding something to a smaller set makes a bigger difference A B V car house villa high-def jacuzzi

Valid Games – Definitions  We will notate as the strategies of players with index smaller than i. We will also use both this and as complete strategies (as in apply over them), meaning that the remaining players play the empty set  Definition 4.2: A game with private utility functions and social utility function is valid if:  is submodular  For all i, s: - private fairness  For all s: - social fairness 34

Valid Games – Example  Market sharing game (Goemans et. al., 2005)  Players are ISP’s  Markets are towns  Each market has price and value  Each player can “enter” the market he has an edge towards, with budget constraint  Player’s payoff per market is the value divided by entrances  Sum social utility  Or – sum of values at entered markets 5 3 9 2 players markets 35

Valid Games – Price of Anarchy  Vetta, 2002: In a valid game, if is a NE strategy, and is the optimal strategy then:  Corollary: if is non-decreasing, then we have POA 2 (The derivatives are always positive)  Theorem 4.3, Corollary 4.2 (no proofs) POTA matches POA in valid games (up to ) 36

Valid Games – Byzantine Players  Theorem 4.5 In a valid game with nondecreasing social welfare, if k players minimize regret with while the Byzantine players play strategies the average social welfare is:  Proof. Assume the contrary, 37

Theorem 4.5 – cont’d (non-decreasing) (gradually inserting) (submodularity) (private fairness) 38

Gradual Insertation 39 A B car house villa jacuzzi

Theorem 4.5 – cont’d (summarizing) (assumption – the first term is less than half) (social fairness) (rearranging sum) 40

Theorem 4.5 – cont’d  At least one player must match that, so for him we have  Contradictory to regret minimization!  Note that it’s compared to the old OPT (without the Byzantine players)  But it’s fair – Byzantine players may be acting even against their own interest – we can’t say anything about them 41

Agenda  Preliminaries  Hotelling games  Valid games  Atomic congestion games  Definition  Sum social utility – POTA  Makespan utility – Lower bounds  Algorithmic efficiency 42

Congestion Games  A congestion game is a minimization game, with k players  For each player, there is a set of facilities V i  Player i plays from some feasible set  In weighted games, player i has a weight w i  For unweighted games, we assume w i = 1  The load on facility e is defined as  Each facility e has an associated latency function f e  Player i playing a i experiences cost 43

Atomic Congestion Games  We’ll consider a specific kind of congestion game  Unweighted  Linear latencies –  We will use sum social utility:  Previously known results:  POA for pure strategies is 2.5 (Awerbuch et. al., 2005)  POA for mixed strategies is also 2.5 (Chirstodoulou and Koutsoupias, 2007)  Theorem 5.1: POTA in this setting is 2.5  This asserts the previously known results! 44

Theorem 5.1 – Proof  Let be the optimal play  Since we have no regret, for all i  Summarizing for each player, and rearranging sum:  Or more simply: 45

Theorem 5.1 – cont’d  Geometric mean is smaller than arithmetic mean, so:  Recall our equation (1) (2) (3) 46

Theorem 5.1 – cont’d  Multiplying both sides by two:  Further relaxing the inequality:  We’re done! 47

Parallel Link Congestion Game  Consider n identical links and k weighted players  Each player selects which link to use (single link)  Each player pays the sum of the weights on the link 48

Parallel Link Congestion Games – cont’d  Claim: In Parallel link congestion game with social cost function as the maximum expected job latency, POTA is 2  Proof.  Rescale the weights, so that OPT = 1  Total weight is less than n, weights are less than 1  Total latency in T plays is Tn, at least one link e* with latency less than T in total, average latency - l(e*) ≤ 1  Regret minimizing player will be competitive to moving to e*  We expect at most l(e*) +w i ≤ 2 49

Parallel Links Congestion Game – cont’d  The unweighted case with the sum social utility, is called “load balancing game”  It’s a specific case of the discussed before, thus we will have POA and POTA of 2.5  If k >> n (a likely case) and the server speeds are relatively bounded, we can say even more  Theorem 5.6 (no proof): In this formation, POTA is 1 + o(1)  Corollary 5.7: In this formation, POA is 1 + o(1), even for mixed strategies 50

Parallel Link Congestion Games – cont’d  Usually, we consider the makespan social utility function – the load on the most loaded link  Why doesn’t our argument from before hold?  Because E[max{X}] > max{E[X]}  The POA for 2-link games is 3/2 (Koutsopias and Papadimitriou, 1999)  The POA for n-link games is (Koutsopias, Marvronikolas, Spirakis, 1999)  Theorem 5.4 (no proof) POTA for this game with two links is 3/2.  Theorem 5.5: POTA for this game with n links is. 51

Theorem 5.5 – Proof Sketch  Consider n links, n players, unit weights. OPT = 1  Resembles what we did in Hotelling (for non- convergence):  Split the players into groups  At time t, group t mod plays at link 1, while the rest play in different nodes – get average latency of close to 1  This minimizes regret – for any fixed link, the player will need to share the link most of the times (latency ~2)  Still, at each time, link 1 has a whole group – maximum latency of  Notice that this holds even for unweighted players! 52

Agenda  Preliminaries  Hotelling games  Valid games  Atomic congestion games  Algorithmic efficiency 54

Algorithmic Efficiency  Weighted Majority Algorithm (Littlestone, Warmuth)  Initialize for all i  Update at time t, where is a small tradeoff parameter (0.01) and is the loss at time t-1  Expects regret over time T  Explained in “Algorithmic Game Theory”, chapter 4  Polynomial in the number of strategies (Hotelling, Congestion games)  Not as good in Valid games (the strategies are exponential to the size of groundset) We’re assuming a minimizing game with [0,1] loss and n strategies. 55

Questions? 56

Download ppt "Price of Total Anarchy June 2008 Slides by Israel Shalom Based on “Regret Minimization and the Price of Total Anarchy” By Avrim Blum, MohammadTaghi Hajiaghayi,"

Similar presentations