Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010.

Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010

Improved Equilibria via Public Service Advertising

Good equilibria, Bad equilibria Many games have both bad and good equilibria. In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.

G Fair cost-sharing Fair cost-sharing: n players in weighted directed graph G. Player i wants to get from s i to t i, and they share cost of edges they use with others.

Fair cost-sharing s t 1n Player i wants to get from s i to t i. All players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. Good equilibrium: all use edge of cost 1. (paying 1/n each) Bad equilibrium: all use edge of cost n. (paying 1 each) Each player wants to minimize his own cost.

Inefficiency of equilibria, PoA and PoS Price of Stability (PoS): ratio of best Nash equilibrium to OPT. Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT. Significant effort spent on understanding these in CS. [Koutsoupias-Papadimitriou’99] [Anshelevich et. al, 2004] E.g., for fair cost-sharing, PoS is log(n), whereas PoA is n. “Algorithmic Game Theory”, Nisan, Roughgarden, Tardos, Vazirani

Fair Cost Sharing Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. PoA is n; PoS is log(n). s t 1n PoA is O(n):in any Nash no player pays more than OPT s t 1n PoA is  (n):

Fair Cost Sharing … 1 1/21/n-1 s1s1 snsn t 000 1+ ² PoA is n; PoS is log(n). PoS is  (log(n)): 1/n 0 0 Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing t PoS is  (log(n)): potential function argument Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. PoA is n; PoS is log(n).

Fair Cost Sharing t where Social cost of is Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing t where Social cost of is Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. A player moves, change in player’s cost = change in potential Proof: player i moves, get from S to S’; let A be the edges in S but not in S’, and B the edges in S’ but not in S. Its change in cost: Change in the potential

Fair Cost Sharing t where Social cost of is Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e.

Fair Cost Sharing PoS is  (log(n)): potential function argument The potential does not increase & reach a pure Nash of cost · H(n) ¢ OPT. Iterate best-response dynamics starting from an optimal solution [i.e, while there is a player that can improve, pick an arbitrary such player and let him to best response]. Player i wants to get from s i to t i, and minimize its cost. all players share cost of edges they use with others. n players in directed graph G, each edge e costs c e. Potential always decreases, finite # of states, so reach a pure Nash.

Congestion games more generally Game defined by n players and m resources. Cost of a resource j is a function f j (n j ) of the number n j of players using it. Each player i chooses a set of resources (e.g., a path) from collection S i of allowable sets of resources (e.g., paths from s i to t i ). Cost incurred by player i is the sum, over all resources being used, of the cost of the resource. Generic potential function: Best-response dynamics always gives an equilibrium.

Congestion games more generally Always have a pure-strategy equilibrium. Have a potential function s.t. whenever a player switches, potential drops by exactly that player’s improvement. Nice general class of games with many players. –Best-response dynamics always gives an equilibrium. But maybe a large gap between the quality of the best and the worst equilibrium. Lots of work on understanding properties of these games and quality of their equilibria.

Good equilibria, Bad equilibria Many games have both bad and good equilibria. In some places, everyone drives their own car. In some, everybody uses and pays for good public transit.

Guiding from Bad to Good Standard motivation for PoS: If a central authority could suggest a low-cost Nash (ride public transit), and everyone followed the suggestion, then this would be stable. Price of Anarchy (PoA): ratio of worst Nash equilibrium to OPT. Price of Stability (PoS): ratio of best Nash equilibrium to OPT. Can a helpful authority encourage (guide) behavior to move from a bad state to a good state?

What if only some  fraction will pay attention? Can the authority guide behavior to a good state? Will it just snap back? How does this depend on  ? Guiding from Bad to Good [Balcan-Blum-Mansour, SODA 2009]

Main Model 1.Authority launches advertising, proposing joint action s ad. 0. n players initially playing some arbitrary equilibrium. … 1 111 s1s1 snsn t 000 k

Main Model 1.Authority launches advertising, proposing joint action s ad. Each player i follows with probability . Call players that follow receptive players 0. n players initially playing some arbitrary equilibrium. … 1 111 s1s1 snsn t 000 k

Main Model 1.Authority launches advertising, proposing joint action s ad. 2.Remaining (non-receptive) players fall to some arbitrary equilibrium for themselves, given play of receptive players. 3.All players follow best-response dynamics to an overall Nash equilibrium. potential games, pure Nash eqs. Each player i follows with probability . Call players that follow receptive players Notes: social cost: 0. n players initially playing some arbitrary equilibrium.

Main Results If only a constant fraction  of the players follow the advice, then we can still get within O(1/  ) of the PoS. Extend to cost-sharing + linear delays. (PoS = log(n), PoA = n) (PoS = 1, PoA =  (n 2 )) Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)).

Fair Cost Sharing … 1 111 s1s1 snsn t 000 k Note: this is best you can hope for. E.g., k =2  n. If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n)

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Advertiser proposes OPT (any apx also works) random vars Phase 1:

Fair Cost Sharing - Moreover, this option is guaranteed to be at least as good as if other NR players didn’t exist. If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) - In any NE a non-receptive player i, can’t improve by switching to his path P i OPT in OPT. Cost of non-receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) - In any NE a non-receptive player i, can’t improve by switching to his path P i OPT in OPT. Cost of non-receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) - In any NE a non-receptive player i, can’t improve by switching to his path P i OPT in OPT. Cost of non-receptive players at the end of Phase 2 - Calculate total cost of these guaranteed options. Rearrange sum...

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Cost of non-receptive players at the end of Phase 2 Cost of receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Cost of non-receptive players at the end of Phase 2 Use: X ~ Bi(n,p) Cost of receptive players at the end of Phase 2

Fair Cost Sharing If only a constant fraction  of the players follow the advice, then we get within O(1/  ) of the PoS. (PoS = log(n), PoA = n) Expected total cost at the end of Phase 2: O(OPT/  ). In Phase 3, potential argument shows behavior cannot get worse by more than an additional log(n) factor.

Cost Sharing, Extension - Still get same guarantee, but proof is trickier + linear delays: Problem: can’t argue as if remaining NR players didn’t exist since they add to delays

Cost Sharing, Extension - Still get same guarantee, but proof is trickier - Shadow game wrt non-receptieve players: pure linear latency fns. Offset defined by equilib. at end of phase 2. # users on e at end of phase 2 - This game has good PoA (5/2). + linear delays:

Cost Sharing, Extension - Still get same guarantee, but proof is trickier - Shadow game: pure linear latency fns - Behavior of NR at end of phase 2 is equilib for this game too. - Show Cost of the of nonreceptive players at the end of step 2: O(OPT/  ). + linear delays:

Cost Sharing, Extension - Still get same guarantee, but proof is trickier Need to still argue about the cost of the receptive players. Edge by edge charging: - more receptive players, loose a factor of two compared to OPT - more non-receptive players, already paid for, loose a factor of two Cost of the of nonreceptive players at the end of step 2: O(OPT/  ). + linear delays:

Party affiliation games Given graph G, each edge labeled + or -. Vertices have two actions: RED or BLUE. Pay 1 for each + edge with endpoints of different color, and each – edge with endpoints of same color. Special cases: + + + -- All + edges is consensus game. All – edges is cut-game.

Party affiliation games OPT is an equilibrium so PoS = 1. But even for consensus games, PoA =  (n 2 ) Clique with perfect matching removed all edges labeled plus

Party affiliation games (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). - Same example as for consensus PoA, but sparser across cut. (lower bound) Degree ° n/8 across cut, ° =1/2- ®

Party affiliation games (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). - Same example as for consensus PoA, but sparser across cut. For large n, whp all nodes have at most a 1/2- ° /2 fraction on neighbs in R Initially, each node has a ° /4 fraction on nodes of the other color. So, players “locked” into place Degree ° n/8 across cut, ° =1/2- ® (lower bound)

Party affiliation games (upper bound, consensus games) - Advertising strategy = follow OPT, e.g. all red. - By Hoeffding, all nodes with degree log n/( ® -1/2) 2 have more than half of their neighbors in the set R, with prob. 1-1/n. - At the end of step two, all nodes are red. (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). Note: for general cut games, OPT might not have zero cost for each player.

Party affiliation games - Split nodes into those incurring low-cost vs those incurring high-cost under OPT. (upper bound, general party affiliation games) - Advertising strategy = follow OPT. - Show that low-cost will switch to behavior in OPT. For high-cost, don’t care. - Cost only improves in final best-response process. (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)).

Party affiliation games S is a ¯ -dominating if every vertex not in S has more than a ½+ ¯ fraction of neighbs in S. If ® > ½+2 ¯, then set R of receptive players is ¯ -dominating whp (PoS = 1, PoA =  (n 2 )) - Threshold behavior: for  > ½, can get ratio O(1), but for  < ½, ratio stays  (n 2 ). (assume degrees  (log n)). Split nodes into those incurring low-cost (less than a ¯ -fraction of incident edges incur a cost in OPT) vs those incurring high- cost under OPT. Low-cost will switch to behavior in OPT. For high-cost, can only incur a cost of only 1/ ¯ more their cost in OPT. (upper bound, general party affiliation games)

Summary Analyze ability of a central authority to guide behavior to a good equilibrium even if only ® fraction of players are paying attention.

Influencing Dynamics Play Best Response Play the Advertised Behavior Each player has a few abstract actions. Expert 1Expert 2 Uses a learning, experts based alg. to decide which one to use A more adaptive model [Balcan Blum Mansour, ICS 2010] [no rigid separation between receptive vs non-receptive players]

Open Questions Get around problem of natural dynamics converging to poor equilibrium without central authority by giving players more information about the game?

Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010.

Similar presentations

Presentation on theme: "Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010.

Similar presentations

Presentation on theme: "Connections between Learning Theory, Game Theory, and Optimization Maria Florina (Nina) Balcan Lecture 14, October 7 th 2010."— Presentation transcript:

Similar presentations

About project

Feedback