Presentation on theme: "Competitive Safety Analysis: Robust Decision-Making in Multi-Agent systems Moshe Tennenholtz Summarized by Yi Seung-Joon."— Presentation transcript:
Competitive Safety Analysis: Robust Decision-Making in Multi-Agent systems Moshe Tennenholtz Summarized by Yi Seung-Joon
(c) 2003 SNU Biointelligence Lab. Introduction Problem: The way to select a proper action in a given environment when facing other agents We wish to equip an agent with an action that guarantees some desired outcome without relying on other agents ’ rationality.
(c) 2003 SNU Biointelligence Lab. Introduction Game theoretical equilibrium analysis –Can identify the Nash equilibria that may emerge in that setting. –Adopting the strategy prescribed by a Nash equilibrium may be quite dangerous for our agent, as other agents may fail to choose strategies prescribed by that equilibrium. Strategy Level Strategy –An robust protocol which guarantees an expected payoff similar to the one obtained in a Nash equilibrium without relying on other agents ’ behavior.
(c) 2003 SNU Biointelligence Lab. Definitions and Notations Game: N:a set of n players Si:a finite set of pure strategies available to player i Ui:the payoff function of player I Δ(Si):the set of probability distributions over the elements of Si.
(c) 2003 SNU Biointelligence Lab. Definitions and Notations An element t ∈ Δ(si) is called a mixed strategy of player i. Pure strategy:an element of Si is assigned probability of 1. Strictly mixed strategy:All elements of Si is assigned positive probability. Strategy profile t is a turple Ui(t):the expected payoff of player i given the strategy profile t
(c) 2003 SNU Biointelligence Lab. Definitions and Notations Domination Non-reducibile There do not exist e,f ∈ Si, for some i ∈ N,such that e donimates f. Generic Different strategies of player i, assuming a fixed strategy profile for the rest of the players,should lead to different payoffs.
(c) 2003 SNU Biointelligence Lab. Nash equilibrium A strategy profile t is a Nash equilibrium Pure strategy Nash equilibrium t: when ti is a pure strategy for every i. Strictly mixed strategy Nash equilibrium:when ti is strictly mixed strategy for every I.
(c) 2003 SNU Biointelligence Lab. Safety Level Value Given a game g and a mixed strategy of player i, t ∈ Δ(Si), the safety level value obtained by i when choosing t in the game g, denoted by val(t,i,g),is the minimal expected payoff that player i may obtain when employing t against arbitrary strategy profiles of the other players.
(c) 2003 SNU Biointelligence Lab. Safety-Level Strategy safety-level strategy(or a probabilistic maximin strategy) of player i A strategy t ’ of player i for which val(.,i,g) is maximal, which satisfies In a Nash equilibrium every strategy in the support should lead to identical expected payoffs. The expected payoff of a safety-level strategy should be identical for any strategy of the other player.
(c) 2003 SNU Biointelligence Lab. Decentralized Load Balancing Two ratioal players need to submit messages in a simple communication network:A network of two parallel communication lines e1,e2 connecting nodes s and t. Each player needs to decide on the route to taken. E1 is the faster one (the value of transmitting a message along e1 is X>0 while the value of transmitting a message along e2 is αX for 0.5<α<1. If both players choose the same line then the value for each one of them drops in a factor of two.
(c) 2003 SNU Biointelligence Lab. Decentralized Load Balancing The optimal safety-level value for a player in the decentralized load balancing game equals its expected payoff in the strictly mixed strategy equilibrium of that game. Nash equilibrium strategy:e1 with p=(2-α)/(1+ α) Safety Level strategy:e1 with p= α/(1+ α) Both strategies differ but results in same expected payoff of 1.5 α/(1+ α)
(c) 2003 SNU Biointelligence Lab. Leader election: Decentralized Voting The players vote about the identity of the player who will take the lead on a particular task. Player 1,2 can vote for 1 or 2, denoted by a1,a2. A failure to obtain agreement leads to 0 payoff. Agreement leads to various positive payoffs.
(c) 2003 SNU Biointelligence Lab. Leader election: Decentralized voting The optimal safety level value for a player in the leader election game equals its expected payoff in the strictly mixed strategy equilibrium of that game. Nash equilibrium strategy Player 1:choose a1 with probability p=d/(b+d) Player 2:choose a1 with probability q=c/(a+c) Safety level strategy Player 1:choose a1 with probability p ’ =c/(a+c) Player 2:choose a1 with probability q ’ =b/(b+d) Both strategies differ(p=q ’ ) but expeceted payoff coincide.
(c) 2003 SNU Biointelligence Lab. Safety Level in General 2x2 games Non-reducible generic 2x2 Games Theorem:Let G be a 2x2 non-reducible generic game. Assume that the optimal safety level value of a player is obtained by a strictly mixed strategy, then this value coincides with the expected payoff of that player in a Nash equilibrium of G.
(c) 2003 SNU Biointelligence Lab. Beyond 2x2 Games Set theoretic games The sets of strategies available to the players are identical, and the payoff of each player is uniquely determined by the set of strategies selected by each player. Given 2-person set theoretic game g with a strictly mixed strategy Nash equilibrium, then the value of an optimal safety level strategy of a player equals its expected payoff in that equilibrium.
(c) 2003 SNU Biointelligence Lab. Competitive Safety Strategies Let S be a set of strategies. Consider a family of games (g1,g2, … gj) where i is a player at each of them, its set of strategies at each of these games is S, and there are j players in addition to i in gj. A mixed strategy t ∈ Δ(S) will be called a C- competetive strategy if there exists some constant C>0 such that
(c) 2003 SNU Biointelligence Lab. Competitive Safety Strategies Extended decentralized load-balancing setting N players submit their messages along e1 and e2. The payoff for player choosing e1 is X/k and the payoff choosing e2 is αX/k where k is the number of players using the line. There exists 9/8-competetive safety strategy for the extended decentralized load-balancing setting.
(c) 2003 SNU Biointelligence Lab. Competetive Safety Analysis in Bayesian Games Games with incomplete information First-price auction A good g is put for sale, and there are n potential buyers. Each buyers has a private valuation (maximal willingness to pay) for g that is drawn from a uniform distribution on the interval of real members[0,1]. The distribution on agent valuations are commonly known. Each potential buyer is asked to submit a bid for the good g. We assume that the bids of a buyer with valuation v is a number in the interval [0,v]. The good will be allocated to the bidder with the highest bid. If the player wins and pays p,the payoff of the player is v-p. Otherwise his payoff is 0.
(c) 2003 SNU Biointelligence Lab. Competetive Safety Analysis in Bayesian Games Bayesian game Equlibrium value: v^n/n with p=(1-1/n)v. Safety level strategy:same p=(1-1/n)v guaranteeing expected payoff v^n/ne as n reaching infinity. There exists an e-competitive strategy for the first- price auction setup.
(c) 2003 SNU Biointelligence Lab. Discussion In order to build robust protocols, relying on standard equilibrium analysis might not be satisfactory, and safety guarantees are required. Shown that safety-level strategy may yield the value of a Nash equilibrium in games that are not zero-sum, provides a powerful normative tool for computer scientists and AI researchers interested in protocols for non-cooperative environments.