Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games Praveen Paruchuri, Jonathan P. Pearce, Sarit Kraus Catherine.

Slides:

Advertisements

Similar presentations

N. Basilico, N. Gatti, F. Amigoni DEI, Politecnico di Milano Leader-Follower Strategies for Robotic Patrolling in Environments with Arbitrary Topology.

Advertisements

S. Ceppi, N. Gatti, and N. Basilico DEI, Politecnico di Milano Computing Bayes-Nash Equilibria through Support Enumeration Methods in Bayesian Two-Player.

Thursday, March 7 Duality 2 – The dual problem, in general – illustrating duality with 2-person 0-sum game theory Handouts: Lecture Notes.

An Introduction to Game Theory Part V: Extensive Games with Perfect Information Bernhard Nebel.

NON - zero sum games.

Nash’s Theorem Theorem (Nash, 1951): Every finite game (finite number of players, finite number of pure strategies) has at least one mixed-strategy Nash.

Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.

Continuation Methods for Structured Games Ben Blum Christian Shelton Daphne Koller Stanford University.

1 University of Southern California Keep the Adversary Guessing: Agent Security by Policy Randomization Praveen Paruchuri University of Southern California.

This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.

Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.

Stackelberg -leader/follower game 2 firms choose quantities sequentially (1) chooses its output; then (2) chooses it output; then the market clears This.

Game Theoretical Insights in Strategic Patrolling: Model and Analysis Nicola Gatti – DEI, Politecnico di Milano, Piazza Leonardo.

Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.

For any player i, a strategy weakly dominates another strategy if (With at least one S -i that gives a strict inequality) strictly dominates if where.

MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)

Network Theory and Dynamic Systems Game Theory: Mixed Strategies

Computing equilibria of security games using linear integer programming Dima Korzhyk

General-sum games You could still play a minimax strategy in general- sum games –I.e., pretend that the opponent is only trying to hurt you But this is.

Extensive-form games. Extensive-form games with perfect information Player 1 Player 2 Player 1 2, 45, 33, 2 1, 00, 5 Players do not move simultaneously.

Security via Strategic Randomization Milind Tambe Fernando Ordonez Praveen Paruchuri Sarit Kraus (Bar Ilan, Israel) Jonathan Pearce, Jansuz Marecki James.

Computing Optimal Randomized Resource Allocations for Massive Security Games Presenter : Jen Hua Chi Advisor : Yeong Sung Lin.

Lecture 1 - Introduction 1.  Introduction to Game Theory  Basic Game Theory Examples  Strategic Games  More Game Theory Examples  Equilibrium  Mixed.

Chapter Twenty-Eight Game Theory. u Game theory models strategic behavior by agents who understand that their actions affect the actions of other agents.

1 Computing Nash Equilibrium Presenter: Yishay Mansour.

6.1 Consider a simultaneous game in which player A chooses one of two actions (Up or Down), and B chooses one of two actions (Left or Right). The game.

Finite Mathematics & Its Applications, 10/e by Goldstein/Schneider/SiegelCopyright © 2010 Pearson Education, Inc. 1 of 68 Chapter 9 The Theory of Games.

1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.

Distributed Combinatorial Optimization

DANSS Colloquium By Prof. Danny Dolev Presented by Rica Gonen

A Game-Theoretic Approach to Determining Efficient Patrolling Strategies for Mobile Robots Francesco Amigoni, Nicola Gatti, Antonio Ippedico.

Commitment without Regrets: Online Learning in Stackelberg Security Games Nika Haghtalab Carnegie Mellon University Joint work with Maria-Florina Balcan,

Game Theory for Security: Lessons Learned from Deployed Applications Milind Tambe Chris Kiekintveld Richard John & teamcore.usc.edu.

MAKING COMPLEX DEClSlONS

Chapter 9 Games with Imperfect Information Bayesian Games.

CPS LP and IP in Game theory (Normal-form Games, Nash Equilibria and Stackelberg Games) Joshua Letchford.

Game-theoretic analysis tools Tuomas Sandholm Professor Computer Science Department Carnegie Mellon University.

Linear Programming Erasmus Mobility Program (24Apr2012) Pollack Mihály Engineering Faculty (PMMK) University of Pécs João Miranda

Static Games of Incomplete Information

1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.

Game Theory Optimal Strategies Formulated in Conflict MGMT E-5070.

Integer LP In-class Prob

Algorithms for solving two-player normal form games

OR Chapter 7. The Revised Simplex Method  Recall Theorem 3.1, same basis  same dictionary Entire dictionary can be constructed as long as we.

Incomplete information: Perfect Bayesian equilibrium

Dynamic games, Stackelburg Cournot and Bertrand

1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.

5.1.Static Games of Incomplete Information

Extensive Form (Dynamic) Games With Perfect Information (Theory)

Econ 805 Advanced Micro Theory 1 Dan Quint Fall 2009 Lecture 1 A Quick Review of Game Theory and, in particular, Bayesian Games.

Game Theory Georg Groh, WS 08/09 Verteiltes Problemlösen.

Commitment to Correlated Strategies Vince Conitzer and Dima Korzhyk.

Keep the Adversary Guessing: Agent Security by Policy Randomization

Hello everyone, welcome to this talk about~~~

C. Kiekintveld et al., AAMAS 2009 (USC) Presented by Neal Gupta

Non-additive Security Games

The minimum cost flow problem

Communication Complexity as a Lower Bound for Learning in Games

Commitment to Correlated Strategies

CPS Extensive-form games

Enumerating All Nash Equilibria for Two-person Extensive Games

Lecture 20 Linear Program Duality

Vincent Conitzer Extensive-form games Vincent Conitzer

CPS 173 Extensive-form games

9.3 Linear programming and 2 x 2 games : A geometric approach

Lecture Game Theory.

M9302 Mathematical Models in Economics

Normal Form (Matrix) Games

Presentation transcript:

Playing Games for Security: An Efficient Exact Algorithm for Solving Bayesian Stackelberg Games Praveen Paruchuri, Jonathan P. Pearce, Sarit Kraus Catherine (Ying) Liu, School of Computer Science, University of Waterloo

Outline Introduction Problem Definition DOBSS Approach – Mixed-Integer Quadratic Program – Decomposed MIQP – Arriving at DOBSS: Decomposed MILP Experiments – Experimental Domain – Experimental Results Conclusion Outline, Playing Games for Security

Introduction Introduction, Playing Games for Security

Introduction Stackelberg Game One agent (the leader) must commit to a strategy that can be observed by the other agent (the follower) Bayesian Stackelberg Game Stackelberg Game + Leader’s uncertainty about the types of adversary he may face Introduction, Playing Games for Security

Introduction Introduction, Playing Games for Security Example of Stackelberg Game Security Problem 1. Simultaneous Moves: Nash Equilibrium (a,c)- Leader’s payoff=2 2. Let’s play Stackelberg Game! cd a2,14,0 b1,03,2 Leader’s Committed Strategy Follower’s Pure Strategy Leader’s Payoff Case 1Pure Strategy: bd3 Case 2Mixed Strategy: (a-0.5,b-0.5) d4*0.5+3*0.5=3.5

Our Target To determine the optimal strategy for a leader to commit to in a Bayesian Stackelberg game What is the Problem? Choosing an optimal strategy for the leader to commit to in a Bayesian Stackelberg game is NP-hard! Existing Solutions Idea 1: Harsanyi Transformation Reference: J.C.Harsanyi and R.Selten. A generalized Nash solution for two-person bargaining games with incomplete information. Management Science, 18(5):80-106, Idea 2: MIP-Nash Reference: T. Sandholm, A. Gilpin, and V. Conizer. Mixed-integer programming methods for finding nash equilibria. In AAAI, Idea 3: ASAP Preference: P. Paruchuri, J.P.Pearce, M.Tambe, G.Ordonez, and S.Kraus. An efficient heuristic approach for security against multiple adversaries. In AAMAS, Introduction, Playing Games for Security Introduction

ADVANTAGES of DOBSS [ Compared to Harsanyi Transformation and MIP Nash ] 1. Compact form of Bayesian game 2. Only 1 mixed-integer linear program required to be solved 3. Direct search for an optimal leader strategy rather than a Nash equilibrium Introduction, Playing Games for Security Introduction

Problem Definition Two agents: the leader and the follower Set of possible types for the leader: Set of possible types for the follower: Agent’s set of strategies: Agent’s Utility function Un: Target: Find the optimal mixed strategy for the leader to commit to, given that the follower may know this mixed strategy when choosing his own strategy Problem Definition, Playing Games for Security

DOBSS Mixed-Integer Quadratic Program Decomposed MIQP Arriving at DOBSS: Decomposed MILP DOBSS, Playing Games for Security

Mixed-Integer Quadratic Program Single follower type scenario The Follower A reward-maximizing pure strategy The Leader Mixed strategy that gives the highest payoff, given follower’s strategy REASON DOBSS: Mixed-Integer Quadratic Program, Playing Games for Security cd a2,14,0 b1,03,2

Notions : the proportion of times in which the leader’s pure strategy i is used in the policy X: the index sets of the leader’s pure strategies Q: the index sets of the follower’s pure strategies R: the leader’s payoff matrix : the reward of the leader when the leader takes pure strategy i and the follower takes pure strategy j C: the follower’s payoff matrix : the reward of the follower when the leader takes pure strategy i and the follower takes pure strategy j DOBSS: Mixed-Integer Quadratic Program, Playing Games for Security Mixed-Integer Quadratic Program

The Optimal Problem for the Follower Primal Problem s.t. (1) Dual problem Linear ProgrammingLinear Programming s.t. (2) Complementary Slackness Linear ProgrammingLinear Programming DOBSS: Mixed-Integer Quadratic Program, Playing Games for Security Mixed-Integer Quadratic Program

Dual Problem Every linear programming problem, referred to as a primal problem, can be converted into a dual problem, which provides an upper bound to the optimal value of the primal problem. We can express the Primal problem (P) as: The corresponding Dual problem (D) is: Complementary Slackness Suppose x and y are feasible solutions to (P) and (D). Then x and y are optimal if and only if the following conditions are satisfied: Background Information: Linear Programming, Playing Games for Security Linear Programming

The Optimal Problem for the Leader (4) s.t. Constraints: (1)(4): Enforce a feasible mixed policy for the leader (2)(5): Enforce a feasible pure strategy for the follower (3): Leftmost inequality: Enforces dual feasibility of the follower’s problem Rightmost inequality: Complementary slackness constraint for an optimal pure strategy q for the follower DOBSS: Mixed-Integer Quadratic Program, Playing Games for Security Mixed-Integer Quadratic Program

DOBSS: Decomposed MIQP, Playing Games for Security Notions : a priori probability that a follower of type will appear L: the set of follower types X: the index sets of the leader’s pure strategies Q: the index sets of the follower ’s pure strategies : the leader’s payoff matrix ( ) : the follower’s payoff matrix ( ) Formula (5) s.t. Decomposed MIQP

DOBSS: Decomposed MIQP, Playing Games for Security Example: Entry Deterrence Problem Follower Types Decomposed MIQP Incumbent ExpandDon’t Expand EntrantEnter-1,α1,1 Stay Out0, β0,3 Scenario 1 (prob- 2/3): α=2, β=4Scenario 2 (prob- 1/3): α=-1, β=0 Incumbent is a low cost firm (type )Incumbent is a high cost firm (type )

ExpandDon’t Expand Enter -1,-11,1 Stay Out 0,00,3 ExpandDon’t Expand Enter -1,21,1 Stay Out 0,40,3 Decomposed MIQP Followers’ optimal strategies Incumbent has a dominant strategy: Expand! Don’t Expand! Leader’s Optimal Strategy, given followers’ optimal choices DOBSS: Decomposed MIQP, Playing Games for Security

Question: Does this decomposition cause any suboptimality? Proposition 1. Problem (5) is equivalent to Problem (4) with the payoff matrix from the Harsanyi transformation for a Bayesian Stackelberg game. Decomposed MIQP DOBSS: Decomposed MIQP, Playing Games for Security

Proof of Proposition 1 [Decomposed MIQP] Leader’s optimal strategy: [Harsanyi Transformation] Incumbent has 4 strategies: For the leader: Stay Out Nash Equilibrium: DOBSS: Decomposed MIQP, Playing Games for Security Decomposed MIQP Incumbent ExpandDon’t Expand EntrantEnter-1,α1,1 Stay Out0, β0,3 (Ex, Ex)(Ex, Don’t)(Don’t, Ex)(Don’t, Don’t) Enter-1, (2,-1), (2,1), (1,-1)1, (1,1) Stay Out0, (4,0)0, (4,3)0, (3,0)0, (3,3)

Decomposed MIQP (5) s.t. DOBSS: MILP (7) s.t. Arriving at DOBSS:MILP DOBSS: Arriving at DOBSS-MILP, Playing Games for Security

Proposition 2. Problem (5) and Problem (7) is equivalent Proposition 3. The DOBSS procedure exponentially reduces the problem over the Multiple-LPs approach in the number of adversary types. DOBSS: Arriving at DOBSS-MILP, Playing Games for Security Arriving at DOBSS:MILP

Experiments, Playing Games for Security Experimental Domain Experimental Results Experiments

A Stackelberg game in the experimental domain consisting of: 1. Two players: the security agent, the robber 2. A world consisting of m houses, 1…m 3. The security agent’s set of pure strategies consists of possible routes of d houses to patrol 4. The robber will know the mixed strategy the security agent has chosen Experimental Domain, Playing Games for Security Experimental Domain

Three sets of experiments 1.Comparison with runtimes of the four methods: DOBSS, ASAP, the multiple-LPs method and MIP-Nash 2.Infeasibility issue of ASAP 3.Quality results for ASAP & MIP-Nash Experimental Results, Playing Games for Security Experimental Results

A. Runtime results from two, three and four houses for all the four methods DOBSS, ASAP, the multiple-LPs method and MIP-Nash Experimental Results, Playing Games for Security Experimental Results

A. Runtime results from two, three and four houses for all the four methods Experimental Results, Playing Games for Security DOBSS, ASAP, the multiple-LPs method and MIP-Nash Experimental Results

A. Runtime results from two, three and four houses for all the four methods Experimental Results, Playing Games for Security DOBSS, ASAP, the multiple-LPs method and MIP-Nash Experimental Results

B. Runtimes of DOBSS and ASAP for five to seven houses Speedup: Experimental Results, Playing Games for Security DOBSS, ASAP, the multiple-LPs method and MIP-Nash Experimental Results

1.DOBSS and ASAP outperform the other two procedures with respect to runtimes 2.DOBSS has a faster algorithm runtime than ASAP Conclusion, Playing Games for Security Conclusion

A new game: Bayesian Stackelberg Game Value of the game: Modeling domains involving security (patrolling, setting up checkpoints, network routing, and transportation systems) New Solution: DOBSS Mixed-Integer Quadratic Program  Decomposed MIQP  Decomposed MILP-DOBSS Why DOBSS? a). DOBSS and ASAP outperform the other two procedures with respect to runtimes b). DOBSS has a faster algorithm runtime than ASAP Take-home Message, Playing Games for Security Take-home Message