Stochastic Zero-sum and Nonzero-sum  -regular Games A Survey of Results Krishnendu Chatterjee Chess Review May 11, 2005.

Slides:



Advertisements
Similar presentations
From Graph Models to Game Models Tom Henzinger EPFL.
Advertisements

Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games.
Game Theory Assignment For all of these games, P1 chooses between the columns, and P2 chooses between the rows.
Distributed Markov Chains P S Thiagarajan School of Computing, National University of Singapore Joint work with Madhavan Mukund, Sumit K Jha and Ratul.
This Segment: Computational game theory Lecture 1: Game representations, solution concepts and complexity Tuomas Sandholm Computer Science Department Carnegie.
Mixed Strategies CMPT 882 Computational Game Theory Simon Fraser University Spring 2010 Instructor: Oliver Schulte.
Game Theoretical Insights in Strategic Patrolling: Model and Analysis Nicola Gatti – DEI, Politecnico di Milano, Piazza Leonardo.
Energy and Mean-Payoff Parity Markov Decision Processes Laurent Doyen LSV, ENS Cachan & CNRS Krishnendu Chatterjee IST Austria MFCS 2011.
© 2015 McGraw-Hill Education. All rights reserved. Chapter 15 Game Theory.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
Krishnendu Chatterjee1 Graph Games with Reachabillity Objectives: Mixing Chess, Soccer and Poker Krishnendu Chatterjee 5 th Workshop on Reachability Problems,
Krishnendu Chatterjee1 Partial-information Games with Reachability Objectives Krishnendu Chatterjee Formal Methods for Robotics and Automation July 15,
Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger.
Concurrent Reachability Games Peter Bro Miltersen Aarhus University 1CTW 2009.
Center for the Study of Rationality Hebrew University of Jerusalem 1/39 n-Player Stochastic Games with Additive Transitions.
Discounting the Future in Systems Theory Chess Review May 11, 2005 Berkeley, CA Luca de Alfaro, UC Santa Cruz Tom Henzinger, UC Berkeley Rupak Majumdar,
02/01/11CMPUT 671 Lecture 11 CMPUT 671 Hard Problems Winter 2002 Joseph Culberson Home Page.
An Introduction to Game Theory Part II: Mixed and Correlated Strategies Bernhard Nebel.
Krishnendu Chatterjee and Laurent Doyen. Energy parity games are infinite two-player turn-based games played on weighted graphs. The objective of the game.
Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger.
Models and Theory of Computation (MTC) EPFL Dirk Beyer, Jasmin Fisher, Nir Piterman Simon Kramer: Logic for cryptography Marc Schaub: Models for biological.
INSTITUTO DE SISTEMAS E ROBÓTICA Minimax Value Iteration Applied to Robotic Soccer Gonçalo Neto Institute for Systems and Robotics Instituto Superior Técnico.
Advanced Microeconomics Instructors: Wojtek Dorabialski & Olga Kiuila Lectures: Mon. & Wed. 9:45 – 11:20 room 201 Office hours: Mon. & Wed. 9:15 – 9:45.
1 Computing Nash Equilibrium Presenter: Yishay Mansour.
Stochastic Games Games played on graphs with stochastic transitions Markov decision processes Games against nature Turn-based games Games against adversary.
Approaches to Reactive System Synthesis J.-H. Roland Jiang.
Outline MDP (brief) –Background –Learning MDP Q learning Game theory (brief) –Background Markov games (2-player) –Background –Learning Markov games Littman’s.
Nash Q-Learning for General-Sum Stochastic Games Hu & Wellman March 6 th, 2006 CS286r Presented by Ilan Lobel.
Chapter 6 Extensive Games, perfect info
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Analysis of Algorithms CS 477/677
Chess Review November 18, 2004 Berkeley, CA Hybrid Systems Theory Edited and Presented by Thomas A. Henzinger, Co-PI UC Berkeley.
1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.
Stochastic Games Krishnendu Chatterjee CS 294 Game Theory.
NSF Foundations of Hybrid and Embedded Software Systems UC Berkeley: Chess Vanderbilt University: ISIS University of Memphis: MSI Program Review May 10,
UNIT II: The Basic Theory Zero-sum Games Nonzero-sum Games Nash Equilibrium: Properties and Problems Bargaining Games Bargaining and Negotiation Review.
Learning and Planning for POMDPs Eyal Even-Dar, Tel-Aviv University Sham Kakade, University of Pennsylvania Yishay Mansour, Tel-Aviv University.
CS6800 Advanced Theory of Computation Fall 2012 Vinay B Gavirangaswamy
Minimax strategies, Nash equilibria, correlated equilibria Vincent Conitzer
Quantitative Languages Krishnendu Chatterjee, UCSC Laurent Doyen, EPFL Tom Henzinger, EPFL CSL 2008.
MAKING COMPLEX DEClSlONS
MCS312: NP-completeness and Approximation Algorithms
Reinforcement Learning on Markov Games Nilanjan Dasgupta Department of Electrical and Computer Engineering Duke University Durham, NC Machine Learning.
Energy Parity Games Laurent Doyen LSV, ENS Cachan & CNRS Krishnendu Chatterjee IST Austria.
Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves.
Standard and Extended Form Games A Lesson in Multiagent System Based on Jose Vidal’s book Fundamentals of Multiagent Systems Henry Hexmoor, SIUC.
Peter van Emde Boas: Games and Computer Science 1999 GAMES AND COMPUTER SCIENCE Theoretical Models 1999 Peter van Emde Boas References available at:
Algorithmic Software Verification III. Finite state games and pushdown automata.
Game theory & Linear Programming Steve Gu Mar 28, 2008.
Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Games with Secure Equilibria Krishnendu Chatterjee (Berkeley) Thomas A. Henzinger (EPFL) Marcin Jurdzinski (Warwick)
Ásbjörn H Kristbjörnsson1 The complexity of Finding Nash Equilibria Ásbjörn H Kristbjörnsson Algorithms, Logic and Complexity.
1 What is Game Theory About? r Analysis of situations where conflict of interests is present r Goal is to prescribe how conflicts can be resolved 2 2 r.
Games, Logic and Automata Seminar Rotem Zach 1. Overview 2.
An Introduction to Game Theory Math 480: Mathematics Seminar Dr. Sylvester.
1 Algorithms for Computing Approximate Nash Equilibria Vangelis Markakis Athens University of Economics and Business.
Parameterized Two-Player Nash Equilibrium Danny Hermelin, Chien-Chung Huang, Stefan Kratsch, and Magnus Wahlstrom..
Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS.
Complexity of Compositional Model Checking of Computation Tree Logic on Simple Structures Krishnendu Chatterjee Pallab Dasgupta P.P. Chakrabarti IWDC 2004,
Kristoffer Arnsfelt Hansen Rasmus Ibsen-Jensen Peter Bro Miltersen
The Multiple Dimensions of Mean-Payoff Games
Vincent Conitzer CPS Repeated games Vincent Conitzer
Stochastic -Regular Games
Alternating tree Automata and Parity games
Game Theory in Wireless and Communication Networks: Theory, Models, and Applications Lecture 10 Stochastic Game Zhu Han, Dusit Niyato, Walid Saad, and.
Uri Zwick Tel Aviv University
Quantitative Modeling, Verification, and Synthesis
Collaboration in Repeated Games
Normal Form (Matrix) Games
Vincent Conitzer CPS Repeated games Vincent Conitzer
Presentation transcript:

Stochastic Zero-sum and Nonzero-sum  -regular Games A Survey of Results Krishnendu Chatterjee Chess Review May 11, 2005

5/11/05 2 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open Problems.

5/11/05 3 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open Problems.

5/11/05 4 Stochastic Games Games played on game graphs with stochastic transitions. Stochastic games [Sha53] Framework to model natural interaction between components and agents. e.g., controller vs. system.

5/11/05 5 Stochastic Games Where: Arena: Game graphs. What for: Objectives -  -regular. How: Strategies.

5/11/05 6 Game Graphs Two broad class: Turn-based games Players make moves in turns. Concurrent games Players make moves simultaneously and independently.

5/11/05 7 Classification of Games Games can be classified in two broad categories: Zero-sum games: Strictly competitive, e.g., Matrix games. Nonzero-sum games: Not strictly competitive, e.g., Bimatrix games.

5/11/05 8 Goals Determinacy: minmax and maxmin values for zero-sum games. Equilibrium: existence of equilibrium payoff for nonzero-sum games. Computation issues. Strategy classification: simplest class of strategies that suffice for determinacy and equilibrium.

5/11/05 9 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open Problems.

Turn-based Games

5/11/05 11 Turn-based Probabilistic Games A turn-based probabilistic game is defined as G=(V,E,(V 1,V 2,V 0 )), where (V,E) is a graph. (V 1,V 2,V 0 ) is a partition of V. V 1 player 1 makes moves. V 2 player 2 makes moves. V 0 randomly chooses successors.

5/11/05 12 A Turn-based Probabilistic Game

5/11/05 13 Special Cases Turn-based deterministic games: V 0 =  (emptyset). No randomness, deterministic transition. Markov decision processes (MDPs) V 2 =  (emptyset). No adversary.

5/11/05 14 Applications MDPs (1 ½- player games) Control in presence of uncertainty. Games against nature. Turn-based deterministic games (2-player games) Control in presence of adversary, control in open environment or controller synthesis. Games against adversary. Turn-based stochastic games (2 ½ -player games) Control in presence of adversary and nature, controller synthesis of stochastic reactive systems. Games against adversary and nature.

5/11/05 15 Game played Token placed on an initial vertex. If current vertex is Player 1 vertex then player 1 chooses successor. Player 2 vertex then player 2 chooses successor. Player random vertex proceed to successors uniformly at random. Generates infinite sequence of vertices.

Concurrent Games

5/11/05 17 Concurrent game Players make move simultaneously. Finite set of states S. Finite set of actions  Action assignments  1,  2 :S ! 2  n  Probabilistic transition function  (s, a 1, a 2 )(t) = Pr [ t | s, a 1, a 2 ]

5/11/05 18 Concurrent game ac,bd ad bc Actions at s 0 : a, b for player 1, c, d for player 2. s0s0 s1s1 s2s2

5/11/05 19 Concurrent games Games with simultaneous interaction. Model synchronous interaction.

5/11/05 20 Stochastic games 1 ½ pl. 2 pl. 2 ½ pl. Conc. games

5/11/05 21 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open problems.

Objectives

5/11/05 23 Plays Plays: infinite sequence of vertices or infinite trajectories. V  : set of all infinite plays or infinite trajectories.

5/11/05 24 Objectives Plays: infinite sequence of vertices. Objectives: subset of plays,  1 µ V . Play is winning for player 1 if it is in  1 Zero-sum game:  2 = V  n  1.

5/11/05 25 Reachability and Safety Let R µ V set of target vertices. Reachability objective requires to visit the set R of vertices. Let S µ V set of safe vertices. Safety objective requires never to visit any vertex outside S.

5/11/05 26 Buchi Objective Let B µ V a set of Buchi vertices. Buchi objective requires that the set B is visited infinitely often.

5/11/05 27 Rabin-Streett Let {(E 1,F 1 ), (E 2,F 2 ),…, (E d,F d )} set of vertex set pairs. Rabin: requires there is a pair (E j,F j ) such that E j finitely often and F j infinitely often. Streett: requires for every pair (E j,F j ) if F j infinitely often then E j infinitely often. Rabin-chain: both a Rabin-Streett, complementation closed subset of Rabin.

5/11/05 28 Objectives  -regular: [, °, *, . Safety, Reachability, Liveness, etc. Rabin and Streett canonical ways to express. Borel  regular

5/11/05 29 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open problems.

Strategies

5/11/05 31 Strategy Given a finite sequence of vertices, (that represents the history of play) a strategy  for player 1 is a probability distribution over the set of successor.  : V * ¢ V 1 ! D

5/11/05 32 Subclass of Strategies Memoryless (stationary) strategies: Strategy is independent of the history of the play and depends on the current vertex.  : V 1 ! D Pure strategies: chooses a successor rather than a probability distribution. Pure-memoryless: both pure and memoryless (simplest class).

5/11/05 33 Strategies The set of strategies: Set of strategy  for player 1; strategies . Set of strategy  for player 2; strategies .

5/11/05 34 Values Given objectives  1 and  2 = V  n  1 the value for the players are v 1 (  1 )(v) = sup  2  inf  2  Pr v ,  (  1 ). v 2 (  2 )(v) = sup  2  inf  2  Pr v ,  (  2 ).

5/11/05 35 Determinacy Determinacy: v 1 (  1 )(v) + v 2 (  2 )(v) =1. Determinacy means sup inf = inf sup. von Neumann’s minmax theorem in matrix games.

5/11/05 36 Optimal strategies A strategy  is optimal for objective  1 if v 1 (  1 )(v) = inf  Pr v ,  (  1 ). Analogous definition for player 2.

5/11/05 37 Zero-sum and nonzero-sum games Zero sum:  2 = V  n  1. Nonzero-sum:  1 and  2 happy with own goals.

5/11/05 38 Concept of rationality Zero sum game: Determinacy. Nonzero sum game: Nash equilibrium.

5/11/05 39 Nash Equilibrium A pair of strategies (  1,  2 ) is an  -Nash equilibrium if For all  ’ 1,  ’ 2 : Value 2 (  1,  ’ 2 ) · Value 2 (  1,  2 ) +  Value 1 (  ’ 1,  2 ) · Value 1 (  1,  2 ) +  Neither player has advantage of more than  in deviating from the equilibrium strategy. A 0-Nash equilibrium is called a Nash equilibrium. Nash’s Theorem guarantees existence of Nash equilibrium in nonzero- sum matrix games.

5/11/05 40 Computational Issues Algorithms to compute values in games. Identify the simplest class of strategies that suffices for optimality or equilibrium.

5/11/05 41 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open problems.

Outline of results

5/11/05 43 History and results MDPs Complexity of MDPs. [PapTsi89] MDPs with  -regular objectives. [CouYan95,deAl97]

5/11/05 44 History and results Two-player games. Determinacy (sup inf = inf sup) theorem for Borel objectives. [Mar75] Finite memory determinacy (i.e., finite memory optimal strategy exists) for  -regular objectives. [GurHar82] Pure memoryless optimal strategy exists for Rabin objectives. [EmeJut88] NP-complete.

5/11/05 45 History and result 2 ½ - player games Reachability objectives: [Con92] Pure memoryless optimal strategy exists. Decided in NP Å coNP.

5/11/05 46 History and results: Concurrent zero-sum games Detailed analysis of concurrent games [FilVri97]. Determinacy theorem for all Borel objectives [Mar98]. Concurrent  -regular games: Reachability objectives [deAlHenKup98]. Rabin-chain objectives [deAlHen00]. Rabin-chain objectives [deAlMaj01].

5/11/05 47 Zero sum games 1 ½ pl. 2 pl. 2 ½ pl. Conc. games Borel  regular CY95, dAl97Mar75 GH82 EJ88 Mar98 dAM01 dAH00,dAM01

5/11/05 48 Zero sum games 2 ½ player games with Rabin and Streett objectives [CdeAlHen 05a] Pure memoryless optimal strategies exist for Rabin objectives in 2 ½ player games. 2 ½ player games with Rabin objectives is NP-complete. 2 ½ player games with Streett objectives is coNP-complete.

5/11/05 49 Zero sum games 2 ½ player Rabin objectives 2-player Rabin objectives [EmeJut88] 2 ½ player Reachability objectives [Con92] Game graphObjectives

5/11/05 50 Zero-sum games 2 ½ pl. 2 pl. Rabin Reach Con 92: PM EJ88 :PM PM, NP comp. NP comp.

5/11/05 51 Zero sum games Concurrent games with parity objectives Requires infinite memory strategies even for Buchi objectives [deAlHen00]. Polynomial witnesses for infinite memory strategies and polynomial time verification procedure. Complexity: NP Å coNP [CdeAlHen 05b].

5/11/05 52 Zero sum games 1 ½ pl. 2 pl. 2 ½ pl. Conc. games Borel  regular CY98, dAl97Mar75 GH82 EJ88 Mar98 dAM01 dAH00,dAM01

5/11/05 53 Zero sum games 1 ½ pl. 2 pl. 2 ½ pl. Conc. games Borel  regular EJ88 dAM01 3EXP  NP,coNP dAM01 3EXP  NP Å coNP

5/11/05 54 History: Nonzero-sum Games Two-player nonzero-sum stochastic games with limit-average payoff. [Vie00a, Vie00b] Closed sets (Safety). [SecSud02]

5/11/05 55 Nonzero sum games n pl. conc. n pl. turn-based 2 pl. conc. Borel  reg R S Lim. avg Nash:SecSud02  Nash:Vie00

5/11/05 56 Nonzero sum games For all n player concurrent games with reachability objectives for all players,  -Nash equilibrium exist for all  >0, in memoryless strategies [CMajJur 04]. For all n player turn-based stochastic games with Borel objectives for the players,  -Nash equilibrium exist for all  >0, in pure strategies [CMajJur 04]. The result strengthens to exact Nash equilibria in case of n player turn based deterministic games with Borel objectives, and n player turn based stochastic games with  -regular objectives.

5/11/05 57 Nonzero sum games n pl. conc. n pl. turn-based 2 pl. conc. Borel  reg R S Lim. avg Nash:SecSud02  Nash:Vil00  Nash Nash

5/11/05 58 Nonzero sum games For 2-player concurrent games with  -regular objectives for both players,  -Nash equilibrium exist for all  >0 [C 05]. Polynomial witness and polynomial time verification procedure to compute an  -Nash equilibrium.

5/11/05 59 Nonzero sum games n pl. conc. n pl. turn-based 2 pl. conc. Borel  reg R S Lim. avg Nash:SecSud02  Nash:Vil00  Nash Nash  Nash

5/11/05 60 Outline 1. Stochastic games: informal descriptions. 2. Classes of game graphs. 3. Objectives. 4. Strategies. 5. Outline of results. 6. Open Problems.

5/11/05 61 Major open problems 2 player Rabin chain 2-1/2 player reachability game 2-1/2 player Rabin chain NP Å coNP Polytime algo???

5/11/05 62 Nonzero sum games n pl. conc. n pl. turn-based 2 pl. conc. Borel  reg R S Lim. avg Nash:SecSud02  Nash:Vil00  Nash Nash  Nash

5/11/05 63 Nonzero sum games n pl. conc. n pl. turn-based 2 pl. conc. Borel  reg R S Lim. avg  Nash

5/11/05 64 Conclusion Stochastic games Rich theory. Communities: Descriptive Set Theory, Stochastic Game Theory, Probability Theory, Control Theory, Optimization Theory, Complexity Theory, Formal Verification …. Several open theoretical problems.

Joint work with Thomas A. Henzinger Luca de Alfaro Rupak Majumdar Marcin Jurdzinski

5/11/05 66 References [Sha53] L.S. Shapley, "Stochastic Games“,1953. MDPs : [PapTsi88] C. Papadimitriou and J. Tsisiklis, "The complexity of Markov decision processes", [deAl97] L. de Alfaro, "Formal verification of Probabilistic Systems", PhD Thesis, Stanford, [CouYan95] C. Courcoubetis and M. Yannakakis, "The complexity of probabilistic verification", Two-player games : [Mar75] Donald Martin, "Borel Determinacy", [GurHar82] Yuri Gurevich and Leo Harrington, "Tree automata and games", [EmeJut88] E.A.Emerson and C.Jutla, "The complexity of tree automata and logic of programs", ½ - player games : [Con 92] A. Condon, "The Complexity of Stochastic Games", 1992.

5/11/05 67 References Concurrent zero-sum games: [FilVri97] J.Filar and F.Vrieze, "Competitive Markov Decision Processes", (Book) Springer, [Mar98] D. Martin, "The determinacy of Blackwell games", [deALHenKup98] L. de Alfaro, T.A. Henzinger and O. Kupferman, "Concurrent reachability games",1998. [deAlHen00] L. de Alfaro and T.A. Henzinger, "Concurrent  -regular games", [deAlMaj01] L. de Alfaro and R. Majumdar, "Quantitative solution of  -regular games", Concurrent nonzero-sum games : [Vie00a] N. Vieille, "Two player Stochastic games I: a reduction", [Vie00b] N. Vieille, "Two-player Stochastic games II: the case of recursive games", [SecSud01] P. Seechi and W. Sudderth, "Stay-in-a-set-games", 2001.

5/11/05 68 References [CJurHen 03] K. Chatterjee, M. Jurdzinski and T.A. Henzinger, “Simple stochastic parity games”, [CJurHen 04] K. Chatterjee, M. Jurdzinski and T.A. Henzinger, “Quantitative stochastic parity games”, [CMajJur 04] K. Chatterjee, R. Majumdar and M. Jurdzinski, “On Nash equilibrium in stochastic games”, [CdeAlHen 05a] K. Chatterjee, L. de Alfaro and T.A. Henzinger, “ The complexity of stochastic Rabin and Streett games”, [CdeAlHen 05b] K. Chatterjee, L. de Alfaro and T.A. Henzinger, “The complexity of quantitative concurrent parity games”, [C 05] K. Chatterjee, “Two-player nonzero-sum  regular games”, 2005.

Thanks !!!