Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games.

Slides:



Advertisements
Similar presentations
Part VI NP-Hardness. Lecture 23 Whats NP? Hard Problems.
Advertisements

Complexity ©D.Moshkovits 1 Where Can We Draw The Line? On the Hardness of Satisfiability Problems.
GAME THEORY.
Lecture 24 MAS 714 Hartmut Klauck
Markov Decision Process
JAYASRI JETTI CHINMAYA KRISHNA SURYADEVARA
COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.
Energy and Mean-Payoff Parity Markov Decision Processes Laurent Doyen LSV, ENS Cachan & CNRS Krishnendu Chatterjee IST Austria MFCS 2011.
MIT and James Orlin © Game Theory 2-person 0-sum (or constant sum) game theory 2-person game theory (e.g., prisoner’s dilemma)
Mini-course on algorithmic aspects of stochastic games and related models Marcin Jurdziński (University of Warwick) Peter Bro Miltersen (Aarhus University)
Simple Stochastic Games and Propositional Proof Systems Toniann Pitassi Joint work with Lei Huang University of Toronto.
Randomness for Free Laurent Doyen LSV, ENS Cachan & CNRS joint work with Krishnendu Chatterjee, Hugo Gimbert, Tom Henzinger.
Concurrent Reachability Games Peter Bro Miltersen Aarhus University 1CTW 2009.
Uri Zwick – Tel Aviv Univ. Randomized pivoting rules for the simplex algorithm Lower bounds TexPoint fonts used in EMF. Read the TexPoint manual before.
Uri Zwick – Tel Aviv Univ. Randomized pivoting rules for the simplex algorithm Upper bounds TexPoint fonts used in EMF. Read the TexPoint manual before.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Complexity 12-1 Complexity Andrei Bulatov Non-Deterministic Space.
Games, Times, and Probabilities: Value Iteration in Verification and Control Krishnendu Chatterjee Tom Henzinger.
1 Computing Nash Equilibrium Presenter: Yishay Mansour.
Analysis of Algorithms CS 477/677
Stochastic Games Krishnendu Chatterjee CS 294 Game Theory.
MAKING COMPLEX DEClSlONS
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
MCS312: NP-completeness and Approximation Algorithms
1 The Theory of NP-Completeness 2012/11/6 P: the class of problems which can be solved by a deterministic polynomial algorithm. NP : the class of decision.
Energy Parity Games Laurent Doyen LSV, ENS Cachan & CNRS Krishnendu Chatterjee IST Austria.
Algorithmic Software Verification III. Finite state games and pushdown automata.
1 Joe Meehean.  Log: binary search in sorted array  Linear: traverse a tree  Log-Linear: insert into a heap  Quadratic (N 2 ): your sort from P1 
CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.
Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
. CLASSES RP AND ZPP By: SARIKA PAMMI. CONTENTS:  INTRODUCTION  RP  FACTS ABOUT RP  MONTE CARLO ALGORITHM  CO-RP  ZPP  FACTS ABOUT ZPP  RELATION.
CSCI 3160 Design and Analysis of Algorithms Tutorial 10 Chengyu Lin.
Ásbjörn H Kristbjörnsson1 The complexity of Finding Nash Equilibria Ásbjörn H Kristbjörnsson Algorithms, Logic and Complexity.
Institute for Applied Information Processing and Communications (IAIK) 1 TU Graz/Computer Science/IAIK Graz, 2009 AK Design and Verification Presentation.
1. 2 Some details on the Simplex Method approach 2x2 games 2xn and mx2 games Recall: First try pure strategies. If there are no saddle points use mixed.
1. 2 You should know by now… u The security level of a strategy for a player is the minimum payoff regardless of what strategy his opponent uses. u A.
1 a1a1 A1A1 a2a2 a3a3 A2A Mixed Strategies When there is no saddle point: We’ll think of playing the game repeatedly. We continue to assume that.
NPC.
The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.
9.2 Mixed Strategy Games In this section, we look at non-strictly determined games. For these type of games the payoff matrix has no saddle points.
Theory of Computational Complexity Probability and Computing Lee Minseon Iwama and Ito lab M1 1.
Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS.
The Theory of NP-Completeness
Game Theory Just last week:
Proof technique (pigeonhole principle)
The Multiple Dimensions of Mean-Payoff Games
Lecture 22 Complexity and Reductions
Markov Decision Processes
Stochastic and non-Stochastic Games – a survey
Computability and Complexity
Uri Zwick Tel Aviv University
Analysis of Algorithms
Secular session of 2nd FILOFOCS April 10, 2013
Uri Zwick – Tel Aviv Univ.
Alternating tree Automata and Parity games
Thomas Dueholm Hansen – Aarhus Univ. Uri Zwick – Tel Aviv Univ.
Oliver Friedmann – Univ. of Munich Thomas Dueholm Hansen – Aarhus Univ
Discounted Deterministic Markov Decision Processes
Uri Zwick – Tel Aviv Univ.
Uri Zwick Tel Aviv University
Analysis of Algorithms
CS 188: Artificial Intelligence Fall 2007
Memoryless Determinacy of Parity Games
Problem Solving 4.
Lecture 20 Linear Program Duality
Algorithms (2IL15) – Lecture 7
The Theory of NP-Completeness
Markov Decision Processes
Collaboration in Repeated Games
Markov Decision Processes
Presentation transcript:

Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games

Zero sum games 12–3 0–5–52 17–2–2 Mixed strategies Max-min theorem …

Stochastic games [Shapley (1953)] 12–3 0–5–52 17–2–2 3–7–7 2–4–4–1–1 4 –1–1 7 Mixed positional (memoryless) optimal strategies

Simple Stochastic games (SSGs) 2 –5–5 7 2–4–4–1–1 4 –1–1 7 Every game has only one row or column Pure positional (memoryless) optimal strategies

Simple Stochastic games (SSGs) Graphic representation M MAX min m RAND R The players construct an (infinite) path e 0,e 1,… Terminating version Non-terminating version Discounted version Fixed duration games easily solved using dynamic programming

Simple Stochastic games (SSGs) Graphic representation – example MM m R MAX Start vertex min RAND

Simple Stochastic game (SSGs) Reachability version [Condon (1992)] M MAX min m RAND R M 0-sink M 1-sink Objective: Max / Min the prob. of getting to the 1-sink Technical assumption: Game halts with prob. 1 No weights All prob. are ½

Simple Stochastic games (SSGs) Basic properties Every vertex in the game has a value  v Both players have positional optimal strategies Positional strategy for MAX: choice of an outgoing edge from each MAX vertex Decision version: Is value  v

“Solving” binary SSGs The values v i of the vertices of a game are the unique solution of the following equations: Corollary: Decision version in NP  co-NP The values are rational numbers requiring only a linear number of bits

Markov Decision Processes (MDPs) Values and optimal strategies of a MDP can be found by solving an LP Theorem: [Derman (1970)] M MAX min m RAND R

NP  co-NP – Another proof Deciding whether the value of a game is at least (at most) v is in NP  co-NP To show that value  v, guess an optimal strategy  for MAX Find an optimal counter-strategy  for min by solving the resulting MDP. Is the problem in P ?

Mean Payoff Games (MPGs) [Ehrenfeucht, Mycielski (1979)] M MAX min m RAND R Non-terminating version Discounted version MPGs Reachability SSGs (PZ’96) Pseudo-polynomial algorithm (PZ’96)

Mean Payoff Games (MPGs) [Ehrenfeucht, Mycielski (1979)] Value – average of the cycle

Parity Games (PGs) EVEN 3 ODD 8 EVEN wins if largest priority seen infinitely often in even Equivalent to many interesting problems in automata and verification: Non-emptyness of  -tree automata modal  -calculus model checking Priorities

Parity Games (PGs) EVEN 3 ODD 8 Chang priority k to payoff (  n) k Mean Payoff Games (MPGs) Move payoff to outgoing edges [Stirling (1993)] [Puri (1995)]

Simple Stochastic games (SSGs) Additional properties An SSG is said to be binary if the outdegree of every non-sink vertex is 2 A switch is a change of a strategy at a single vertex A strategy is optimal iff no switch is profitable A switch is profitable for MAX if it increases the value of the game (sum of values of all vertices)

Start with an arbitrary strategy  for MAX Choose a random vertex i  V MAX Find the optimal strategy  ’ for MAX in the game in which the only outgoing edge from i is (i,  (i)) If switching  ’ at i is not profitable, then  ’ is optimal Otherwise, let  (  ’) i and repeat A randomized subexponential algorithm for binary SSGs [Ludwig (1995) ] [Kalai (1992) Matousek-Sharir-Welzl (1992) ]

There is a hidden order of MAX vertices under which the optimal strategy returned by the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i All correct ! Would never be switched ! MAX vertices

Exponential algorithm for PGs [McNaughton (1993)] [Zielonka (1998)] Vertices of highest priority (even) Vertices from which EVEN can force the game to enter A First recursive call Second recursive call In the worst case, both recursive calls are on games of size n  1

Deterministic subexponential alg for PGs Jurdzinski, Paterson, Z (2006) Second recursive call Dominion Idea: Look for small dominions! A (small) set from which one of the players can without the play ever leaving this set Dominions of size s can be found in O(n s ) time

Open problems ● Polynomial algorithms? ● Faster subexponential algorithms for parity games? ● Deterministic subexponential algorithms for MPGs and SSGs? ● Faster pseudo-polynomial algorithms for MPGs?