Download presentation

Presentation is loading. Please wait.

Published byLesly Lugar Modified over 4 years ago

1
Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games

2
Zero sum games 12–3 0–5–52 17–2–2 Mixed strategies Max-min theorem …

3
Stochastic games [Shapley (1953)] 12–3 0–5–52 17–2–2 3–7–7 2–4–4–1–1 4 –1–1 7 Mixed positional (memoryless) optimal strategies

4
Simple Stochastic games (SSGs) 2 –5–5 7 2–4–4–1–1 4 –1–1 7 Every game has only one row or column Pure positional (memoryless) optimal strategies

5
Simple Stochastic games (SSGs) Graphic representation M MAX min m RAND R The players construct an (infinite) path e 0,e 1,… Terminating version Non-terminating version Discounted version Fixed duration games easily solved using dynamic programming

6
Simple Stochastic games (SSGs) Graphic representation – example MM m R MAX Start vertex min RAND

7
Simple Stochastic game (SSGs) Reachability version [Condon (1992)] M MAX min m RAND R M 0-sink M 1-sink Objective: Max / Min the prob. of getting to the 1-sink Technical assumption: Game halts with prob. 1 No weights All prob. are ½

8
Simple Stochastic games (SSGs) Basic properties Every vertex in the game has a value v Both players have positional optimal strategies Positional strategy for MAX: choice of an outgoing edge from each MAX vertex Decision version: Is value v

9
“Solving” binary SSGs The values v i of the vertices of a game are the unique solution of the following equations: Corollary: Decision version in NP co-NP The values are rational numbers requiring only a linear number of bits

10
Markov Decision Processes (MDPs) Values and optimal strategies of a MDP can be found by solving an LP Theorem: [Derman (1970)] M MAX min m RAND R

11
NP co-NP – Another proof Deciding whether the value of a game is at least (at most) v is in NP co-NP To show that value v, guess an optimal strategy for MAX Find an optimal counter-strategy for min by solving the resulting MDP. Is the problem in P ?

12
Mean Payoff Games (MPGs) [Ehrenfeucht, Mycielski (1979)] M MAX min m RAND R Non-terminating version Discounted version MPGs Reachability SSGs (PZ’96) Pseudo-polynomial algorithm (PZ’96)

13
Mean Payoff Games (MPGs) [Ehrenfeucht, Mycielski (1979)] Value – average of the cycle

14
Parity Games (PGs) EVEN 3 ODD 8 EVEN wins if largest priority seen infinitely often in even Equivalent to many interesting problems in automata and verification: Non-emptyness of -tree automata modal -calculus model checking Priorities

15
Parity Games (PGs) EVEN 3 ODD 8 Chang priority k to payoff ( n) k Mean Payoff Games (MPGs) Move payoff to outgoing edges [Stirling (1993)] [Puri (1995)]

16
Simple Stochastic games (SSGs) Additional properties An SSG is said to be binary if the outdegree of every non-sink vertex is 2 A switch is a change of a strategy at a single vertex A strategy is optimal iff no switch is profitable A switch is profitable for MAX if it increases the value of the game (sum of values of all vertices)

17
Start with an arbitrary strategy for MAX Choose a random vertex i V MAX Find the optimal strategy ’ for MAX in the game in which the only outgoing edge from i is (i, (i)) If switching ’ at i is not profitable, then ’ is optimal Otherwise, let ( ’) i and repeat A randomized subexponential algorithm for binary SSGs [Ludwig (1995) ] [Kalai (1992) Matousek-Sharir-Welzl (1992) ]

18
There is a hidden order of MAX vertices under which the optimal strategy returned by the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i All correct ! Would never be switched ! MAX vertices

19
Exponential algorithm for PGs [McNaughton (1993)] [Zielonka (1998)] Vertices of highest priority (even) Vertices from which EVEN can force the game to enter A First recursive call Second recursive call In the worst case, both recursive calls are on games of size n 1

20
Deterministic subexponential alg for PGs Jurdzinski, Paterson, Z (2006) Second recursive call Dominion Idea: Look for small dominions! A (small) set from which one of the players can without the play ever leaving this set Dominions of size s can be found in O(n s ) time

21
Open problems ● Polynomial algorithms? ● Faster subexponential algorithms for parity games? ● Deterministic subexponential algorithms for MPGs and SSGs? ● Faster pseudo-polynomial algorithms for MPGs?

Similar presentations

OK

The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.

The Theory of NP-Completeness 1. Nondeterministic algorithms A nondeterminstic algorithm consists of phase 1: guessing phase 2: checking If the checking.

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google