Games Henry Kautz.

Games Henry Kautz

ExpectiMiniMax: Alpha-Beta Pruning Cutoffs at Max and Min nodes work just as before If range of values is bounded, can add cutoffs to Chance nodes Assume that all branches not searched have the worst-case result L = lowest value achievable (-10) U = highest value achievable (10)

ExpectiMiniMax: Cutoffs Beta cutoff: Alpha cutoff: Values seen Values to come Current value Values seen Values to come Current value

Probabilistic STRIPS Planning domain: Hungry Monkey shake:if (ontable) Prob(2/3) -> +1 banana Prob(1/3) -> no change else Prob(1/6) -> +1 banana Prob(5/6) -> no change jump:if (~ontable) Prob(2/3) -> ontable Prob(1/3) -> ~ontable else ontable

What is the expected reward? [1] shake [2] jump; shake [3] jump; shake; shake; [4] jump; if (~ontable){ jump; shake} else { shake; shake }

ExpectiMax

Hungry Monkey: 2-Ply Game Tree 0 0 1 0 0 0 1 0 1 1 2 1 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

ExpectiMax 1 – Chance Nodes 0 2/3 0 0 1 0 0 1/6 0 0 1 0 1 7/6 1 1 2 1 0 1/6 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

ExpectiMax 2 – Max Nodes 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

ExpectiMax 3 – Chance Nodes 1/2 1/3 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

ExpectiMax 4 – Max Node 1/2 1/3 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

Policies The result of the ExpectiMax analysis is a conditional plan (also called a policy): Optimal plan for 2 steps: jump; shake Optimal plan for 3 steps: jump; if (ontable) {shake; shake} else {jump; shake} Probabilistic planning can be generalized in many ways, including action costs and hidden state The general problem is that of solving a Markov Decision Process (MDP)

Gambler’s Paradox How much would you pay to play the following game? Flip a coin. If heads, you win $2. Otherwise: flip again. If heads, you win $4. Otherwise: flip again. If heads, you win $8. Otherwise: flip again. If heads, you win $16.

Expected Value Expect value is INFINITE! (1/2)*2 + (1/4)*4 + (1/8)*8 + … “Rationally” you should pay ANY fixed amount. In real life, people will pay about $20. –This is consistent with logarithmic utility of money –(1/2)*log(2) + (1/4)*log(4) + (1/8)*log(8) + …

Games Henry Kautz.

Similar presentations

Presentation on theme: "Games Henry Kautz."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Games Henry Kautz.

Similar presentations

Presentation on theme: "Games Henry Kautz."— Presentation transcript:

Similar presentations

About project

Feedback