Download presentation
Presentation is loading. Please wait.
1
Games Henry Kautz
30
ExpectiMiniMax: Alpha-Beta Pruning Cutoffs at Max and Min nodes work just as before If range of values is bounded, can add cutoffs to Chance nodes Assume that all branches not searched have the worst-case result L = lowest value achievable (-10) U = highest value achievable (10)
31
ExpectiMiniMax: Cutoffs Beta cutoff: Alpha cutoff: Values seen Values to come Current value Values seen Values to come Current value
48
Probabilistic STRIPS Planning domain: Hungry Monkey shake:if (ontable) Prob(2/3) -> +1 banana Prob(1/3) -> no change else Prob(1/6) -> +1 banana Prob(5/6) -> no change jump:if (~ontable) Prob(2/3) -> ontable Prob(1/3) -> ~ontable else ontable
49
What is the expected reward? [1] shake [2] jump; shake [3] jump; shake; shake; [4] jump; if (~ontable){ jump; shake} else { shake; shake }
50
ExpectiMax
51
Hungry Monkey: 2-Ply Game Tree 0 0 1 0 0 0 1 0 1 1 2 1 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6
52
ExpectiMax 1 – Chance Nodes 0 2/3 0 0 1 0 0 1/6 0 0 1 0 1 7/6 1 1 2 1 0 1/6 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6
53
ExpectiMax 2 – Max Nodes 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6
54
ExpectiMax 3 – Chance Nodes 1/2 1/3 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6
55
ExpectiMax 4 – Max Node 1/2 1/3 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6
56
Policies The result of the ExpectiMax analysis is a conditional plan (also called a policy): Optimal plan for 2 steps: jump; shake Optimal plan for 3 steps: jump; if (ontable) {shake; shake} else {jump; shake} Probabilistic planning can be generalized in many ways, including action costs and hidden state The general problem is that of solving a Markov Decision Process (MDP)
57
Gambler’s Paradox How much would you pay to play the following game? Flip a coin. If heads, you win $2. Otherwise: flip again. If heads, you win $4. Otherwise: flip again. If heads, you win $8. Otherwise: flip again. If heads, you win $16.
58
Expected Value Expect value is INFINITE! (1/2)*2 + (1/4)*4 + (1/8)*8 + … “Rationally” you should pay ANY fixed amount. In real life, people will pay about $20. –This is consistent with logarithmic utility of money –(1/2)*log(2) + (1/4)*log(4) + (1/8)*log(8) + …
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.