Presentation is loading. Please wait.

Presentation is loading. Please wait.

Games Henry Kautz.

Similar presentations


Presentation on theme: "Games Henry Kautz."— Presentation transcript:

1 Games Henry Kautz

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30 ExpectiMiniMax: Alpha-Beta Pruning Cutoffs at Max and Min nodes work just as before If range of values is bounded, can add cutoffs to Chance nodes Assume that all branches not searched have the worst-case result L = lowest value achievable (-10) U = highest value achievable (10)

31 ExpectiMiniMax: Cutoffs Beta cutoff: Alpha cutoff: Values seen Values to come Current value Values seen Values to come Current value

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48 Probabilistic STRIPS Planning domain: Hungry Monkey shake:if (ontable) Prob(2/3) -> +1 banana Prob(1/3) -> no change else Prob(1/6) -> +1 banana Prob(5/6) -> no change jump:if (~ontable) Prob(2/3) -> ontable Prob(1/3) -> ~ontable else ontable

49 What is the expected reward? [1] shake [2] jump; shake [3] jump; shake; shake; [4] jump; if (~ontable){ jump; shake} else { shake; shake }

50 ExpectiMax

51 Hungry Monkey: 2-Ply Game Tree 0 0 1 0 0 0 1 0 1 1 2 1 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

52 ExpectiMax 1 – Chance Nodes 0 2/3 0 0 1 0 0 1/6 0 0 1 0 1 7/6 1 1 2 1 0 1/6 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

53 ExpectiMax 2 – Max Nodes 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

54 ExpectiMax 3 – Chance Nodes 1/2 1/3 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

55 ExpectiMax 4 – Max Node 1/2 1/3 2/3 0 0 0 1 0 1/6 0 0 0 1 0 7/6 1 1 1 2 1 1/6 0 0 0 1 0 jump shake 2/3 1/3 1/6 5/6 1/6 5/6

56 Policies The result of the ExpectiMax analysis is a conditional plan (also called a policy): Optimal plan for 2 steps: jump; shake Optimal plan for 3 steps: jump; if (ontable) {shake; shake} else {jump; shake} Probabilistic planning can be generalized in many ways, including action costs and hidden state The general problem is that of solving a Markov Decision Process (MDP)

57 Gambler’s Paradox How much would you pay to play the following game? Flip a coin. If heads, you win $2. Otherwise: flip again. If heads, you win $4. Otherwise: flip again. If heads, you win $8. Otherwise: flip again. If heads, you win $16.

58 Expected Value Expect value is INFINITE! (1/2)*2 + (1/4)*4 + (1/8)*8 + … “Rationally” you should pay ANY fixed amount. In real life, people will pay about $20. –This is consistent with logarithmic utility of money –(1/2)*log(2) + (1/4)*log(4) + (1/8)*log(8) + …


Download ppt "Games Henry Kautz."

Similar presentations


Ads by Google