Presentation is loading. Please wait.

Presentation is loading. Please wait.

Games 1 Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf.

Similar presentations


Presentation on theme: "Games 1 Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf."— Presentation transcript:

1 Games 1 Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf

2 Games 2 Alpha-Beta Example (continued) [-∞,3] [-∞,+∞]

3 Games 3 Alpha-Beta Example (continued) [-∞,3] [-∞,+∞]

4 Games 4 Alpha-Beta Example (continued) [3,+∞] [3,3]

5 Games 5 Alpha-Beta Example (continued) [-∞,2] [3,+∞] [3,3] This node is worse for MAX

6 Games 6 Alpha-Beta Example (continued) [-∞,2] [3,14] [3,3][-∞,14],

7 Games 7 Alpha-Beta Example (continued) [−∞,2] [3,5] [3,3][-∞,5],

8 Games 8 Alpha-Beta Example (continued) [2,2] [−∞,2] [3,3]

9 Games 9 Alpha-Beta Example (continued) [2,2] [-∞,2] [3,3]

10 Games 10 Comments about Alpha-Beta Pruning  Pruning does not affect final results  Entire subtrees can be pruned  Alpha-beta pruning can look twice as far as minimax in the same amount of time

11 Games 11 Heuristic Evaluation Function (EVAL)  Idea: produce an estimate of the expected utility of the game from a given position.  Performance depends on quality of EVAL.  Must be able to differentiate between good and bad board states  Exact values not important

12 Games 12  Must be consistent with the utility function  values for terminal nodes (or at least their order) must be the same  should reflect the actual chances of winning  Frequently weighted linear functions are used  E = w 1 f 1 + w 2 f 2 + … + w n f n  combination of features, weighted by their relevance  Example in chess  Weights: Pawn=1, knight=bishop=3, rook=5, queen=9 Heuristic Evaluation Function (EVAL)

13 Games 13 Example Chess Score  Black has:  5 pawns, 1 bishop, 2 rooks  Score = 1*(5)+3*(1)+5*(2) = 5+3+10 = 18 White has:  5 pawns, 1 rook  Score = 1*(5)+5*(1) = 5 + 5 = 10 Overall scores for this board state: black = 18-10 = 8 white = 10-18 = -8

14 Games 14 Example: Tic-Tac-Toe  simple evaluation function E(s) = (rx + cx + dx) - (ro + co + do) where r,c,d are the numbers of row, column and diagonal lines still available; x and o are the pieces of the two players  1-ply lookahead  start at the top of the tree  evaluate all 9 choices for player 1  pick the maximum E-value  2-ply lookahead  also looks at the opponents possible move  assuming that the opponents picks the minimum E-value

15 Games 15 E(s12) 8 - 6 = 2 E(s13) 8 - 5 = 3 E(s14) 8 - 6 = 2 E(s15) 8 - 4 = 4 E(s16) 8 - 6 = 2 E(s17) 8 - 5 = 3 E(s18) 8 - 6 = 2 E(s19) 8 - 5 = 3 Tic-Tac-Toe 1-Ply XXX XXX XXX E(s11) 8 - 5 = 3 E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4

16 Games 16 E(s2:16) 5 - 6 = -1 E(s2:15) 6 -6 = 0 E(s28) 5 - 5 = 0 E(s27) 6 - 5 = 1 E(s2:48) 5 - 4 = 1 E(s2:47) 6 - 4 = 2 E(s2:13) 6 - 6 = 0 E(s2:9) 5 - 6 = -1 E(s2:10) 5 -6 = -1 E(s2:11) 5 - 6 = -1 E(s2:12) 4 - 6 = -2 E(s2:14) 5 - 6 = -1 E(s25) 6 - 5 = 1 E(s21) 6 - 5 = 1 E(s22) 5 - 5 = 0 E(s23) 6 - 5 = 1 E(s24) 4 - 5 = -1 E(s26) 5 - 5 = 0 E(s1:6) 8 - 6 = 2 E(s1:7) 8 - 5 = 3 E(s1:8) 8 - 6 = 2 E(s1:9) 8 - 5 = 3 E(s1:5) 8 - 4 = 4 E(s1:3) 8 - 5 = 3 E(s1:2) 8 - 6 = 2 E(s1:1) 8 - 5 = 3 E(s2:45) 6 - 4 = 2 Tic-Tac-Toe 2-Ply XXX XXX XXX E(s0) = Max{E(s11), E(s1n)} = Max{2,3,4} = 4 E(s1:4) 8 - 6 = 2 XOX O X O E(s2:41) 5 - 4 = 1 E(s2:42) 6 - 4 = 2 E(s2:43) 5 - 4 = 1 E(s2:44) 6 - 4 = 2 E(s2:46) 5 - 4 = 1 OX O X O X O XX O X O X O X O XX O XOOXX O X O X O X O X O X O XOXOX O O

17 Games 17 31 Checkers Case Study  initial board configuration  Black single on 20 single on 21 king on 31  Redsingle on 23 king on 22  evaluation function E(s) = (5 x 1 + x 2 ) - (5r 1 + r 2 ) where x 1 = black king advantage, x 2 = black single advantage, r 1 = red king advantage, r 2 = red single advantage 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23

18 Games 18 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23 MAX MIN Checkers MiniMax Example

19 Games 19 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 6 1 6 MAX MIN Checkers Alpha-Beta Example

20 Games 20 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 1 1 1 MAX MIN Checkers Alpha-Beta Example

21 Games 21 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 1 1 1  cutoff: no need to examine further branches MAX MIN Checkers Alpha-Beta Example

22 Games 22 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 1 1 1 MAX MIN Checkers Alpha-Beta Example

23 Games 23 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 1 1 1  cutoff: no need to examine further branches MAX MIN Checkers Alpha-Beta Example

24 Games 24 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 1 1 1 MAX MIN Checkers Alpha-Beta Example

25 Games 25 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 13 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1 0 1 0 MAX MIN Checkers Alpha-Beta Example

26 Games 26 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1  -4 MAX MIN Checkers Alpha-Beta Example

27 Games 27 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 22 -> 31 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1  -4  cutoff: no need to examine further branches MAX MIN Checkers Alpha-Beta Example

28 Games 28 22 -> 31 1 1 1112 2 6 6 1 1 11111 1 11116 6 0 0 00-4 -8 10-4-8 1 20 -> 16 21 -> 17 31 -> 26 31 -> 27 22 -> 17 22 -> 18 22 -> 25 22 -> 26 23 -> 26 23 -> 27 21 -> 14 16 -> 11 31 -> 27 16 -> 11 31 -> 27 31 -> 22 16 -> 11 31 -> 27 31 -> 24 22 -> 18 23 -> 30 23 -> 32 20 -> 16 31 -> 27 31 -> 26 21 -> 17 20 -> 16 21 -> 17 20 -> 16 21 -> 17 31 1234 865 9101112 161413 17181920 242221 25262728 323029 7 15 23  1  -8 MAX MIN Checkers Alpha-Beta Example

29 Games 29 Horizon Problem  Moves may have disastrous consequences in the future, but the consequences are not visible  Agent cannot see far enough into search space

30 Games 30 Games with Chance  In many games, there is a degree of unpredictability through random elements  throwing dice, card distribution, roulette wheel, …  This requires chance nodes in addition to the Max and Min nodes  branches indicate possible variations  each branch indicates the outcome and its likelihood (probability)

31 Games 31 Games with Chance chance nodes

32 Games 32 Decisions with Chance  The utility value of a position depends on the random element  the definite minimax value must be replaced by an expected value  Calculation of expected values  utility function for terminal nodes  for all other nodes  calculate the utility for each chance event  weigh by the chance that the event occurs  add up the individual utilities

33 Games 33 More interesting (but still trivial) game  Deal four cards face up  Player 1 chooses a card  Player 2 throws a die  If it’s a six, player 2 chooses a card, swaps it with player 1’s and keeps player 1’s card  If it’s not a six, player 2 just chooses a card  Player 1 chooses next card  Player 2 takes the last card

34 Games 34 Expectiminimax Diagram

35 Games 35 Expectiminimax Calculations

36 Games 36 Games and Computers  State of the art for some game programs  Chess  Checkers  Othello  Backgammon  Go

37 Games 37 Chess  Deep Blue, a special-purpose parallel computer, defeated the world champion Gary Kasparov in 1997  the human player didn’t show his best game  some claims that the circumstances were questionable  Deep Blue used a massive data base with games from the literature  Fritz, a program running on an ordinary PC, challenged the world champion Vladimir Kramnik to an eight-game draw in 2002  top programs and top human players are roughly equal

38 Games 38 Checkers  Arthur Samuel develops a checkers program in the 1950s that learns its own evaluation function  reaches an expert level stage in the 1960s  Chinook becomes world champion in 1994  human opponent, Dr. Marion Tinsley, withdraws for health reasons  Tinsley had been the world champion for 40 years  Chinook uses off-the-shelf hardware, alpha-beta search, end-games data base for six-piece positions

39 Games 39 Othello  Logistello defeated the human world champion in 1997  Many programs play far better than humans  smaller search space than chess  little evaluation expertise available

40 Games 40 Backgammon  TD-Gammon, neural-network based program, ranks among the best players in the world  improves its own evaluation function through learning techniques  search-based methods are practically hopeless  chance elements, branching factor

41 Games 41 Go  Humans play far better  large branching factor (around 360)  search-based methods are hopeless  Rule-based systems play at amateur level  The use of pattern-matching techniques can improve the capabilities of programs  difficult to integrate  $2,000,000 prize for the first program to defeat a top-level player

42 Games 42 Chapter Summary  Many game techniques are derived from search methods  The minimax algorithm determines the best move for a player by calculating the complete game tree  Alpha-beta pruning dismisses parts of the search tree that are provably irrelevant  An evaluation function gives an estimate of the utility of a state when a complete search is impractical  Chance events can be incorporated into the minimax algorithm by considering the weighted probabilities of chance events


Download ppt "Games 1 Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf."

Similar presentations


Ads by Google