# 159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.

## Presentation on theme: "159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing."— Presentation transcript:

159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing

159.3022 Types of Games Bridge, Poker, Scrabble Battleships Backgammon, Monopoly Chess, Checkers,Go Deterministic Chance Perfect Information Imperfect Information

159.3023 Two player game Two players: MAX and MIN MAX moves first and they take turns until the game is over. Properties Initial state: e.g. board configuration of chess Successor function: list of (move,state) pairs specifying legal moves. Terminal test: Is the game finished? Utility function: Gives numerical value of terminal states. E.g. win (+1), loose (-1) and draw (0) MAX uses a search tree to determine next move.

159.3024 Game tree for noughts and crosses

159.3025 What is the Optimal Strategy? Assume MIN plays perfectly for each node assign a value given by: the utility function if it is a terminal node the minimum of the successor nodes if it is a min node the maximum of the successor nodes if it is a max node

159.3026 Minimax Algorithm For each move by the computer 1. Perform depth-first search as far as the terminal state 2. Assign utilities at each terminal state 3. Propagate upwards the minimax choices If the parent is a minimizer (opponent) Propagate up the minimum value of the children If the parent is a maximizer (computer) Propagate up the maximum value of the children 4. Choose the move (the child of the current node) corresponding to the maximum of the minimax values of the children

159.3027 Minimax Algorithm int MINIMAX(N) { if N is a leaf then return the score of this leaf else Let N 1, N 2,.., N m be the successors of N; if N is a Min node then return min{MINIMAX(N 1 ),.., MINIMAX(N m )} else return max{MINIMAX(N 1 ),.., MINIMAX(N m )} }

159.3028 Example

159.3029 A Real Game Start with a stack of coins Each player divides one of the current stacks into two unequal stacks (one having more coins than the other). The game ends when every stack contains one or two coins The first player who cannot play loses.

159.30210 Game Tree 7 6, 15, 2 4, 3 3, 2, 2 3, 3, 1 Min’s turn 5, 1, 14, 2, 1 Max’s turn 4, 1, 1, 1 3, 2, 1, 1 Max’s turn Min’s turn Max’s turn 3, 1, 1, 1, 1 2, 1, 1, 1, 1, 1 MAX Loses Min’s turn 2, 2, 2, 1 MAX Loses 2, 2, 1, 1, 1 Min Loses

159.30211 Complexity Time Complexity O(b m ) Space Complexity O(bm) Where b is the branching factor and m is the maximum depth For chess b=35, m=100 approximately, this is not feasible

159.30212 Evaluation functions Not often practical to search all the way to the terminal states. Use a heuristic evaluation function to estimate which moves lead to better terminal states. A cutoff test must be used to limit the search. Choosing a good evaluation function is very important to the efficiency of this method. The evaluation function must agree with the utility function for terminal states and it must be a good predictor of the terminal values. If the evaluation function is infallible then no search is necessary, just choose the move that gives the best position. The better the evaluation function the less search need to be done.

159.30213 Cutting off Search Fixed depth OK, if we know how long it will take to evaluate the tree. Iterative deepening Good if there is a time limit, just keep going until time is up and use best so far.

159.30214 Can we do better? Yes mimimax examines some branches that are already known to be bad. Alpha-Beta pruning keep a track of the best and worst values (alpha and beta) if one of the successors of a min node is worse than the best so far, go no further. If one of the successors of a max node is better than the worst so far, go no further.

159.30215 Alpha-Beta Algorithm int MAX-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=-MAXVAL for each s in SUCCESSORS (state) { v = MAX (v, MIN-VALUE (s,alpha,beta)) if(v>=beta) return v if(v>alpha) alpha=v } return v } int MIN-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=MAXVAL for each s in SUCCESSORS (state) { v = MIN (v, MAX-VALUE (s,alpha, beta)) if (v<=alpha) return v if (v { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/14/4380433/slides/slide_15.jpg", "name": "159.30215 Alpha-Beta Algorithm int MAX-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=-MAXVAL for each s in SUCCESSORS (state) { v = MAX (v, MIN-VALUE (s,alpha,beta)) if(v>=beta) return v if(v>alpha) alpha=v } return v } int MIN-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=MAXVAL for each s in SUCCESSORS (state) { v = MIN (v, MAX-VALUE (s,alpha, beta)) if (v<=alpha) return v if (v=beta) return v if(v>alpha) alpha=v } return v } int MIN-VALUE (state, alpha, beta) { if CUTOFF-TEST (state) then return EVAL (state) v=MAXVAL for each s in SUCCESSORS (state) { v = MIN (v, MAX-VALUE (s,alpha, beta)) if (v<=alpha) return v if (v

159.30216 Alpha-Beta Pruning

159.30217 Alpha-Beta Pruning

159.30218 Alpha-Beta Pruning

159.30219 Alpha-Beta Pruning

159.30220 Alpha-Beta Pruning

159.30221 Effectiveness of alpha-beta pruning. What are the maximum savings possible? Suppose the tree is ordered as follows: ^ ____________________________o____________________________ / | \ v ________o________ _______o________ ________o________ / | \ / | \ / | \ ^ o o o o o o o o o / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ / | \ 14 15 16 17 18 19 20 21 22 13 14 15 26 27 28 29 30 31 12 13 14 35 36 37 38 39 40 * * * * * * * * * * * Only those nodes marked (*) need be evaluated. How many static evaluations are needed?. If b is the branching factor (3 above) and d is the depth (3 above) s = 2b d/2 - 1 IF d is even s = b (d+1)/2 + b (d-1)/2 - 1 IF d is odd

159.30222 Effectiveness of alpha-beta pruning. For our tree d=3, b=3 and so s=11. This is only for the perfectly arranged tree. It gives a lower bound on the number of evaluations of approximately 2 bd/2.The worst case is b d (minimax) In practice, for reasonable games, the complexity is O(b 3d/4 ). Using minimax with alpha beta pruning allows us to look ahead about half as far again as without.

Download ppt "159.3021 Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing."

Similar presentations