Presentation on theme: "Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College."— Presentation transcript:
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College
Why Games? Small number of rules Well-defined knowledge set Easy to evaluate performance Large search spaces (too large for exhaustive search) Fame & Fortune, e.g. ChessChess
Example Games & Best Computer Players (sec. 6.6 w/updates) Chess - Deep Blue (beat Kasparov); Deep Junior (tied Kasparov); Hydra (scheduled to play British champion for 80,000 pounds) Checkers - Chinook (world champion) Go (Goemate, Go4++ rated “weak amateur”) Othello - Iago (world championship level), Logistello (defeated world champion, now retired) Backgammon - TD-Gammon (neural network that learns to play using “reinforcement learning”)
Properties of Games Two-Player Zero-sum –If it’s good for one player, it’s bad for the opponent and vice versa Perfect information –All relevant information is apparent to both players (no hidden cards)
Game as Search Problem –State space search Each potential board or game position is a state Each possible move is an operation Space can be BIG: –large branching factor (chess avg. 35) –deep search for game (chess avg. 50 ply) –Components of any search technique Move generator (successor function) Terminal test (end of game?) Utility function (win, lose or draw?)
Game Tree Root is initial state Next level is all of first player’s moves Next level is all of second player’s moves Example: Tic Tac Toe –Root: 9 blank squares –Level 1: 3 different boards (corner, center and edge X) –Level 2 below center: 2 different boards (corner, edge) –Etc. Utility function: win for X is 1, win for O is -1 –X is Maximizer, O is minimizer
Minimax Strategy Max’s goal: get to 1 Min’s goal: get to -1 Max’s strategy –Choose moves that will lead to a win, even though min is trying to block Minimax value of a node (backed up value): –If N is terminal, use the utility value –If N is a Max move, take max of successors –If N is a Min move, take min of successors
Minimax Values: 2-Ply Example
Minimax Algorithm Depth-first search to bottom of tree As search “unwinds”, compute backed up values Backed-up value of root determines which step to take. Assumes: –Both players are playing this strategy (optimally) –Tree is small enough to search completely
Alpha-Beta Pruning We don’t really have to look at all subtrees! Recognize when a position can never be chosen in minimax no matter what its children are –Max (3, Min(2,x,y) …) is always ≥ 3 –Min (2, Max(3,x,y) …) is always ≤ 2 –We know this without knowing x and y!
Alpha-Beta Pruning Alpha = the value of the best choice we’ve found so far for MAX (highest) Beta = the value of the best choice we’ve found so far for MIN (lowest) When maximizing, cut off values lower than Alpha When minimizing, cut off values greater than Beta
Alpha-Beta Example 53281xx76 23<=1 3
Notes on Alpha-Beta Pruning Effectiveness depends on order of successors (middle vs. last node of 2-ply example) If we can evaluate best successor first, search is O(b d/2 ) instead of O(b d ) This means that in the same amount of time, alpha-beta search can search twice as deep!
Optimizing Minimax Search Use alpha-beta cutoffs –Evaluate most promising moves first Remember prior positions, reuse their backed-up values –Transposition table (like closed list in A*) Avoid generating equivalent states (e.g. 4 different first corner moves in tic tac toe) But, we still can’t search a game like chess to the end!
When you can’t search to the end Replace terminal test (end of game) by cutoff test (don’t search deeper) Replace utility function (win/lose/draw) by heuristic evaluation function that estimates results on the best path below this board –Like A* search, good evaluation functions mean good results (and vice versa) Replace move generator by plausible move generator (don’t consider “dumb” moves)
Good evaluation functions… Order terminal states in the same order as the utility function Don’t take too long (we want to search as deep as possible in limited time) Should be as accurate as possible (estimate chances of winning from that position…) –Human knowledge (e.g. material value) –Known solution (e.g. endgame) –Pre-searched examples (take features, average value of endgame of all games with that feature)
How Deep to Search? Until time runs out (the original application of Iterative Deepening!) Until values don’t seem to change (quiescence) Deep enough to avoid horizon effect (delaying tactic to delay the inevitable beyond the depth of the search) Singular extensions - search best (apparent) paths deeper than others –Tends to limit horizon effect, since these are the moves that will exhibit it