Presentation on theme: "Game Playing "Abstract" games are of interest to AI (artificial intelligence) because: – Games states are accessible and easy to represent – Games are."— Presentation transcript:
Game Playing "Abstract" games are of interest to AI (artificial intelligence) because: – Games states are accessible and easy to represent – Games are usually restricted to a small number of well-defined actions – Successful programs which play complex games are evidence of machine intelligence Game playing goes beyond the search technique A* because of opponent behavior, which is unpredictable.
Games vs. Search Problems ● “Unpredictable” opponent – Solution is a contingency plan ● Time limits – Unlikely to find goal, must approximate ● Plan of attack: – Algorithm for perfect play: Minimax – Finite horizon and evaluation functions – “Pruning” to reduce costs
Types of Games chess, checkers, go, othello backgammon, monopoly battleship bridge, poker, scrabble, nuclear war Deterministic Chance Perfect Information Imperfect Information
Two-Person Zero-Sum Games ● A zero-sum game is one in which a gain by one player (MAX) is equivalent to a loss by the other (MIN), leading to a sum of zero overall advantage ● A two-person game defines a state space that has the game's starting configuration as its root ● Final states in the state space signify either a win for MAX, or a win for MIN, or a draw ● MAX's goal: maximize the value of the final state. ● MIN's goal: minimize the value of the final state
Formal Parts of a Game ● Initial state, including whose move it is (either MAX or MIN) ● Operators, defining the legal moves ● Terminal test, determining when the game is over ● Utility function, giving a numeric value to the outcome of a game: – A higher utility is a win for MAX; a lower utility is a win for MIN. ● These parts define a game tree
Partial Game Tree for Tic-Tac-Toe MAX(X) MIN(O) MAX(X) MIN(O) Terminal utility Win for MIN Draw Win for MAX
One Version of the Game of Nim ● Start with one pile of N objects, say coins ● A move consists of dividing any pile into two unequal-size piles ● The first player who cannot move loses
Partial Game Tree for Nim with N=7 7 5-24-36-1 4-2-13-3-13-2-24-2-1 5-1-1 Note that state 4-2-1 is repeated. We can simplify the structure by drawing a general graph.
Complete State Space for Nim (7) 7 5-24-36-1 4-2-15-1-13-2-23-3-1 3-2-1-12-2-2-14-1-1-1 3-1-1-1-12-2-1-1-1 2-1-1-1-1-1 MIN MAX MIN MAX MIN MAX Win for MAX Win for MIN
A Forced Win for MAX (Bold Lines) 7 5-24-36-1 4-2-15-1-13-2-23-3-1 3-2-1-12-2-2-14-1-1-1 3-1-1-1-12-2-1-1-1 2-1-1-1-1-1 MIN MAX MIN MAX MIN MAX If MIN goes first, and MAX plays intelligently, a win can be guaranteed for MAX
Game Tree Terminology ● Each level in a game search tree is called a "ply". "2-ply" corresponds to a player's move and the opponent's response ● Here is a trivial 2-ply tree:
Interpreting the Game Tree ● MAX is to move first ● MAX can choose among 3 actions A 1, A 2 and A 3 ● MIN can respond to move A i with A i1, A i2, or A i3 ● There are 9 terminal states, whose utility values for MAX are computed and shown below the state ● On the basis of the terminal utilities, the utilities of nonterminal states are "backed up" the tree to the root, indicating that MAX should choose A 1 ● How?
The Minimax Procedure for Simple Games 1 Generate entire game tree 2 Apply utility function to each terminal state 3 Determine utility of states at previous ply by asking, "If MIN had these choices at this ply, what would MIN choose?" (Answer: the minimum utility state) 1 At ply previous to THAT, determine utility by taking the maximum of the minimums taken by MIN 2 Continue in this way up the tree until root is reached
Exhaustive Minimax for Nim 7 5-24-36-1 4-2-15-1-13-2-23-3-1 3-2-1-12-2-2-14-1-1-1 3-1-1-1-12-2-1-1-1 2-1-1-1-1-1 MIN MAX MIN MAX MIN MAX 1 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 terminal utility values for MAX
A Forced Win for MAX (Bold Lines) 7 5-24-36-1 4-2-15-1-13-2-23-3-1 3-2-1-12-2-2-14-1-1-1 3-1-1-1-12-2-1-1-1 2-1-1-1-1-1 MIN MAX MIN MAX MIN MAX 1 0 0 0 0 0 0 0 0 1 1 1 1 1 0 1 1 If MIN goes first, and MAX plays intelligently, a win can be guaranteed for MAX
Implementing Minimax (Pseudocode) Move minimaxDecision(State s) for each Move m from s do value[m] = minimaxValue(nextState(s,m)) return the m with the highest value[m] Suppose: Integer utility(State s); // returns a state's utility value State nextState(State s, Move m); // returns the new state // resulting from applying m to s Successors expand(State s); // returns all of the possible next states
Implementing Minimax (cont'd) Integer minimaxValue(State s) if s is a terminal state then return utility(s) else successors = expand(s) if it is MAX's turn to move then return the highest minimaxValue of successors else return the lowest minimaxValue of successors
Trace of Minimax on Example Tree > minimaxDecision(R) > minimaxValue(X) > minimaxValue(A) => returns 3 > minimaxValue(B) => returns 12 > minimaxValue(C) => returns 8 returns 3 [minimum value of 3, 12, and 8] > minimaxValue(Y) > minimaxValue(D) => returns 2 > minimaxValue(E) => returns 4 > minimaxValue(F) => returns 6 returns 2 [minimum value of 2, 4, and 6] > minimaxValue(Z) > minimaxValue(G) => returns 14 > minimaxValue(H) => returns 5 > minimaxValue(I) => returns 2 returns 2 [minimum value of 14, 5, and 2] returns A 1 [the move that produces the state with the highest value in 3, 2, and 2]
Efficiency of Minimax The branching factor b of a game is the average number of possible moves from a state (3 in example). The number of calls to minimaxValue depends upon b and the depth N of the game tree: 3 2 + 3 = 12 b N can be ignored In general, the number of calls to minimaxValue is: O(b N )
Efficiency Comparisons 0 N O(N) array implementation of priority queue O(logN) binary heap implementation of priority queue O(b N ) minimaxValue time
Comparison of Big-O Rates of Growth Nlog 2 N N 2 2 N 10 1 2 21 4 4 42 16 16 83 64 256 164256 65,536 325 10244,294,967,296 646 40965 years' worth of instructions on a supercomputer 128 716,384 600,000 times greater than age of universe in nanosecs
Properties of Minimax ● Complete? That is, will it find a move given enough time? – Yes, if tree is finite ● Optimal? That is, if there is a forced win, will it find it? – Yes, provided opponent is trying to win ● Time complexity: O(b m ) ● Space complexity: O(bm) -- depth-first search ● For chess, b 35, m for reasonable games
When the Game Tree Cannot Be Exhaustively Searched ● Alter minimax in two ways: – Replace the terminal test with a cutoff test, so only a subtree of the entire tree is searched. – Replace utility function with an evaluation function eval and apply it to the leaves of the subtree ● Usually the cutoff is a fixed ply depth N determined by the available resources of time and memory ● This strategy is called N-move lookahead
Minimax Modified Integer minimaxValue(State s) if s does not survive cutoff test then return eval(s) else successors = expand(s) if it is MAX's turn to move then return the highest minimaxValue of successors else return the lowest minimaxValue of successors
Evaluation Functions Somewhat like the 8-puzzle heuristic, an evaluation function estimates the utility of the game from a given (nonterminal) position. Chess example: Add up the "material values" of pieces: piece value pawn 1 bishop 3 knight 3 rook 5 queen 9 Other features such as "pawn structure" or "king safety" can be given values
Chess Board Evaluations white has better pawn structure
Weighted Linear Functions As Evaluation Functions Suppose: – there are n features to be included in the evaluation – f 1, f 2,..., f n are the number of pieces with each feature – w 1,w 2,..., w n are the weights associated with each feature Then the evaluation function can be computed by: w 1 f 1 + w 2 f 2 +... + w n f n
Problems with Search Cutoff 1 Arbitrary depth limit may fail to recognize an impending disaster: Suppose the lookahead stops at this point. White is ahead by a knight and thus has material advantage. But the eval function will not take into account that white's queen is about to be lost without compensation.
Problems with Search Cutoff (cont'd) 2 "Horizon problem": cutting off the search may fail to foresee a significant event that is inevitable: Black is slightly ahead in material, but when white advances pawn to the eighth row it becomes a queen. Black can be fooled into thinking the queening move can be avoided by checking white with the rook. If the lookahead is not far enough, the queening move will be pushed "over the horizon" of what can be predicted.
The Need for Game Tree Pruning ● An ordinary computer can search about 1000 chess states per second ● Tournament chess allows 150 seconds per move, so 150,000 states can be searched ● The branching factor b of chess is about 35 Q: How many ply p can the computer look ahead? A: 35 p = 150,000, so 3 < p < 4 Thus the computer can do a lookahead of 3 or 4 ply 4-ply: human novice 8-ply: human master, typical PC 12-ply: Kasparov, Deep Blue
Game Tree Pruning (cont'd) Depth cutoff This part of the tree is not examined
Game Tree Pruning (cont'd) Suppose you can determine that these states will never be reached
Game Tree Pruning (cont'd) Then all of their descendants can be ignored
Game Tree Pruning (cont'd) So the depth cutoff can be increased From here To here These nodes can be examined And these
Pruning This technique recognizes when a game tree state can NEVER BE REACHED IN ACTUAL PLAY. After looking ahead to here, MAX knows that the utility for MIN of state B will be 2 or less. Since this cannot beat the utility already found for state A, MAX knows that B will not be chosen. So the rest of the subtree can be ignored (pruned). R AB C
Pruning Example (cont'd) 3128 Max Min 3 33 2 22 X X These nodes do not need to be analyzed.
Pruning Example (cont'd) 3128 Max Min 3 33 2 22 X X 14 14
Pruning Example (cont'd) 3128 Max Min 3 33 2 22 X X 14 14 5
Pruning Example (cont'd) 3128 Max Min 3 3333 2 22 X X 14 14 52
General Principle If at node n a player has already noticed that a better choice existed at node m at some point further up the tree, then n will never be reached
Implementation of Search ● Similar to minimaxValue, only two mutually recursive functions are used: – maxValue is called when it is MAX's turn – minValue is called when it is MIN's turn ● Since a depth-first search of the subtree is done, it is easy to pass along: – the best score for MAX so far along the current path () – the best score for MIN so far along the current path ( )
Implementation (cont'd) ● is initialized to - ∞ and only increases ● is initialized to ∞ and only decreases ● If ever becomes less than or equal to, the search (from the current node) is abandoned Depth cutoff
Minimax Modified for Pruning Move minimaxDecision(State s) global Integer = - ∞ global Integer = ∞ for each Move m from s do value[m] = minimaxValue(nextState(s,m), , ) return the m with the highest value[m] Integer minimaxValue(State s, Integer , Integer ) if it is MAX's turn to move then return = maxValue(s, , ) else return = minValue(s, , )
Implementation (cont'd) Integer maxValue(State s, Integer, Integer ) if s does not survive cutoff test then return eval(s) for each successor in expand(s) do = Maximum(, minValue(successor,,)) if <= then return return Integer minValue(State s, Integer, Integer ) if s does not survive cutoff test then return eval(s) for each successor in expand(s) do = Minimum(, maxValue(successor,,)) if <= then return return
Properties of Search ● Pruning does not affect final result ● Good move ordering improves effectiveness of pruning ● With “perfect ordering” time complexity = O(b m/2 ) – Doubles depth of search – Can reach depth 8 and play good chess
History of Chess Programs Chess ratings: 1000: beginning human, 2750: world champion ● 1970: Early winners of ACM North American Computer Chess Championships were rated less than 2000, used: – search – book openings – infallible endgame algorithms
History of Chess Programs (cont'd) ● 1982: Belle became first master-level program (2200) – used special-purpose hardware – searched several million positions per move ● 1987: HiTech was first program to beat human grand master – special-purpose hardware – searched ten million positions per move – used most accurate eval function yet
History of Chess Programs (cont'd) ● 1995: Deep Thought2 beat Danish Olympic team – used simple eval function – searched 1/2 billion states to 10-ply ● 1997: Deep Blue beats Kasparov 3.5 - 2.5 – 32-node IBM RS/6000 SP high-performance computer – Each node of the SP employs a single microchannel card containing 8 dedicated VLSI chess processors – System is capable of calculating 60 billion moves within three minutes, which is the time allotted to each player's move in classical chess
Ratings of Human and Machine Chess Champions Deep Blue
Other Games ● Checkers: Program Chinook official world champion (as of 1994) ● Othello (Reversi): Programs are far better than humans ● Backgammon: Tesauro's program using neural net learning is ranked among top three players in world ● Go: Branching factor of 360 makes regular search methods impossible. $2M prize to first program to defeat top-level player.