2Two Player Games Competitive rather than cooperative Zero sum game One player loses, one player winsZero sum gameOne player wins what the other one losesSee game theory for the mathematicsGetting an agent to play a gameBoils down to how it plays each moveExpress this as a search problemCannot backtrack once a move has been made (episodic)
3(Our) Basis of Game Playing: Search for best move every time Search for OpponentMove MovesInitial Board State Board State Board State 3Search for OpponentMove MovesBoard State Board State 5
4Lookahead Search If I played this move If I played this move… Then they might play that moveThen I could do that moveAnd they would probably do that moveOr they might play that moveAnd they would play that moveOr I could play that moveAnd they would do that moveIf I played this move…
5Lookahead Search (best moves) If I played this moveThen their best move would beThen my best move would beOr another good move for them is…Etc.
6Minimax Search Like children sharing a cake Underlying assumption Opponent acts rationallyEach player moves in such a way as toMaximise their final winnings, minimise their lossesi.e., play the best move at the timeMethod:Calculate the guaranteed final scores for each moveAssuming the opponent will try to minimise that scoreChoose move that maximises this guaranteed score
7Example Trivial Game Deal four playing cards out, face up Player 1 chooses one, player 2 chooses onePlayer 1 chooses another, player 2 chooses anotherAnd the winner is….Add the cards upThe player with the highest even numberScores that amount (in pounds sterling from opponent)
8For Trivial Games Draw the entire search space Put the scores associated with each final board state at the ends of the pathsMove the scores from the ends of the paths to the starts of the pathsWhenever there is a choice use minimax assumptionThis guarantees the scores you can getChoose the path with the best score at the topTake the first move on this path as the next move
13For Real Games Search space is too large So we cannot draw (search) the entire spaceFor example: chess has branching factor of ~35Suppose our agent searches 1000 board states per secondAnd has a time limit of 150 secondsSo can search 150,000 positions per moveThis is only three or four ply look aheadBecause 353 = 42,875 and 354 = 1,500,625Average humans can look ahead six-eight ply
14Cutoff Search Must use a heuristic search Use an evaluation function Estimate the guaranteed score from a board stateDraw search space to a certain depthDepth chosen to limit the time takenPut the estimated values at the end of pathsPropagate them to the top as beforeQuestion:Is this a uniform path cost, greedy or A* search?
15Evaluation Functions Must be able to differentiate between Good and bad board statesExact values not importantIdeally, the function would return the true scoreFor goal statesExample in chessWeighted linear functionWeights:Pawn=1, knight=bishop=3, rook=5, queen=9
17Evaluation Function for our Game Evaluation after the first moveCount zero if it’s odd, take the number if its evenEvaluation function here would choose 10But this would be disastrous for the player
18Problems with Evaluation Functions Horizon problemAgent cannot see far enough into search spacePotentially disastrous board position after seemingly good onePossible solutionReduce the number of initial moves to look atAllows you to look further into the search spaceNon-quiescent searchExhibits big swings in the evaluation functionE.g., when taking pieces in chessSolution: advance search past non-quiescent part
19Pruning Want to visit as many board states as possible Want to avoid whole branches (prune them)Because they can’t possibly lead to a good scoreExample: having your queen taken in chess(Queen sacrifices often very good tactic, though)Alpha-beta pruningCan be used for entire search or cutoff searchRecognize that a branch cannot produce better scoreThan a node you have already evaluated
20Alpha-Beta Pruning for Player 1 1. Given a node N which can be chosen by player one, then if there is another node, X, along any path, such that (a) X can be chosen by player two (b) X is on a higher level than N and (c) X has been shown to guarantee a worse score for player one than N, then the parent of N can be pruned.2. Given a node N which can be chosen by player two, then if there is a node X along any path such that (a) player one can choose X (b) X is on a higher level than N and (c) X has been shown to guarantee a better score for player one than N, then the parent of N can be pruned.
21Example of Alpha-Beta Pruning player 1Pruneplayer 2Depth first search a good idea hereSee notes for explanation
22Games with Chance Many more interesting games Example: backgammon Have an element of chanceBrought in by throwing a die, tossing a coinExample: backgammonSee Gerry Tesauro’s TD-Gammon programIn these casesWe can no longer calculate guaranteed scoresWe can only calculate expected scoresUsing probability to guide us
23Expectimax Search Going to draw tree and move values as before Whenever there is a random eventAdd an extra node for each possible outcome which will change the board states possible after the eventE.g., six extra nodes if each roll of die affects stateWork out all possible board states from chance nodeWhen moving score values up through a chance nodeMultiply the value by the probability of the event happeningAdd together all the multiplicandsGives you expected value coming through the chance node
24More interesting (but still trivial) game Deal four cards face upPlayer 1 chooses a cardPlayer 2 throws a dieIf it’s a six, player 2 chooses a card, swaps it with player 1’s and keeps player 1’s cardIf it’s not a six, player 2 just chooses a cardPlayer 1 chooses next cardPlayer 2 takes the last card
27Games Played by Computer Games played perfectly:Connect four, noughts & crosses (tic-tac-toe)Best move pre-calculated for each board stateSmall number of possible board statesGames played well:Chess, draughts (checkers), backgammonScrabble, tetris (using ANNs)Games played badly:Go, bridge, soccer
28Philosophical Questions Q1. Is how computers plays chessMore fundamental than how people play chess?In science, simple & effective techniques are valuedMinimax cutoff search is simple and effectiveBut this is seen by some as stupid and “non-AI”Drew McDermott:"Saying Deep Blue doesn't really think about chess is like saying an airplane doesn't really fly because it doesn't flap its wings”Q2. If aliens came to Earth and challenged us to chess…Would you send Deep Blue or Kasparov into battle?