Adversarial Search 1 (Game Playing)

Name: Adversarial Search 1 (Game Playing)
Uploaded: 2017-08-02T13:37:52+00:00
Duration: PTM27S53
Channel: Nigel James
Description: Adversarial Search 1 (Game Playing)

Adversarial Search 1 (Game Playing)

Outline Motivation Optimal decisions Minimax algorithm α-β pruning

Motivation Games are a form of multi-agent environment
Any given agent will need to consider the actions of other agents and how they affect our success? The unpredictability of other agents can introduce many possible contingencies into the agent’s problem solving process Cooperative vs. competitive multi-agent environments

Background on Multi agent environment

Environments: Single agent vs. multiagent
The distinction is done as below If A is an agent (say, taxi driver) How to treat an other object B? As an agent? Or as a stochastically behaving object? The key distinction is whether B’s behavior is best described as maximizing a performance measure whose value depends on agent A’s behavior Examples?

Example1: Chess The opponent B is trying to maximize its performance measure Which minimizes agent A’s performance measure Chess is a competitive multiagent environment

Example2: Taxi driving environment is a partially co operative multiagent environment Avoiding collisions maximizes the performance measure of all agents Partially competitive as only one car can occupy a parking space

Game theory Mathematical game theory, a branch of economics, views any multi agent environment as a game provided that the Impact of each agent on the others is “significant” Regardless of agents are competitive or cooperative (Game theory in Ch 17) Note: Environments with large number of agents are often viewed as economics rather than games.

Games in AI In AI most common games are Turn taking Two player
Zero-sum: one player’s loss is another’s gain Perfect Information: each player knows the entire game state (No information is hidden from either player) Deterministic: no element of chance What it means? Deterministic, fully observable environments in which there are two agents Whose actions alternate and in which The utility values at the end of the game are always equal and opposite

What is adversarial? There is opposition between the agent’s utility functions that makes the situation adversarial Ex: if one player wins a game of chess +1 The other player necessarily loses

Motivation Contd … Games are a form of multi-agent environment
What do other agents do and how do they affect our success? Competitive multi-agent environments in which the agents’ goals are in conflict give rise to adversarial search problems known as games Why study games? Games are fun! Historical role in AI Studying games teaches us how to deal with other agents trying to foil our plans

Motivation Contd … Huge state spaces
The state of a game is easy to represent Agents are restricted to a small number of actions Outcomes are defined by precise rules Clear set of legal moves Well-defined outcomes (e.g. win, lose, draw) Nice, clean environment with clear criteria for success

Motivation Contd … Physical games
Physical games like tennis, croquet, ice hockey, etc. have complicated descriptions Larger range of possible actions Imprecise rules to define legal actions With exception of robot soccer, physical games have not much attracted by AI community (more on

Motivation Contd … Game playing is one of the first task undertaken in AI Chess Checkers Othello Backgammon Games unlike toy problems are interesting Too hard to solve Ex: chess has an average branching factor of about 35 If there are 50 moves by each player then Search tree has about nodes

Motivation Contd … Too hard to solve
Games like the real world require the ability to take some decision Needed even when calculating the optimal decision is infeasible Games penalize inefficiency severely Implementing game algorithms without much efficiency lead to cost more and take more time So game playing research has come up with ideas on how to make the best possible use of time

Drawing game trees Even a simple tic tac toe is too complex to draw the entire game tree as shown next

Game tree (2-player, deterministic, turns)

More complicated games
Most card games (e.g. Hearts, Bridge, etc.) and Scrabble non-deterministic lacking in perfect information Cooperative games Real-time strategy games (lack alternating moves). e.g. Warcraft

Types of Games Note: No chance (e.g., using dice) involved

Types of Games

Optimal decisions in games

Game setup – Two player Two players: A and B known as MAX and MIN
MAX represents the player trying to win ie., to MAXimize performance. MIN is the opponent who attempts to MINimize MAX’s score High values are assumed to be good for MAX and bad for MIN Assume that MIN uses the same information and always attempts to move to a state that is worst for MAX MAX moves first and they take turns until the game is over Winner gets award, loser gets penalty Note: We consider zero sum games in this chapter

Two-Player Games A game formulated as a search problem:
Initial state: ? Actions: ? Terminal state: ? Utility function: ? Transition Model?

Game setup – Two player The initial state S0 , which specifies how the game is set up at the start. PLAYER(s): Defines which player has the move in a state. ACTIONS(s): Returns the set of legal moves in a state. RESULT(s, a): The transition model, which defines the result of a move. TERMINAL-TEST ( s): A terminal test, which is true when the game is over and false otherwise. States where the game has ended are called terminal states. UTILITY(s,p): A utility function (also called an objective function or payoff function) defines the final numeric value for a game that ends in terminal state s for a player p.

Game setup – Two player A game can be defined as a search problem:
Initial state: board position and identifies the player to move e.g. board configuration of chess Terminal test: Is the game finished? States at which the game has ended are called terminal states Utility function: Gives numerical value for terminal states. E.g. win (+1), lose (-1) and draw (0) in tic-tac-toe, chess. Backgammon has +192 to -192

Game Trees Represent the problem space for a game by a game tree
Nodes represent ‘board positions’; edges represent legal moves. Root node is the position in which a decision must be made. Evaluation function f assigns real-number scores to `board positions.’ Terminal nodes represent ways the game could end, labeled with the desirability of that ending (e.g. win/lose/draw or a numerical score)

Game Trees VS. Search Trees
For tic-tac-toe the game tree is relatively small-fewer than 9! = 362, 880 terminal nodes. But for chess there are over 1040 nodes So the game tree is best thought of as a theoretical construct that we cannot realize in the physical world. Search tree: But regardless of the size of the game tree, it is MAX's job to search for a good move. We use the term search tree for a tree that is superimposed on the full game tree, and examines enough nodes to allow a player to determine what move to make

Game Tree for Tic Tac Toe
MAX has 9 possible moves. places ‘x’ MIN places ‘o’ . They paly alternate until reach terminal state: states where one player has three in a row or all the squares are filled. It’s MAX job to use the search tree to determine the best move. Terminal states are assigned with utility value according to the rules of the game

Optimal strategies

Optimal strategies In a normal search problem what is the optimal solution? A sequence of steps leading to a goal state What about games? MIN has some decisions So MAX must find a contingent strategy Specifies the MAX’s move in the initial state Then MAX’s moves in the states resulting from every possible response by MIN Then MAX’s moves in the states resulting from every possible response by MIN to those moves and so on We will see how to find optimal strategy (minimax procedure)

Utility function How many functions for two players MAX and MIN?
The zero-sum assumption allows to use a single evaluation function to describe the goodness of a board with respect to both players. one of the players just have to negate the return of the function. Positive numbers indicate favor to MAX player Negative numbers indicate favor to MIN player f(n) > 0: position n good for MAX and bad for MIN. f(n) < 0: position n bad for MAX and good for MIN f(n) near 0: position n is a neutral position. f(n) >> 0: win for MAX f(n) << 0: win for MIN

An example (partial) game tree for Tic-Tac-Toe
f(n) = +1 if the position is a win for X. f(n) = -1 if the position is a win for O. f(n) = 0 if the position is a draw. -

Generate game tree

Generate game tree x x x x

Generate game tree x x o x o x o o x

Generate game tree x 1 ply 1 move x o x o x o o x

Drawing game trees So we adopt to the game tree shown next
Game trees are searched by level or a ply Each move by a player defines a new ply of the game tree Each level in the game tree is labeled according to the player who moves at that point in the game, MIN or MAX MAX node nodes at even-numbered depths correspond to positions in which it is MAX’s move next MIN node nodes at odd-numbered depths correspond to positions in which it is MIN’s move next

Applying Minimax to Tic Tac Toe

Tic-Tac-Toe X O f(n) = = 1 Initial State: Board position of 3x3 matrix with 0 and X. Actions (Operators): Putting 0’s or X’s in vacant positions alternatively Terminal test: Which determines game is over Utility function: f(n) = (No. of complete rows, columns or diagonals are still open for player ) – (No. of complete rows, columns or diagonals are still open for opponent )

Example : Tic-Tac-Toe MAX marks crosses and MIN marks circles and it is MAX’s turn to play first. With a depth bound of 2, conduct a breadth-first search evaluation function f(n) of a position n If n is not a winning for either player, f(n) = (no. of complete rows, columns, or diagonals that are still open for MAX) - (no. of complete rows, columns, or diagonals that are still open for MIN) If n is a win of MAX, f(n) =  If n is a win of MIN f(n) = - 

Example : Tic-Tac-Toe (2)
First move

Problems

Compute Two-ply minimax for tic-tac-toe at the following state

Building Minimax Procedure

A 2-ply Game tree - Hypothetical
The possible moves for MAX at the root node are labeled A1, A2, and A3. The possible replies to A1 for MIN are A11, A12, A13 Assume game ends after one move each by MAX and MIN. In game parlance, we say that this tree is one move deep, consisting of two half-moves, each of which is called a ply. Assume the utilities of the terminal states in this game range from 2 to 14. Given a game tree, the optimal strategy can be determined from the minimax value of each node, MINIMAX(s).

MAX A1 A2 A3 1st ply MIN 3 12 8 A11 A12 A13 2 4 6 A21 A22 A23 14 5 2 A31 A32 A33 2nd ply Note: An action by one player is called a ply, two ply (an action and a counter action) is called a move. MAX nodes are denoted as and MIN nodes as inverted.

A 2-ply Game tree What is the MAX’s best move at the root?
What is the MIN’s best reply ? Compute the minimax value Label the nodes with their minimax values Apply Minimax definition

Definition MINIMAX (s):
Given a game tree, the optimal strategy can be determined by using the minimax value of each node which is denoted as MINIMAX (s): If the parent state is a MAX node, give it the maximum value among its children If the parent state is a MIN node, give it the minimum value among its children The minimax value of a terminal state is just its utility. MINIMAX(s)= UTILITY(s) If TERMINAL-TEST(s) maxa  Actions(s) MINIMAX(RESULT(s,a)) If PLAYER (s) = MAX mina  Actions(s) MINIMAX(RESULT(s,a)) If PLAYER (s) = MIN

A 2-ply Game tree Apply minimax definition to the hypothetical game tree What is the MAX’s best move at the root? MAX’s best move at the root is A1 As it leads to the successor with the highest minimax value What is the MIN’s best reply ? A11 because it leads to the successor with the lowest minimax value

MAX A1 A2 A3 1st ply MIN 3 12 8 A11 A12 A13 2 4 6 A21 A22 A23 14 5 2 A31 A32 A33 2nd ply Note: An action by one player is called a ply, two ply (an action and a counter action) is called a move. MAX nodes are denoted as and MIN nodes as inverted.

Minimax Rule Goal of game tree search: to determine one move for Max player that maximizes the guaranteed payoff for a given game tree for MAX Regardless of the moves the MIN will take The value of each node (MAX and MIN) is determined by (back up from) the values of its children MAX plays the worst case scenario: Always assume MIN to take moves to maximize his pay-off (i.e., to minimize the pay-off of MAX) For a MAX node, the backed up value is the maximum of the values associated with its children For a MIN node, the backed up value is the minimum of the values associated with its children

Minimax Tree MAX node MIN node f value A1 is selected as the next move

Minimax procedure Create start node as a MAX node with current board configuration Expand nodes down to some depth (i.e., ply) of lookahead in the game. Apply the evaluation function at each of the leaf nodes Obtain the “back up" values for each of the non-leaf nodes from its children by Minimax rule until a value is computed for the root node. Pick the operator associated with the child node whose backed up value determined the value at the root as the move for MAX

Applying Minimax Definition
The minimax decision

Minimax Search 2 7 1 8 2 7 1 8 2 7 1 8 2 7 1 8 This is the move
selected by minimax Static evaluator value MAX MIN

Minimax algorithm Algorithm:
Generate game tree completely Determine utility of each terminal state Propagate the utility values upward in the three by applying MIN and MAX operators on the nodes in the current level At the root node use minimax decision to select the move with the max (of the min) utility value Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.

Minimax Algorithm

Explanation The algorithm for calculating minimax decisions
It returns the action corresponding to the best possible move that is, the move that leads to the outcome with the best utility, under the assumption that the opponent plays to minimize utility. The functions MAX-VALUE and MIN-VALUE go through the whole game tree, all the way to the leaves to determine the backed-up value of a state. The notation argmax a € S f(a) computes the element a of set S that has the maximum value of f (a).

Minimax Assumption Finds the contingent strategy for MAX assuming an infallible MIN opponent. Minimax Assumption: Both players play optimally !! Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX will do even better [proven]

MINIMAX Code function MINIMAX(N) begin if N is a leaf then return the estimated score of this leaf else Let N1, N2, .., Nm be the successors of N; if N is a MIN node then return min{MINIMAX(N1),…,MINIMAX(Nm)} return max{MINIMAX(N1), .., MINIMAX(Nm)} end MINIMAX;

Minimax Properties Minimax is for deterministic fully observable games
perfect information games: play for deterministic environments with perfect information

Applying minimax to complicated games
How to apply minimax to complicated games? It is not possible to expand the game tree till the leaf nodes (complete tree is infeasible ex: as in chess) Instead, the state space is searched to a predefined number of levels (determined by available resources of time and memory) This strategy is an n-ply lookahead where n is the no of levels explored Leaves of this sub graph are not the terminal states of the game So, it is not possible to give them values that reflect a win or a loss

Applying minimax to complicated games
How to apply minimax to complicated games? Each node is given a value according to some heuristic evaluation function The value that is propagated back to the root node is not an indication of whether or not a win can be achieved But is the heuristic value of the best state that can be reached in n moves from the start node Backed up value are based on “looking ahead” in the game tree Look ahead increases the power of a heuristic by allowing it to apply over a greater area of the search space So, minimax consolidates these evaluations into a single value of an ancestor state

Heuristic vs. Brute force
Zero sum games : one players loss is another player's gain. A winning strategy for this type of game is to minimize the maximum potential gain of the opponent and Assume your opponent is following the same strategy. Better than brute force lookahead: Consider all possible moves to the end Pick the move that leads to a win, if possible Why not program Computer Chess that way?

Heuristics in games Heuristics in chess
Heuristics in chess: difference in no of pieces belonging to MAX and MIN

Minimax: properties Complete: ? Optimal: ? Time complexity: ?
Space complexity: ?

Minimax: properties The minimax algorithm is depth-first search
Complete: ? Yes, for finite state-space (finite tree) Optimal: ? Yes (against an optimal opponent) Time complexity: ? O(bm) Space complexity: ? O(bm) if all successors are generated at once O(m) if successors are generated one at a time

State space search vs. minimax search
Performance depends on Quality of evaluation functions (domain knowledge) Depth of the search (computer power and search algorithm) Different from ordinary state space search Not to search for a complete solution but for one move only No cost is associated with each arc MAX does not know how MIN is going to counter each of his moves Time complexity is impractical for real games But minimax rule is a basis for other game tree search algorithms

Multiplayer Games Many popular games allow more than two players.
How to extend the minimax idea to multiplayer games This is straightforward from the technical viewpoint but raises some interesting new conceptual issues.

Multiplayer Games Many games allow more than two players
Replace the single value for each node with a vector of values In 2 player zero sum games the two element vector was reduced to a single value because values are always opposite Treat utility function to return a vector of values Ex: for 3 players A, B, C a vector (vA, vB, vC) is associated to each node

Multiplayer Games Computing minimax values
Consider node X where player C chooses what to do There are 2 choices leading to 2 terminal states (1, 2, 6) and (4, 2, 3) C should choose (1, 2, 6) as 6 > 3. So backed up value of node X is (1, 2, 6) In general, backed up value of a node n is the utility vector of that successor which has the highest value for the player choosing at n

Extending Minimax to Multiplayer games
Note: optimal strategy for multi player games such as alliances are not dealt

Alpha Beta Pruning

Adversarial Search 1 (Game Playing)

Similar presentations

Presentation on theme: "Adversarial Search 1 (Game Playing)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Adversarial Search 1 (Game Playing)

Similar presentations

Presentation on theme: "Adversarial Search 1 (Game Playing)"— Presentation transcript:

Similar presentations

About project

Feedback