Adversarial Search 1 (Game Playing)

Slides:



Advertisements
Similar presentations
Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Advertisements

Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
This lecture topic: Game-Playing & Adversarial Search
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
CMSC 671 Fall 2001 Class #8 – Thursday, September 27.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
Game Playing (Tic-Tac-Toe), ANDOR graph By Chinmaya, Hanoosh,Rajkumar.
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
COMP-4640: Intelligent & Interactive Systems Game Playing A game can be formally defined as a search problem with: -An initial state -a set of operators.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search CSE 473 University of Washington.
Adversarial Search Chapter 6.
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
This time: Outline Game playing The minimax algorithm
CS 561, Sessions Administrativia Assignment 1 due tuesday 9/24/2002 BEFORE midnight Midterm exam 10/10/2002.
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
CS 561, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
Game Playing CSC361 AI CSC361: Game Playing.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
1 DCP 1172 Introduction to Artificial Intelligence Lecture notes for Chap. 6 [AIMA] Chang-Sheng Chen.
CS 460, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
CSC 412: AI Adversarial Search
Notes adapted from lecture notes for CMSC 421 by B.J. Dorr
PSU CS 370 – Introduction to Artificial Intelligence Game MinMax Alpha-Beta.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Game Playing. Introduction Why is game playing so interesting from an AI point of view? –Game Playing is harder then common searching The search space.
Game Playing.
AD FOR GAMES Lecture 4. M INIMAX AND A LPHA -B ETA R EDUCTION Borrows from Spring 2006 CS 440 Lecture Slides.
Chapter 12 Adversarial Search. (c) 2000, 2001 SNU CSE Biointelligence Lab2 Two-Agent Games (1) Idealized Setting  The actions of the agents are interleaved.
Adversarial Search CS311 David Kauchak Spring 2013 Some material borrowed from : Sara Owsley Sood and others.
Game-playing AIs Part 1 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part I (this set of slides)  Motivation  Game Trees  Evaluation.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Notes on Game Playing by Yun Peng of theYun Peng University of Maryland Baltimore County.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
CMSC 421: Intro to Artificial Intelligence October 6, 2003 Lecture 7: Games Professor: Bonnie J. Dorr TA: Nate Waisbrot.
Adversarial Search 2 (Game Playing)
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Artificial Intelligence in Game Design Board Games and the MinMax Algorithm.
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
Adversarial Search CMPT 463. When: Tuesday, April 5 3:30PM Where: RLC 105 Team based: one, two or three people per team Languages: Python, C++ and Java.
Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.
Adversarial Search Chapter Two-Agent Games (1) Idealized Setting – The actions of the agents are interleaved. Example – Grid-Space World – Two.
Adversarial Search and Game-Playing
Games and Adversarial Search
Last time: search strategies
PENGANTAR INTELIJENSIA BUATAN (64A614)
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Game Playing in AI by: Gaurav Phapale 05 IT 6010
Artificial Intelligence
Game Playing: Adversarial Search
NIM - a two person game n objects are in one pile
Artificial Intelligence
Game Playing: Adversarial Search
Game Playing Chapter 5.
Adversarial Search CMPT 420 / CMPG 720.
Unit II Game Playing.
Presentation transcript:

Adversarial Search 1 (Game Playing)

Outline Motivation Optimal decisions Minimax algorithm α-β pruning

Motivation Games are a form of multi-agent environment Any given agent will need to consider the actions of other agents and how they affect our success? The unpredictability of other agents can introduce many possible contingencies into the agent’s problem solving process Cooperative vs. competitive multi-agent environments

Background on Multi agent environment

Environments: Single agent vs. multiagent The distinction is done as below If A is an agent (say, taxi driver) How to treat an other object B? As an agent? Or as a stochastically behaving object? The key distinction is whether B’s behavior is best described as maximizing a performance measure whose value depends on agent A’s behavior Examples?

Environments: Single agent vs. multiagent Example1: Chess The opponent B is trying to maximize its performance measure Which minimizes agent A’s performance measure Chess is a competitive multiagent environment

Environments: Single agent vs. multiagent Example2: Taxi driving environment is a partially co operative multiagent environment Avoiding collisions maximizes the performance measure of all agents Partially competitive as only one car can occupy a parking space

Game theory Mathematical game theory, a branch of economics, views any multi agent environment as a game provided that the Impact of each agent on the others is “significant” Regardless of agents are competitive or cooperative (Game theory in Ch 17) Note: Environments with large number of agents are often viewed as economics rather than games.

Games in AI In AI most common games are Turn taking Two player Zero-sum: one player’s loss is another’s gain Perfect Information: each player knows the entire game state (No information is hidden from either player) Deterministic: no element of chance What it means? Deterministic, fully observable environments in which there are two agents Whose actions alternate and in which The utility values at the end of the game are always equal and opposite

What is adversarial? There is opposition between the agent’s utility functions that makes the situation adversarial Ex: if one player wins a game of chess +1 The other player necessarily loses -1

Motivation Contd … Games are a form of multi-agent environment What do other agents do and how do they affect our success? Competitive multi-agent environments in which the agents’ goals are in conflict give rise to adversarial search problems known as games Why study games? Games are fun! Historical role in AI Studying games teaches us how to deal with other agents trying to foil our plans

Motivation Contd … Huge state spaces The state of a game is easy to represent Agents are restricted to a small number of actions Outcomes are defined by precise rules Clear set of legal moves Well-defined outcomes (e.g. win, lose, draw) Nice, clean environment with clear criteria for success

Motivation Contd … Physical games Physical games like tennis, croquet, ice hockey, etc. have complicated descriptions Larger range of possible actions Imprecise rules to define legal actions With exception of robot soccer, physical games have not much attracted by AI community (more on http://www.robocup.org)

Motivation Contd … Game playing is one of the first task undertaken in AI Chess Checkers Othello Backgammon Games unlike toy problems are interesting Too hard to solve Ex: chess has an average branching factor of about 35 If there are 50 moves by each player then Search tree has about 35100 nodes

Motivation Contd … Too hard to solve Games like the real world require the ability to take some decision Needed even when calculating the optimal decision is infeasible Games penalize inefficiency severely Implementing game algorithms without much efficiency lead to cost more and take more time So game playing research has come up with ideas on how to make the best possible use of time

Drawing game trees Even a simple tic tac toe is too complex to draw the entire game tree as shown next

Game tree (2-player, deterministic, turns)

More complicated games Most card games (e.g. Hearts, Bridge, etc.) and Scrabble non-deterministic lacking in perfect information Cooperative games Real-time strategy games (lack alternating moves). e.g. Warcraft

Types of Games Note: No chance (e.g., using dice) involved

Types of Games

Optimal decisions in games

Game setup – Two player Two players: A and B known as MAX and MIN MAX represents the player trying to win ie., to MAXimize performance. MIN is the opponent who attempts to MINimize MAX’s score High values are assumed to be good for MAX and bad for MIN Assume that MIN uses the same information and always attempts to move to a state that is worst for MAX MAX moves first and they take turns until the game is over Winner gets award, loser gets penalty Note: We consider zero sum games in this chapter

Two-Player Games A game formulated as a search problem: Initial state: ? Actions: ? Terminal state: ? Utility function: ? Transition Model?

Game setup – Two player The initial state S0 , which specifies how the game is set up at the start. PLAYER(s): Defines which player has the move in a state. ACTIONS(s): Returns the set of legal moves in a state. RESULT(s, a): The transition model, which defines the result of a move. TERMINAL-TEST ( s): A terminal test, which is true when the game is over and false otherwise. States where the game has ended are called terminal states. UTILITY(s,p): A utility function (also called an objective function or payoff function) defines the final numeric value for a game that ends in terminal state s for a player p.

Game setup – Two player A game can be defined as a search problem: Initial state: board position and identifies the player to move e.g. board configuration of chess Terminal test: Is the game finished? States at which the game has ended are called terminal states Utility function: Gives numerical value for terminal states. E.g. win (+1), lose (-1) and draw (0) in tic-tac-toe, chess. Backgammon has +192 to -192

Game Trees Represent the problem space for a game by a game tree Nodes represent ‘board positions’; edges represent legal moves. Root node is the position in which a decision must be made. Evaluation function f assigns real-number scores to `board positions.’ Terminal nodes represent ways the game could end, labeled with the desirability of that ending (e.g. win/lose/draw or a numerical score)

Game Trees VS. Search Trees For tic-tac-toe the game tree is relatively small-fewer than 9! = 362, 880 terminal nodes. But for chess there are over 1040 nodes So the game tree is best thought of as a theoretical construct that we cannot realize in the physical world. Search tree: But regardless of the size of the game tree, it is MAX's job to search for a good move. We use the term search tree for a tree that is superimposed on the full game tree, and examines enough nodes to allow a player to determine what move to make

Game Tree for Tic Tac Toe MAX has 9 possible moves. places ‘x’ MIN places ‘o’ . They paly alternate until reach terminal state: states where one player has three in a row or all the squares are filled. It’s MAX job to use the search tree to determine the best move. Terminal states are assigned with utility value according to the rules of the game

Optimal strategies

Optimal strategies In a normal search problem what is the optimal solution? A sequence of steps leading to a goal state What about games? MIN has some decisions So MAX must find a contingent strategy Specifies the MAX’s move in the initial state Then MAX’s moves in the states resulting from every possible response by MIN Then MAX’s moves in the states resulting from every possible response by MIN to those moves and so on We will see how to find optimal strategy (minimax procedure)

Utility function How many functions for two players MAX and MIN? The zero-sum assumption allows to use a single evaluation function to describe the goodness of a board with respect to both players. one of the players just have to negate the return of the function. Positive numbers indicate favor to MAX player Negative numbers indicate favor to MIN player f(n) > 0: position n good for MAX and bad for MIN. f(n) < 0: position n bad for MAX and good for MIN f(n) near 0: position n is a neutral position. f(n) >> 0: win for MAX f(n) << 0: win for MIN

An example (partial) game tree for Tic-Tac-Toe f(n) = +1 if the position is a win for X. f(n) = -1 if the position is a win for O. f(n) = 0 if the position is a draw. -

Generate game tree

Generate game tree x x x x

Generate game tree x x o x o x o o x

Generate game tree x 1 ply 1 move x o x o x o o x

Drawing game trees So we adopt to the game tree shown next Game trees are searched by level or a ply Each move by a player defines a new ply of the game tree Each level in the game tree is labeled according to the player who moves at that point in the game, MIN or MAX MAX node nodes at even-numbered depths correspond to positions in which it is MAX’s move next MIN node nodes at odd-numbered depths correspond to positions in which it is MIN’s move next

Applying Minimax to Tic Tac Toe

Tic-Tac-Toe X O f(n) = 6 - 5 = 1 Initial State: Board position of 3x3 matrix with 0 and X. Actions (Operators): Putting 0’s or X’s in vacant positions alternatively Terminal test: Which determines game is over Utility function: f(n) = (No. of complete rows, columns or diagonals are still open for player ) – (No. of complete rows, columns or diagonals are still open for opponent )

Example : Tic-Tac-Toe MAX marks crosses and MIN marks circles and it is MAX’s turn to play first. With a depth bound of 2, conduct a breadth-first search evaluation function f(n) of a position n If n is not a winning for either player, f(n) = (no. of complete rows, columns, or diagonals that are still open for MAX) - (no. of complete rows, columns, or diagonals that are still open for MIN) If n is a win of MAX, f(n) =  If n is a win of MIN f(n) = - 

Example : Tic-Tac-Toe (2) First move

Example : Tic-Tac-Toe (3)

Example : Tic-Tac-Toe (4)

Problems

Compute Two-ply minimax for tic-tac-toe at the following state

Compute Two-ply minimax for tic-tac-toe at the following state

Building Minimax Procedure

A 2-ply Game tree - Hypothetical The possible moves for MAX at the root node are labeled A1, A2, and A3. The possible replies to A1 for MIN are A11, A12, A13 Assume game ends after one move each by MAX and MIN. In game parlance, we say that this tree is one move deep, consisting of two half-moves, each of which is called a ply. Assume the utilities of the terminal states in this game range from 2 to 14. Given a game tree, the optimal strategy can be determined from the minimax value of each node, MINIMAX(s).

A 2-ply Game tree - Hypothetical MAX A1 A2 A3 1st ply MIN 3 12 8 A11 A12 A13 2 4 6 A21 A22 A23 14 5 2 A31 A32 A33 2nd ply Note: An action by one player is called a ply, two ply (an action and a counter action) is called a move. MAX nodes are denoted as and MIN nodes as inverted.

A 2-ply Game tree What is the MAX’s best move at the root? What is the MIN’s best reply ? Compute the minimax value Label the nodes with their minimax values Apply Minimax definition

Definition MINIMAX (s): Given a game tree, the optimal strategy can be determined by using the minimax value of each node which is denoted as MINIMAX (s): If the parent state is a MAX node, give it the maximum value among its children If the parent state is a MIN node, give it the minimum value among its children The minimax value of a terminal state is just its utility. MINIMAX(s)= UTILITY(s) If TERMINAL-TEST(s) maxa  Actions(s) MINIMAX(RESULT(s,a)) If PLAYER (s) = MAX mina  Actions(s) MINIMAX(RESULT(s,a)) If PLAYER (s) = MIN

A 2-ply Game tree Apply minimax definition to the hypothetical game tree What is the MAX’s best move at the root? MAX’s best move at the root is A1 As it leads to the successor with the highest minimax value What is the MIN’s best reply ? A11 because it leads to the successor with the lowest minimax value

A 2-ply Game tree - Hypothetical MAX A1 A2 A3 1st ply MIN 3 12 8 A11 A12 A13 2 4 6 A21 A22 A23 14 5 2 A31 A32 A33 2nd ply Note: An action by one player is called a ply, two ply (an action and a counter action) is called a move. MAX nodes are denoted as and MIN nodes as inverted.

Minimax Rule Goal of game tree search: to determine one move for Max player that maximizes the guaranteed payoff for a given game tree for MAX Regardless of the moves the MIN will take The value of each node (MAX and MIN) is determined by (back up from) the values of its children MAX plays the worst case scenario: Always assume MIN to take moves to maximize his pay-off (i.e., to minimize the pay-off of MAX) For a MAX node, the backed up value is the maximum of the values associated with its children For a MIN node, the backed up value is the minimum of the values associated with its children

Minimax Tree MAX node MIN node f value A1 is selected as the next move

Minimax procedure Create start node as a MAX node with current board configuration Expand nodes down to some depth (i.e., ply) of lookahead in the game. Apply the evaluation function at each of the leaf nodes Obtain the “back up" values for each of the non-leaf nodes from its children by Minimax rule until a value is computed for the root node. Pick the operator associated with the child node whose backed up value determined the value at the root as the move for MAX

Applying Minimax Definition The minimax decision

Minimax Search 2 7 1 8 2 7 1 8 2 7 1 8 2 7 1 8 This is the move selected by minimax Static evaluator value MAX MIN

Minimax algorithm Algorithm: Generate game tree completely Determine utility of each terminal state Propagate the utility values upward in the three by applying MIN and MAX operators on the nodes in the current level At the root node use minimax decision to select the move with the max (of the min) utility value Steps 2 and 3 in the algorithm assume that the opponent will play perfectly.

Minimax Algorithm

Explanation The algorithm for calculating minimax decisions It returns the action corresponding to the best possible move that is, the move that leads to the outcome with the best utility, under the assumption that the opponent plays to minimize utility. The functions MAX-VALUE and MIN-VALUE go through the whole game tree, all the way to the leaves to determine the backed-up value of a state. The notation argmax a € S f(a) computes the element a of set S that has the maximum value of f (a).

Minimax Assumption Finds the contingent strategy for MAX assuming an infallible MIN opponent. Minimax Assumption: Both players play optimally !! Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX will do even better [proven]

MINIMAX Code function MINIMAX(N) begin if N is a leaf then return the estimated score of this leaf else Let N1, N2, .., Nm be the successors of N; if N is a MIN node then return min{MINIMAX(N1),…,MINIMAX(Nm)} return max{MINIMAX(N1), .., MINIMAX(Nm)} end MINIMAX;

Minimax Properties Minimax is for deterministic fully observable games perfect information games: play for deterministic environments with perfect information

Applying minimax to complicated games How to apply minimax to complicated games? It is not possible to expand the game tree till the leaf nodes (complete tree is infeasible ex: as in chess) Instead, the state space is searched to a predefined number of levels (determined by available resources of time and memory) This strategy is an n-ply lookahead where n is the no of levels explored Leaves of this sub graph are not the terminal states of the game So, it is not possible to give them values that reflect a win or a loss

Applying minimax to complicated games How to apply minimax to complicated games? Each node is given a value according to some heuristic evaluation function The value that is propagated back to the root node is not an indication of whether or not a win can be achieved But is the heuristic value of the best state that can be reached in n moves from the start node Backed up value are based on “looking ahead” in the game tree Look ahead increases the power of a heuristic by allowing it to apply over a greater area of the search space So, minimax consolidates these evaluations into a single value of an ancestor state

Heuristic vs. Brute force Zero sum games : one players loss is another player's gain. A winning strategy for this type of game is to minimize the maximum potential gain of the opponent and Assume your opponent is following the same strategy. Better than brute force lookahead: Consider all possible moves to the end Pick the move that leads to a win, if possible Why not program Computer Chess that way?

Heuristics in games Heuristics in chess Heuristics in chess: difference in no of pieces belonging to MAX and MIN

Minimax: properties Complete: ? Optimal: ? Time complexity: ? Space complexity: ?

Minimax: properties The minimax algorithm is depth-first search Complete: ? Yes, for finite state-space (finite tree) Optimal: ? Yes (against an optimal opponent) Time complexity: ? O(bm) Space complexity: ? O(bm) if all successors are generated at once O(m) if successors are generated one at a time

State space search vs. minimax search Performance depends on Quality of evaluation functions (domain knowledge) Depth of the search (computer power and search algorithm) Different from ordinary state space search Not to search for a complete solution but for one move only No cost is associated with each arc MAX does not know how MIN is going to counter each of his moves Time complexity is impractical for real games But minimax rule is a basis for other game tree search algorithms

Multiplayer Games Many popular games allow more than two players. How to extend the minimax idea to multiplayer games This is straightforward from the technical viewpoint but raises some interesting new conceptual issues.

Multiplayer Games Many games allow more than two players Replace the single value for each node with a vector of values In 2 player zero sum games the two element vector was reduced to a single value because values are always opposite Treat utility function to return a vector of values Ex: for 3 players A, B, C a vector (vA, vB, vC) is associated to each node

Multiplayer Games Computing minimax values Consider node X where player C chooses what to do There are 2 choices leading to 2 terminal states (1, 2, 6) and (4, 2, 3) C should choose (1, 2, 6) as 6 > 3. So backed up value of node X is (1, 2, 6) In general, backed up value of a node n is the utility vector of that successor which has the highest value for the player choosing at n

Extending Minimax to Multiplayer games Note: optimal strategy for multi player games such as alliances are not dealt

Alpha Beta Pruning