Adversarial Search 2 (Game Playing)

Slides:



Advertisements
Similar presentations
Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Advertisements

Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
For Friday Finish chapter 5 Program 1, Milestone 1 due.
Games & Adversarial Search
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
Game Playing (Tic-Tac-Toe), ANDOR graph By Chinmaya, Hanoosh,Rajkumar.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
Search Strategies.  Tries – for word searchers, spell checking, spelling corrections  Digital Search Trees – for searching for frequent keys (in text,
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
Artificial Intelligence in Game Design
This time: Outline Game playing The minimax algorithm
Lecture 02 – Part C Game Playing: Adversarial Search
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter.
CS 561, Sessions Last time: search strategies Uninformed: Use only information available in the problem formulation Breadth-first Uniform-cost Depth-first.
Game Playing CSC361 AI CSC361: Game Playing.
Min-Max Trees Based on slides by: Rob Powers Ian Gent Yishay Mansour.
1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle f(N) = g(N) + h(N)
1 DCP 1172 Introduction to Artificial Intelligence Lecture notes for Chap. 6 [AIMA] Chang-Sheng Chen.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
ICS-270a:Notes 5: 1 Notes 5: Game-Playing ICS 270a Winter 2003.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
CSC 412: AI Adversarial Search
CHAPTER 6 : ADVERSARIAL SEARCH
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
Notes adapted from lecture notes for CMSC 421 by B.J. Dorr
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
CISC 235: Topic 6 Game Trees.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Game Playing.
AD FOR GAMES Lecture 4. M INIMAX AND A LPHA -B ETA R EDUCTION Borrows from Spring 2006 CS 440 Lecture Slides.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
1 Computer Group Engineering Department University of Science and Culture S. H. Davarpanah
October 3, 2012Introduction to Artificial Intelligence Lecture 9: Two-Player Games 1 Iterative Deepening A* Algorithm A* has memory demands that increase.
Heuristic Search In addition to depth-first search, breadth-first search, bound depth-first search, and iterative deepening, we can also use informed or.
Games Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart Russell, Andrew Moore,
For Wednesday Read Weiss, chapter 12, section 2 Homework: –Weiss, chapter 10, exercise 36 Program 5 due.
Minimax with Alpha Beta Pruning The minimax algorithm is a way of finding an optimal move in a two player game. Alpha-beta pruning is a way of finding.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
For Wednesday Read chapter 7, sections 1-4 Homework: –Chapter 6, exercise 1.
CSCI 4310 Lecture 6: Adversarial Tree Search. Book Winston Chapter 6.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
Game-Playing & Adversarial Search Alpha-Beta Pruning, etc. This lecture topic: Game-Playing & Adversarial Search (Alpha-Beta Pruning, etc.) Read Chapter.
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Game-playing AIs Part 2 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part II  The Minimax Rule  Alpha-Beta Pruning  Game-playing.
Game Playing: Adversarial Search chapter 5. Game Playing: Adversarial Search  Introduction  So far, in problem solving, single agent search  The machine.
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
CS-424 Gregory Dudek Lecture 10 Annealing (final comments) Adversary Search Genetic Algorithms (genetic search)
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search In this lecture, we introduce a new search scenario: game playing 1.two players, 2.zero-sum game, (win-lose, lose-win, draw) 3.perfect.
Artificial Intelligence in Game Design Board Games and the MinMax Algorithm.
Adversarial Search and Game-Playing
Last time: search strategies
Iterative Deepening A*
PENGANTAR INTELIJENSIA BUATAN (64A614)
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Pengantar Kecerdasan Buatan
Artificial Intelligence
Artificial Intelligence
Introduction to Artificial Intelligence Lecture 9: Two-Player Games I
Based on slides by: Rob Powers Ian Gent
Games & Adversarial Search
Unit II Game Playing.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning
Presentation transcript:

Adversarial Search 2 (Game Playing)

Outline Motivation Optimal decisions Minimax algorithm α-β pruning

Alpha Beta Pruning

Limitation of Minimax search Minimax algorithm requires expanding the entire state-space. Severe limitation, especially for problems with a large state-space. Some nodes in the search can proven to be irrelevant to the outcome of the search

Alpha Beta Pruning Idea: If we have an idea that is surely bad, do not take time to see how truly awful it is. Ex: If we know half-way through a calculation that it will fail, then there is no point doing the rest of it.

A 2-ply Game tree – Ex1 MAX A1 A2 A3 1st ply MIN 3 12 8 A11 A12 A13 2 4 6 A21 A22 A23 14 5 2 A31 A32 A33 2nd ply Note: An action by one player is called a ply, two ply (a action and a counter action) is called a move. MAX nodes are denoted as and MIN nodes as inverted.

Alpha-Beta Pruning Example >=3 Max (3, Min(2,x,y) …) is always ≥ 3 A1 A2 A3 3  2 14 5 2 A11 A12 A13 A21 A22 A23 A31 A32 A33 3 12 8 2 14 5 2 x y We know this without knowing x and y

Alpha-Beta Pruning Example MINIMAX( root) = max(min(3, 12, 8), min(2, x, y), min(14, 5, 2)) = max(3, min(2, x, y), 2) = max(3, z,2) = 3 where z = min(2, x, y) <= 2 In other words, the value of the root and hence the minimax decision are independent of the values of the pruned leaves x and y.

General alpha-beta pruning Alpha-beta pruning can be applied to trees of any depth It is often possible to prune entire sub trees rather than just leaves.

General alpha-beta pruning Consider a node n in the tree --- If player has a better choice m at: Parent node of n Or any choice point further up Then n will never be reached in play So once we have found out about n (by examining some of its descendants) to reach this conclusion, we can prune it.

alpha-beta pruning (a) The first leaf below B has the value 3. Hence, B, which is a MIN node, has a value of at most 3.

alpha-beta pruning (b) The second leaf below B has a value of 12; MIN would avoid this move, so the value of B is still at most 3.

alpha-beta pruning (c) The third leaf below B has a value of 8; we have seen all B's successor states, so the value of B is exactly 3. Now, we can infer that the value of the root is at least 3, because MAX has a choice worth 3 at the root.

alpha-beta pruning (d) The first leaf below C has the value 2. Hence, C, which is a MIN node, has a value of at most 2. But we know that B is worth 3, so MAX would never choose C. Therefore, there is no point in looking at the other successor states of C. This is an example of alpha-beta pruning.

alpha-beta pruning (e) The first leaf below D has the value 14, so D is worth at most 14. This is still higher than MAX's best alternative (i.e., 3), so we need to keep exploring D's successor states. Notice also that we now have bounds on all of the successors of the root, so the root's value is also at most 14.

alpha-beta pruning (f) The second successor of D is worth 5, so again we need to keep exploring. The third successor is worth 2, so now D is worth exactly 2. MAX's decision at the root is to move to B, giving a value of 3.

Alpha-beta Algorithm Depth first search only consider nodes along a single path from root at any time a = highest-value choice found at any choice point of path for MAX (initially, a = −infinity) b = lowest-value choice found at any choice point of path for MIN (initially,  = +infinity) Pass current values of a and b down to child nodes during search. Update values of a and b during search: MAX updates  at MAX nodes MIN updates  at MIN nodes Prune remaining branches at a node when a ≥ b

alpha-beta pruning alpha = the value of the best (i.e., highest-value) choice we have found so far at any choice point along the path for MAX. beta = the value of the best (i.e., lowest-value) choice we have found so far at any choice point along the path for MIN. Alpha-beta search updates the values of alpha and beta as it goes along and prunes the remaining branches at a node (i.e., terminates the recursive call) as soon as the value of the current node is known to be worse than the current alpha or beta value for MAX or MIN, respectively

Because MAX nodes are given the maximum value among their children, Alpha Value An alpha value is an initial or temporary value associated with a MAX node. Because MAX nodes are given the maximum value among their children, an alpha value can never decrease; it can only go up.

Because MIN nodes are given the minimum value among their children, Beta Value A beta value is an initial or temporary value associated with a MIN node. Because MIN nodes are given the minimum value among their children, a beta value can never increase; it can only go down.

Alpha Beta Procedure Depth first search of game tree, keeping track of: Alpha: Highest value seen so far on Max nodes (maximizing level) Beta: Lowest value seen so far on MIN nodes (minimizing level) Pruning When Maximizing, do not expand any more sibling nodes once a node has been seen whose evaluation is smaller than Alpha When maximizing, cut off values lower than Alpha When Minimizing, do not expand any sibling nodes once a node has been seen whose evaluation is greater than Beta When minimizing, cut off values greater than Beta

alpha-beta pruning

Alpha Beta Procedure – Trace

Alpha-Beta Example Ex1 Revisited Do DF-search until first leaf , , initial values =−  =+ , , passed to child nodes =−  =+

Alpha-Beta Example (continued) =−  =+ =−  =3 MIN updates , based on child nodes

Alpha-Beta Example (continued) =−  =+ =−  =3 MIN updates , based on child nodes No change.

Alpha-Beta Example (continued) MAX updates , based on child nodes =3  =+ 3 is returned as node value.

Alpha-Beta Example (continued) =3  =+ , , passed to child nodes =3  =+

Alpha-Beta Example (continued) =3  =+ MIN updates , based on child nodes. =3  =2

Alpha-Beta Example (continued) =3  =+ =3  =2  ≥ , so prune.

Alpha-Beta Example (continued) MAX updates , based on child nodes No change. =3  =+ 2 is returned as node value.

Alpha-Beta Example (continued) =3  =+ , , , passed to child nodes =3  =+

Alpha-Beta Example (continued) =3  =+ , MIN updates , based on child nodes . =3  =14

Alpha-Beta Example (continued) =3  =+ , MIN updates , based on child nodes =3  =5

Alpha-Beta Example (continued) =3  =+ 2 is returned as node value. 2

Alpha-Beta Example (continued) Max calculates the same node value, and makes the same move! 2

alpha-beta pruning

Analysis of alpha Beta Pruning

Analysis of alpha Beta Pruning For the same tree, different move orderings give different cut branches. If a node can evaluate a child with the best possible outcome earlier, then it can decide to cut earlier. For a MIN node, this means to evaluate the child branch that gives the lowest value first. For a MAX node, this means to evaluate the child branch that gives the highest value first.

Effectiveness of Alpha-Beta Search The effectiveness of alpha-beta pruning is highly dependent on the order in which the states are examined. Example: we could not prune any successors of D because the worst successors (from the point of view of MIN) were generated first. If the third successor of D had been generated first, we would have been able to prune the other two. This suggests that it might be worthwhile to try to examine first the successors that are likely to be best.

A 2-ply Game tree – Ex1 MAX A1 A2 A3 1st ply MIN 3 12 8 2 4 6 14 5 2 2nd ply

Alpha-Beta Example 2

Effectiveness of Alpha-Beta Search Best-Case each player’s best move is the left-most child (i.e., evaluated first) E.g., sort moves by the remembered move values found last time. E.g., expand captures first, then threats, then forward moves, etc. (chess game)

Effectiveness of Alpha-Beta Search Worst-Case branches are ordered so that no pruning takes place alpha-beta gives no improvement over exhaustive search Best-Case each player’s best move is the left-most alternative (i.e., evaluated first) In practice often O(b^(m/2)) rather than O(b^m) this is the same as having a branching factor of sqrt(b), since (sqrt(b))^m = b^(m/2) i.e., Effective branching factor is square root of b instead of b e.g., in chess go from b ~ 35 to b ~ 6 this permits much deeper search in the same amount of time makes computer chess competitive with humans!

Alpha-Beta Pruning - Summary Pruning does not affect final results Entire subtrees can be pruned. Good move ordering improves effectiveness of pruning Repeated states are again possible. Store them in memory = transposition table