Intelligence Artificial Intelligence Ian Gent Games 2: Alpha Beta.

Intelligence Artificial Intelligence Ian Gent ipg@cs.st-and.ac.uk Games 2: Alpha Beta

Intelligence Artificial Intelligence Part I :The idea of Alpha Beta Search Part II: The details of Alpha Beta Search Part III: Results of using Alpha Beta Alpha Beta Search

3 Reminder zWe consider 2 player perfect information games zTwo players, Min and M  x zLeaf nodes given definite score zbacking up by MiniMax defines score for all nodes zUsually can’t search whole tree yUse static evaluation function instead zMiniMax hopelessly inefficient

4 What’s wrong with MiniMax zMinimax is horrendously inefficient zIf we go to depth d, branching rate b, ywe must explore b d nodes zbut many nodes are wasted zWe needlessly calculate the exact score at every node zbut at many nodes we don’t need to know exact score ze.g. outlined nodes are irrelevant

5 The Solution zStart propagating costs as soon as leaf nodes are generated zDon’t explore nodes which cannot affect the choice of move yI.e. don’t explore those that we can prove are no better than the best found so far zThis is the idea behind alpha-beta search

6 Alpha-Beta search zAlpha-Beta =  zUses same insight as branch and bound zWhen we cannot do better than the best so far ywe can cut off search in this part of the tree zMore complicated because of opposite score functions zTo implement this we will manipulate alpha and beta values, and store them on internal nodes in the search tree

7 Alpha and Beta values zAt a M  x node we will store an alpha value ythe alpha value is lower bound on the exact minimax score ythe true value might be   yif we know Min can choose moves with score <  xthen Min will never choose to let Max go to a node where the score will be  or more zAt a Min node,  value is similar but opposite zAlpha-Beta search uses these values to cut search

8 Alpha Beta in Action zWhy can we cut off search? zBeta = 1 < alpha = 2 where the alpha value is at an ancestor node zAt the ancestor node, Max had a choice to get a score of at least 2 (maybe more) zMax is not going to move right to let Min guarantee a score of 1 (maybe less)

9 Alpha and Beta values zM  x node has  value ythe alpha value is lower bound on the exact minimax score ywith best play M  x can guarantee scoring at least  zMin node has  value ythe beta value is upper bound on the exact minimax score ywith best play Min can guarantee scoring no more than  zAt Max node, if an ancestor Min node has  <  yMin’s best play must never let Max move to this node xtherefore this node is irrelevant yif  = , Min can do as well without letting Max get here xso again we need not continue

10 Alpha-Beta Pruning Rule zTwo key points: yalpha values can never decrease ybeta values can never increase zSearch can be discontinued at a node if: yIt is a Max node and xthe alpha value is  the beta of any Min ancestor xthis is beta cutoff yOr it is a Min node and xthe beta value is  the alpha of any Max ancestor xthis is alpha cutoff

11 Calculating Alpha-Beta values zAlpha-Beta calculations are similar to Minimax ybut the pruning rule cuts down search zUse concept of ‘final backed up value’ of node ythis might be the minimax value yor it might be an approximation where search cut off xless than the true minimax value at a Max node xmore than the true minimax value at a Min node xin either case, we don’t need to know the true value

12 Final backed up value zLike MiniMax zAt a Max node: ythe  value is equal to the: xlargest final backed up value of its successors xthis can be all successors (if no beta cutoff) xor all successors used until beta cutoff occurs zAt a Min node ythe  value is equal to the xsmallest final backed up value of its successors xmin of all successors until alpha cutoff occurs

13 Calculating alpha values zAt a M  x node yafter we obtain the final backed up value of the first child xwe can set  of the node to this value ywhen we get the final backed up value of the second child xwe can increase  if the new value is larger ywhen we have the final child, or if beta cutoff occurs xthe stored  becomes the final backed up value xonly then can we set the  of the parent Min node xonly then can we guarantee that  will not increase zNote the difference ysetting alpha value of current node as we go along yvs. propagating value up only when it is finalised

14 Calculating beta values zAt a Min node yafter we obtain the final backed up value of the first child xwe can set  of the node to this value ywhen we get the final backed up value of the second child xwe can decrease  if the new value is smaller ywhen we have the final child, or if alpha cutoff occurs xthe stored  becomes the final backed up value xonly then can we set the  of the parent Min node xonly then can we guarantee that  will not decrease zNote the difference ysetting beta value of current node as we go along yvs. propagating value up only when it is finalised

15 Move ordering Heuristics zVariable ordering heuristics irrelevant zvalue ordering heuristics = move ordering heuristic zThe optimal move ordering heuristic for alpha-beta.. y… is to consider the best move first yI.e. test the move which will turn out to have best final backed up value yof course this is impossible in practice zThe pessimal move ordering heuristic … y… is to consider the worst move first yI.e. test move which will have worst final backed up value

16 Move ordering Heuristics zIn practice we need quick and dirty heuristics zwill neither be optimal nor pessimal zE.g. order moves by static evaluation function yif it’s reasonable, most promising likely to give good score yshould be nearer optimal than random zIf static evaluation function is expensive yneed even quicker heuristics zIn practice move ordering heuristics vital

17 Theoretical Results zWith pessimal move ordering, yalpha beta makes no reduction in search cost zWith optimal move ordering yalpha beta cuts the amount of search to the square root yI.e. From bd to  bd = bd/2 yEquivalently, we can search to twice the depth xat the same cost zWith heuristics, performance is in between zalpha beta search vital to successful computer play in 2 player perfect information games

18 Summary and Next Lecture zGame trees are similar to search trees ybut have opposing players zMinimax characterises the value of nodes in the tree ybut is horribly inefficient zUse static evaluation when tree too big zAlpha-beta can cut off nodes that need not be searched ycan allow search up to twice as deep as minimax zNext Time: y Chinook, world champion Checkers player

Intelligence Artificial Intelligence Ian Gent Games 2: Alpha Beta.

Similar presentations

Presentation on theme: "Intelligence Artificial Intelligence Ian Gent Games 2: Alpha Beta."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Intelligence Artificial Intelligence Ian Gent Games 2: Alpha Beta.

Similar presentations

Presentation on theme: "Intelligence Artificial Intelligence Ian Gent Games 2: Alpha Beta."— Presentation transcript:

Similar presentations

About project

Feedback