CSCI 4310 Lecture 6: Adversarial Tree Search
Book Winston Chapter 6
Adversaries Many games involve two players Competing Agents Previous Trees were exploring search space
Adversaries Now, alternating levels of the tree represent each player P 1 decision P 2 decision P 1 decision
Adversaries Making choices based on scoring function for each node Ideally, fully expand the tree, follow the choice that leads you to the best subtree
Adversaries A completed game
Scoring path outcomes Domain specific We need expert advice Tic-tac-toe is easy Maximize winning possibilities X This position has a score of 3 for X
Scoring path outcomes Other games can become much more difficult or no apparent best strategy. X This position has a score of 4 for X
Scoring path outcomes Tuning parameters are important Piece count – often underperforms Not effective in chess, unless you are playing against a chimp Positional strategy – more involved Works well in Othello Moriarty paper on NN Othello play Computers crush the best human Othello players
Scoring path outcomes Make it fast – we will do this often Every potential game (along a path we have not pruned) will need to be evaluated by score function
Minimax P 1 is maximizing Maximizer (p1) must take p2’s decisions into account Min = 2Min = 1 P 1 decision - MAX P 2 decision - MIN P 1 decision - MAX s g1 =2s g2 =7s g3 =1s g4 =8
Minimax In practice, we limit the search depth Assuming, if we are p max, other player will minimize Can lead to poor play against poor players Assuming (if I am maximizing) that the opponent is minimizing is sometimes risky
Zugzwang from wikipediawikipedia A situation where someone is forced to act, but would prefer not to. In chess, zugzwang occurs when one player is put at a disadvantage because he has to make a move.
- pruning We do not need to evaluate every possibility to the leaves of the tree Book: If you have an idea that is surely bad, do not take time to see how truly awful it is
- pruning MAX LEVEL MIN LEVEL MAX LEVEL 872 DFS starts by exploring leftmost child of each node Scoring function assigns values to children
- pruning MAX LEVEL MIN LEVEL MAX LEVEL 872 Maximizer will choose 8 at this node
- pruning MAX LEVEL MIN LEVEL MAX LEVEL This is already better for maximizer than the 8 of the sibling nodes. But…
- pruning MAX LEVEL MIN LEVEL MAX LEVEL The minimizer will never choose this path over the 8. We need not score these nodes
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Minimizer one level above only cares if this is < 8.
- pruning MAX LEVEL MIN LEVEL MAX LEVEL It is - after the first node (value = 2), so we proceed Minimizer will choose 4 here – will maximizer at root?
- pruning MAX LEVEL MIN LEVEL MAX LEVEL We need to score the leftmost branch of the center subtree for a baseline
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Once we hit 9, minimizer at level above will never choose this node – no need to evaluate siblings
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Minimizer one level above wants a max of 4 from the children of this node
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Oops – we can ignore siblings
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Oops – we can ignore siblings
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Maximizer at root wants ≥ 6, which means…
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Minimum of these siblings must be ≥ 6 Which means…
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Maximum of these groups must be ≥ 6 Didn’t work, so…
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Discontinue further evaluation
- pruning MAX LEVEL MIN LEVEL MAX LEVEL Maximizer will follow middle path. Alpha-Beta prunes 11/27 40% This minimax search is evaluating 3 levels
- pruning Effectiveness of alpha-beta pruning dependent on the ordering of successor nodes in the minimax DFS. We can get lucky from the perspective of min or max.
- pruning If we manage to get the perfect ordering of successors Minimax O(b d ) Branching factor b Search depth d Minimax with alpha-beta Ideal branching factor b O(b d/2 ) perfect O(b 3d/4 ) realistic Computational savings may allow larger d in simulations
- pruning Recursive algorithm on p. 110 Or, a nice explanation with application to Othello Or, a nice explanation with application to Othello
Horizon effect Not seeing the big picture Going for quick points may lead to a state that was worse than the original even with the extra points We did not search deep enough Did not have time
Horizon effect Problem occurs in many algorithms Attempting to avoid local optima may be considered the defining characteristic of many algorithms Neural networks can be overtrained GA’s can converge too soon Greedy search problems Etc.
Horizon effect P. 114 describes heuristics that attempt to mitigate the horizon effect
McAllester’s conspiracy number Nodes that, if changed, could cause different choices higher in the search are expanded for further evaluation David Allen McAllester, Conspiracy numbers for min-max search, Artificial Intelligence, v.35 n.3, p , July 1988