 # Adversarial Search Reference: “Artificial Intelligence: A Modern Approach, 3 rd ed” (Russell and Norvig)

## Presentation on theme: "Adversarial Search Reference: “Artificial Intelligence: A Modern Approach, 3 rd ed” (Russell and Norvig)"— Presentation transcript:

Adversarial Search Reference: “Artificial Intelligence: A Modern Approach, 3 rd ed” (Russell and Norvig)

Goal Find the best move to make in a two-agent, zero-sum game. o win=+1, lose=-1 o player1 + player2 = 0 Ideally, do this as quickly as possible Terms: o MAX = us, we’re trying to maximize our score o MIN = opponent, they’re trying to minimize our score.

Brute-Force (minimax) Given: B (current board state) Create a search tree B B1B1 B2B2 BnBn … B 1,1 … B 1,m B 2,1 B 2,2 B 2,p … MAX’s move MIN’s move … B q,r +1 MAX’s move

Problems A lot of states to calculate / evaluate! – For tic-tac-toe, at most 9! = 362,880 states – For chess, over 10 40 states (zillions of years to calculate) We may need to limit the ply (number of times both min and max move) – Cuts down on search tree size – But…we’re not always seeing the game to it’s end. – Often necessitates a heuristic score of the board (from MAX’s point-of-view) Also, there are many win / loses cases – which is best? – If we get to the win through nodes where MIN picks their best move, we stand a better chance of winning.

Minimax algorithm Let’s say the heursitics (show beside the boxes) look like this (from Max’s point of view): – (a 1-ply look-ahead) B B1B2 B3 B1,1 B1,2B1,3B2,1 B2,2 B2,3 B3,1 B3,2B3,3 MAX’s move MIN’s move 31282461452 MIN wants to minimize the score, so they would choose the lowest value on their turn(s) 3 2 2 MAX wants to maximize the score, so they would choose the highest value on their turn(s) 3 So…against an optimal opponent, MAX will get a score of 3 if they make move#1 The values are backed up

Analysis Always picks the optimal solution (assuming the heuristic is good) But…does a complete depth-first traversal of states (up to the max-ply)

Another way of looking at minimax minimax(B) = max(min(3, 12,8), min(2,4,6), min(14,5,2)) =max(3, 2, 2) = 3 But…notice if we hadn't evaluated the 4 or 6: minimax(B) = max(min(3,12,8), min(2,x,y), min(14,5,2)) = max(3, min(2, x, y), 2) = max(3, z, 2) where z <= 2 (why??) A: because on the first branch, the min is 3, we wouldn't choose the second branch because it's at best 2. = 3 the trick is, how can we determine this algorithmically. B B1B2 B3 B1,1 B1,2B1,3B2,1 B2,2 B2,3 B3,1 B3,2B3,3 MAX’s move 31282461452 3 2 2 3

alpha-beta search Track these two values (each recursive call has its own copy) – α: the best value (highest) for any paths going through a MAX node. – β: the best value (lowest) for any paths going through a MIN node. Together these are the range of values MAX can expect if we go through this node.

alpha-beta search, cont. If looking at a MAX node: – Possibly update α (if a child branch is higher) – Terminate early if we see a child branch bigger than β – Return the minimal child value that we looked at [and the action] If looking at a MIN node: – Possibly update β (if a child branch is lower) – Terminate early if we see a child branch smaller than α – Return the maximal child value that we looked at [and the action]

alpha-beta algorithm def alpha_beta(state): v = max_value(state, -∞, +∞) return move with value v def max_value(state, α, β): if ending_state(state) return value(state) v = -∞ for each move in actions(state): r = result(state, move) v = max(v, min_value(r, α, β)) if v ≥ β, return v α = max(α, v) return v def min_value(state, α, β): if ending_state(state) return value(state) v = +∞ for each move in actions(state): r = result(state, move) v = min(v, max_value(r, α, β)) if v ≤ α, return v β = min(β, v) return v

max_value(A) [α= -∞ β= ∞ v= -∞ ] min_value(B) [α= -∞ β= ∞ v= -∞ ] max_value(C) [α= -∞ β= ∞ v= -∞ ] min_value(D) [α= -∞ β= ∞] min_value(E) [α= 5 β= ∞] min_value(F) [α= 5 β= ∞] max_value(G) [α= -∞ β= 8 v= -∞ ] min_value(H) [α= -∞ β= 8] break out of loop early b/c v(12) ≥ β (8) min_value(K) [α= 8 β= ∞ v= -∞ ] max_value(L) [α= 8 β= ∞ v= -∞ ] min_value(M) [α= 8 β= ∞] min_value(N) [α= 8 β= ∞] break out of loop b/c v(4) ≤ α (8) A C G L OR B K DEFHIJ M NPQST 5 => 5 5 5 => 3 => 8 8 8 8 8 => 12 12 8 => 8 8 => -1 => 4 4 3812 1 3 4 8 32-2 => 44 => A->B

Analysis Alpha-beta pruning can shave off some state checks Move-ordering: – It does best when moves are ordered: highest=>lowest for MIN nodes lowest=>highest for MAX nodes – Sometimes it's possible to order moves: e.g. Chess: captures first, then threats, then forward-moves, then backwards-moves. – Sometimes you can't, though. Worst-case: alpha-beta pruning prunes nothing, then you have minimax. Cutoff-depth (or time) restraints

Games of Chance (stochastic games)

"Modern" Applications Deep Blue (IBM c.1996) Beat Gary Kasparov Algorithms (Chess 4.0): – a playbook of common opening and closing moves – alpha-beta – quiescence search (searching those branches that look "promising" (heurisitic) a bit deeper) Helps avoid the horizon problem. – a few more optimizations

Download ppt "Adversarial Search Reference: “Artificial Intelligence: A Modern Approach, 3 rd ed” (Russell and Norvig)"

Similar presentations