Adversarial Search CMPT 420 / CMPG 720.

Adversarial Search CMPT 420 / CMPG 720

Outline Game playing Game trees Minimax Alpha-beta pruning

Games vs. search problems
competitive environments: agents’ goals are in conflict adversarial search problems (games)

Types of Games chess, checkers, go, othello backgammon, monopoly
deterministic chance perfect information chess, checkers, go, othello backgammon, monopoly bridge, poker imperfect information

Games deterministic, fully-observable, turn-taking, two–player, zero-sum games Utility values at the end are equal and opposite Tic-tac-toe

Game Search Formulation
Two players MAX and MIN take turns (with MAX playing first) S0: Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

S0: initial state Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

S0: initial state Player(s): which player has the move in a state Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): Terminal-test(s): Utility(s,p):

S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): Utility(s,p):

S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p):

S0: initial state Player(s): which player has the move in a state Actions(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p): utility function defines the final value of a game that ends in terminal state s for a player p zero-sum games: same total payoff

Game tree (1-player)

Partial Game Tree for Tic-Tac-Toe

Optimal strategies MAX uses search tree to determine next move.
Assumption: Both players play optimally!! Given a game tree, the optimal strategy can be determined by using the minimax value of each node

Minimax The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = if Terminal-test(s) if Player(s) = Max if Player(s) = Min

Minimax The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = Utility (s) if Terminal-test(s) max of Minimax(Result(s,a)) if Player(s) = Max min of Minimax(Result(s,a)) if Player(s) = Min

Optimal Play 2 7 1 8 2 7 1 8 2 7 1 8 2 7 1 8 2 7 1 8 This is the optimal play MAX MIN

Two-Ply Game Tree

Two-Ply Game Tree Minimax maximizes the worst-case outcome for max.
The minimax decision Minimax maximizes the worst-case outcome for max.

What if MIN does not play optimally?
Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX can do even better.

Minimax Algorithm function MINIMAX-DECISION(state) returns an action
inputs: state, current state in game vMAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  -∞ for a,s in SUCCESSORS(state) do v  MAX(v,MIN-VALUE(s)) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  ∞ for a,s in SUCCESSORS(state) do v  MIN(v,MAX-VALUE(s)) return v

Properties of minimax Complete? Optimal? Time complexity?
Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games  exact solution is infeasible

Alpha-Beta Pruning Problem with minimax search: exponential in the depth of the tree Can we cut it in half? It is possible to compute the minimax decision without looking at every node. pruning: eliminate some parts of the tree

Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX MIN MAX 2 7 1 ?

Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX We don’t need to compute the value at this node. No matter what it is, it can’t affect the value of the root node. MIN MAX 2 7 1 ?

Alpha-Beta Example Do DFS until the first leaf [-∞,+∞] [-∞, +∞]
Range of possible values [-∞,+∞] [-∞, +∞]

Alpha-Beta Example Do DFS until first leaf [-∞,+∞] [-∞, +∞]
Range of possible values [-∞,+∞] [-∞, +∞]

Alpha-Beta Example (continued)
[-∞,+∞] [-∞,3]

[-∞,+∞] [3,3]

[3,+∞] [3,3]

[3,+∞] [3,3] [-∞, ∞]

[3,+∞] [3,3] [-∞,2]

[3,+∞] This node is worse for MAX [3,3] [-∞,2]

, [3,14] [3,3] [-∞,2] [-∞, ∞]

, [3,14] [3,3] [-∞,2] [-∞,14]

, [3,5] [3,3] [−∞,2] [-∞,5]

[3,3] [−∞,2] [2,2]

[3,3] [3,3] [-∞,2] [2,2]

α-β pruning example Minimax(root)
= max(min(3,12,8),min(2,x,y),min(14,5,2)) = max(3,min(2,x,y),2) = 3

α-β pruning We made the same minimax decision without ever evaluating two of the leaf nodes! They are independent. It is possible to prune entire subtrees.

Why is it called α-β? α = value of the best choice found so far at any choice point along the path for max If v is worse than α, max will avoid it  prune that branch Define β similarly for min

Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  - ∞ for a,s in SUCCESSORS(state) do v  MAX(v,MIN-VALUE(s,  , )) if v ≥  then return v   MAX( ,v) return v

Alpha-Beta Algorithm function MIN-VALUE(state,  , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  + ∞ for a,s in SUCCESSORS(state) do v  MIN(v,MAX-VALUE(s,  , )) if v ≤  then return v   MIN( ,v) return v

Comments: Alpha-Beta Pruning
Pruning does not affect the final results. Entire subtrees can be pruned. Good move ordering improves effectiveness of pruning. With “perfect ordering,” time complexity is O(bm/2) Alpha-beta pruning can look twice as far as minimax in the same amount of time

Deterministic games in practice
Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Othello: Logistello defeated the human world champion. It is generally acknowledged that human are no match for computers at Othello.

Adversarial Search CMPT 420 / CMPG 720.

Similar presentations

Presentation on theme: "Adversarial Search CMPT 420 / CMPG 720."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Adversarial Search CMPT 420 / CMPG 720.

Similar presentations

Presentation on theme: "Adversarial Search CMPT 420 / CMPG 720."— Presentation transcript:

Similar presentations

About project

Feedback