Presentation is loading. Please wait.

Presentation is loading. Please wait.

Adversarial Search CMPT 420 / CMPG 720.

Similar presentations


Presentation on theme: "Adversarial Search CMPT 420 / CMPG 720."— Presentation transcript:

1 Adversarial Search CMPT 420 / CMPG 720

2

3 Outline Game playing Game trees Minimax Alpha-beta pruning

4 Games vs. search problems
competitive environments: agents’ goals are in conflict adversarial search problems (games)

5 Types of Games chess, checkers, go, othello backgammon, monopoly
deterministic chance perfect information chess, checkers, go, othello backgammon, monopoly bridge, poker imperfect information

6 Games deterministic, fully-observable, turn-taking, two–player, zero-sum games Utility values at the end are equal and opposite Tic-tac-toe

7 Game Search Formulation
Two players MAX and MIN take turns (with MAX playing first) S0: Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

8 Game Search Formulation
S0: initial state Player(s): Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

9 Game Search Formulation
S0: initial state Player(s): which player has the move in a state Action(s): Result(s,a): Terminal-test(s): Utility(s,p):

10 Game Search Formulation
S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): Terminal-test(s): Utility(s,p):

11 Game Search Formulation
S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): Utility(s,p):

12 Game Search Formulation
S0: initial state Player(s): which player has the move in a state Action(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p):

13 Game Search Formulation
S0: initial state Player(s): which player has the move in a state Actions(s): set of legal moves in a state Result(s,a): transition model Terminal-test(s): true/false (terminal states) Utility(s,p): utility function defines the final value of a game that ends in terminal state s for a player p zero-sum games: same total payoff

14 Game tree (1-player)

15

16 Partial Game Tree for Tic-Tac-Toe

17 Optimal strategies MAX uses search tree to determine next move.
Assumption: Both players play optimally!! Given a game tree, the optimal strategy can be determined by using the minimax value of each node

18 Minimax The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = if Terminal-test(s) if Player(s) = Max if Player(s) = Min

19 Minimax The minimax value of a node is the utility (for Max) of being in the corresponding state, assuming that both players play optimally. Minimax(s) = Utility (s) if Terminal-test(s) max of Minimax(Result(s,a)) if Player(s) = Max min of Minimax(Result(s,a)) if Player(s) = Min

20 Optimal Play 2 7 1 8 2 7 1 8 2 7 1 8 2 7 1 8 2 7 1 8 This is the optimal play MAX MIN

21 Two-Ply Game Tree

22 Two-Ply Game Tree

23 Two-Ply Game Tree

24 Two-Ply Game Tree Minimax maximizes the worst-case outcome for max.
The minimax decision Minimax maximizes the worst-case outcome for max.

25 What if MIN does not play optimally?
Definition of optimal play for MAX assumes MIN plays optimally: maximizes worst-case outcome for MAX. But if MIN does not play optimally, MAX can do even better.

26 Minimax Algorithm function MINIMAX-DECISION(state) returns an action
inputs: state, current state in game vMAX-VALUE(state) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  -∞ for a,s in SUCCESSORS(state) do v  MAX(v,MIN-VALUE(s)) return v function MIN-VALUE(state) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  ∞ for a,s in SUCCESSORS(state) do v  MIN(v,MAX-VALUE(s)) return v

27 Properties of minimax Complete? Optimal? Time complexity?
Yes (if tree is finite) Optimal? Yes (against an optimal opponent) Time complexity? O(bm) Space complexity? O(bm) (depth-first exploration) For chess, b ≈ 35, m ≈100 for "reasonable" games  exact solution is infeasible

28 Alpha-Beta Pruning Problem with minimax search: exponential in the depth of the tree Can we cut it in half? It is possible to compute the minimax decision without looking at every node. pruning: eliminate some parts of the tree

29 Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX MIN MAX 2 7 1 ?

30 Alpha-beta pruning We can improve on the performance of the minimax algorithm through alpha-beta pruning MAX We don’t need to compute the value at this node. No matter what it is, it can’t affect the value of the root node. MIN MAX 2 7 1 ?

31 Alpha-Beta Example Do DFS until the first leaf [-∞,+∞] [-∞, +∞]
Range of possible values [-∞,+∞] [-∞, +∞]

32 Alpha-Beta Example Do DFS until first leaf [-∞,+∞] [-∞, +∞]
Range of possible values [-∞,+∞] [-∞, +∞]

33 Alpha-Beta Example (continued)
[-∞,+∞] [-∞,3]

34 Alpha-Beta Example (continued)
[-∞,+∞] [-∞,3]

35 Alpha-Beta Example (continued)
[-∞,+∞] [3,3]

36 Alpha-Beta Example (continued)
[3,+∞] [3,3]

37 Alpha-Beta Example (continued)
[3,+∞] [3,3] [-∞, ∞]

38 Alpha-Beta Example (continued)
[3,+∞] [3,3] [-∞,2]

39 Alpha-Beta Example (continued)
[3,+∞] This node is worse for MAX [3,3] [-∞,2]

40 Alpha-Beta Example (continued)
, [3,14] [3,3] [-∞,2] [-∞, ∞]

41 Alpha-Beta Example (continued)
, [3,14] [3,3] [-∞,2] [-∞,14]

42 Alpha-Beta Example (continued)
, [3,5] [3,3] [−∞,2] [-∞,5]

43 Alpha-Beta Example (continued)
[3,3] [−∞,2] [2,2]

44 Alpha-Beta Example (continued)
[3,3] [3,3] [-∞,2] [2,2]

45 α-β pruning example Minimax(root)
= max(min(3,12,8),min(2,x,y),min(14,5,2)) = max(3,min(2,x,y),2) = 3

46 α-β pruning We made the same minimax decision without ever evaluating two of the leaf nodes! They are independent. It is possible to prune entire subtrees.

47 Why is it called α-β? α = value of the best choice found so far at any choice point along the path for max If v is worse than α, max will avoid it  prune that branch Define β similarly for min

48 Alpha-Beta Algorithm function ALPHA-BETA-SEARCH(state) returns an action inputs: state, current state in game vMAX-VALUE(state, - ∞ , +∞) return the action in SUCCESSORS(state) with value v function MAX-VALUE(state, , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  - ∞ for a,s in SUCCESSORS(state) do v  MAX(v,MIN-VALUE(s,  , )) if v ≥  then return v   MAX( ,v) return v

49 Alpha-Beta Algorithm function MIN-VALUE(state,  , ) returns a utility value if TERMINAL-TEST(state) then return UTILITY(state) v  + ∞ for a,s in SUCCESSORS(state) do v  MIN(v,MAX-VALUE(s,  , )) if v ≤  then return v   MIN( ,v) return v

50 Comments: Alpha-Beta Pruning
Pruning does not affect the final results. Entire subtrees can be pruned. Good move ordering improves effectiveness of pruning. With “perfect ordering,” time complexity is O(bm/2) Alpha-beta pruning can look twice as far as minimax in the same amount of time

51 Deterministic games in practice
Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used a precomputed endgame database defining perfect play for all positions involving 8 or fewer pieces on the board, a total of 444 billion positions. Chess: Deep Blue defeated human world champion Garry Kasparov in a six-game match in Deep Blue searches 200 million positions per second, uses very sophisticated evaluation, and undisclosed methods for extending some lines of search up to 40 ply. Othello: Logistello defeated the human world champion. It is generally acknowledged that human are no match for computers at Othello.


Download ppt "Adversarial Search CMPT 420 / CMPG 720."

Similar presentations


Ads by Google