Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves.

Agents that can play multi-player games

Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves -A representation of the initial state; -A method to generate new states from existing ones; -A test for whether a state is a goal state. Initial Board for Triangle Peg Solitaire A jump, with resulting board The goal state:

Recall: Single-player, fully-observable, deterministic game agents Initial Board for Triangle Peg Solitaire A jump, with resulting board The goal state: … Initial state Successor state axioms or STRIPS effects Goal state

Goal state vs. Terminal states and Utilities The goal state: terminal states Utility: +2 Utility: +1 Utility: -1

Quiz: Goal states vs. Terminal states and Utilities … Initial state Successor state axioms or STRIPS effects Terminal states What could go wrong when using A* or breadth-first or other strategies with terminal states? +1 +2

Answer: Goal states vs. Terminal states and Utilities … Initial state Successor state axioms or STRIPS effects Terminal states You’re guaranteed to find the best path to the terminal state that is found. You’re NOT guaranteed to find the best terminal state (the one with highest utility), unless you do an exhaustive search. +1 +2

Hex: Two-player, zero-sum game (Also, deterministic and fully-observable.) Hex: -Two players, red and blue. -Board is N x N, with hexagonal spaces. -Two opposite sides are red, and other two sides are blue. -Each player’s objective is to build a path connecting the sides of his or her color. -Players alternate turns, and place a single piece of their color on their turn.

Hex: Two-player, zero-sum game Some fun facts: -There are no ties in Hex (proved by John Nash). -First player has a distinct advantage (also proved by Nash). -In tournament play, it’s common to use the “pie rule”, for fairness: after the first player makes the first move, the second player can choose whether to switch sides. (We will ignore this rule.)

Hex Question What is red’s best move (red’s turn next)?

Hex Question What is red’s best move (red’s turn next)? This orange one looks pretty good: only one more square, and red will win. Using a simple heuristic, this looks like it’s getting close to the goal.

Hex Question What is red’s best move (red’s turn next)? However, if red moves to the orange square, the blue player can win on the next turn!

Quiz: Hex Question If red moves to the orange square, what is blue’s best move?

Answer: Hex Question Blue has no good moves left!

Answer: Hex Question Blue has no good moves left! This one’s bad – red can still connect the paths.

Answer: Hex Question Blue has no good moves left! And this one’s bad too – red can still connect the paths.

Reasoning about 2-player games To pick a good move, each player has to think about the other player’s possible responses!

Extensive Form Representation of Games Notation: -two players, Max (Δ) and Min ( ∇ ). -Terminal states are represented by a with a number for the utility for Max (Δ) inside. (Since we’re doing zero-sum games, the utility for Min ( ∇ ) is just the opposite of this number.)

Extensive Form Representation of Games Game tree: … Max’s turn Resulting worlds/boards +1 +2 ∆ ∇∇∇ ∆∆∆∆∆∆∆∆∆ Max’s turn … … Terminal states, with utility for Max Max’s possible actions Min’s turn Resulting worlds/boards Min’s possible actions

Minimax (Backup) Algorithm Basic Idea: Compute ∆ ’s Value(n) for each node n in the game tree, starting with the leaves and working up (“backup”). We’ll use a depth-first tree traversal. Once this is calculated, Max will choose an action that leads to a child node with the highest possible value. 8 12 1 ∆ ∇∇∇ 4 4 3 20 15 2

Minimax (Backup) Algorithm 8 12 1 ∆ ∇∇∇ 4 4 3 20 15 2

Minimax (Backup) Algorithm 8 12 1 ∆ ∇∇∇ 4 4 3 20 15 2 Value: min {3, 4, 4} = 3 Value: min {2, 30, 15} = 2

Quiz: Minimax (Backup) Algorithm 8 12 1 ∆ ∇∇∇ 4 4 3 20 15 2 Value: min {3, 4, 4} = 3 Value: min {2, 30, 15} = 2 1.What is the Value of the middle ∇ node? 2.What is the value of the top ∆ node?

Answer: Minimax (Backup) Algorithm 8 12 1 ∆ ∇∇∇ 4 4 3 20 15 2 1.What is the Value of the middle ∇ node? min {1, 8, 12} = 1 2.What is the value of the top ∆ node? Max {3, 1, 2} = 3

Quiz: Minimax 1.Compute the value of each node in the game tree. 2.Which action should Max take? 3.What is Min’s optimal response? 4 12 1 ∆ ∇∇∇ 4 56 20 -9215301079 ∆ ∆∆∆ ∆ abc

Answer: Minimax 1.Compute the value of each node in the game tree. 2.Which action should Max take? Action on right (c) 3.What is Min’s optimal response? Action on right 4 12 1 ∆ ∇∇∇ 4 56 20 -9215301079 ∆ ∆∆∆ ∆ 67 3015 4 1 abc

From Extensive Form to Normal Form Games Every “extensive form” game (even ones where you don’t have zero-sum utilities) can be made into a “normal form” game. 4 1 ∆ ∇∇ 4 5107 ∆∆ A B C D C D A B A B CD A, A+4, -4+5, -5 A, B+4, -4+7, -7 B, A+1, -1+4, -4 B, B+1, -1+10, -10 Each sequence of actions for a player becomes a row or a column. The size of the resulting matrix can be exponential in the size of the game tree.

From Normal Form games to Extensive Form games Not every Normal Form game can be represented using the Extensive Form I have showed you so far. CD C+2, -2-3, +3 D +4, -4 -3 ∆ ∇∇ 2 C D C D C D 4 ∇ ∆∆ 2 C D C D C D 4 ? ? ∇ ∆

From Normal Form games to Extensive Form games Can introduce new notation – information states – that allows the Extensive Form to represent any Normal Form game. CD C+2, -2-3, +3 D +4, -4 -3 ∆ ∇∇ 2 C D C D C D 4 ∇ ∆∆ 2 C D C D C D 4 ∇ ∆

From Normal Form games to Extensive Form games Information states are also useful for handling Partial Observability in turn-based games. Eg, in Poker, they can be used to represent the set of all hands your opponent may have been dealt. CD C+2, -2-3, +3 D +4, -4 -3 ∆ ∇∇ 2 C D C D C D 4 ∇ ∆∆ 2 C D C D C D 4 ∇ ∆

Perfect Information Games Definition: A game in extensive form has perfect information if every information state has only one node. (This is the same as our original version of game trees.) Perfect Information is basically just another name for full observability for game trees. We’ll talk more about partial observability later. Theorem (Zermelo, 1913): Every finite, perfect-information game in extensive form has a pure-strategy Nash equilibrium.

Relation between Minimax Algorithm and Minimax Theorem Recall that the Minimax Theorem says every 2- player, zero-sum game has a Value for each player and a Nash Equilibrium. The guy who proved this (von Neumann) used essentially the Minimax algorithm to prove the theorem. The Value of the root node in the Minimax algorithm is the same as the Value of the game for the Max player.

Quiz: Time Complexity of Minimax Let b be the branching factor of the game tree. Let m be the depth of the game tree. What is the time complexity of Minimax? O(b+m)? O(bm)? O(m b )? 4 12 1 ∆ ∇∇∇ 4 56 20 -9215301079 ∆ ∆∆∆ ∆

Answer: Time Complexity of Minimax Let b be the branching factor of the game tree. Let m be the depth of the game tree. What is the time complexity of Minimax? O(b+m)? O(bm)? O(b m ) O(m b )? 4 12 1 ∆ ∇∇∇ 4 56 20 -9215301079 ∆ ∆∆∆ ∆

Quiz: Space Complexity of Minimax Let b be the branching factor of the game tree. Let m be the depth of the game tree. What is the space complexity of Minimax? O(b+m)? O(bm)? O(m b )? 4 12 1 ∆ ∇∇∇ 4 56 20 -9215301079 ∆ ∆∆∆ ∆

Answer: Space Complexity of Minimax Let b be the branching factor of the game tree. Let m be the depth of the game tree. What is the space complexity of Minimax? O(b+m)? O(bm) O(b m )? O(m b )? 4 12 1 ∆ ∇∇∇ 4 56 20 -9215301079 ∆ ∆∆∆ ∆

Quiz: Complexity of Minimax Chess: has an average branching factor of ~30, and each game takes on average ~40. If it takes ~1 milli-second to compute the value of each board position in the game tree, how long to figure out the value of the game using Minimax? A few milliseconds A few seconds A few minutes A few hours A few days A few years? A few decades? A few millenia (thousands of years)? More time than the age of the universe?

Quiz: Complexity of Minimax Chess: has an average branching factor of ~30, and each game takes on average ~40. If it takes ~1 milli-second to compute the value of each board position in the game tree, how long to figure out the value of the game using Minimax? A few milliseconds A few seconds A few minutes A few hours A few days A few years? A few decades? A few millenia (thousands of years)? More time than the age of the universe

Strategies for coping with complexity Reduce b Reduce m Memoize

Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves.

Similar presentations

Presentation on theme: "Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves.

Similar presentations

Presentation on theme: "Agents that can play multi-player games. Recall: Single-player, fully-observable, deterministic game agents An agent that plays Peg Solitaire involves."— Presentation transcript:

Similar presentations

About project

Feedback