Instructor: Vincent Conitzer

Slides:



Advertisements
Similar presentations
Artificial Intelligence 5. Game Playing
Advertisements

Adversarial Search Chapter 6 Section 1 – 4. Types of Games.
Adversarial Search We have experience in search where we assume that we are the only intelligent being and we have explicit control over the “world”. Lets.
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
For Friday Finish chapter 5 Program 1, Milestone 1 due.
February 7, 2006AI: Chapter 6: Adversarial Search1 Artificial Intelligence Chapter 6: Adversarial Search Michael Scherger Department of Computer Science.
Games & Adversarial Search
Games and adversarial search
Artificial Intelligence Adversarial search Fall 2008 professor: Luigi Ceccaroni.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
Adversarial Search Chapter 5.
1 Game Playing. 2 Outline Perfect Play Resource Limits Alpha-Beta pruning Games of Chance.
Adversarial Search: Game Playing Reading: Chapter next time.
Lecture 12 Last time: CSPs, backtracking, forward checking Today: Game Playing.
Adversarial Search CSE 473 University of Washington.
Hoe schaakt een computer? Arnold Meijster. Why study games? Fun Historically major subject in AI Interesting subject of study because they are hard Games.
Adversarial Search 對抗搜尋. Outline  Optimal decisions  α-β pruning  Imperfect, real-time decisions.
An Introduction to Artificial Intelligence Lecture VI: Adversarial Search (Games) Ramin Halavati In which we examine problems.
1 Adversarial Search Chapter 6 Section 1 – 4 The Master vs Machine: A Video.
10/19/2004TCSS435A Isabelle Bichindaritz1 Game and Tree Searching.
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
Adversarial Search Board games. Games 2 player zero-sum games Utility values at end of game – equal and opposite Games that are easy to represent Chess.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.
This time: Outline Game playing The minimax algorithm
Lecture 02 – Part C Game Playing: Adversarial Search
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
Game Playing CSC361 AI CSC361: Game Playing.
Games and adversarial search
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 580 Artificial Intelligence Ch.6: Adversarial Search Fall 2008 Marco Valtorta.
MAE 552 – Heuristic Optimization Lecture 28 April 5, 2002 Topic:Chess Programs Utilizing Tree Searches.
Adversarial Search: Game Playing Reading: Chess paper.
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
CSC 412: AI Adversarial Search
Game Trees: MiniMax strategy, Tree Evaluation, Pruning, Utility evaluation Adapted from slides of Yoonsuck Choe.
PSU CS 370 – Introduction to Artificial Intelligence Game MinMax Alpha-Beta.
Minimax Trees: Utility Evaluation, Tree Evaluation, Pruning CPSC 315 – Programming Studio Spring 2008 Project 2, Lecture 2 Adapted from slides of Yoonsuck.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Game Playing.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
1 Computer Group Engineering Department University of Science and Culture S. H. Davarpanah
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
For Wednesday Read chapter 7, sections 1-4 Homework: –Chapter 6, exercise 1.
Adversarial Search Chapter Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time limits.
CSE373: Data Structures & Algorithms Lecture 23: Intro to Artificial Intelligence and Game Theory Based on slides adapted Luke Zettlemoyer, Dan Klein,
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Game-playing AIs Part 2 CIS 391 Fall CSE Intro to AI 2 Games: Outline of Unit Part II  The Minimax Rule  Alpha-Beta Pruning  Game-playing.
Adversarial Search Chapter 6 Section 1 – 4. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent reply Time.
Adversarial Search 2 (Game Playing)
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search Chapter 5 Sections 1 – 4. AI & Expert Systems© Dr. Khalid Kaabneh, AAU Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
1 Chapter 6 Game Playing. 2 Chapter 6 Contents l Game Trees l Assumptions l Static evaluation functions l Searching game trees l Minimax l Bounded lookahead.
Games and adversarial search (Chapter 5)
EA C461 – Artificial Intelligence Adversarial Search
Instructor: Vincent Conitzer
4. Games and adversarial search
Games and adversarial search (Chapter 5)
Games & Adversarial Search
Instructor: Vincent Conitzer
Minimax strategies, alpha beta pruning
Instructor: Vincent Conitzer
Games & Adversarial Search
Minimax strategies, alpha beta pruning
Presentation transcript:

Instructor: Vincent Conitzer CPS 270: Artificial Intelligence http://www.cs.duke.edu/courses/fall08/cps270/ Two-player, zero-sum, perfect-information Games Instructor: Vincent Conitzer

Game playing Rich tradition of creating game-playing programs in AI Many similarities to search Most of the games studied have two players, are zero-sum: what one player wins, the other loses have perfect information: the entire state of the game is known to both players at all times E.g., tic-tac-toe, checkers, chess, Go, backgammon, … Will focus on these for now Recently more interest in other games Esp. games without perfect information; e.g., poker Need probability theory, game theory for such games

“Sum to 2” game Player 1 moves, then player 2, finally player 1 again Move = 0 or 1 Player 1 wins if and only if all moves together sum to 2 Player 1 1 Player 2 Player 2 1 1 Player 1 Player 1 Player 1 Player 1 1 1 1 1 -1 -1 -1 1 -1 1 1 -1 Player 1’s utility is in the leaves; player 2’s utility is the negative of this

Backward induction (aka. minimax) From leaves upward, analyze best decision for player at node, give node a value Once we know values, easy to find optimal action (choose best value) 1 Player 1 1 Player 2 Player 2 -1 1 1 1 1 1 Player 1 -1 Player 1 1 Player 1 Player 1 1 1 1 1 -1 -1 -1 1 -1 1 1 -1

Modified game From leaves upward, analyze best decision for player at node, give node a value 6 Player 1 1 Player 2 Player 2 -1 6 1 1 6 7 Player 1 -1 Player 1 4 Player 1 Player 1 1 1 1 1 -1 -2 -3 4 -5 6 7 -8

A recursive implementation Value(state) If state is terminal, return its value If (player(state) = player 1) v := -infinity For each action v := max(v, Value(successor(state, action))) Return v Else v := infinity v := min(v, Value(successor(state, action))) Space? Time?

Do we need to see all the leaves? Do we need to see the value of the question mark here? Player 1 1 Player 2 Player 2 1 1 Player 1 Player 1 Player 1 Player 1 1 1 1 1 -1 -2 ? 4

Do we need to see all the leaves? Do we need to see the values of the question marks here? Player 1 1 Player 2 Player 2 1 1 Player 1 Player 1 Player 1 Player 1 1 1 1 1 -1 -2 ? ? -5 6 7 -8

Alpha-beta pruning Pruning = cutting off parts of the search tree (because you realize you don’t need to look at them) When we considered A* we also pruned large parts of the search tree Maintain alpha = value of the best option for player 1 along the path so far Beta = value of the best option for player 2 along the path so far

Pruning on beta Beta at node v is -1 We know the value of node v is going to be at least 4, so the -1 route will be preferred No need to explore this node further Player 1 1 Player 2 Player 2 1 1 node v Player 1 Player 1 Player 1 Player 1 1 1 1 1 -1 -2 ? 4

Pruning on alpha Alpha at node w is 6 We know the value of node w is going to be at most -1, so the 6 route will be preferred No need to explore this node further Player 1 1 Player 2 node w Player 2 1 1 Player 1 Player 1 Player 1 Player 1 1 1 1 1 -1 -2 ? ? -5 6 7 -8

Modifying recursive implementation to do alpha-beta pruning Value(state, alpha, beta) If state is terminal, return its value If (player(state) = player 1) v := -infinity For each action v := max(v, Value(successor(state, action), alpha, beta)) If v >= beta, return v alpha := max(alpha, v) Return v Else v := infinity v := min(v, Value(successor(state, action), alpha, beta)) If v <= alpha, return v beta := min(beta, v)

Benefits of alpha-beta pruning Without pruning, need to examine O(bm) nodes With pruning, depends on which nodes we consider first If we choose a random successor, need to examine O(b3m/4) nodes If we manage to choose the best successor first, need to examine O(bm/2) nodes Practical heuristics for choosing next successor to consider get quite close to this Can effectively look twice as deep! Difference between reasonable and expert play

Repeated states As in search, multiple sequences of moves may lead to the same state Again, can keep track of previously seen states (usually called a transposition table in this context) May not want to keep track of all previously seen states…

Using evaluation functions Most games are too big to solve even with alpha- beta pruning Solution: Only look ahead to limited depth (nonterminal nodes) Evaluate nodes at depth cutoff by a heuristic (aka. evaluation function) E.g., chess: Material value: queen worth 9 points, rook 5, bishop 3, knight 3, pawn 1 Heuristic: difference between players’ material values

Chess example Ki B p R K White to move Depth cutoff: 3 ply 2 -1 Ply = move by one player Ki B p R K White Rd8+ … Black Kb7 White Rxf8 Re8 … 2 -1

Chess (bad) example Ki B R p K White to move Depth cutoff: 3 ply 2 -1 Ply = move by one player Ki B R p K White Rd8+ … Black Kb7 White Rxf8 Re8 … 2 -1 Depth cutoff obscures fact that white R will be captured

Addressing this problem Try to evaluate whether nodes are quiescent Quiescent = evaluation function will not change rapidly in near future Only apply evaluation function to quiescent nodes If there is an “obvious” move at a state, apply it before applying evaluation function

Playing against suboptimal players Minimax is optimal against other minimax players What about against players that play in some other way?

Many-player, general-sum games of perfect information Basic backward induction still works No longer called minimax What if other players do not play this way? Player 1 1 Player 2 Player 2 1 1 Player 3 (1,2,3) Player 3 Player 3 Player 3 1 1 1 1 (1,2,3) (3,4,2) vector of numbers gives each player’s utility

Games with random moves by “Nature” E.g., games with dice (Nature chooses dice roll) Backward induction still works… Evaluation functions now need to be cardinally right (not just ordinally) For two-player zero-sum games with random moves, can we generalize alpha-beta? How? Player 1 1 Nature (2,3.5) Nature 50% 50% 60% 40% Player 2 (1,3) Player 2 (3,4) Player 2 Player 2 1 1 1 1 (1,3) (3,2) (3,4) (1,2)

Games with imperfect information Players cannot necessarily see the whole current state of the game Card games Ridiculously simple poker game: Player 1 receives King (winning) or Jack (losing), Player 1 can bet or stay, Player 2 can call or fold Dashed lines indicate indistinguishable states Backward induction does not work, need random strategies for optimality! (more later in course) “nature” 1 gets King 1 gets Jack player 1 player 1 bet stay bet stay player 2 player 2 call fold call fold call fold call fold 2 1 1 1 -2 1 -1 1

Intuition for need of random strategies Suppose my strategy is “bet on King, stay on Jack” What will you do? What is your expected utility? What if my strategy is “always bet”? What if my strategy is “always bet when given King, 10% of the time bet when given Jack”? “nature” 1 gets King 1 gets Jack player 1 player 1 bet stay bet stay player 2 player 2 call fold call fold call fold call fold 2 1 1 1 -2 1 -1 1

The state of the art for some games Chess: 1997: IBM Deep Blue defeats Kasparov … there is still debate about whether computers are really better Checkers: Computer world champion since 1994 … there was still debate about whether computers are really better… until 2007: checkers solved optimally by computer Go: Computers still not very good Branching factor really high Some recent progress Poker: Competitive with top humans in some 2-player games 3+ player case much less well-understood

Is this of any value to society? Some of the techniques developed for games have found applications in other domains Especially “adversarial” settings Real-world strategic situations are usually not two-player, perfect-information, zero-sum, … But game theory does not need any of those Example application: security scheduling at airports