Perfect-Information Games

Slides:



Advertisements
Similar presentations
Games & Adversarial Search Chapter 5. Games vs. search problems "Unpredictable" opponent  specifying a move for every possible opponent’s reply. Time.
Advertisements

Games & Adversarial Search
Introduction Introduction n One of the major problems of the algorithms schemas we have seen so far is that they take exponential time to find the optimal.
CMSC 671 Fall 2001 Class #8 – Thursday, September 27.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2008.
CS 484 – Artificial Intelligence
Adversarial Search Chapter 6 Section 1 – 4.
University College Cork (Ireland) Department of Civil and Environmental Engineering Course: Engineering Artificial Intelligence Dr. Radu Marinescu Lecture.
Adversarial Search Chapter 5.
Adversarial Search CSE 473 University of Washington.
Artificial Intelligence for Games Game playing Patrick Olivier
Games CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
G51IAI Introduction to AI Minmax and Alpha Beta Pruning Garry Kasparov and Deep Blue. © 1997, GM Gabriel Schwartzman's Chess Camera, courtesy IBM.
CMSC 463 Chapter 5: Game Playing Prof. Adam Anthony.
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Lecture 13 Last time: Games, minimax, alpha-beta Today: Finish off games, summary.
Artificial Intelligence in Game Design
1 Game Playing Chapter 6 Additional references for the slides: Luger’s AI book (2005). Robert Wilensky’s CS188 slides:
Game Playing CSC361 AI CSC361: Game Playing.
Games and adversarial search
How computers play games with you CS161, Spring ‘03 Nathan Sturtevant.
ICS-271:Notes 6: 1 Notes 6: Game-Playing ICS 271 Fall 2006.
double AlphaBeta(state, depth, alpha, beta) begin if depth
Games & Adversarial Search Chapter 6 Section 1 – 4.
Game Playing: Adversarial Search Chapter 6. Why study games Fun Clear criteria for success Interesting, hard problems which require minimal “initial structure”
ICS-270a:Notes 5: 1 Notes 5: Game-Playing ICS 270a Winter 2003.
Game Playing State-of-the-Art  Checkers: Chinook ended 40-year-reign of human world champion Marion Tinsley in Used an endgame database defining.
1 Adversary Search Ref: Chapter 5. 2 Games & A.I. Easy to measure success Easy to represent states Small number of operators Comparison against humans.
CSC 412: AI Adversarial Search
CISC 235: Topic 6 Game Trees.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Lecture 6: Game Playing Heshaam Faili University of Tehran Two-player games Minmax search algorithm Alpha-Beta pruning Games with chance.
Game Playing Chapter 5. Game playing §Search applied to a problem against an adversary l some actions are not under the control of the problem-solver.
Introduction Introduction n One of the major problems of the algorithms schemas we have seen so far is that they take exponential time to find the optimal.
Adversarial Search Chapter 6 Section 1 – 4. Outline Optimal decisions α-β pruning Imperfect, real-time decisions.
Notes on Game Playing by Yun Peng of theYun Peng University of Maryland Baltimore County.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 6 –Adversarial Search Thursday –AIMA, Ch. 6 –More Adversarial Search The “Luke.
Minimax with Alpha Beta Pruning The minimax algorithm is a way of finding an optimal move in a two player game. Alpha-beta pruning is a way of finding.
Games. Adversaries Consider the process of reasoning when an adversary is trying to defeat our efforts In game playing situations one searches down the.
1 Adversarial Search CS 171/271 (Chapter 6) Some text and images in these slides were drawn from Russel & Norvig’s published material.
Games 1 Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf.
Adversarial Search Chapter 6 Section 1 – 4. Search in an Adversarial Environment Iterative deepening and A* useful for single-agent search problems What.
Game Playing Revision Mini-Max search Alpha-Beta pruning General concerns on games.
Artificial Intelligence
ARTIFICIAL INTELLIGENCE (CS 461D) Princess Nora University Faculty of Computer & Information Systems.
Adversarial Search and Game Playing Russell and Norvig: Chapter 6 Slides adapted from: robotics.stanford.edu/~latombe/cs121/2004/home.htm Prof: Dekang.
Adversarial Search and Game Playing Russell and Norvig: Chapter 5 Russell and Norvig: Chapter 6 CS121 – Winter 2003.
Explorations in Artificial Intelligence Prof. Carla P. Gomes Module 5 Adversarial Search (Thanks Meinolf Sellman!)
Adversarial Search and Game Playing. Topics Game playing Game trees Minimax Alpha-beta pruning Examples.
Luca Weibel Honors Track: Competitive Programming & Problem Solving Partisan game theory.
ADVERSARIAL SEARCH Chapter 6 Section 1 – 4. OUTLINE Optimal decisions α-β pruning Imperfect, real-time decisions.
1 Chapter 6 Game Playing. 2 Chapter 6 Contents l Game Trees l Assumptions l Static evaluation functions l Searching game trees l Minimax l Bounded lookahead.
Game Playing Why do AI researchers study game playing?
Adversarial Search and Game-Playing
PENGANTAR INTELIJENSIA BUATAN (64A614)
Adversarial Search and Game Playing (Where making good decisions requires respecting your opponent) R&N: Chap. 6.
Adversarial Search Chapter 5.
Games & Adversarial Search
Games & Adversarial Search
Games & Adversarial Search
Games & Adversarial Search
Mini-Max search Alpha-Beta pruning General concerns on games
Game Playing: Adversarial Search
Game Playing Chapter 5.
Games & Adversarial Search
Adversarial Search CMPT 420 / CMPG 720.
Adversarial Search CS 171/271 (Chapter 6)
Games & Adversarial Search
Presentation transcript:

Perfect-Information Games Chapter 7 Two Player Perfect-Information Games

Computer Chess A natural domain for studying AI The game is well structured. Perfect information game. Early programmers and AI researchers were often amateur chess players as well.

Brief History of Computer Chess Maelzel’s Chess Machine 1769 Chess automaton by Baron Wolfgang von Kempelen of Austria Appeared to automatically move the pieces on a board on top of the machine and played excellent chess. Puzzle of the machine playing solved in 1836 by Edgar Allen Poe.

Brief History of Computer Chess Maelzel’s Chess Machine

Brief History of Computer Chess Early 1950’s - First serious paper on computer chess was written by Claude Shannon. Described minimax search with a heuristic static evaluation function and anticipated the need for more selective search algorithms. 1956 - Invention of alpha-beta pruning by John McCarthy. Used in early programs such as Samuel’s checkers player and Newell, Shaw and Simon’s chess program.

Brief History of Computer Chess 1982 - Development of Belle by Condon and Thomson. Belle - first machine whose hardware was specifically designed to play chess, in order to achieve speed and search depth. 1997 - Deep Blue machine was the first machine to defeat the human world champion, Garry Kasparov, in a six-game match.

Checkers 1952 - Samuel developed a checkers program that learned its own evaluation through self play. 1992 - Chinook (J. Schaeffer) wins the U.S Open. At the world championship, Marion Tinsley beat Chinook.

Othello Othello programs better than the best humans. Large number of pieces change hands in each move. Best Othello program today is Logistello (Michael Buro).

Backgammon Unlike the above games backgammon includes a roll of the dice, introducing a random element. Best backgammon program TD -gammon(Gerry Tesauro). Comparable to best human players today. Learns an evaluation function using temporal-difference.

Card games In addition to a random element there is hidden information introduced. Best bridge GIB (M.Ginsberg) Bridge games are not competitive with the best human players. Poker programs are worse relative to their human counterparts. Poker involves a strong psychological element when played by people.

Other games - Summary The greater the branching factor the worse the performance. Go - branching factor 361 very poor performance. Checkers - branching factor 4 - very good performance. Backgammon - exception. Large branching factor still gets good results.

Brute-Force Search We begin considering a purely brute-force approach to game playing. Clearly, this will only be feasible for small games, but provides a basis for further discussions. Example - 5-stone Nim played with 2 players and pile of stones. Each player removes one or two stones from the pile. player who removes the last stone wins the game.

Example - Game Tree for 5-Stone Nim 4 3 2 1 OR nodes AND nodes x + + + +

Minimax Minimax theorem - Every two-person zero-sum game is a forced win for one player, or a forced draw for either player, in principle these optimal minimax strategies can be computed. Performing this algorithm on tic-tac-toe results in the root being labeled a draw.

A strategy A strategy is a method which tells the player how to play in each possible scenario. Can be described implicit or explicit An explicit strategy is a subtree of the search tree which branches only at the opponent moves. The size of the subtree b^d/2

Example - strategy for 5-Stone Nim 4 3 2 1 OR nodes AND nodes x + + + +

MinMax propgation Start from the leaves. At each step, evaluate the value of all descendants: take the maximum if it is A’s turn, or the minimum if it is B’s turn The result will be the value of the tree.

Illustration of MinMax principle

Heuristic Evaluation Functions Problem: How to evaluate positions, where brute force is out of the question? Solution: Use a heuristic static evaluation function to estimate the merit of a position when the final outcome has not yet been determined.

Example of heuristic Function Chess : Number of pieces on board of each type multiplied by relative value summed up for each color. By subtracting the weighted material of the black player from the weighted material of the white player we receive the relative strength of the position for each player.

Heuristic Evaluation Functions A heuristic static evaluation function for a two player game is a function from a state to a number. The goal of a two player game is to reach a winning state, but the number of moves required to get there is unimportant. Other features must be taken into account to get to an overall evaluation function.

Heuristic Evaluation Functions Given a heuristic static evaluation function, it is straightforward to write a program to play a game. From any given position, we simply generate all the legal moves, apply our static evaluator to the position resulting from each move, and then move to the position with the largest or smallest evaluation, depending if we are MIN/MAX

Example - tic-tac-toe Behavior of Evaluation Function Detect if game over. If X is the Maximizer, the function should return  if there are three X’s in a row and -  if there are three O’s in a row. Count of the number of different rows, columns, and diagonals occupied by O.

Example: First moves of tic-tac-toe 3-0 = 3 4-0=4 2-0 = 2

Example - tic-tac-toe Behavior of Evaluation Function This algorithm is extremely efficient, requiring time that is only linear in the number of legal moves. It’s drawback is that it only considers immediate consequences of each move (doesn’t look over the horizon).

Minimax Search Where does X go? X X X 1 -1 -1 -2 4-3 = 1 4-2 = 2

Minimax search Search as deeply as possible given the computational resources of the machine and the time constraints on the game. Evaluate the nodes at the search frontier by the heuristic function. Where MIN is to move, save the minimum of it’s children’s values. Where MAX is to move, save the maximum of it’s children’s values. A move is made to a child of the root with the largest or smallest value, depending on whether MAX or MIN is moving.

Minimax search example Minimax Tree 4 14 13 12 11 2 10 1 9 8 7 6 3 5 MAX MIN

Nash equilibrium Nash equilibrium: Once an agreement has been reached, it is not worthwhile for any of the players to deviate from that agreement given that the other players do not deviate. Example: The market place: agreement: no one sells a hot dog for less than 10$ Is it worthwhile for me to reduce the price?? No, they will burn my stand.

Example: prisoners dilemma Should the prisoner “rat” on his friend? No equilibrium equilibrium quite rat quite rat dead quite quite 1 1 3 3 1 1 3 3 3 3 rat rat dead 3 3

Nash equilibrium The principal branch values are in Nash equilibrium 4 14 13 12 11 2 10 1 9 8 7 6 3 5 MAX MIN

Alpha-Beta Pruning By using alpha-beta pruning the minimax value of the root of a game tree can be determined without having to examine all the nodes.

Alpha-Beta Pruning Example 4 b 4 m 2 c i n 4 >=6 <=2 d g <=3 j 4 6 o <=1 q <=2 MAX MIN 4 5 3 6 7 1 2 e f h k l p r

Alpha-Beta Deep pruning - Right half of tree in example. Next slide code for alpha-beta pruning : MAXIMIN - assumes that its argument node is a maximizing node. MINIMAX - the same. V(N) - Heuristic static evaluation of node N.

MAXIMIN ( node: N ,lowerbound : alpha ,upperbound: beta) IF N is at the search depth, RETURN V(N) FOR each child Ni of N value = MINIMAX(Ni,alpha,beta) IF value > alpha , alpha := value IF alpha >= beta ,return alpha RETURN alpha MINIMAX ( node: N ,lowerbound : alpha ,upperbound: beta) value = MAXIMIN(Ni,alpha,beta) IF value < beta , beta := value IF beta <= alpha, return alpha RETURN beta

Performance of Alpha-Beta Efficiency depends on the order in which the nodes are encountered at the search frontier. Optimal - b½ - if the largest child of a MAX node is generated first, and the smallest child of a MIN node is generated first. Worst - b. Average b¾ - random ordering.

Games with chance chance nodes: nodes where chance events happen (rolling dice, flipping a coin, etc) Evaluate expected value by averaging outcome probabilities: C is a chance node P(di) probability of rolling di (1,2, …, 12) S(C,di) is the set of positions generated by applying all legal moves for roll di to C

Games with chance Backgammon board

Search tree with probabilities MAX MIN 3 -1 0.5 2 4 0 -2 2 4 7 4 6 0 5 -2

Search tree with probabilities

Additional Enhancements A number of additional improvements have been developed to improve performance with limited computation. We briefly discuss the most important of these below.

Node Ordering By using node ordering we can get close to b½ . Node ordering instead of generating the tree left- to-right, we reorder the tree based on the static evaluations of the interior nodes. To save space only the immediate children are reordered after the parent is fully expanded.

Iterative Deepening Another idea is to use iterative deepening. In two player games using time, when time runs out, the move recommended by the last completed iteration is made. Can be combined with node ordering to improve pruning efficiency. Instead of using the heuristic value we can use the value from previous iteration.

Quiescence Quiescence search is to make a secondary search in the case of a position whose values are unstable. This way obtains a stable evaluation.

Transposition Tables For efficiency, it is important to detect when a state has already been searched. In order to detect a searched state, previously generated game states, with their minimax values are saved into a transposition table.

Opening Book Most board games start with the same initial state. A table of good initial moves is used, based on human expertise, known as an opening book.

Endgame Databases A database of endgame moves, with minimax values, is used. In checkers, endgame for less than eight or fewer pieces on board. A technique for calculating endgame databases, retrograde analysis.

Special Purpose Hardware The faster the machine ,the deeper the search in the time available and the better it plays. The best machines today are based on special- purpose hardware designed and built only to play chess.

Selective Search The fundamental reason that humans are competitive with computers is that they are very selective in their choice of positions to examine, unlike programs which do full-width fixed depth searches. Selective search: to search only on a “interesting” domain. Example - Best first minimax.

Best First Minimax Given a partially expanded minimax tree, the backed up minimax value of the root is determined by one of the leaf nodes, as is the value of every node on the path from the root to that leaf. This path is known as principal variation, and the leaf is known as principal leaf. In general, the best-first minimax will generate an unbalanced tree, and make different move decisions than full-width-fixed-depth alpha-beta.

Best First minimax search- Example 6 4 Principal leaf - expand it

Best First minimax search- Example 4 2 5 Principal leaf - expand it

Best First minimax search- Example 2 1 5 8 Principal leaf - expand it

Best First minimax search- Example 5 1 7 8 3

Best First search Full width search is a good insurance against missing a move (and making a mistake). Most game programs that use selective searches use a combined algorithm that starts with a full- width search to a nominal length, and then searches more selectively below that depth.