Presentation on theme: "Introduction Introduction n One of the major problems of the algorithms schemas we have seen so far is that they take exponential time to find the optimal."— Presentation transcript:
Introduction Introduction n One of the major problems of the algorithms schemas we have seen so far is that they take exponential time to find the optimal solution. n Therefore, these algorithms are usually used for solving relatively small problems. n However, most often we do not need the optimal solution, and a near optimal solution can satisfy most “real” problems.
n A serious drawback of the above algorithms is that they must search all the way to a complete solution before making a commitment to even the first move of the solution. n The reason is that an optimal first move can not be guaranteed until the entire solution is found and shown to be at least as good as any other solution. Introduction Introduction
n Heuristic search in two player games adopts an entirely different set of assumptions. n Example - chess game : u the actions are made before the consequences are known u there is a limited amount of time u a move that has been made can’t be revoked. Two Player Games
Real-Time Single-Agent Search n Our goal is to apply the assumptions of two- player games to single agent heuristic search. n So far we needed both to check all of the available moves and in each case of back tracking- each move that was tried was a waste of time and we didn’t gain any information.
Minimin Lookahead Search n In similar to the minmax search which we used in the two player games, we will use an algorithm minimin for the single problem solving agent. n This algorithm will always look for the minimal route to the goal by choosing each time the next minimal node, and that is because there is only one player that makes all of the decisions.
n The search proceeds so that we do a minimin lookahead from a planning mode and at the end of the search we execute the best move that was found. From that point we repeat the lookahead search procedure. n There are a few heuristics that can be used in this algorithm : u A* heuristic function : f(n) = g(n)+h(n) u Fixed depth heuristic function : a g(n) fixed cost. u Fixed f(n) cost : search at the frontier for the minimal node. Minimin Lookahead Search
n If a goal state is encountered before the search horizon, then the pat is terminated and a heuristic value of zero is assigned to the goal. n If a path ends I a non-goal dead-end before the horizon is reached, than a heuristic value of infinity is assigned to the dead-end node, guaranteeing that the path will not be chosen. Minimin Lookahead Search
Branch-and-Bound Pruning n An obvious question is whether every frontier node must be examined to find one of minimum cost. n If we allow heuristic evaluations of interior nodes, then pruning is possible. By using an admissible f function, we can apply the branch-and-bound method, in order to reduce the number of nodes checked.
Efficiency of Branch and Bound 1020304050 10 100 1000 10,000 100,000 1,000,000 99 241588 2499 Search depth nodes
n From the graph above we can see the advantages of branch-and-bound pruning verses the brute-force minmax search. n For example : u on a scale of a million nodes per move, in the 8 puzzle - brute-force search searches 25 moves, where else the branch-and-bound searches 35 moves. About 40 % ! n We can also see that we get better results for the 15 puzzle than for the 8 puzzle. Efficiency of Branch and Bound
An Analytic Model n Is the former surprising result special for the sliding-tile-puzzles or is it more general ? n A model has been defined so that each edge was assigned a value of 0 or 1 with probability p, and a uniform branching factor and depth. n This model represents the tile puzzle. In the tile puzzle- each movement of a tile either increases or decreases by one the h value.
n Since for each move the g function increases by one- the f function either increases by 2 or doesn’t increase at all. n It has been proved that if the probability of finding a zero cost node below a certain node is less than one- finding the lowest cost route is exponential, while if the probability is more than one -then the time is polynomial. An Analytic Model
n For example: u if the probability is 0.5-for a binary tree the expected number of zero cost edges is 2*0.5=1, whereas for a ternary tree the expected number of zero cost edges is 3*0.5=1.5 ! We see that a ternary tree can be searched more efficiently than a binary tree ! An Analytic Model
n But this model is not so accurate for a number of reasons: u We can predict the results only till a certain point. u The model applies only to a limited depth, because from there the model assumes the probability for a zero cost node is the same for all edges whereas in the sliding tile puzzle the probability from some depth is not the same for each node. (The probability for a positive node increases). An Analytic Model
Real -Time-A* (RTA*) n So far, we only found a solution for one move at a time, but not for several moves. n The initial idea would be to repeat the action done for one move several times. But that leads to several problems.
Problems: u We might want to make a move to a node that has already been visited and we will be in an infinite loop. u If we don’t allow visiting in previous visited nodes, then we may encounter a state where we visited all the rest of the nodes. u Due to the limited information known in each state we want to allow back tracking in cases where we won’t repeat the same moves from that state. Real -Time-A* (RTA*)
Solution: u We should allow backtracking only if the cost of returning to that point plus the estimated cost from there is less than the estimated cost from the current point. Real-Time-A* (RTA*) is an efficient algorithm for implementing this solution. Real -Time-A* (RTA*)
RTA* Algorithm n In RTA* the value of f(n) for node n is like in A*. n The difference is g(n) in RTA* is computed differently than in A*- g(n) of RTA* is the distance of node n from the current state, and not from the initial state. n The implementation will be stored in an open list, and for each move we update the g value of each node in the open list relatively to the new current state. f(n) = g(n)+h(n)
n The time to make a move is linear in the size of the open list. n It is not clear exactly how to update the g values. n It is not clear how to find the path to the next destination node that was chosen from the open list. But these problems can be solved in constant time per move ! RTA* - The Drawbacks
RTA* - Example 2 5 1 3 4 c a d b ei9 87 9 j k m i
n In the example, we start at node a and we update the nodes so that now f(b)=1+1=2, f(c)=1+2=3, f(d)=1+3=4. n The problem solver goes to b because it’s the minimal and update h(a)=f(c)=3. Then we generate nodes e and i and updates that f(e)=1+4=5, f(i)=1+5=6,f(a)=1+3=4 n The problem solver goes to a and updates f(b)=1+5=6, and so on. RTA* - Example
n As we can notice we won’t get into an infinite loop even though we do allow back tracking, since each time we gather more information and according to that we decide what will be the next move. n Note: RTA* does not require good admissible functions and will find the solution in any case (though a good heuristic function will give better results). n RTA* running time is linear in the number of moves made, and so is the size of the hash table stored. RTA* - Example
Completeness of RTA* n RTA* is complete if it stands under the following restrictions: u The problem space must be finite. u A goal must be reachable from every state. u There can’t be cycles in the graph with zero or negative cost. u The heuristic values returned must be finite.
Correctness of RTA* n RTA* makes decisions based on limited information, and therefore the quality of the decision it makes is the best relative to the part of the search space it has seen so far. n The nodes that need to be expanded by RTA* are similar to the open list in A*. n The main difference is in the definitions of g and h. n The completeness of RTA* can be proved by induction on the number of moves made.
Solution Quality vs. Computation n We should also consider the quality of the solution that is returned by RTA*. n This depends on the accuracy of the heuristic function and the search depth. n A choice should be made between all of the families of heuristic functions, while some of them are more accurate but more expensive to compute, while the other are less accurate but simpler to compute.
Learning-RTA* (LRTA*) n Until now RTA* solved the problem for single trailed problems. n We would like now to improve the algorithm so that it will now be good also for multiple problem solving trails
n The algorithm for that is the same, except for one change that will make the algorithm suitable for the new problem: The algorithm will store the best value of the heuristic function, instead of the second best value, each time. Learning-RTA* (LRTA*)
Convergence of LRTA* n An important advantage of LRTA* is that because of the repetition of the problem solving trails the heuristic values become the exact values! n This advantage is under the following circumstances : u The initial and goal states are chosen randomly. u The initial heuristic values are admissble or do not overestimate the distance to the nearest goal. u Ties are broken randomly, otherwise if we find one optimal solution- we might continue finding the same one each time, and not find the other trails to the goal.
Theorem 5.2: In a finite space with finite positive edge costs, and non-overestimating initial heuristic values, in which a goal state is reachable from every state, over repeated trials of LRTA*, the heuristic values will eventualy converge to their exact values along every optimal path. Convergence of LRTA*
Conclusion n In real-time large scale application, we can’t use the single agent heuristic search algorithms, because the high cost and the fact that the algorithm does not return a solution before searching the expanded tree. n Minimin solves the problems for such cases. n Branch and bound pruning improves very much the results given by minimin. n RTA* solves the problem, of abandoning a trail to a better looking one, efficiently.
n RTA* guarantees finding a solution. n RTA* makes optimal local decisions. n The more usage of lookahead- the higher the cost is but the better quality of solution. n The family of heuristic varies according to the accuracy of the solution and the computational complexity. n The optimal level of lookahead depends on the relative costs of simulating vs. executing moves. n LRTA* is an algorithm that solves the over repeated problem solving trail while preserving the completeness of the solution. Conclusion
Heuristic from Relaxed Models n A heuristic function returns the exact cost of reaching a goal in a simplified or relaxed version of the original problem. n This means that we remove some of the constraints of the problem we are dealing with.
Heuristic from Relaxed Models - Example n Consider the problem of navigating in a network of roads from initial location to a goal location, n A good heuristic would be to estimate the cost between two points in a straight line. n We remove the constraint of the original problem that we have to move along the roads and assume that we are allowed to move in a straight line between two points. Thus we get a relaxation of the original problem.
Relaxation example - TSP problem n We can describe the problem as a graph with 3 constraints: 1Our tour covers all the cities. 2Every node has a degree two n an edge entering the node and n an edge leaving the node. 3The graph is connected. n If we remove constraint 2 n If we remove constraint 2 : We get a spanning graph and the optimal solution to this problem is a MST (Minimum Spanning tree). n If we remove constraint 3: Now the graph isn’t connected and the optimal solution to this problem is the solution to the assignment problem.
Relaxation example - Tile Puzzle problem n One of the constraints in this problem is that a tile can only slide into the position occupied by a blank. n If we remove this constraint we allow any tile to be moved horizontally or vertically position. And we actually get its Manhattan distance to its goal location.
The STRIPS Problem formulation n We would like to derive such heuristics automatically. n In order to do that we need a formal description language that is richer than the problem space graph. STRIPS n One such language is called STRIPS. predicates operators n In this language we have predicates and operators. n Let’s see a STRIPS representation of the Eight Puzzle Problem
1On(x,y) = tile x is in location y. 2Clear(z) = location z is clear. 3Adj(y,z) = location y is adjacent to location z. 4Move(x,y,z) = move tile x from location y to location z. In the language we have: n A precondition list n A precondition list - for example to execute move(x,y,z) we must have:On(x,y) Clear(z) Adj(y,z) n An add list n An add list - predicates that weren’t true before the operator and now after the operator was executed are true. n A delete list n A delete list - a subset of the preconditions, that now after the operator was executed aren’t true anymore. STRIPS - Eight Puzzle Example
n Now in order to construct a simplified or relaxed problem we only have to remove some of the preconditions. n For example n For example - by removing Clear(z) we allow tiles to move to adjacent locations. n In general, the hard part is to identify which relaxed problems have the property that their exact solution can be efficiently computed.
Admissibility and Consistency n The heuristics that are derived by this method are both admissible and consistent. n Admissibility n Admissibility means that the simplified graph has an equal or lower cost than the lowest - cost path in the original graph. n Note : The cost of the simplified graph should be as close as possible to the original graph. n Consistency n Consistency means that a heuristic h is consistent for every neighbor n’ of n, when h(n) is the actual optimal cost of reaching a goal in the graph of the relaxed problem. h(n) c(n,n’)+h(n)
n We begin by presenting an alternative derivation of the Manhattan distance heuristic for the sliding tile puzzles. n Any description of this problem is likely to describe the goal state as a set of subgoals, where each subgoal is to correctly position of individual tile, ignoring the interaction with the other tiles. Heuristic from Multiple Subgoals
Enhancing the Manhattan distance n In the Manhattan distance for each tile we looked for the optimal solution ignoring other tiles and only counting moves of the tile in question. n Therefore the heuristic function we get isn’t accurate. 16 1 7895 1110121314 1615171819 2120222324 1234
n We can perform a single search for each tile, starting from its goal position, and record how many moves of the tile are required to move to it to every other position. n Doing this for all tiles results in a table which gives, for every possible position of each tile, its Manhattan distance from its goal position. n Then, since each move moves one tile, for a given state we add the Manhattan distances of each tile to get an admissible heuristic for the state. Enhancing the Manhattan distance
n However this heuristic function isn’t accurate, since it ignores the interactions between the tiles. n The obvious next step is to repeat the process on all possible pairs of tiles. pairwise distance of the two tiles from their goal locations n In other words, for each pair of tile, and each combination of positions they would occupy, perform a search to their goal positions, and count only moves of the two tiles of interest. We call this value the pairwise distance of the two tiles from their goal locations. Enhancing the Manhattan distance
n Of course the goal is to find the shortest path from the goal state to all possible positions of the two tiles, where only moves of the two tiles of interest are counted. n For almost all pairs of tiles and positions, their pairwise distances will equal the sum of their Manhattan distances from their goal positions. n However, there are three types of cases where the pairwise distance exceeds the combined Manhattan distance. Enhancing the Manhattan distance
1Two tiles are in the same row or column but are reversed relative to their goal positions. In order to get to the goal states of the tiles, one tile must move down or up in order to unable the other one to get to its goal location, and than return to the row and go back to its place. Enhancing the Manhattan distance - the first case 21 1 2 1 2 12 12 1 2 1 Cost us relatively to Manhattan :+2
2The corners of the puzzle. n If the 3 tile is in its goal position, but some tile other than the 4 is in the 4 position, the 3 tile will have to move temporarily to correctly position the 4 tile. This requires two moves of the 3 tile, one to move it out of position, and another to move it back. Thus the sum of their Manhattan distances will exceed by two moves to their Manhattan distances. Enhancing the Manhattan distance - the second case 3 1 1 2 3 3 Cost us relatively to Manhattan : +2 3 4 4 34 4
3In the last moves of the solution. A detailed explanation is in the next slide Enhancing the Manhattan distance - the third case 1 Cost us relatively to Manhattan : +2 1 5 1 1 51 5 1
n Before the last move either the 1 or 5 tile must be in the upper -left corner in the goal state. Thus, the last move must move either the 1 tile right, or the 5 tile down. n Since the Manhattan distance of these tiles is computed to their goal positions, unless the 1 tile is in the left-most column, its Manhattan distance will not accommodate a path through the upper-left corner. Similarly, unless the 5 tile is in the top row,its Manhattan will not accommodate a path through the upper-left corner. n Thus, if the 1 tile is not in the left-most-column, and the 5 tile is not in the top row, we can add tow moves to the sum of their Manhattan distances. If we first move 1 or 5 tile into the blank position, and thus the pairwise of the 1 and 5 tiles will be two moves greater than the sum o their Manhattan distances, unless the 1 tile starts in the left column or the 5 starts in the top row. Enhancing the Manhattan distance - the third case
n The states of these searches are only distinguishable by the positions of the two tiles and the blank, and hence there are = O(n 3 ) different states, where n is number of tiles. n Since there are pairs of tiles, there are = O(n 2 ) such searches to perform, for an overall time complexity of O(n 5 ). The size of the resulting table is O(n 4 ), one entry for each pair of tiles in each combination of positions. 22 Enhancing the Manhattan distance
n The next question is how to automatically handle the interactions between these individual heuristics to compute an overall admissible heuristic estimate for a particular state. the maximum weighted matching problem n If we represent a state as a graph with a node for each tile, and an edge between each pair. We need to select a set of edges in such a way that the sum of edges selected is maximized. This problem is called the maximum weighted matching problem, and can be solved in O(n 3 ) time, where n is the number of nodes.] Applying the Heuristics
n Of course when using as heuristic the solution with the pairs, it’s not an optimal solution. n For example for the case: n Therefore in order to get the full power of these heuristic, we need to extend the idea of pairs of tiles to include triples etc’. 1321XX Higher-Order Heuristics
Pattern Databases n In the tile puzzles seen earlier each legal move, moves only one tile and therefore affects only one subgoal. n This unable us to add heuristic estimates for the individual tiles. n This isn’t the same for all problems. n For example in the Rubik’s Cube each legal twist moves a large fraction of the individual cubes.
n The simple heuristic is 3-dimensional Manhattan distance, where for every cubie we compute the minimum number of moves required to correctly position and orient it, and sum these values over all cubies. n Here we have to divide the sum to 8 since every move, moves 8 cubies. n A better heuristic is as before, but only to compute the sum of moves for the edge and corner cubies (In contrast to the previous heuristic, here we calculate the number of moves needed for the edge and corner cubies separately). n For the edge cubies we will divide the value by 4, and for the corner cubies we will also divide the value by 4. Pattern Databases
n We can compute the heuristic function by a table lookup, which is sometimes more efficient since it will save time during execution of the program. pattern databases n The use in such tables is called pattern databases. n A pattern database stores the number of moves to solve different patterns of subsets of the puzzle elements. Pattern Databases
n For example, the Manhattan distance function is usually computed with the aid of a small table that contains the Manhattan distance for each cubie from all possible positions and orientations. n The idea can be developed much further since for the 8 corner cubies - each cubie can be carried in 3 different orientions, but the last cubie is determined by the other 7. n This results in 8!*3^7=88,179,840 differnet states. Pattern Databases
n We can use a breadth first search and record in a table the number of moves required to solve each combination of corner cubies (This table requires 42 megabytes). n During an IDA* search as each state is generated, a unique index into the heuristic table is computed, followed by a reference to the table. The stored value is the number of moves needed to solve just the corner cubies, and thus a lower bound on the number of moves needed to solve the entire puzzle. Pattern Databases
n We can improve the heuristic by considering the 12 edge cubies as well. The edge cubies can be in one of 12! permutations, and each can be in one of two different orientations, but the last orientation of the last cubie is determined by the other 11. However this requires too much memory. n Therefore we will compute and store pattern databases for subsets of the edge cubies. n We can compute the possible combinations for 6 cubies (for 7 cubies it will take too much memory). The number of possible combinations for 6 of the 12 edge cubies is *2 = 42,557,920. n Similarly, we can compute the corresponding heuristic table for the remaining 6 edge cubies. 6 12 6 Pattern Databases
n The heuristic used for the experiments is the maximum of all 3 of these values: all 8 corner cubies, and 2 groups of 6 edge cubies each. n The total amount of memory for all the 3 tables is 82 megabytes. n The total time to generate all 3 heuristic tables was about an hour. n Even though, the result was a small increase compared to the number of moves for the corner cubies only, it results in a significant performance improvement. n Given more memory we could compute and store even larger pattern databases. Pattern Databases
Computer Chess A natural domain for studying AI n The game is well structured. n Perfect information game. n Early programmers and AI researchers were often amateur chess players as well.
Brief History of Computer Chess n Maelzel’s Chess Machine u 1769 Chess automaton by Baron Wolfgang von Kempelen of Austria u Appeared to automatically move the pieces on a board on top of the machine and played excellent chess. u Puzzle of the machine playing solved in 1836 by Edgar Allen Poe.
Brief History of Computer Chess Maelzel ’ s Chess Machine
n Early 1950’s n Early 1950’s - First serious paper on computer chess was written by Claude Shannon. Described minimax search with a heuristic static evaluation function and anticipated the need for more selective search algorithms. n 1956 n 1956 - Invention of alpha-beta pruning by John McCarthy. Used in early programs such as Samuel’s checkers player and Newell, Shaw and Simon’s chess program. Brief History of Computer Chess
n 1982 n 1982 - Development of Belle by Condon and Thomson. Belle - first machine whose hardware was specifically designed to play chess, in order to achieve speed and search depth. n 1997 n 1997 - Deep Blue machine was the first machine to defeat the human world champion, Garry Kasparov, in a six-game match. Brief History of Computer Chess
Checkers n 1952 n 1952 - Samuel developed a checkers program that learned its own evaluation through self play. n 1992 n 1992 - Chinook (J. Schaeffer) wins the U.S Open. At the world championship, Marion Tinsley beat Chinook.
Othello n Othello programs better than the best humans. n Large number of pieces change hands in each move. n Best Othello program today is Logistello (Michael Buro).
Backgammon n Unlike the above games backgammon includes a roll of the dice, introducing a random element. n Best backgammon program TD -gammon(Gerry Tesauro). Comparable to best human players today. n Learns an evaluation function using temporal- difference.
Card games n In addition to a random element there is hidden information introduced. n Best bridge GIB (M.Ginsberg) n Bridge games are not competitive with the best human players. n Poker programs are worse relative to their human counterparts. n Poker involves a strong psychological element when played by people.
Other games - Summary n The greater the branching factor the worse the performance. n Go - branching factor 361 very poor performance. Checkers - branching factor 4 - very good performance. n Backgammon - exception. Large branching factor still gets good results.
Brute-Force Search n We begin considering a purely brute-force approach to game playing. n Clearly, this will only be feasible for small games, but provides a basis for further discussions. n Example - 5-stone Nim u played with 2 players and pile of stones. u Each player removes a stone from the pile. u player who removes the last stone wins the game.
Example - Game Tree for 5-Stone Nim 5 43 3 221 2110100 10000 0 OR nodes AND nodes x x
Minimax Minimax theorem Minimax theorem - Every two-person zero-sum game is a forced win for one player, or a forced draw for either player, in principle these optimal minimax strategies can be computed. n Performing this algorithm on tic-tac-toe results in the root being labeled a draw.
Heuristic Evaluation Functions n Problem n Problem: How to evaluate positions, where brute force is out of the question? n Solutionheuristic static evaluation function n Solution: Use a heuristic static evaluation function to estimate the merit of a position when the final outcome has not yet been determined.
Example of heuristic Function n Chess n Chess : u Number of pieces on board of each type multiplied by relative value summed up for each color. By subtracting the weighted material of the black player from the weighted material of the white player we receive the relative strength of the position for each player.
n A heuristic static evaluation function for a two player game is a function from a state to a number. n The goal of a two player game is to reach a winning state, but the number of moves required to get there is unimportant. n Other features must be taken into account to get to an overall evaluation function. Heuristic Evaluation Functions
n Given a heuristic static evaluation function, it is straightforward to write a program to play a game. n From any given position, we simply generate all the legal moves, apply our static evaluator to the position resulting from each move, and then move to the position with the largest or smallest evaluation, depending if we are MIN/MAX Heuristic Evaluation Functions
Example - tic-tac-toe Behavior of Evaluation Function n Detect if game over. n If X is the Maximizer, the function should return if there are three X’s in a row and - if there are three O’s in a row. n Count of the number of different rows, columns, and diagonals occupied by O.
Example: First moves of tic-tac-toe X X X 3-0 = 3 4-4=4 2-0 = 2
n This algorithm is extremely efficient, requiring time that is only linear in the number of legal moves. n It’s drawback is that it only considers immediate consequences of each move (doesn’t look over the horizon). Example - tic-tac-toe Behavior of Evaluation Function
Minimax Search Where does X go? 1010 0 0-2 X X X 4-3 = 1 4-2 = 2
Minimax search ¶ Search as deeply as possible given the computational resources of the machine and the time constraints on the game. · Evaluate the nodes at the search frontier by the heuristic function. ¸ Where MIN is to move, save the minimum of it’s children’s values. Where MAX is to move, save the maximum of it’s children’s values. ¹ A move is made to a child of the root with the largest or smallest value, depending on whether MAX or MIN is moving.
Minimax search Minimax search example Minmax Tree 4 414 13121121019876235 1412218624 14284 2 4 MAX MIN
Alpha-Beta Pruning n By using alpha-beta pruning the minimax value of the root of a game tree can be determined without having to examine all the nodes.
Alpha-Beta Pruning Example 4 4217635 <=2<=1 6 <=3 4 <=2 >=64 2 4 a b o g rlpkhfe c dj i q n m MAX MIN
Alpha-Beta n Deep pruning - Right half of tree in example. n Next slide code for alpha-beta pruning : u MAXIMIN u MAXIMIN - assumes that its argument node is a maximizing node. u MINIMAX u MINIMAX - the same. u V(N) u V(N) - Heuristic static evaluation of node N.
MAXIMIN ( node: N,lowerbound : alpha,upperbound: beta) IF N is at the search depth, RETURN V(N) FOR each child Ni of N value = MINIMAX(Ni,alpha,beta) IF value > alpha, alpha := value IF alpha >= beta,return alpha RETURN alpha MINIMAX ( node: N,lowerbound : alpha,upperbound: beta) IF N is at the search depth, RETURN V(N) FOR each child Ni of N value = MAXIMIN(Ni,alpha,beta) IF value < beta, beta := value IF beta <= alpha, return alpha RETURN beta
Performance of Alpha-Beta n Efficiency depends on the order in which the nodes are encountered at the search frontier. n Optimal - b ½ - if the largest child of a MAX node is generated first, and the smallest child of a MIN node is generated first. n Worst - b. n Average b ¾ - random ordering.
Additional Enhancements n A number of additional improvements have been developed to improve performance with limited computation. n We briefly discuss the most important of these below.
Node Ordering n By using node ordering we can get close to b ½. n Node ordering instead of generating the tree left- to-right, we reorder the tree based on the static evaluations of the interior nodes. n To save space only the immediate children are reordered after the parent is fully expanded.
Iterative Deepening n Another idea is to use iterative deepening. In two player games using time, when time runs out, the move recommended by the last completed iteration is made. n Can be combined with node ordering to improve pruning efficiency. Instead of using the heuristic value we can use the value from pervious iteration.
Quiescence n Quiescence search is to make a secondary search in the case of a position whose values are unstable. n This way obtains a stable evaluation.
Transposition Tables n For efficiency, it is important to detect when a state has already been searched. transposition table n In order to detect a searched state, previously generated game states, with their minimax values are saved into a transposition table.
Opening Book n Most board games start with the same initial state. opening book n A table of good initial moves is used, based on human expertise, known as an opening book.
Endgame Databases n A database of endgame moves, with minimax values, is used. n In checkers, endgame for less than eight or fewer pieces on board. n A technique for calculating endgame databases, retrograde analysis.
Special Purpose Hardware n The faster the machine,the deeper the search in the time available and the better it plays. n The best machines today are based on special- purpose hardware designed and built only to play chess.
Selective Search selective full-width fixed depth n The fundamental reason that humans are competitive with computers is that they are very selective in their choice of positions to examine, unlike programs which do full-width fixed depth searches. n Selective search: to search only on a “interesting” domain. n Example n Example - Best first minimax.
Best First Minimax n Given a partially expanded minimax tree, the backed up minimax value of the root is determined by one of the leaf nodes, as is the value of every node on the path from the root to that leaf. principal variation principal leaf n This path is known as principal variation, and the leaf is known as principal leaf. n In general, the best-first minimax will generate an unbalanced tree, and make different move decisions than full-width-fixed-depth alpha-beta.
Best First minimax search- Example 6 64 Principal leaf - expand it
Best First minimax search- Example 4 24 52 Principal leaf - expand it
Best First minimax search- Example 4 21 5281 Principal leaf - expand it
Best First minimax search- Example 4 21 5781 73
Best First search n Full width search is a good insurance against missing a move (and making a mistake). n Most game programs that use selective searches use a combined algorithm that starts with a full- width search to a nominal length, and then searches more selectively below that depth.