Presentation is loading. Please wait.

Presentation is loading. Please wait.

Applications of informed search  optimization problems Algorithm A, admissibility, A*  zero-sum game playing minimax principle, alpha-beta pruning Dave.

Similar presentations


Presentation on theme: "Applications of informed search  optimization problems Algorithm A, admissibility, A*  zero-sum game playing minimax principle, alpha-beta pruning Dave."— Presentation transcript:

1 applications of informed search  optimization problems Algorithm A, admissibility, A*  zero-sum game playing minimax principle, alpha-beta pruning Dave Reed

2 Optimization problems consider a related search problem:  instead of finding the shortest path (i.e., fewest moves) to a solution, suppose we want to minimize some cost EXAMPLE: airline travel problem could associate costs with each flight, try to find the cheapest route could associate distances with each flight, try to find the shortest route we could use a strategy similar to breadth first search  repeatedly extend the minimal cost path  search is guided by the cost of the path so far but such a strategy ignores heuristic information  would like to utilize a best first approach, but not directly applicable  search is guided by the remaining cost of the path IDEAL: combine the intelligence of both strategies  cost-so-far component of breadth first search (to optimize actual cost)  cost-remaining component of best first search (to make use of heuristics)

3 Algorithm A associate 2 costs with a path gactual cost of the path so far hheuristic estimate of the remaining cost to the goal* f = g + hcombined heuristic cost estimate *note: the heuristic value is inverted relative to best first Algorithm A: best first search using f as the heuristic

4 Travel problem revisited %% h(Loc, Goal, Value) : Value is crow-flies %% distance from Loc to Goal %%%%%%%%%%%%%%%%%%%%%%%% h(loc(omaha), loc(los_angeles), 1700). h(loc(chicago), loc(los_angeles), 2200). h(loc(denver), loc(los_angeles), 1400). h(loc(los_angeles), loc(los_angeles), 0). g: cost is actual distances per flight h: cost estimate is crow-flies distance

5 Algorithm A implementation %% Algorithm A (for trees) %% atree(Current, Goal, Path): Path is a list of states %% that lead from Current to Goal with no duplicate states. %% %% atree_help(ListOfPaths, Goal, Path): Path is a list of %% states (with associated F and G values) that lead from one %% of the paths in ListOfPaths (a list of lists) to Goal with %% no duplicate states. %% %% extend(G:Path, Goal, ListOfPaths): ListOfPaths is the list %% of all possible paths (with associated F and G values) obtainable %% by extending Path (at the head) with no duplicate states. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% atree(State, Goal, G:Path) :- atree_help([0:0:[State]], Goal, G:RevPath), reverse(RevPath, Path). atree_help([_:G:[Goal|Path]|_], Goal, G:[Goal|Path]). atree_help([_:G:Path|RestPaths], Goal, SolnPath) :- extend(G:Path, Goal, NewPaths), append(RestPaths, NewPaths, TotalPaths), sort(TotalPaths, SortedPaths), atree_help(SortedPaths, Goal, SolnPath). extend(G:[State|Path], Goal, NewPaths) :- bagof(NewF:NewG:[NextState,State|Path], Cost^H^(move(State, NextState, Cost), not(member(NextState, [State|Path])), h(NextState,Goal,H), NewG is G+Cost, NewF is NewG+H), NewPaths), !. extend(_, _, []). differences from best associate two values with each path (F is total estimated cost, G is actual cost so far) F:G:Path since extend needs to know of current path, must pass G new feature of bagof: if a variable appears only in the 2 nd arg, must identify it as backtrackable

6 Travel example ?- atree(loc(omaha), loc(los_angeles), Path). Path = 2000:[loc(omaha), loc(denver), loc(los_angeles)] ; Path = 2700:[loc(omaha), loc(chicago), loc(los_angeles)] ; Path = 2900:[loc(omaha), loc(chicago), loc(denver), loc(los_angeles)] ; No note: Algorithm A finds the path with least cost (here, distance) not necessarily the path with fewest steps  suppose the flight from Chicago to L.A. was 2500 miles (instead of 2200) Omaha  Chicago  Denver  LA would be shorter than Omaha  Chicago  LA %% travelcost.pro Dave Reed 3/15/02 %%%%%%%%%%%%%%%%%%%%%%%%% move(loc(omaha), loc(chicago), 500). move(loc(omaha), loc(denver), 600). move(loc(chicago), loc(denver), 1000). move(loc(chicago), loc(los_angeles), 2200). move(loc(chicago), loc(omaha), 500). move(loc(denver), loc(los_angeles), 1400). move(loc(denver), loc(omaha), 600). move(loc(los_angeles), loc(chicago), 2200). move(loc(los_angeles), loc(denver), 1400). %% h(Loc, Goal, Value) : Value is crow-flies %% distance from Loc to Goal %%%%%%%%%%%%%%%%%%%%%%%% h(loc(omaha), loc(los_angeles), 1700). h(loc(chicago), loc(los_angeles), 2200). h(loc(denver), loc(los_angeles), 1400). h(loc(los_angeles), loc(los_angeles), 0).

7 8-puzzle example ?- atree(tiles([1,2,3],[8,6,space],[7,5,4]), tiles([1,2,3],[8,space,4],[7,6,5]), Path). Path = 3:[tiles([1, 2, 3], [8, 6, space], [7, 5, 4]), tiles([1, 2, 3], [8, 6, 4], [7, 5, space]), tiles([1, 2, 3], [8, 6, 4], [7, space, 5]), tiles([1, 2, 3], [8, space, 4], [7, 6, 5])] Yes ?- atree(tiles([2,3,4],[1,8,space],[7,6,5]), tiles([1,2,3],[8,space,4],[7,6,5]), Path). Path = 5:[tiles([2, 3, 4], [1, 8, space], [7, 6, 5]), tiles([2, 3, space], [1, 8, 4], [7, 6, 5]), tiles([2, space, 3], [1, 8, 4], [7, 6, 5]), tiles([space, 2, 3], [1, 8, 4], [7, 6, 5]), tiles([1, 2|...], [space, 8|...], [7, 6|...]), tiles([1|...], [8|...], [7|...])] Yes here, Algorithm A finds the same paths as best first search  not surprising since g is trivial  still, not guaranteed to be the case g: actual cost of each move is 1 h: remaining cost estimate is # of tiles out of place %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% h(Board, Goal, Value) : Value is the number of tiles out of place h(tiles(Row1,Row2,Row3),tiles(Goal1,Goal2,Goal3), Value) :- diff_count(Row1, Goal1, V1), diff_count(Row2, Goal2, V2), diff_count(Row3, Goal3, V3), Value is V1 + V2 + V3. diff_count([], [], 0). diff_count([H1|T1], [H2|T2], Count) :- diff_count(T1, T2, TCount), (H1 = H2, Count is TCount ; H1 \= H2, Count is TCount+1).

8 Algorithm A vs. hill-climbing if the cost estimate function h is perfect, then f is a perfect heuristic  Algorithm A is deterministic if know actual costs for each state, Alg. A reduces to hill-climbing

9 Admissibility in general, actual costs are unknown at start – must rely on heuristics if the heuristic is imperfect, Alg. A is NOT guaranteed to find an optimal solution if a control strategy is guaranteed to find an optimal solution (when a solution exists), we say it is admissible if cost estimate h never overestimates actual cost, then Alg. A is admissible (when admissible, Alg. A is commonly referred to as Alg. A*)

10 Admissible examples is our heuristic for the travel problem admissible? h (State, Goal) = crow-flies distance from Goal is our heuristic for the 8-puzzle admissible? h (State, Goal) = number of tiles out of place, including the space is our heuristic for the Missionaries & Cannibals admissible?

11 Cost of the search the closer h is to the actual cost function, the fewer states considered  however, the cost of computing h tends to go up as it improves also, admissibility is not always needed or desired Graceful Decay of Admissibility: If h rarely overestimates the actual cost by more than D, then Alg. A will rarely find a solution whose cost exceeds optimal by more than D. the best algorithm is one that minimizes the total cost of the solution

12 Algorithm A (for graphs) representing the search search as a tree is conceptually simpler  but must store entire paths, which include duplicates of states more efficient to store the search space as a graph  store states w/o duplicates, need only remember parent for each state (so that the solution path can be reconstructed at the end) IDEA: keep 2 lists of states (along with f & g values, parent pointer) OPEN: states reached by the search, but not yet expanded CLOSED: states reached and already expanded while more efficient, the graph implementation is trickier  when a state moves to the CLOSED list, it may not be finished  may have to revise values of states if new (better) paths are found

13 Flashlight example consider the flashlight puzzle discussed in class: Four people are on one side of a bridge. They wish to cross to the other side, but the bridge can only take the weight of two people at a time. It is dark and they only have one flashlight, so they must share it in order to cross the bridge. Assuming each person moves at a different speed (able to cross in 1, 2, 5 and 10 minutes, respectively), find a series of crossings that gets all four across in the minimal amount of time. state representation? cost of a move? heuristic?

14 Flashlight implementation state representation must identify the locations of each person and the flashlight bridge(SetOfPeopleOnLeft, SetOfPeopleOnRight, FlashlightLocation) note: can use a list to represent a set, but must be careful of permutations e.g., [1,2,5,10] = [1,5,2,10], so must make sure there is only one list repr. per set solution: maintain the lists in sorted order, so only one permutation is possible only 3 possible moves: 1.if the flashlight is on left and only 1 person on left, then move that person to the right (cost is time it takes for that person) 2.if flashlight is on left and at least 2 people on left, then select 2 people from left and move them to right (cost is max time of the two) 3.if the flashlight is on right, then select a person from right and move them to left (cost is time for that person) heuristic: h(State, Goal) = number of people in wrong place

15 Flashlight code %% flashlight.pro Dave Reed 3/15/02 %% %% This file contains the state space definition %% for the flashlight puzzle. %% %% bridge(Left, Right, Loc): Left is a (sorted) list of people %% on the left shore, Right is a (sorted) list of people on %% the right shore, and Loc is the location of the flashlight. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% move(bridge([P], Right, left), bridge([], NewRight, right), P) :- merge([P], Right, NewRight). move(bridge(Left, Right, left), bridge(NewLeft, NewRight, right), Cost) :- length(Left, L), L >= 2, select(P1, Left, Remain), select(P2, Remain, NewLeft), merge([P1], Right, TempRight), merge([P2], TempRight, NewRight), Cost is max(P1, P2). move(bridge(Left, Right, right), bridge(NewLeft, NewRight, left), P) :- select(P, Right, NewRight), merge([P], Left, NewLeft). %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %% h(State, Goal, Value) : Value is # of people out of place h(bridge(Left, Right, _), bridge(GoalLeft, GoalRight, _), Value) :- diff_count(Left, GoalLeft, V1), diff_count(Right, GoalRight, V2), Value is V1 + V2. diff_count([], _, 0). diff_count([H|T], L, Count) :- diff_count(T, L, TCount), (member(H,L), !, Count is TCount ; Count is TCount+1). merge(L1,L2,L3): L3 is the result of merging sorted lists L1 & L2 select(X,L,R): R is the remains after removing X from list L

16 Flashlight answers ?- atree(bridge([1,2,5,10],[],left), bridge([],[1,2,5,10],right), Path). Path = 17:[bridge([1, 2, 5, 10], [], left), bridge([5, 10], [1, 2], right), bridge([1, 5, 10], [2], left), bridge([1], [2, 5, 10], right), bridge([1, 2], [5, 10], left), bridge([], [1|...], right)] ; Path = 17:[bridge([1, 2, 5, 10], [], left), bridge([5, 10], [1, 2], right), bridge([2, 5, 10], [1], left), bridge([2], [1, 5, 10], right), bridge([1, 2], [5, 10], left), bridge([], [1|...], right)] ; Path = 19:[bridge([1, 2, 5, 10], [], left), bridge([2, 5], [1, 10], right), bridge([1, 2, 5], [10], left), bridge([2], [1, 5, 10], right), bridge([1, 2], [5, 10], left), bridge([], [1|...], right)] Yes Algorithm A finds optimal solutions to the puzzle  note: more than one solution is optimal  can use ';' to enumerate solutions, from best to worst

17 Search in game playing consider games involving:  2 players  perfect information  zero-sum (player's gain is opponent's loss) examples: tic-tac-toe, checkers, chess, othello, … non-examples: poker, backgammon, prisoner's dilemma, … von Neumann (the father of game theory) showed that for such games, there is always a "rational" strategy  that is, can always determine a best move, assuming the opponent is equally rational

18 Game trees idea: model the game as a search tree  associate a value with each game state (possible since zero-sum) player 1 wants to maximize the state value (call him/her MAX) player 2 wants to minimize the state value (call him/her MIN)  players alternate turns, so differentiate MAX and MIN levels in the tree the leaves of the tree will be end-of-game states

19 Minimax search can visualize the search bottom-up (start at leaves, work up to root) likewise, can search top-down using recursion minimax search:  at a MAX level, take the maximum of all possible moves  at a MIN level, take the minimum of all possible moves

20 Minimax example

21 In-class exercise

22 Minimax in practice while Minimax Principle holds for all 2-party, perfect info, zero-sum games, an exhaustive search to find best move may be infeasible EXAMPLE: in an average chess game, ~100 moves with ~35 options/move  ~35 100 states in the search tree! practical alternative: limit the search depth and use heuristics  expand the search tree a limited number of levels (limited look-ahead)  evaluate the "pseudo-leaves" using a heuristic high value  good for MAXlow value  good for MIN back up the heuristic estimates to determine the best-looking move at MAX level, take minimumat MIN level, take maximum

23 Tic-tac-toe example 1000 if win for MAX (X) heuristic(State) = -1000 if win for MIN (O) (#rows/cols/diags open for MAX – #rows/cols/diags open for MIN) otherwise suppose look-ahead of 2 moves {

24  -  bounds sometimes, it isn't necessary to search the entire tree  -  technique: associate bonds with state in the search  associate lower bound  with MAX: can increase  associate upper bound  with MIN: can decrease

25  -  pruning discontinue search below a MIN node if  value <=  value of ancestor discontinue search below a MAX node if  value >=  value of ancestor

26 larger example

27 tic-tac-toe example  -  vs. minimax: worst case:  -  examines as many states as minimax best case: assuming branching factor B and depth D,  -  examines ~2b d/2 states (i.e., as many as minimax on a tree with half the depth)


Download ppt "Applications of informed search  optimization problems Algorithm A, admissibility, A*  zero-sum game playing minimax principle, alpha-beta pruning Dave."

Similar presentations


Ads by Google