1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle 0+41+5 1+3 3+3 3+4 3+24+15+2 5+0 2+3 2+4 2+3 f(N) = g(N) + h(N)

1 search CS 331/531 Dr M M Awais A* Examples:

2 search CS 331/531 Dr M M Awais 8-Puzzle 0+41+5 1+3 3+3 3+4 3+24+15+2 5+0 2+3 2+4 2+3 f(N) = g(N) + h(N) with h(N) = number of misplaced tiles

3 search CS 331/531 Dr M M Awais Robot Navigation

4 search CS 331/531 Dr M M Awais Robot Navigation 0211 587 7 3 4 7 6 7 632 8 6 45 233 36524435 546 5 6 4 5 f(N) = h(N), with h(N) = Manhattan distance to the goal (not A*)

5 search CS 331/531 Dr M M Awais Robot Navigation 0211 587 7 3 4 7 6 7 632 8 6 45 233 36524435 546 5 6 4 5 f(N) = h(N), with h(N) = Manhattan distance to the goal (not A*) 7 0

6 search CS 331/531 Dr M M Awais Robot Navigation f(N) = g(N)+h(N), with h(N) = Manhattan distance to goal (A*) 0211 587 7 3 4 7 6 7 632 8 6 45 233 36524435 546 5 6 4 5 7+0 6+1 8+1 7+0 7+2 6+17+2 6+1 8+1 7+2 8+37+26+3 5+4 4+5 3+6 2+78+37+4 6+5 5+6 6+35+62+73+8 4+7 5+64+7 3+8 4+73+8 2+9 3+10 2+9 3+82+9 1+10 0+11

7 search CS 331/531 Dr M M Awais Adversary Search (Games) The aim is to move in such a way as to ‘stop’ the opponent from making a good / winning move. Game playing can use Tree - Search. The tree or game - tree alternates between two players.

8 search CS 331/531 Dr M M Awais Games? Games are a form of multi-agent environment What do other agents do How do they affect our success? Cooperative vs. competitive multi-agent environments. Competitive multi-agent environments give rise to adversarial problems (games) Why study games? Fun; historically entertaining Interesting subject of study because they are hard Easy to represent and agents restricted to small number of actions

9 search CS 331/531 Dr M M Awais Games vs. Search Search – no adversary Solution is (heuristic) method for finding goal Heuristics and CSP techniques can find optimal solution Evaluation function: estimate of cost from start to goal through given node Examples: path planning, scheduling activities Games – adversary Solution is strategy (strategy specifies move for every possible opponent reply). Time limits force an approximate solution Evaluation function: evaluate “goodness” of game position Examples: chess, checkers, Othello, backgammon

10 search CS 331/531 Dr M M Awais Types of Games

11 search CS 331/531 Dr M M Awais Game setup Two players: MAX and MIN MAX and MIN take turns until the game is over. Winner gets award, looser gets penalty. Games as search: Initial state: e.g. board configuration of chess Successor function: list of (move,state) pairs specifying legal moves. Terminal test: Is the game finished? Utility function: Gives numerical value of terminal states. E.g. win (+1), loose (-1) and draw (0) in tic-tac-toe (next) MAX uses search tree to determine next move.

12 search CS 331/531 Dr M M Awais Things to Remember: 1.Every move is vital 2.The opponent could win at the next move or subsequent moves. 3.Keep track of the safest moves 4.The opponent is well - informed 5.How the opponent is likely to response to your moves.

13 search CS 331/531 Dr M M Awais Two move win Player 1 = P 1 Player 2 = P 2 Safest move for P 1 is always A to C Safest move for P 2 is always A to D (if allowed 1 st move) P1P2 P1 P1 P2 P2 A B C D EFGHIJ P1 moves P2 moves wins

14 search CS 331/531 Dr M M Awais MINIMAX Procedure for Games Assumption: Opponent has same knowledge of state space and makes a consistent effort to WIN. MIN:Label for the opponent trying to minimize other player’s (MAX) score. MAX: Player trying to win (maximise advantage) BOTH MAX AND MIN ARE EQUALLY INFORMED

15 search CS 331/531 Dr M M Awais Rules 1. Label levels MAX and MIN 2. Assign values to leaf nodes: 0 if MIN wins 1 if MAX wins 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it min-value of its children MAX MIN MAX MIN

16 search CS 331/531 Dr M M Awais Rules 1. Label level’s MAX and MIN 2. Assign values to leaf nodes: 0 if MIN wins 1 if MAX wins 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it min-value of its children MAX MIN MAX MIN 0 1

17 search CS 331/531 Dr M M Awais Rules 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it min-value of its children MAX MIN MAX MIN 0 1 1 Max(0,1) = 1 1 Max(1) = 1

18 search CS 331/531 Dr M M Awais Rules 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it min-value of its children MAX MIN MAX MIN 0 1 1 1 Min(1) = 1 1 1

19 search CS 331/531 Dr M M Awais Rules 3. Propagate values up the graph. If parent is MAX, assign it Max-value of its children If parent is MIN, assign it min-value of its children MAX MIN MAX MIN 0 1 1 1 1 Max(1,1) 1 Min(1) = 1 1

20 search CS 331/531 Dr M M Awais Utility Values Leaf Nodes represent the result of the game Results could be WIN or LOOSE for any player WIN for MAX is 1, LOOSE for MAX is 0 These values are known as Utility values / functions Draw could be another result, in this case WIN for MAX could be 1 LOOSE for MAX could be –1 DRAW could be 0

21 search CS 331/531 Dr M M Awais Game tree (2-player, deterministic, turns)

22 search CS 331/531 Dr M M Awais MINMAX Unfinished Games Apply from the leaf node to the start node Or, Result nodes are necessary to be in the search space What if you want to evaluate the game status at an intermediate level E.g., The game finishes at level 5 We want to find out the relative advantage of MAX upto level 3. Solution: Evaluate intermediate nodes through a heuristic and then apply MINMAX

23 search CS 331/531 Dr M M Awais Minimaxing to fixed ply depth (Complex games) Strategy: n - move look ahead - Suppose you start in the middle of the game. WIN/LOOSE - One cannot assign WIN/LOOSE values at that stage - In this case some heuristics evaluation is applied - Values are then projected back to supply indications of WINNING/LOOSING trend.

24 search CS 331/531 Dr M M Awais TIC - TAC - TOE HEURISTIC FUNCTION: TIC - TAC - TOE M(n) = Total of possible winning lines for MAX O(n) = Trial of Opponents winning lines E(n) = M(n) - O(n) X X X O O O X X

25 search CS 331/531 Dr M M Awais TIC - TAC - TOE HEURISTIC FUNCTION: TIC - TAC - TOE M(n) = Total of possible winning lines for MAX O(n) = Trial of Opponents winning lines E(n) = M(n) - O(n) X X X O O O X X M(n)=4 M(n)=5

26 search CS 331/531 Dr M M Awais TIC - TAC - TOE HEURISTIC FUNCTION: TIC - TAC - TOE M(n) = Total of possible winning lines for MAX O(n) = Trial of Opponents winning lines E(n) = M(n) - O(n) X X X O O O X X M(n)=4 O(n)=2 E(n)=2 M(n)=5 O(n)=1 E(n)=4

27 search CS 331/531 Dr M M Awais Two-Ply Game Tree

30 search CS 331/531 Dr M M Awais Two-Ply Game Tree The minimax decision Minimax maximizes the worst-case outcome for max.

31 search CS 331/531 Dr M M Awais Problem of minimax search Number of games states is exponential to the number of moves.

32 search CS 331/531 Dr M M Awais Solution Do not examine every node Alpha-beta pruning Alpha = value of best choice found so far at any choice point along the MAX path Beta = value of best choice found so far at any choice point along the MIN path

33 search CS 331/531 Dr M M Awais Alpha - Beta Procedures Minimax procedure pursues all branches in the space. Some of them could have been ignored or pruned. To improve efficiency pruning is applied to two person games

34 search CS 331/531 Dr M M Awais Simple Idea if A > 5 OR B < 0 If the first condition A > 5 succeeds then B < 0 may not be evaluated. if A > 5 AND B < 0 If the first condition A > 5 fails then B < 0 may not be evaluated.

35 search CS 331/531 Dr M M Awais Implementation FORWARD PASS: APPLY DEPTH FIRST SEARCH REACH THE LEAF NODE BACKWARD PASS: PROPAGATE THE VALUES TO THE ROOT NODE

36 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN -0.2 (at least) Why –0.2 is the least value?

37 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN -0.2 Suppose this node takes a value less than –0.2 Value for node e will not change and remains at –0.2

38 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN v Suppose this node takes a value greater than –0.2, say v Value for node e will change to v

39 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN v WHAT IS THE LOWER BOUND ON v ? Lower bound is the value at node g

40 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN  =-0.2 (at least) Minimum advantage for e MAX node is –0.2 This is called the ALPHA Value for MAX Node

41 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN -0.2 (at most) Why –0.2 is the AT MOST value For node c ?  =-0.2 (at least)

42 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN v Suppose this node takes a value v less than –0.2 Value for node c will change to v  =-0.2 (at least)

43 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN -0.2 Suppose this node takes a value greater than –0.2 Value for node c will not change and will remain at –0.2  =-0.2 (at least)

44 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN -0.2 WHAT IS THE UPPER BOUND ON v ? UPPER bound is the value at node e  =-0.2 (at least)

45 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN  = -0.2 (at most) Maximum advantage for c MIN node is –0.2 This is called the BETA Value for MIN Node  =-0.2 (at least)

46 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN  = -0.2 (at most) FIND THE ALPHA VALUE FOR NODE a ?  =-0.2 (at least)

47 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN  = -0.2 (at most)  =-0.2 (at least)  = 0.4 (at least) The least advantage which MAX can get in this portion of the game is 0.4

48 search CS 331/531 Dr M M Awais a b = 0.4 g = -0.2 e c MAX MIN MAX MIN  = -0.2 (at most)  =-0.2 (at least)  = 0.4 (at least) IF this least advantage is acceptable, then Expanding to c and to all the proceeding nodes can be neglected: Prune away link to c With ALPHA=0.4

49 search CS 331/531 Dr M M Awais - MAX node neglects values <= a (atleast it can score) at MIN nodes below it. - MIN node neglects values >= b (almost it can score) at MAX nodes below it A B =10 C G=0 H MAX MIN C node can score ATMOST 0 nothing above 0 (beta) A node can score ATLEAST 10 nothing less than 10 (alpha)

50 search CS 331/531 Dr M M Awais Alpha-Beta Example [-∞, +∞] Range of possible values Do DF-search until first leaf

51 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [-∞,3] [-∞,+∞]

52 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [-∞,3] [-∞,+∞]

53 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [3,+∞] [3,3]

54 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [-∞,2] [3,+∞] [3,3] This node is worse for MAX

55 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [-∞,2] [3,14] [3,3][-∞,14],

56 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [−∞,2] [3,5] [3,3][-∞,5],

57 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [2,2] [−∞,2] [3,3]

58 search CS 331/531 Dr M M Awais Alpha-Beta Example (continued) [2,2] [-∞,2] [3,3]

1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle 0+41+5 1+3 3+3 3+4 3+24+15+2 5+0 2+3 2+4 2+3 f(N) = g(N) + h(N)

Similar presentations

Presentation on theme: "1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle 0+41+5 1+3 3+3 3+4 3+24+15+2 5+0 2+3 2+4 2+3 f(N) = g(N) + h(N)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle 0+41+5 1+3 3+3 3+4 3+24+15+2 5+0 2+3 2+4 2+3 f(N) = g(N) + h(N)

Similar presentations

Presentation on theme: "1 search CS 331/531 Dr M M Awais A* Examples:. 2 search CS 331/531 Dr M M Awais 8-Puzzle 0+41+5 1+3 3+3 3+4 3+24+15+2 5+0 2+3 2+4 2+3 f(N) = g(N) + h(N)"— Presentation transcript:

Similar presentations

About project

Feedback