Presentation on theme: "in Artifitial Intelligence"— Presentation transcript:
1in Artifitial Intelligence Heuristic Searchin Artifitial IntelligenceCourse written by Richard E. Korf, UCLA.The slides were made by students of this course from Bar-ilan University, Tel-Aviv, Israel.
2Problems and Problem Spaces Chapter 1Problems and Problem Spaces
3Problems There are 3 general categories of problems in AI: Single-agent pathfinding problems.Two-player games.Constraint satisfaction problems.
4Single Agent Pathfinding Problems In these problems, in each case, we have a single problem-solver making the decisions, and the task is to find a sequence of primitive steps that take us from the initial location to the goal location.Famous examples:Rubik’s Cube (Erno Rubik, 1975).Sliding-Tile puzzle.Navigation - Travelling Salesman Problem.
5Two-Player GamesIn a two-player game, one must consider the moves of an opponent, and the ultimate goal is a strategy that will guarantee a win whenever possible.Two-player perfect information have received the most attention of the researchers till now. But, nowadays, researchers are starting to consider more complex games, many of them involve an element of chance.The best Chess, Checkers, and Othello players in the world are computer programs!
6Constraint-Satisfaction Problems In these problems, we also have a single-agent making all the decisions, but here we are not concerned with the sequence of steps required to reach the solution, but simply the solution itself.The task is to identify a state of the problem, such that all the constraints of the problem are satisfied.Famous Examples:Eight Queens Problem.Number Partitioning.
7Problem SpacesA problem space consists of a set of states of a problem and a set of operators that change the state.State : a symbolic structure that represents a single configuration of the problem in a sufficient detail to allow problem solving to proceed.Operator : a function that takes a state and maps it to another state.
8Problem Spaces 8-Puzzle: Chess: Not all operators are applicable to all states. The conditions that must be true in order for an operator to be legally applied to a state are known as the preconditions of the operator.Examples:8-Puzzle:states: the different permutations of the tiles.operators: moving the blank tile up, down, right or left.Chess:states: the different locations of the pieces on the board.operators: legal moves according to chess rules.
9[single\set of goal state(s)] [explicit\implicit]. Problem SpacesA problem instance: consists of a problem space, an initial state, and a set of goal states.There may be a single goal state, or a set of goal states, anyone of which would satisfy the goal criteria. In addition, the goal could be stated explicitly or implicitly, by giving a rule of determining when the goal has been reached.All 4 combinations are possible:[single\set of goal state(s)] [explicit\implicit].
10Problem SpacesFor Constraint Satisfaction Problems, the goal will always be represented implicitly, since an explicit description is the solution itself.Example:4-Queens has 2 different goal states.Here the goal is stated explicitly.Q
11Problem Representation For some problems, the choice of a problem space is not so obvious.The choice of representation for a problem can have an enormous impact on the efficiency of solving the problem.There are no algorithms for problem representation. One general rule is that a smaller representation, in the sense of fewer states to search, is often better then a larger one.
12Problem Representation For example, in the 8-Queens problem, when every state is an assignment of the 8 queens on the board:The number of possibilities with all 8 queens on the board is 64 choose 8, which is over 4 billion.The solution of the problem prohibits more then one queen per row, so we may assign each queen to a separate row, now we’ll have 88 > 16 million possibilities.Same goes for not allowing 2 queens in the same column either, this reduces the space to 8!, which is only 40,320 possibilities.
13Problem-Space GraphsA Problem-Space Graph is a mathematical abstraction often used to represent a problem space:The states are represented by nodes of the graph.The operators are represented by edges between nodes.Edges may be undirected or directed.
14Problem-Space GraphsExample: a small part of the 8-puzzle problem-space graph:
15Problem-Space GraphsIn most problems spaces there is more then one path between a pair of nodes.Detecting when the same state has been regenerated via a different path requires saving all the previously generated states, and comparing newly generated states against the saved states.Many search algorithms don’t detect when a state has previously been generated. The cost of this is that any state that can be reached by 2 different paths will be represented by duplicate nodes. The benefits are memory savings and simplicity.
16Branching Factor and Solution Depth The branching factor of a node :is the number of children it has, not counting its parent if the operator is reversible.is a function of the problem space.The branching factor of a problem space :is the average number of children of the nodes in the space.The solution depth in a single-agent problem:is the length of the shortest path from the initial node to a goal node.is a function of the particular problem instance.
17Eliminating Duplicate Nodes In many cases we can reduce the size of the search tree, by eliminating some simple duplicate paths.In general,we never apply an operator and it’s inverse in succession, since no optimal path can contain such a sequence.Therefore we never list the parent of a node as one of his children.This reduces the branching factor of the problem by approximately 1.
18Types of Problem Spaces There are several types of problem spaces:State spaceProblem Reduction SpaceAND/OR Graphs
19State Space The states represent situations of the problem. The operators represent actions in the world.forward search: the root of the problem space represents the start state, and the search proceeds forward to a goal state.backward search : the root of the problem space represents the goal state, and the search proceeds backward to the initial state.For example: in Rubik’s Cube and the Sliding-Tile Puzzle, either a forward or backward search are possible.
20Problem Reduction Space In a problem reduction space, the nodes represent problems to be solved or goals to be achieved, and the edges represent the decomposition of the problem into subproblems.This is best illustrated by the example of the Towers of Hanoi problem.CAB
21Problem Reduction Space The root node, labeled “3AC” represents the original problem of transferring all 3 disks from peg A to peg C.The goal can be decomposed into three subgoals: 2AB, 1AC, 2BC. In order to achieve the goal, all 3 subgoals must be achieved.2AB3AC1AC2BC1AB1CB1BA1BC
30AND/OR GraphsAn AND graph consists entirely of AND nodes, and in order to solve a problem represented by it, you need to solve the problems represented by all of his children (Hanoi towers example).An OR graph consists entirely of OR nodes, and in order to solve the problem represented by it, you only need to solve the problem represented by one of his children (Eight Puzzle Tree example).
31AND/OR Graphs An AND/OR graph consists of both AND nodes and OR nodes. י"ט/ניסן/תשע"זAND/OR GraphsAn AND/OR graph consists of both AND nodes and OR nodes.One source of AND/OR graphs is a problem where the effect of an action cannot be predicted in advanced, as in an interaction with the physical world.Example:the counterfeit-coin problem.
32Two-Player Game TreesThe most common source of AND/OR graphs is player perfect-information games.Example: Game Tree for 5-Stone Nim:54321OR nodesAND nodesx
33Solution subgraph for AND/OR trees In general, a solution to an AND/OR graph is a subgraph with the following properties:It contains the root node. For every OR node included in the solution subgraph, one child is included.For every OR node included in the solution subgraph, all the children are included.Every terminal node in the solution subgraph is a solved node.
34SolutionsThe notion of a solution is different for the different problem types:For a path-finding problem, an optimal solution is a solution of lowest cost.For a CSP, if there is a cost function associated with a state of the problem, an optimal solution would again be one of lowest cost.For a 2-player game:If the solution is simply a move to be made, an optimal solution would be the best possible move that can be made in a given situation.If the solution is considered a complete strategy subgraph, then an optimal solution might be one that forces a win in the fewest number of moves in the worst case.
35Combinatorial Explosion The number of different states of the problems above is enormous, and grows extremely fast as the problem size increases.Examples for the number of different possibilities:
36Combinatorial Explosion The combinatorial explosion of the number of possible states as a function of problem size is a key characteristic that separates artificial intelligence search algorithms in other areas of computer science.Techniques that rely on storing all possibilities in memory, or even generating all possibilities, are out of the question except for the smallest of these problems. As a result, the problem-space graphs of AI problems are usually represented implicitly by specifying an initial state and a set of operators to generate new states.
37Search AlgorithmsThis course will focus on systematic search algorithms that are applicable to the different problem types, so that a central concern is their efficiency.There are 3 primary measures of efficiency of a search algorithm:The quality of the solution returned, is it optimal or not.The running time of the algorithm.The amount of memory required by the algorithm
38The Next Chapters Chapter 2 : brute force searches. Chapter 3 : heuristic search algorithms.Chapter 4 : search algorithms that run in linear space.Chapter 5 : search algorithms for the case where individual moves of a solution must be executed in the real world before a complete optimal solution can be computed.Chapter 6 : methods for deriving the heuristic functionChapter 7 : 2-player perfect-information games.Chapter 8 : analysis of alpha-beta minimax.Chapter 9 : games with more then 2 players.Chapter 10: the decision quality of minimax.Chapter 11: automatic learning of heuristic functions for 2-player games.Chapter 12: Constraint Satisfaction Problems.Chapter 13: parallel search algorithms.
40Brute-Force SearchThe most general search algorithms are Brute-Force searches, that do not use any domain specific knowledge.It requires:a state descriptiona set of legal operatorsan initial statea description of the goal state.We will assume that all edges have unit cost.To generate a node means to create the data structure corresponding to the that node.To expand a node means to generate all the children of that node.
41Breadth-First Search (BFS) BFS expands nodes in order of their depth from the root.Generating one level of the tree at a time.Implemented by first-in first-out (FIFO) queue.At each cycle the node at the head of the queue is removed and expanded, and its children are placed at the end of the queue.
42Breadth-First Search (BFS) The numbers represent the order generated by BFS12c34567813149101112
43Solution Quality BFS continues until a goal node is generated. Two ways to report the actual solution path:Store with each node the sequence of moves made to reach that node.Store with each node a pointer back to his parent - more memory efficient.If a goal exists in the tree BFS will find a shortest path to a goal.
44Time Complexity We assume : N(b,d) - total number of nodes generated. י"ט/ניסן/תשע"זTime ComplexityWe assume :each node can be generated in constant timefunction of the branching factor b and the solution depth dnumber of nodes depends on where at level d the goal node is found.the worst case - have to generate all the nodes at level d.N(b,d) - total number of nodes generated.
45Time Complexity of BFS is י"ט/ניסן/תשע"זTime ComplexityTime Complexity of BFS isO(bd)
46Space Complexity=Time Complexity= O(bd) To report the solution we need to store all nodes generated.Example:Machine speed = 100 MHzGenerated a new state in 100 Instruction106 nodes/secnode size = 4 bytestotal memory = 1GB=109 bytenodes’ capacity=109/4=250*106After 250 sec’ = 4 minutes the memory is exhausted !Space Complexity=Time Complexity= O(bd)
47Space Complexity The previous example based on current technology. The problem won’t go away since as memories increase in size, processors get faster and our appetite to solve larger problem grows.BFS and any algorithm that must store all the nodes are severely space-bound and will exhaust the memory in minutes.
48Depth-First Search (DFS) DFS generates next a child of the deepest node that has not been completely expanded yet.First Implementation is by last in first out (LIFO) stack.At each cycle the node at the head of the stack is removed and expanded, and its children are placed on top of the stack.
49DFS - stack implementation The numbers represent the order generated by DFS12c34910561314781112
50Depth-First Search (DFS) Second Implementation is recursive.The recursive function takes a node as an argument and perform DFS below that node. This function will loop through each of the node’s children and make a recursive call to perform a DFS below each of the children in turn.
51DFS - recursive implementation The numbers represent the order generated by DFS18c25912341314671011
52Space ComplexityThe space complexity is linear in the maximum search depth.d is the maximum depth of search and b is the Branching Factor.Depth-first generation stores O(d) nodes.Depth-first expansion stores O(bd) nodes.DFS is time-limited rather than space-limited.
53Time Complexity and Solution Quality DFS generate the same set of nodes as BFS.However, on infinite tree DFS may not terminate.For example: Eight puzzle contain 181,440 nodes but every path is infinitely long and thus DFS will never end.Time Complexity of DFS isO(bd)
54Time Complexity and Solution Quality The solution for infinite tree is to impose an artificial Cutoff depth on the search.If the chosen cutoff depth is less than d, the algorithm won’t find a solution.If the cutoff depth is greater than d, time complexity is larger than BFS.The first solution DFS found may not be the optimal one.
55Depth-First Iterative-Deepening (DFID) Combines the best features of BFS and DFS.DFID first performs a DFS to depth one. Than starts over executing DFS to depth two. Continue to run DFS to successively greater depth until a solution is found.
56Depth-First Iterative-Deepening (DFID) The numbers represent the order generated by DFID1,3,92,6,16c4,105,137,178,201112212214151819
57Solution QualityDFID never generates a node until all shallower nodes have already been generated.The first solution found by DFID is guaranteed to be along a shortest path.
58The space complexity is only O(d) Like DFS, at any given point DFID saving only a stack of nodes.The space complexity is only O(d)
59י"ט/ניסן/תשע"זTime ComplexityDFID do not waste a great deal of time in the iterations prior to the one that finds a solution. This extra work is usually insignificant.The ratio of the number of nodes generated by DFID to those generated by BFS on a tree is:The total number of nodes generated by DFID is
60Optimality of DFID Steps of proof: Theorem 2.1 : DFID is asymptotically optimal in terms of time and space among all brute-force shortest-path algorithms on a tree with unit edge costs.Steps of proof:verify that DFID is optimal in terms of:solution qualitytime complexityspace complexity
61Optimality of DFID- Solution Quality Since DFID generates all nodes at given level before any nodes at next deeper level, the first solution it finds is arrived at via an optimal path.
62Optimality of DFID- Time Complexity Assume the contrary that Algorithm A is:Running on Problem P.Finding a shortest path to a goal.Running less than b^d .Since its running time is less than b^d and there are b^d nodes at depth d, there must be at least one node n at depth d that A doesn’t generate when solve P.
63Optimality of DFID- Time Complexity New Problem P’.P’ identical to P except that n is the goal.A examines the same nodes in both P and P’.A doesn’t examine the node n.A fail to solve P’ since n is the only goal node.There is no Algorithm runs better than O(b^d ).Since DFID takes O(b^d ) time, its time complexity is asymptotically optimal.
64Optimality of DFID- Space Complexity There is a well-known result from C.S that:Any algorithm that takes f(n) time must use at least logf(n) space.We have already seen that any brute-force search must take at least bd time, any such algorithm must use at least log(b^d) space, witch is O(d) space.Since DFID uses O(d) space, it’s asymptotically optimal in space.
65י"ט/ניסן/תשע"זGraph with CyclesOn graph with cycles BFS can be more efficient because it can detect all duplicate nodes whereas a DFS can’t.The complexity of BFS grows only as a numbers of nodes at a given depth.
66Graph with CyclesThe complexity of DFS depends on the numbers of paths of a given length.In a graph with a large number of very short cycles, BFS is preferable to DFS, if sufficient memory is available.In a square grid with radius r, there is O(r2) nodes and O(4r) paths.
67Pruning duplicate Nodes in DFS Eliminate the parent of each node as one of its children.Easily done with FSM.Reduce the branching factor from 4 to 3.startrightupleftdown
68Pruning duplicate Nodes in DFS More Efficient FSM allowed sequences of moves up only or down only . And sequences of moves left only or right only.Time complexity of DFS controlled by this FSM, like BFS, is O(r2).startrightleftupdown
69Node Generation TimesBFS, DFS, DFID generates asymptotically the same number of nodes on a tree.DFS, DFID are more efficient than BFS.The amount of time to generate a node is proportional to the size of the state representation.If DFS is implemented as a recursive program, a move would require only a constant time, instead of time linear in the number of tiles.This advantage of DFS, becomes increasingly significant the larger state description.
70Backward Chaining/Search The root node represent the goal state, and we could search backward until we reach the initial state.Requirements:The goal state represented explicitly.We be able to reason backwards about the operators.
71Bidirectional Search Main idea: Simultaneously search forward from the initial state and backward from the goal state, until the two search frontiers meet at a common state.
72Solution QualityBidirectional search guarantees finding a shortest path from the initial state to the goal state, if one exist.Assume that there is a solution of length d and the both searches are breadth-first.When the forward search has proceeded to depth k, its frontier will contain all nodes at depth k from the initial state.
73Solution QualityWhen the backward search has proceeded to depth d-k, its frontier will contain all states at depth d-k from the goal state.State s reached along an optimal solution path at depth k from the initial state and at depth d-k from the goal state.The state s is in the frontier of both searches and the algorithm will find the match and return the optimal solution.
74The total number of nodes generated is O(2bd/2) = O(bd/2). Time ComplexityIf the two search frontiers meet in the middle, each search proceeds to depth d/2 before they meet.But this isn’t the asymptotic time complexity because we have to compare every new node with the opposite search frontier.Naively, compare each node with the all opposite search frontier cost us O(bd/2).The total number of nodes generated is O(2bd/2) = O(bd/2).
75Time ComplexityThe time complexity of the whole algorithm becomes O(bd).More efficiently is using hash tables.In the average case:The time to do hashing and compare will be constant.the asymptotically time complexity is O(bd/2).
76Space ComplexityThe simplest implementation of bidirectional is to use one search in BFS, and the search in other direction can be DFS such as DFID.At least one of the frontiers must be sorted in memory.The space complexity of bidirectional search is dominated by BFS search and is O(bd/2).Bidirectional search is space bound.Bidirectional search is much more time efficient than unidirectional search.