# Review for Finals 2011 159.302. SEARCH CONSTRAINT SATISFACTION PROBLEMS GAMES FUZZY LOGIC NEURAL NETWORKS GENETIC ALGORITHM (not included in the exam)

## Presentation on theme: "Review for Finals 2011 159.302. SEARCH CONSTRAINT SATISFACTION PROBLEMS GAMES FUZZY LOGIC NEURAL NETWORKS GENETIC ALGORITHM (not included in the exam)"— Presentation transcript:

Review for Finals 2011 159.302

SEARCH CONSTRAINT SATISFACTION PROBLEMS GAMES FUZZY LOGIC NEURAL NETWORKS GENETIC ALGORITHM (not included in the exam) LOGIC (not included in the exam)

SEARCH – 15 marks CONSTRAINT SATISFACTION PROBLEMS – 10 marks GAMES – 8 marks FUZZY LOGIC – 7 marks NEURAL NETWORKS– 12 marks Allotment of marks Total = 60 marks FUNDAMENTALS (true or false)– 8 marks

Search

Input: MultipleObstacles: x, y, angle Target’s x, y, angle Robot Navigation Output: Robot angle, speed Obstacle Avoidance, Target Pursuit, Opponent Evasion

Cascade of Fuzzy Systems Adjusted Speed Adjusted Angle Next Waypoint N Y Adjusted Speed Adjusted Angle Fuzzy System 1: Target Pursuit Fuzzy System 2: Speed Control for Target Pursuit Fuzzy System 3: Obstacle Avoidance Fuzzy System 4: Speed Control for Obstacle Avoidance ObstacleDistance < MaxDistanceTolerance and closer than Target Actuators Path planning Layer: The A* Algorithm Multiple Fuzzy Systems employ the various robot behaviours Multiple Fuzzy Systems employ the various robot behaviours Fuzzy System 1 Fuzzy System 2 Fuzzy System 3 Fuzzy System 4 Path Planning Layer Central Control Target Pursuit Obstacle Avoidance

7SEARCH Background and Motivation General Idea: Search allows exploring alternatives 1 Background Uninformed vs. Informed Searches Any Path vs. Optimal Path Implementation and Performance Topics for Discussion: These algorithms provide the conceptual backbone of almost every approach to the systematic exploration of alternatives.

8SEARCH Graph Search as Tree Search 1 C A D G graph search problems tree search problems We can turn graph search problems (from S to G) into tree search problems by: 1. Replacing undirected links by 2 directed links 2. Avoiding loops in path (or keeping track of visited nodes globally) S B S D D CGCG AB G C

9SEARCH More Abstract Example of Graphs Planning actions (graph of possible states of the world) 1 ABC AB C AB C A B C A B C Put C on A Put C on B Put B on C Put A on C Here, the nodes denote descriptions of the state of the world Path = “plan of actions”

10SEARCH Classes of Search 3 ClassNameOperation Any Path Uninformed Depth-FirstSystematic exploration of the whole tree until a goal is found Breadth-First Any Path Informed Best-First Uses heuristic measure of goodness of a state (e.g. estimated distance to goal) Optimal Uninformed Uniform-Cost Uses path-length measure. Finds “shortest” path. Optimal Informed A* Uses path “length” measure and heuristic. Finds “shortest” path.

11SEARCH Simple Search Algorithm A Search Node is a path from some state X to the start state e.g. (X B A S) e.g. (X B A S) 3 1.Initialise Q with search node (S) as only entry; set Visited = (S). The state of a search node is the most recent state of the path e.g. X e.g. X Let Q be a list of search nodes e.g. (X B A S) (C B A S) …) e.g. (X B A S) (C B A S) …) Let S be the start state. 2. If Q is empty, fail. Else, pick some search node N from Q. 3. If state (N) is a goal, return N (we’ve reached the goal). 4. (Otherwise) Remove N from Q. 5. Find all the descendants of state (N) not in Visited and create all the one-step extensions of N to each descendant. 6. Add the extended paths to Q; add children of state (N) to Visited. 7. Go to Step 2. Critical Decisions: Step 2: picking N from Q. Step 6: adding extensions of N to Q.

12SEARCH Implementing the Search Strategies 4 1. Pick first element of Q. Breadth-First 2. Add path extensions to end of Q. 1. Pick first element of Q. Depth-First (backtracking search) 2. Add path extensions to front of Q. 1. Pick best (measured heuristic value of state) element of Q. Best-First (greedy search) 2. Add path extensions anywhere in Q (it may be efficient to keep the Q ordered in some way so as to make it easier to find the “best” element. Heuristic functions are applied in the hope of completing the search quicker or finding a relatively good goal state. It does not guarantee finding the “best” path though.

13SEARCH Depth-First with Visited List 4 firstfront Pick first element of Q; Add path extensions to front of Q. StepQVisited 1(S)S 2(AS)(BS)A, B, S 3(CAS)(DAS)(BS)C, D, B, A, S 4(DAS)(BS)C, D, B, A, S 5(GDAS)(BS)G,C,D,B,A,S Sequence of State Expansions: S-A-C-D-G C A D G S B In DFS – nodes are pulled off the queue and inserted into the queue using a stack.

14SEARCH Visited States 4 generally improves time efficiency substantial additional space Keeping track of visited states generally improves time efficiency when searching graphs, without affecting correctness. Note, however, that substantial additional space may be required to keep track of visited states. If all we want to do is find a path from the start to the goal, there is no advantage to adding a search node whose state is already the state of another search node. Any state reachable from the node the second time would have been reachable from that node the first time. Note that, when using Visited, each state will only ever have at most one path to it (search node) in Q. We’ll have to revisit this issue when we look at optimal searching

15SEARCH Worst Case Running Time 4 In the worst case, all the searches, with or without visited list may have to visit each state at least once. d is depth b is branching factor Number of States in the tree = b d < (b d+1 – 1)/(b-1) < b d+1 d=0 d=1 d=2 d=3 b=2 Max Time is proportional to the Max. number of Nodes visited. So, all searches will have worst case running times that are at least proportional to the total number of states and therefore exponential in the “depth” parameter.

16SEARCH Worst Case Space 4 Depth-first maximum Q size: (b-1)d ≈ bd d=0 d=1 d=2 d=3 Max Q Size = Max (#visited – #expanded). visited expanded b=2 Breadth-first max. Q size: b d

17SEARCH Cost and Performance of Any-Path Methods 4 Search Method Worst TimeWorst SpaceFewest states? Guaranteed to find a path? Depth-First b d+1 bd No Yes* Breadth-First b d+1 bdbd Yes Best-First b d+1 ** bdbd No Yes* without using a Visited List Searching a tree with branching factor b and depth d (without using a Visited List) *If there are no indefinitely long paths in the search space **Best-First needs more time to locate the best node in Q. Worst case time is proportional to the number of nodes added to Q. Worst case space is proportional to maximal length of Q.

18SEARCH Cost and Performance of Any-Path Methods 4 Search Method Worst Time Worst Space Fewest states? Guarantee d to find a path? Depth-First b d+1 bd b d+1 No Yes* Breadth-First b d+1 b d b d+1 Yes Best-First b d+1 ** b d b d+1 No Yes* with Visited List Searching a tree with branching factor b and depth d (with Visited List) *If there are no indefinitely long paths in the search space **Best-First needs more time to locate the best node in Q. Worst case time is proportional to the number of nodes added to Q. and Visited list. Worst case space is proportional to maximal length of Q and Visited list.

19SEARCH States vs. Path 4 d=0 d=1 d=2 d=3 b=2 Using a Visited list helps prevent loops; that is, no path visits a state more than once. However, using the Visited list for very large spaces may not be appropriate as the space requirements would be prohibitive. Using a Visited list, the worst-case time performance is limited by the number of states in the search space rather than the number of paths through the nodes in the space (which may be exponentially larger than the number of states.

20SEARCH Space (the final frontier) 4 In large search problems, memory is often the limiting factor. Imagine searching a tree with branching factor 8 and depth 10. Assume a node requires just 8 bytes of storage. Then, Breadth- First search might require up to: (2 3 ) 10 * 2 3 = 2 33 bytes = 8,000 Mbytes = 8 Gbytes Progressive Deepening Search (PDS) One strategy is to trade time for memory. For example, we can emulate Breadth-First search by repeated applications of Depth-First search, each up to a preset depth limit. This is called Progressive Deepening Search (PDS): 1.C=1 2.Do DFS to max. depth C. If path is found, return it. 3.Otherwise, increment C and go to Step 2.

See Tutorial on Search

22SEARCH Simple Search Algorithm A Search Node is a path from some state X to the start state e.g. (X B A S) e.g. (X B A S) 3 1. Initialise Q with search node (S) as only entry; set Visited = (S). The state of a search node is the most recent state of the path e.g. X e.g. X Let Q be a list of search nodes e.g. (X B A S) (C B A S) …) e.g. (X B A S) (C B A S) …) Let S be the start state. 2. If Q is empty, fail. Else, pick some search node N from Q. 3. If state (N) is a goal, return N (we’ve reached the goal). 4. (Otherwise) Remove N from Q. 5. Find all the descendants of state (N) not in Visited and create all the one- step extensions of N to each descendant. 6. Add the extended paths to Q; add children of state (N) to Visited. 7. Go to Step 2. Visited List Optimal Do NOT use Visited List for Optimal Searching! Critical Decisions: Step 2: picking N from Q. Step 6: adding extensions of N to Q.

23SEARCH Uniform Cost 4 StepQ 1(0 S) 2(2 AS)(5 BS) 3(4 CAS)(6 DAS)(5 BS) 4(6 DAS)(5 BS) 5(6 DBS)(10 GBS)(6 DAS) 6(8 GDBS)(9 CDBS)(10 GBS)(6 DAS) 7(8 GDAS)(9 CDAS)(8 GDBS)(9 CDBS) (10 GBS) C A D G S B 2 2 4 5 3 2 5 1 Uniform Cost enumerates paths in order of total path cost! S DC CG DG CG AB 2 6 5 9 8 4 6 10 98 Sequence of State Expansions: S – A – C – B – D – D - G The sequence of path extensions corresponds precisely to path-length order, so it is not surprising we find the shortest path. 0 2 4 5 6 6 8

24SEARCH Simple Optimal Search Algorithm: Uniform Cost + Strict Expanded List A Search Node is a path from some state X to the start state e.g. (X B A S) e.g. (X B A S) 3 set Expanded = {}. 1. Initialise Q with search node (S) as only entry, set Expanded = {}. The state of a search node is the most recent state of the path e.g. X e.g. X Let Q be a list of search nodes e.g. (X B A S) (C B A S) …) e.g. (X B A S) (C B A S) …) Let S be the start state. 2. If Q is empty, fail. Else, pick some search node N from Q. 3. If state (N) is a goal, return N (we’ve reached the goal). 4. (Otherwise) Remove N from Q. If state (N) in Expanded, go to Step 2; otherwise, add state (N) to Expanded List. 5. If state (N) in Expanded, go to Step 2; otherwise, add state (N) to Expanded List. (Not in Expanded) 6. Find all the children of state(N) (Not in Expanded) and create all the one-step extensions of N to each descendant. Add all the extended paths to Q. If descendant state already in Q, keep only shorter path to state in Q. 7. Add all the extended paths to Q. If descendant state already in Q, keep only shorter path to state in Q. Take note that we need to add some precautionary measures in the adding of extended paths to Q. 8. Go to Step 2. In effect, discard that state

25SEARCH Uniform Cost (with Strict Expanded List) 4 C A D G S B 2 2 4 5 3 2 5 1 StepQExpanded 1(0 S) 2(2 AS)(5 BS)S 3(4 CAS)(6 DAS)(5 BS)S, A 4(6 DAS)(5 BS)S, A, C 5(6 DBS)(10 GBS)(6 DAS)S, A, C, B 6(8 GDAS)(9 CDAS)(10 GBS)S, A, C, B, D Sequence of State Expansions: S – A – C – B – D - G Remove because C was expanded already Remove because there’s a new shorter path to G Remove because D is already in Q (our convention: take element at the front of the Q if a path leading to the same state is in Q).

26SEARCH Classes of Search 3 ClassNameOperation Any Path Uninformed Depth-First Systematic exploration of the whole tree until a goal is found Breadth-First Any Path Informed Best-First Uses heuristic measure of goodness of a state (e.g. estimated distance to goal) Optimal Uninformed Uniform-Cost Uses path “length” measure. Finds “shortest” path. Optimal Informed A* Uses path “length” measure and heuristic. Finds “shortest” path.

27SEARCH Why use estimate of goal distance? 3 AB SOrder in which UC looks at states. A and B are same distance from start S, so will be looked at before any longer paths. No “bias” towards goal. S ABG Assume states are points in the Euclidean space

28SEARCH Why use estimate of goal distance? 3 S A S GOrder of examination using distance from S + estimate of distance to G. “bias”Note: “bias” toward the goal; The points away from G look worse AB SOrder in which UC looks at states. A and B are same distance from start S, so will be looked at before any longer paths. No “bias” towards goal. Assume states are points in the Euclidean space BG

29SEARCH Goal Direction 3 UC is really trying to identify the shortest path to every state in the graph in order. It has no particular bias to finding a path to a goal early in the search. h(N) (h)We can introduce such a bias by means of heuristic function h(N), which is an estimate (h) of the distance from a state to the goal. gInstead of enumerating paths in order of just length (g), enumerate paths in terms of fg + h f = estimated total path length = g + h. admissibleAn estimate that always underestimates the real path length to the goal is called admissible. For example, an estimate of 0 is admissible (but useless). Straight line distance is admissible estimate for path length in Euclidean space. Use of an admissible estimate guarantees that UC will still find the shortest path. with an admissible estimateA* SearchUC with an admissible estimate is known as A* Search.

30SEARCHA* 4 C A D G S B 2 2 4 5 3 2 5 1 StepQ 1(0 S) 2(4 AS)(8 BS) 3(5 CAS)(7 DAS)(8 BS) 4(7 DAS)(8 BS) 5(8 GDAS)(10 CDAS)(8 BS) Sequence of State Expansions: S – A – C – D - G Heuristic Values: A=2, B=3, C=1, D=1, S=0, G=0 bestanywherePick best (by path length + heuristic) element of Q, Add path extensions anywhere in Q.

31SEARCH States vs. Paths 4 Uniform-CostSearch A*Visited List We have ignored the issue of revisiting states in our discussion of Uniform-Cost Search and A*. We indicated that we could not use the Visited List and still preserve optimality, but can we use something else that will keep the worst-case cost of a search proportional to the number of states in a graph rather than to the number of non-looping paths? d=0 d=1 d=2 d=3 b=2

32SEARCHConsistency 4 To enable implementing A* using the strict Expanded List, H needs to satisfy the following consistency (also known as monotonicity) conditions 0 1. h(S i ) = 0, if n i is a goal <= 2. h(S i ) - h(S j ) <= c( S i, S j ), if n j a child of n i triangle inequality That is, the heuristic cost in moving from one entry to the next cannot decrease by more than the arc cost between the states. This is a kind of triangle inequality This condition is a highly desirable property of a heuristic function and often simply assumed (more on this later). nini njnj goal h(S i ) h(S j ) C(S i, S j )

33SEARCH A* (with Strict Expanded List) 4 Note that the heuristic is admissible and consistent. A C G S B 1 1 2 100 2 Heuristic Values: 88 A=100, B=88, C=100, S=90, G=0 StepQExpanded List 1(90 S) 2(90 BS)(101 AS)S 3(101 AS)(104 CBS)A, S 4(102 CAS)(104 CBS)C, A, S 5(102 GCAS)G, C, A, S Underlined paths are chosen for extension. If we modify the heuristic in the example we have been considering so that it is consistent, as we have done here by increasing the value of h(B), then A* (with the Expanded List) will work.

34SEARCH Dealing with inconsistent heuristic 4 What can we do if we have an inconsistent heuristic but we still want optimal paths? Modify A* so that it detects and corrects when inconsistency has led us astray. Assume we are adding node 1 to Q and node 2 is present in Expanded List with node 1.state = node 2.state. Strict: Do NOT add node 1 to Q. Non-Strict Expanded List: If (node 1.path_length < node 2.path_length), then 1.Delete node 2 from Expanded List 2.Add node 1 to Q.

35SEARCH Optimality and Worst Case Complexity 4 AlgorithmHeuristicExpanded List Optimality Guaranteed? Worst Case & Expansions Uniform Cost NoneStrictYesN A*AdmissibleNoneYes>N A*ConsistentStrictYesN A*AdmissibleStrictNoN A*AdmissibleNon-StrictYes>N

Fuzzy Logic

Fuzzy Inference Process Fuzzification Rule Evaluation Defuzzification e.g. theta e.g. force Fuzzification: Translate input into truth values Rule Evaluation: Compute output truth values Defuzzification: Transfer truth values into output Fuzzy Inference Process

Obstacle Avoidance Problem obstacle (obsx, obsy)  (x,y) Can you describe how the robot should turn based on the position and angle of the obstacle? Robot Navigation Demonstration Obstacle Avoidance & Target Pursuit

Another example: Fuzzy Sets for Robot Navigation Angle and Distance Sub ranges for angles & distances overlap * SMALL MEDIUM LARGE NEAR FAR VERY FAR

Fuzzy Systems for Obstacle Avoidance NEARFARVERY FAR SMALLVery SharpSharp TurnMed Turn MEDIUMSharp TurnMed TurnMild Turn LARGEMed TurnMild TurnZero Turn Nearest Obstacle (Distance and Angle) NEARFARVERY FAR SMALLVery SlowSlow SpeedFast MEDIUMSlow SpeedFast SpeedVery Fast LARGEFast SpeedVery FastTop Speed e.g. If the Distance from the Obstacle is NEAR and the Angle from the Obstacle is SMALL Then turn Very Sharply. e.g. If the Distance from the Obstacle is NEAR and the Angle from the Obstacle is SMALL Then turn Very Sharply. Fuzzy System 3 (Steering) Fuzzy System 3 (Steering) Fuzzy System 4 (Speed Adjustment) e.g. If the Distance from the Obstacle is NEAR and the Angle from the Obstacle is SMALL Then move Very Slowly. e.g. If the Distance from the Obstacle is NEAR and the Angle from the Obstacle is SMALL Then move Very Slowly. Vision System Angle Speed

Fuzzification 0.00.5-0.51.0-2.5-3.03.02.5 1.0 0.0 NEGATIVEPOSITIVEZERO Fuzzy Sets = { Negative, Zero, Positive } Fuzzification Example x = 0.25 Crisp Input: x = 0.25 What is the degree of membership of x in each of the Fuzzy Sets? Assuming that we are using trapezoidal membership functions. 1. Fuzzification

Fuzzification 0.00.5-0.51.0-2.5-3.03.02.5 1.0 0.0 NEGATIVEPOSITIVEZERO Fuzzy Sets = { Negative, Zero, Positive } Fuzzification Example

Sample Calculations Crisp Input: F zero (0.25) F negative (0.25) F positive (0.25) 3 - 2.5

Sample Calculations Crisp Input: F zero (-0.25) F negative (-0.25) F positive (-0.25)

LeftTrapezoid Left_Slope = 0 Right_Slope = 1 / (A - B) CASE 1: X < a Membership Value = 1 CASE 2: X >= b Membership Value = 0 CASE 3: a < x < b Membership Value = Right_Slope * (X - b) Trapezoidal Membership Functions ab

RightTrapezoid Left_Slope = 1 / (B - A) Right_Slope = 0 CASE 1: X <= a Membership Value = 0 CASE 2: X >= b Membership Value = 1 CASE 3: a < x < b Membership Value = Left_Slope * (X - a) ab

Trapezoidal Membership Functions Regular Trapezoid Left_Slope = 1 / (B - A) Right_Slope = 1 / (C - D) CASE 1: X = d Membership Value = 0 CASE 2: X >= b And X <= c Membership Value = 1 CASE 3: X >= a And X <= b Membership Value = Left_Slope * (X - a) CASE 4: (X >= c) And (X <= d) Membership Value = Right_Slope * (X - d) abcd

Inputs are applied to a set of if/then control rules. 2. Rule Evaluation e.g. IF temperature is very hot, THEN set fan speed very high. The results of various rules are summed together to generate a set of “fuzzy outputs”. Fuzzy Control NLNS ZEPS PL Different stages of Fuzzy control N ZE P N ZE P FAMM Outputs NL=-5 NS=-2.5 ZE=0 PS=2.5 PL=5.0 W1W4W7 W2W5W8 W3W6W9 x y

Fuzzy Control Assuming that we are using the conjunction operator (AND) in the antecedents of the rules. NLNS ZEPS PL N ZE P N ZE P FAMM x y W1W4W7 W2W5W8 W3W6W9 2. Rule Evaluation

Fuzzy Control = -1.25/ 4.5 = -0.278 Defuzzification Example Assuming that we are using the center of mass defuzzification method. NLNS ZEPS PL N ZE P N ZE P FAMM x y W1W4W7 W2W5W8 W3W6W9 3. Defuzzification

Neural Networks

Three-layer Network to Solve the XOR Problem 0.91 0.98 1 1 1 0 -3.29 -4.95 7.1 -2.76 10.9 Output Layer Hidden Layer Input Layer

Training a Network BACKPROPAGATION TRAINING -4.95 7.1 0.91 0.98 1 1 -2.76 10.9 7.1 -3.29 x yzh bhbhbhbh bzbzbzbz OkOk OjOj OiOi i (INPUT) j (HIDDEN) k (OUTPUT) W jk W ij W ik t k – target output, O k – actual output

O k (1) OjOj O k (0) O k (2) j (HIDDEN) k (OUTPUT) k –subscript for all output nodes that connect to node j BACKPROPAGATION TRAINING

We will now look at the formulas for adjusting the weights that lead into the output units of a back­propagation network. The actual activation value of an output unit, k, will be o k and the target for unit, k, will be t k. First of all there is a term in the formula for  k, the error signal: Training a Network BACKPROPAGATION TRAINING where f’ is the derivative of the activation function, f. If we use the usual activation function: the derivative term is: -4.95 7.1 0.91 0.98 1 1 -2.76 10.9 7.1 -3.29 x yzh bhbhbhbh bzbzbzbz

The formula to change the weight, w jk between the output unit, k, and unit j is: Training a Network BACKPROPAGATION TRAINING where  is some relatively small positive constant called the learning rate. With the network given, assuming that all weights start with zero values, and with  = 0.1 we have: 0.0 0.5 1 1 0.0 x yzh bhbhbhbh bzbzbzbz

The k subscript is for all the units in the output layer however in this example there is only one unit. In the example, then: Training a Network BACKPROPAGATION TRAINING 0.0 0.5 1 1 0.0 x yzh bhbhbhbh bzbzbzbz The formula for computing the error  j for a hidden unit, j, is:

Training a Network BACKPROPAGATION TRAINING The weight change formula for a weight, w ij that goes between the hidden unit, j and the input unit, i is essentially the same as before: The new weights will be: 0.0 0.5 1 1 0.0 x yzh bhbhbhbh bzbzbzbz

Iterative minimization of error over training set: 1. Put one of the training patterns to be learned on the input units. 2. Find the values for the hidden unit and output unit. 3. Find out how large the error is on the output unit. 4. Use one of the back-propagation formulas to adjust the weights leading into the output unit. 5. Use another formula to find out errors for the hidden layer unit. 6. Adjust the weights leading into the hidden layer unit via another formula. 7. Repeat steps 1 thru 6 for the second, third patterns, etc… Backpropagation Training

Sum of Squared Errors This error is minimised during training. where T k where T k is the target output; O k O k is the actual output of the network; m m is the total number of output units; n n is the number of training exemplars;

Training Neural Nets m Given: Data set, desired outputs and a Neural Net with m weights. Find a setting for the weights that will give good predictive performance on new data. Estimate expected performance on new data. three subsets 1.Split data set (randomly) into three subsets: nTraining set – used for picking weights nValidation set – used to stop training nTest set – used to evaluate performance 2.Pick random, small weights as initial values. iterative minimization of error over training set 3.Perform iterative minimization of error over training set. error 4.Stop when error on validation set reaches a minimum (to avoid over-fitting). 5.Repeat training (from Step 2) several times (to avoid local minima). new data 6.Use best weights to compute error on test set, which is the estimate of performance on new data. Do not repeat training to improve this.

Multi-Layer Feed-Forward Neural Network Why do we need BIAS UNITS (or Threshold Nodes)? Without thresholds, it would be impossible to approximate functions which assign nonzero output to zero input. Apart from improving the speed of learning for some problems (XOR problem), bias units or threshold nodes are required for universal approximation. Without them, the feedforward network always assigns 0 output to 0 input. Without thresholds, it would be impossible to approximate functions which assign nonzero output to zero input. Threshold nodes are needed in much the same way that the constant polynomial ‘1’ is required for approximation by polynomials. -4.95 7.1 0.91 0.98 1 1 -2.76 10.9 7.1 -3.29 x yzh bhbhbhbh bzbzbzbz

Data Sets Split data set (randomly) into three subsets: 1.Training set – used for picking weights 2.Validation set – used to stop training 3.Test set – used to evaluate performance

Training

Input Representation All the signals in a neural net are [0, 1]. Input values should also be scaled to this range (or approximately so) so as to speed training. If the input values are discrete, e.g. {A,B,C,D} or {1,2,3,4}, they need to be coded in unary form.

Output Representation A neural net with a single sigmoid output is aimed at binary classification. Class is 0 if y < 0.5 and 1 otherwise. For multi-class problems Can use one output per class (unary encoding) There may be confusing outputs (two outputs > 0.5 in unary encoding) softmax More sophisticated method is to use special softmax units, which force outputs to sum to 1.

Neural Networks Regression Problems? Classification Problems? Backpropagation Learning Practice solving using the tutorials. Overfitting problem in network training

CSP Constraint Propagation Simple Backtracking (BT) Simple Backtracking with Forward Checking (BT-FC) Simple Backtracking with Forward Checking with Dynamic Variable and Value Ordering Practice solving using the tutorials.

Games Min-Max Alpha-Beta Practice solving using the tutorials

Download ppt "Review for Finals 2011 159.302. SEARCH CONSTRAINT SATISFACTION PROBLEMS GAMES FUZZY LOGIC NEURAL NETWORKS GENETIC ALGORITHM (not included in the exam)"

Similar presentations