Learning and Perceptrons

Learning and Perceptrons
CIS 488/588 Bruce R. Maxim UM-Dearborn 11/30/2018

Momentum and Friction When human players use a mouse to aim
Momentum turns the view more than they expect for large angles (the ballistic mouse thing) Friction slows down the turn for small angles Adjustments are needed to avoid losing accuracy For AI players they just aim at an exact target and shoot Perfect shooters may not be fun to play against so turning errors can be introduced 11/30/2018

11/30/2018

Explicit Model A mathematical function for computing the actual turning angle in terms of desired angle and previous output  = * noise(angle) output(t) = (angle * ) *  + output(t – 1) * (1 - ) = scaling factors for blending previous output with angle request in range [0.3,0.5] = initialized to random value between in range [0.9, 1.1] noise( ) returns value in range [-1,1] could use cos(angle2 * /angle) 11/30/2018

Linear Approximation We can use a perceptron to approximate the function described earlier One the animat learns the a faster approximation for function it can be removed from the AI code Aiming errors just become a constraint on animat behavior 11/30/2018

11/30/2018

Methodology Approximation computed by training network iteratively
Desired output is computed for random inputs By grouping results, the batch algorithm can be used to find values for weights and bias A small perceptron is applied twice (to get pitch and then yaw) rather creating a larger one that does both This reduces memory use at the expense of programming time 11/30/2018

Missy Perceptron is used to approximate the aiming error function
Learning accomplished using a batch algorithm during initialization (cannot locate code for Missy) 11/30/2018

Evaluation A linear approximation of the non-linear function is close enough for this task Turning behavior is more realistic since it visibly show signs of taking momentum and friction into account Training is done quickly using as few as 10 samples during initialization to get the weights to settle into equilibrium 11/30/2018

Accumulating Errors Momentum and friction causes errors or drift that tend to accumulate after several turns These errors allow the AI to perform more realistically performance Ignoring the variations in aiming will make the AI too error prone to challenge human players 11/30/2018

Inverse Error To compensate for aiming errors, we could define an inverse error function to help correct the aiming errors Not every function as a definable inverse so that AI would be better served by a math-free method of approximating this type of function Given enough trial and error through simulation opportunities the AI should be able to predict the corrected angles needed 11/30/2018

Learning - 1 In effect the AI learns how to deal with aiming errors by receiving evaluative feedback Using this feedback the AI can incrementally improve its task performance The AI uses its sensors to detect the actual angles the body was turned since the last update Unfortunately the AI learns to shoot where it should have shot last time 11/30/2018

Learning - 2 With enough trials the AI can learn to anticipate where to shoot (the NN weights provide a crude memory to work with) Both the inputs and outputs will need to be scaled because the perceptron will have to deal with values that are not within the unit vector 11/30/2018

Aimy Perceptron is used to learn corrected angles needed to prevent undershooting and overshooting Gathers data from its sensors to determine how far its body turned based on each requested angle Incremental training is used to approximate the inverse function needed to prevent aming errors 11/30/2018

Evaluation - 1 Animat should have the opportunity to correct aiming while moving around Perceptrons can learn more quickly when more training samples are presented The animat can corrects its aim on only two dimensions (pitch and yaw) Only when pitch is near horizontal can the animat aim while it is moving 11/30/2018

Evaluation - 2 When looking fully up or fully down there is no forward movement is possible, this prevents learning To prevent this trap, the animat is only allowed to control yaw until satisfactory results are obtained The worst that happens is the animat spinning around while learning 11/30/2018

Evaluation - 3 The way in which the yaw is chosen determines the angles available for learning If the animat full control over the yaw, it can decide what to learn and what to ignore (the effect may be for the NN to always predict the same turn to correct aiming errors) This is a good reason for forcing the NN to examine a variety of randomly generated angles during training to get a more representative training set and better learning 11/30/2018

Multilayer Perceptrons
Single layer perceptrons can only deal with linear problems Non-linear problems can only be approximated by single layer perceptrons Multilayer perceptrons (MLP) Have extra middle layers know as “hidden” layers The middle layers require more sophisticated activation functions than single layer perceptrons (e.g. linear activations would make MLP behave like single layer perceptron) 11/30/2018

11/30/2018

Topology MLP topology is said to be forward feed because there are no backward (recurrent) connections There can be an arbitrary number of hidden layers in MLP Adding too many hidden layers increases the computational complexity of the network One hidden layer is usually enough to allow the MLP to be a universal approximator capable of approximating any continuous function 11/30/2018

Hidden Layers In some cases, there may be many independencies among the input variables and adding an extra hidden layer can be helpful Adding hidden layers some times can reduce the total number of weights needed for suitable approximation MLP with two hidden layers can approximate any non-continuous functions 11/30/2018

Hidden Neurons Choosing the number of neurons in the hidden layer is an art, often depends on the AI designer’s intuition and experience The neurons in the hidden layer are needed to represent the problem knowledge internally As the number of dimensions grows the complexity of the decision surface (path through hidden layer) increases Basically the output on one side of the surface is positive and negative on the other decide 11/30/2018

Connections Neurons can be fully connected to one another within and between layers Neurons can also be sparsely connected and even skip layers (e.g. straight from input to output) Most MLP are fully connected to simplify programming 11/30/2018

Activation Function Properties
Derivable (known and computable derivative) Continuous (derivative defined every where) Complexity (nonlinear for higher order tasks) Monotonous (derivative positive) Boundless (activation output and its derivative are finite) Polarity (bipolar preferred to positive) 11/30/2018

Activation Functions Activation functions for the input and output layers are usually one of the following: Step, Linear, Threshold logic, Sigmoid Hidden layer activation functions might be one of the following Sigmoid: sig(x) = 1/(1 + e-x) Hyperbolic tangent Bipolar Sigmoid: sigb(x) = 2/(1 + e-x) - 1 11/30/2018

Role of Hidden Layers The use of a hidden layer implies that the information needed to compute the output must be filtered before passing it on to the next layer Each layer of the MLP receives its input from the previous layer and passes its modified output on to the next layer 11/30/2018

Feed-Forward Algorithm
current = input; // process input layer for layer = 1 to n { for i = 1 to m // compute output of each neuron // multiply arrays and sum result s = NetSum(neuron(I).weights.current); output[i] = Activate(s); } // next layer uses this layer’s output as input current = output; 11/30/2018

Benefits of MLP The importance of MLP’s is not that they really mimic animal brains, they do not MLP have a thoroughly researched mathematical foundation and have been proven to work well in some applications MLP can be trained to do interesting things and this training really just involves numeric optimization (minimizing output error) 11/30/2018

Back Propagation - 1 BP is the process of filtering error from the output layer back through the preceding layers BP was developed in response to fact that single layer perceptron algorithms do not train hidden layers BP is the essence of most MLP learning algorithms 11/30/2018

11/30/2018

Back Propagation - 2 Form of hill climbing know as “gradient ascent” hill climbing several directions tried simultaneously “steepest gradient” used to direct search Training may require thousands of backpropagations BP can get stuck or become unstable during training BP can be done in stages 11/30/2018

Back Propagation - 3 BP can train a net to recognize several concepts simultaneously Trained neural networks can be used to make predictions Too many trainable weights relative to the number of training facts can lead to overflow problems 11/30/2018

Back Propagation Algorithm - 1
Given: set of input-output pairs Task: compute weights for 3 layer network at maps inputs to corresponding outputs Algorithm: 1.Determine the number of neurons required 2.Initialize weights to random values 3.Set activation values for threshold units 11/30/2018

4.Choose and input-output pair and assign activation levels to input neurons 5.Propagate activations from input neurons to hidden layer neurons for each neuron hj = 1/(1 + e- w1ijXi) 6.Propagate activations from hidden layer neurons to output neurons for each neuron oj = 1/(1 + e- w2ijhi) 11/30/2018

7.Compute error for output neurons by comparing pattern to actual 8.Compute error for neurons in hidden layer 9.Adjust weights in between hidden layer and output layer 10.Adjust weights between input layer and hidden layer 11.Go to step 4 11/30/2018

Backprop - 1 // compute gradient in last layer neurons for j = 1 to m
delta[j] = deriv_activate(net_sum) * (desired[j] – output[j]); for i = last – 1 to first // process layers { total = 0; for k = 1 to n total += delta[k] * weights[j][k]; delta[j] = deriv_activate(net_sum) * total; } 11/30/2018

Backprop - 2 // compute each weight wij
// steepest descent for error gradient for // each weight for j = 1 to m for i = 1 to n // adjust weights using error gradient weight[j][i] += learning_rate * delta[j] * output[I]; // The generalized delta rule is used to // compute each weight wij // learning_rate set by KE // delta[j] is gradient of neuron j error 11/30/2018

Quick Propagation Batch technique
Exploits locally adaptive techniques to adjust step magnitude based on local parameters Uses knowledge of higher-order derivatives (e.g. Newton’s methods) Allows for better prediction of the slope of the curve and location of minima Weights updated using method similar to backprop 11/30/2018

Quickprop - 1 // Requires two additional arrays for step and
// gradient - it remembers last set of values // New weight update replaces steepest descent for j = 1 to m for i = 1 to n // compute gradient and step { new_gradient[j][i] = -delta[j] * input[i]; new_step[j][i] = new_gradient[j][i] / (old_gradient[j][i] – new_gradient[j][I]) * old_step[j][i]; 11/30/2018

Quickprop - 1 // adjust weight weight[j][i] += new_step[j][i]; // store values for next iteration old_step[j][i] = new_step[j][i]; old_gradient[j][i] = new_gradient[j][i]; } Note since this is a batch algorithm all gradients for each training samples are added together 11/30/2018

Resilient Propagation
Weights updated only after all training samples have been seen The step size is not determined by the gradient unlike steepest descent techniques Equations are not too hard to implement 11/30/2018

Rprop - 1 // New weight update replaces steepest descent
for j = 1 to m for i = 1 to n // compute gradient and step { new_gradient[j][i] = -delta[j] * input[i]; // analyze change to get size of update if(new_gradient[j][i]*old_gradient[j][i]>0) new_update[j][i] = nplus * new_update[j][i]; else if(new_gradient[j][i]*old_gradient[j][i]<0) new_update[j][i] = nminus * new_update[j][i]; else new_update[j][i] = old_update[j][i]; 11/30/2018

Rprop - 2 // determine step direction if(new_gradient[j] > 0)
step[j][i] = -new_update[j][i]; else if(new_gradient[j] < 0) step[j][i] = new_update[j][i]; else step[j][i] = 0; // adjust weight and store values weight[j][i]+= step[j][i]; old_update[j][i] = new_update[j][i]; old_gradient[j][i] = new_gradient[j][i]; } 11/30/2018

Building Neural Networks
Define the problem in terms of neurons think in terms of layers Represent information as neurons operationalize neurons select their data type locate data for testing and training Define the network Train the network Test the network 11/30/2018

Generalization – 1 Learning phase is responsible for optimizing the weights from the training examples It would be good if the NN could also process new or unseen examples correctly as well (generalization) If NN is bound too tightly to training examples is known as overfitting Overfitting is never a problem with single layer perceptrons 11/30/2018

Generalization – 2 For MLP number of hidden neurons affects complexity of decision surface Need to find the trade-off between the number of hidden neurons and result quality Incorrect or incomplete data interferes with generalization Bad training examples are usually to blame for failure of MLP to learn concepts 11/30/2018

Testing and Validation
Training sets – used to optimize the weights for a given set of parameters Validation sets – used to check the quality of training, help to find best combination of parameters Testing sets – check final quality of validated perceptrons (no test info is used to improve NN) 11/30/2018

Batch vs Incremental Batch preferred over incremental training
Converge to answer faster Have greater accuracy Incremental data can be gathered for batch processing if necessary Incremental approaches best suited for real-time, in-game learning (requires less memory) 11/30/2018

Forgetting With incremental learning, it may be wise to slow down learning rate later in the game to avoid forgetting earlier lessons No formal approach to reducing learning rate, linear or exponential decay strategies are often successful This implies that learning will eventually become frozen as time passes 11/30/2018

Perceptron Advantages
Good mathematical foundation If solution exists it can be found Work best for well defined problems If things go wrong the parameters can be adjusted Lots training algorithms exist MLP works easily with continuous values Deals well with noise 11/30/2018

Perceptron Disadvantages - 1
NN do not contain an easily understood representation of their knowledge MLP depends entirely on the algorithms used to create it MLP does not scale well Once trained MLP is not updated without retraining Retraining does not preserve pervious MLP knowledge 11/30/2018

Perceptron Disadvantages - 2
Design of inputs and outputs can have a profound impact on MLP success Input may require pre-processing and outputs may require post-processing Getting the right number of layers and neurons requires trial and error 11/30/2018

Onno Uses a large neural network to handle shooting (prediction, target selection, aiming) Input is similar to that described in previous chapters Results are moderate, but demonstrates versatility of MLP and benefits of decomposing behaviors 11/30/2018

Learning and Perceptrons

Similar presentations

Presentation on theme: "Learning and Perceptrons"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Learning and Perceptrons

Similar presentations

Presentation on theme: "Learning and Perceptrons"— Presentation transcript:

Similar presentations

About project

Feedback