Dr. Kenneth Stanley September 6, 2006

Dr. Kenneth Stanley September 6, 2006
CAP6938 Neuroevolution and Developmental Encoding Neural Network Weight Optimization Dr. Kenneth Stanley September 6, 2006

Review ? ? ? ? ? ? ? ? ? Remember, the values of the weights and the topology determine the functionality Given a topology, how are weights optimized? Weights are just parameters on a structure

Two Cases Output targets are known Output targets are not known out1
H1 H2 w11 w21 w12 X1 X2

Decision Boundaries OR is linearly separable
OR function: + + Input Output - + OR is linearly separable Linearly separable problems do not require hidden nodes (nonlinearities) Bias

Decision Boundaries XOR is not linearly separable
XOR function: + - Input Output - + XOR is not linearly separable Requires at least one hidden node Bias

Hebbian Learning Change weights based on correlation of connected neurons Learning rules are local Simple Hebb Rule: Works best when relevance of inputs to outputs is independent Simple Hebb Rule grows weights unbounded Can be made incremental:

More Complex Local Learning Rules
Hebbian Learning with a maximum magnitude: Excitatory: Inhibitory: Second terms are decay terms: forgetting Happens when presynaptic node does not affect postsynaptic node Other rules are possible Videos: watch the connections change

Perceptron Learning Will converge on correct weights
Single layer learning rule: Rule is applied until boundary is learned Bias

Backpropagation Designed for at least one hidden layer
First, activation propagates to outputs Then, errors are computed and assigned Finally, weights are updated Sigmoid is a common activation function t1 t2 x’s are inputs z’s are hidden units y’s are outputs t’s are targets v’s are layer 1 weights w’s are layer 2 weights y1 y2 w21 w11 w22 w12 z1 z2 v11 v22 v12 v21 X1 X2

Backpropagation Algorithm
Initialize weights While stopping condition is false, for each training pair Compute outputs by forward activation Backpropagate error: For each output unit, error Weight correction Send error back to hidden units Calculate error contribution for each hidden unit: Adjust weights by adding weight corrections (target minus output times slope) (Learning rate times error times hidden output)

Example Applications Anything with a set of examples and known targets
XOR Character recognition NETtalk: reading English aloud Failure predicition Disadvantages: trapped in local optima

Output Targets Often Not Available
(Stone, Sutton, and Kuhlmann 2005)

One Approach: Value Function Reinforcement Learning
Divide the world into states and actions Assign values to states Gradually learn the most promising states and actions Goal Start

Learning to Navigate T=56 T=1 T=703 T=350 Goal Goal 0 0 0 0 0 0 0 1
Start Start T=703 T=350 Goal Goal Start Start

How to Update State/Action Values
Q learning rule: Exploration increases Q-values’ accuracy The best actions to take in different states become known Works only in Markovian domains

Backprop In RL The state/action table can be estimated by a neural network The target learned by the network is the Q-value: Value NN Action State_description

Next Week: Evolutionary Computation
EC does not require targets EC can be a kind of RL EC is policy search EC is more than RL For 9/11: Mitchell ch.1 (pp. 1-31) and ch.2 (pp ) Note Section 2.3 is "Evolving Neural Networks"

Dr. Kenneth Stanley September 6, 2006

Similar presentations

Presentation on theme: "Dr. Kenneth Stanley September 6, 2006"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dr. Kenneth Stanley September 6, 2006

Similar presentations

Presentation on theme: "Dr. Kenneth Stanley September 6, 2006"— Presentation transcript:

Similar presentations

About project

Feedback