A note about gradient descent: Consider the function f(x)=(x-x 0 ) 2 Its derivative is: By gradient descent. x0x0 + -
Solving the differential equation: or in the general form: What is the solution of this type of equation: Try:
THE PERCEPTRON: (Classification) Threshold unit: where is the output for input pattern, are the synaptic weights and is the desired output w 1 w 2 w 3 w 4 w 5 x1x2y AND
x1x2y AND Linearly seprable
x1x2y OR Linearly separable
Perceptron learning rule: w 1 w 2 w 3 w 4 w 5 Convergence proof: Hertz, Krough, Palmer (HKP) - did you receive the ? Assignment 3a: program in matlab a preceptron with a perceptron learning rule and solve the OR, AND and XOR problems. (Due before Feb 27) Show Demo
Summary – what can perceptrons do and how?
Linear single layer network: ( approximation, curve fitting) Linear unit: where is the output for input pattern, are the synaptic weights and is the desired output w 1 w 2 w 3 w 4 w 5 Minimize mean square error: or *
Linear single layer network: ( approximation, curve fitting) Linear unit: where is the output for input pattern, are the synaptic weights and is the desired output w 1 w 2 w 3 w 4 w 5 Minimize mean square error:
The best solution is obtained when E is minimal. For linear neurons there is an exact solution for this called the pseudo-inverse (see HKP). Looking for a solution by gradient descent: E w -gradient Chain rule
and Since: Error: Therefore: Which types of problems can a linear network solve?
Sigmoidal neurons: Which types of problems can a sigmoidal networks solve? Assignment 3b – Implement a one layer linear and sigmoidal network, fit a 1D a linear, a sigmoid and a quadratic function, for both networks. for example:
Multi layer networks: Can solve non linearly separable classification problems. Can approximate any arbitrary function, given ‘enough’ units in the hidden layer. Hidden layer Output layer Input layer
Note: is not a vector but a matrix
Solving linearly inseparable problems x1x2y XOR Hint: XOR = or and not and
How do we learn a multi-layer network The credit assignment problem ! x1x2y XOR
Gradient descent/ Back Propagation, the solution to the credit assignment problem: Where: From hidden layer to output weights: {
Where: For input to hidden layer: and {
For input to hidden layer: and Assignment 3c: Program a 2 layer network in matlab, solve the XOR problem. Fit the curve: x(x- 1) between 0 and 1, how many hidden units did you need?
Formal neural networks can accomplish many tasks, for example: Perform complex classification Learn arbitrary functions Account for associative memory Some applications: Robotics, Character recognition, Speech recognition, Medical diagnostics. This is not Neuroscience, but is motivated loosely by neuroscience and carries important information for neuroscience as well. For example: Memory, learning and some aspects of development are assumed to be based on synaptic plasticity.
What did we learn today? Is BackProp biologically realistic?