Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.

Similar presentations


Presentation on theme: "Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed."— Presentation transcript:

1 Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed

2 Rutgers CS440, Fall 2003 Outline Human learning, brain, neurons Artificial neural networks Perceptron Learning in NNs Multi-layer NNs Hopfield networks

3 Rutgers CS440, Fall 2003 Brains & neurons Brain is composed of over a billion nerve cells that communicate with each other through specialized contacts called synapses. ~1000 synapses / neuron => extensive and elaborate neural circuits Some synapses excite neurons and cause them to generate signals called action potentials, large transient voltage changes that propagates down their axons. Other synapses are inhibitory and prevent the neuron from generating action potentials. The action potential propagates down the axon to the sites where the axon has made synapses with the dendrites of other nerve cells. http://www.medicine.mcgill.ca/physio/cooperlab/cooper.htm

4 Rutgers CS440, Fall 2003 Artificial neurons Computational neuroscience McCulloch-Pitts model  in j gjgj xjxj w j0 w j1 w ji w jN x 0 =-1 x1x1 xixi xNxN Input function Activation function Output Offset / bias Weights Input links Output links

5 Rutgers CS440, Fall 2003 Activation function Threshold activation function (hard threshold, step function) Sigmoid activation function (soft threshold) Bias / offset shifts activation function left/right gjgj in j gjgj

6 Rutgers CS440, Fall 2003 X Y X V Y wxwx wywy w0w0 OR X Y X V Y w x =1 w y =1 w 0 =0.5 OR Implementing functions Artificial neurons can implement different (boolean) function X Y X ^ Y wxwx wywy w0w0 AND X XX wxwx w0w0 NOT X Y X ^ Y w x =1 w y =1 w 0 =1.5 AND X XX w x =1 w 0 =0.5 NOT

7 Rutgers CS440, Fall 2003 Networks of neurons Feed-forward networks: –single-layer networks –multi-layer networks –Feed-forward networks implement functions, have no internal state Recurrent networks: –Hopfield networks have symmetric weights (W ij = W ji ) –Boltzmann machines use stochastic activation functions, – Recurrent neural nets have directed cycles with delays

8 Rutgers CS440, Fall 2003 Perceptron Rosenblatt, 1958 Single layer, feed-forward network x1x1 x2x2 x3x3 x4x4 x5x5 y1y1 y2y2 y3y3 Input layerOutput layer w ji x1x1 x2x2 y1y1

9 Rutgers CS440, Fall 2003 Expressiveness of perceptron With threshold function, can represent AND, OR, NOT, majority Linear separator (linear discriminant classifier) Line: WX = 0

10 Rutgers CS440, Fall 2003 Expressiveness of perceptron (cont’d) Can only separate linearly separable classes x1x1 x2x2 0 1 1 x1x1 x2x2 0 1 1 x1x1 x2x2 0 1 1 ANDORXOR ? x1x1 x2x2 0 C1 C2

11 Rutgers CS440, Fall 2003 Perceptron example: geometric interpretation Two inputs, one output w’ x – w 0 = 0 w0w0  w 1 = cos  w 2 = sin  xpxp w’ x p x1x1 x2x2 w’x p -w 0

12 Rutgers CS440, Fall 2003 Perceptron learning Given a set of (linearly separable) points, X + and X - ( or a set of points with labels, D = { (x,y) k }, y  { 0, 1 } ), find a perceptron that separates (classifies) the two Algorithm: –Minimize error of classification. How? –Start with some random weights and adjust them (iteratively) so that the error is minimal in g’(in)

13 Rutgers CS440, Fall 2003 Analysis of the perceptron learning algorithm Consider 1D case: 1 input, 1 output x 0 w 0 (l-1) 00 00err w 0 (l) x3x3

14 Rutgers CS440, Fall 2003 Perceptron learning (cont’d) Converges to consistent classifiers if data is linearly separable

15 Rutgers CS440, Fall 2003 Multilayer perceptrons Number of hidden layers Usually fully connected 2-layers can represent all continuous functions 3-layers can represent all functions Learning: backpropagation algorithm, extension of the perceptron learning –Backpropagate the error from the output layer into hidden layer(s) x1x1 x2x2 x3x3 x4x4 x5x5 y1y1 y2y2 Input layerOutput layer w (1) ji Hidden layer w (2) ji

16 Rutgers CS440, Fall 2003 Application: handwritten digit recognition

17 Rutgers CS440, Fall 2003 Probabilistic interpretation Two classes, X + and X -, elements of which are distributed randomly according to some densities p + (x) and p - (x) Classes have prior probabilities p + and p - Given an element x decide which class it belongs to. X+X+ X-X- Output of NN is the posterior probability of a data class

18 Rutgers CS440, Fall 2003 Hopfield NN “Associative memory” – associate unknown input with entities encoded in the network Activation potential at “time” t-1 (state of network at t-1) Activation potential at “time” t (state of network at t) Learning: Use a set of examples X = { x 1, …, x K } and set W to Association: To associate an unknown input x u with one of K memorized elements, set x (0) = x u and let the network converge

19 Rutgers CS440, Fall 2003 Hopfield NN example Training set Test example t=0 t=1t=2t=3


Download ppt "Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed."

Similar presentations


Ads by Google