 # Introduction to Neural Networks John Paxton Montana State University Summer 2003.

## Presentation on theme: "Introduction to Neural Networks John Paxton Montana State University Summer 2003."— Presentation transcript:

Introduction to Neural Networks John Paxton Montana State University Summer 2003

Chapter 2: Simple Neural Networks for Pattern Classification x0x0 x1x1 y xnxn 1 w0w0 w1w1 wnwn w 0 is the bias f(y in ) = 1 if y in >= 0 f(y in ) = 0 otherwise ARCHITECTURE

Representations Binary: 0 no, 1 yes Bipolar: -1 no, 0 unknown, 1 yes Bipolar is superior

Interpreting the Weights w 0 = -1, w 1 = 1, w 2 = 1 0 = -1 + x 1 + x 2 or x 2 = 1 – x 1 decision boundary x1 x2 YES NO

Modelling a Simple Problem Should I attend this lecture? x 1 = it’s hot x 2 = it’s raining x0x0 x1x1 y x2x2 2.5 -2 1

Linear Separability AND OR XOR 1 0 0 0 00 0 1 1 11 1

Hebb’s Rule 1949. Increase the weight between two neurons that are both “on”. 1988. Increase the weight between two neurons that are both “off”. w i (new) = w i (old) + x i *y

Algorithm 1. set w i = 0 for 0 <= i <= n 2. for each training vector 3.set x i = s i for all input units 4.set y = t 5.w i (new) = w i (old) + x i *y

Example: 2 input AND s0s0 s1s1 s2s2 t 1111 11 1 1 1

Training Procedure w 0 w 1 w 2 x 0 x 1 x 2 y 0001111 11111-1 (!) 00211-1 (!) 111 -222

Result Interpretation -2 + 2x 1 + 2x 2 = 0 OR x 2 = -x 1 + 1 This training procedure is order dependent and not guaranteed.

Pattern Recognition Exercise #.#.#..#. #.# #.#.#. “X” “O”

Pattern Recognition Exercise Architecture? Weights? Are the original patterns classified correctly? Are the original patterns with 1 piece of wrong data classified correctly? Are the original patterns with 1 piece of missing data classified correctly?

Perceptrons (1958) Very important early neural network Guaranteed training procedure under certain circumstances x0x0 x1x1 y xnxn 1 w0w0 w1w1 wnwn

Activation Function f(y in ) = 1 if y in >  f(y in ) = 0 if -  <= y in <=  f(y in ) = -1 otherwise Graph interpretation 1

Learning Rule w i (new) = w i (old) +  *t*x i if error  is the learning rate Typically, 0 <  <= 1

Algorithm 1. set w i = 0 for 0 <= i <= n (can be random) 2. for each training exemplar do 3.x i = s i 4.y in =  x i *w i 5.y = f(y in ) 6.w i (new) = w i (old) +  *t*x i if error 7. if stopping condition not reached, go to 2

Example: AND concept bipolar inputs bipolar target  = 0  = 1

Epoch 1 w0w0 w1w1 w2w2 x0x0 x1x1 x2x2 yt 00011101 111111 0021 11 111

Exercise Continue the above example until the learning algorithm is finished.

Perceptron Learning Rule Convergence Theorem If a weight vector exists that correctly classifies all of the training examples, then the perceptron learning rule will converge to some weight vector that gives the correct response for all training patterns. This will happen in a finite number of steps.

Exercise Show perceptron weights for the 2-of-3 concept x1x2x3y 1111 111 1 11 1 111 1 1

Adaline (Widrow, Huff 1960) Adaptive Linear Network Learning rule minimizes the mean squared error Learns on all examples, not just ones with errors

Architecture x0x0 x1x1 y xnxn 1 w0w0 w1w1 wnwn

Training Algorithm 1. set w i (small random values typical) 2. set  (0.1 typical) 3. for each training exemplar do 4.x i = s i 5.y in =  x i *w i 6.w i (new) = w i (old) +  *(t – y in )*x i 7. go to 3 if largest weight change big enough

Activation Function f(y in ) = 1 if y in >= 0 f(y in ) = -1 otherwise

Delta Rule squared error E = (t – y in ) 2 minimize error E’ = -2(t – y in )x i =  (t – y in )x i

Example: AND concept bipolar inputs bipolar targets w 0 = -0.5, w 1 = 0.5, w 2 = 0.5 minimizes E x0x0 x1x1 x2x2 y in tE 111.51.25 11-.5.25 11-.5.25 1 -1.5.25

Exercise Demonstrate that you understand the Adaline training procedure.

Madaline Many adaptive linear neurons xmxm x1x1 1 zkzk z1z1 1 y

Madaline MRI (1960) – only learns weights from input layer to hidden layer MRII (1987) – learns all weights

Download ppt "Introduction to Neural Networks John Paxton Montana State University Summer 2003."

Similar presentations