Presentation is loading. Please wait.

Presentation is loading. Please wait.

NN – cont. Alexandra I. Cristea USI intensive course Adaptive Systems April-May 2003.

Similar presentations


Presentation on theme: "NN – cont. Alexandra I. Cristea USI intensive course Adaptive Systems April-May 2003."— Presentation transcript:

1 NN – cont. Alexandra I. Cristea USI intensive course Adaptive Systems April-May 2003

2 We have seen how the neuron computes, lets see –What it can compute? –How it can learn?

3 What does the neuron compute?

4 Perceptron, discrete neuron First, simple case: –no hidden layers –Only one neuron –Get rid of threshold – b becomes w 0 –Y – Boolean function : > 0 fires 0 doesnt fire

5 Threshold function f f (w0 = - t = -1)

6 Y = X1 or X2 1 X1 X2 0 0 1 1 1 0 Y X1 X2

7 Y = X1 and X2 0,5 X1 X2 0 0 0 0 1 0 Y X1 X2

8 Y = or(x1,…,xn) w1=w2=…=wn=1

9 Y = and(x1,…,xn) w1=w2=…=wn=1/n

10 What are we actually doing? X1 X2 0 -1 1 1 1 0 Y X1 X2 0 0 0 0 1 0 Y X1 X2 0 0 1 1 1 0 Y w0+w1*X1+w2*X2 0=-1; 7; 9 0=-1; 0,7; 0,9 0=1; 7; 9 X1 X2

11 x1 x2 w0+w1*x1+w2*x2 w0= - 1 w1= - 0,67 w2= 1 Linearly Separable Set

12 w0+w1*x1+w2*x2 Linearly Separable Set x1 x2 w0= - 1 w1= 0,25 w2= - 0,1

13 w0+w1*x1+w2*x2 Linearly Separable Set x1 x2 w0= - 1 w1= 0,25 w2= 0,04

14 w0+w1*x1+w2*x2 Linearly Separable Set x1 x2 w0= - 1 w1= 0,167 w2= 0,1

15 Non-linearly separable Set

16 w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=

17 w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=

18 w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=

19 w0+w1*x1+w2*x2 Non Linearly Separable Set x1 x2 w0= w1= w2=

20 Perceptron Classification Theorem A finite set X can be classified correctly by a one-layer perceptron if and only if it is linearly separable.

21 w0+w1*x1+w2*x2 Typical non-linearly separable set: Y=XOR(x1,x2) x1 x2 0,01,0 0,1 1,1 Y=1 Y=0

22 How does the neuron learn?

23 Learning: weight computation W1* X1 W * X2= X2 X1 W1*X1 W2*X2

24 Perceptron Learning Rule incremental version FOR i:= 0 TO n DO wi:=random initial value ENDFOR; REPEAT select a pair (x,t) in X; (* each pair must have a positive probability of being selected *) IF w T * x' > 0 THEN y:=1 ELSE y:=0 ENDIF; IF y t THEN FOR i:= 0 TO n DO wi:= wi + (t-y) xi' ENDFOR ENDIF; UNTIL X is correctly classified ROSENBLATT (1962)

25 Idea Perceptron Learning Rule w x w new w new =w + x t=1 y=0 (w T x 0) w niew x w x x w new =w - x wi:= wi + (t-y) xi' w changes in the direction of the input +- t=0 y=1 (w T x>0)

26

27 For multi-layered perceptrons w. continuous neurons, a simple and successful learning algorithm exists.

28 BKP:Error Input Output Hidden layer d d d d e1=d1 y1 e2=d2 y2 e3=d3 y3 e4=d4 y4 Hiddenlayer error error

29 Synapse W weight neuron1 neuron2 y1 value y2 w*y1 value Value (y1,y2)= Internal activation Forward propagation Weight serves as amplifier!

30 Inverse Synapse W weight neuron1 neuron2 ?? e1= ?? value e2 value Value(e1,e2)= Error Backward propagation Weight serves as amplifier!

31 Inverse Synapse W weight neuron1 neuron2 w e2 e1=w e2 value e2 value Value(e1,e2)= Error Backward propagation Weight serves as amplifier!

32 BKP:Error Input Output Hidden layer d d d d e1=d1 y1 e2=d2 y2 e3=d3 y3 e4=d4 y4 Hiddenlayer error error O2 O1 I1 O2, I2

33 Backpropagation to hidden layer Input I1 Output O1 Hidden layer ee j i e i j,i Backpropagation e e e O2, I2

34 Update rule for 2 weight types I2 hidden layer, O1 system output I1 system input, O2 hidden layer (simplification f=1 for repeater, e.g.) Δ =α(d[i]-y[i]) f(S[i])f(S[i]) = =αe[i] f(S[i]) (simplification f=1 for repeater, e.g.) S[i] = j w[j, ](t)h[j] Δ =α i e[i] [j,i] f(S[j])f(S[j]) =α ee[j]f(S[j]) S[j] = k w[k,j](t)x[k]

35 Backpropagation algorithm FOR s := 1 TO r DO Ws := initial matrix (often random); REPEAT select a pair (x,t) in X; y 0 :=x; # forward phase: compute the actual output ys of the network with input x FOR s := 1 TO r DO y s := F(Ws y s-1 ) END; # yr is the output vector of the network # backpropagation phase: propagate the errors back through the network # and adapt the weights of all layers d r := F r (t - y r ) ; FOR s := r TO 2 DO d s-1 := F s-1 ' Ws T d s ; Ws := Ws + d s y s-1 T ; END; W1 := W1 + d 1 y 0 T UNTIL stop criterion

36 Conclusion We have seen binary function representation with single layer perceptron We have seen a learning algorithm for SLP We have seen a learning algorithm for MLP (BP) So, neurons can represent knowledge AND learn!


Download ppt "NN – cont. Alexandra I. Cristea USI intensive course Adaptive Systems April-May 2003."

Similar presentations


Ads by Google