Presentation is loading. Please wait.

Presentation is loading. Please wait.

CHAPTER 11 Back-Propagation Ming-Feng Yeh.

Similar presentations


Presentation on theme: "CHAPTER 11 Back-Propagation Ming-Feng Yeh."— Presentation transcript:

1 CHAPTER 11 Back-Propagation Ming-Feng Yeh

2 Objectives A generalization of the LMS algorithm, called backpropagation, can be used to train multilayer networks. Backpropagation is an approximate steepest descent algorithm, in which the performance index is mean square error. In order to calculate the derivatives, we need to use the chain rule of calculus. Ming-Feng Yeh

3 Motivation The perceptron learning and the LMS algorithm were designed to train single-layer perceptron-like networks. They are only able to solve linearly separable classification problems. Parallel Distributed Processing The multilayer perceptron, trained by the backpropagation algorithm, is currently the most widely used neural network. Ming-Feng Yeh

4 Three-Layer Network Number of neurons in each layer: Ming-Feng Yeh

5 Pattern Classification: XOR gate
The limitations of the single-layer perceptron (Minsky & Papert, 1969) Ming-Feng Yeh

6 Two-Layer XOR Network Two-layer, 2-2-1 network AND
Individual Decisions Ming-Feng Yeh

7 Solved Problem P11.1 Design a multilayer network to distinguish these categories. Class I Class II There is no hyperplane that can separate these two categories. Ming-Feng Yeh

8 Solution of Problem P11.1 OR AND Ming-Feng Yeh

9 Function Approximation
Two-layer, network Ming-Feng Yeh

10 Function Approximation
The centers of the steps occur where the net input to a neuron in the first layer is zero. The steepness of each step can be adjusted by changing the network weights. Ming-Feng Yeh

11 Effect of Parameter Changes
Ming-Feng Yeh

12 Effect of Parameter Changes
Ming-Feng Yeh

13 Effect of Parameter Changes
Ming-Feng Yeh

14 Effect of Parameter Changes
Ming-Feng Yeh

15 Function Approximation
Two-layer networks, with sigmoid transfer functions in the hidden layer and linear transfer functions in the output layer, can approximate virtually any function of interest to any degree accuracy, provided sufficiently many hidden units are available. Ming-Feng Yeh

16 Backpropagation Algorithm
For multilayer networks the outputs of one layer becomes the input to the following layer. Ming-Feng Yeh

17 Performance Index Training Set: Mean Square Error: Vector Case:
Approximate Mean Square Error: Approximate Steepest Descent Algorithm Ming-Feng Yeh

18 Chain Rule If f(n) = en and n = 2w, so that f(n(w)) = e2w.
Approximate mean square error: Ming-Feng Yeh

19 Sensitivity & Gradient
The net input to the ith neurons of layer m: The sensitivity of to changes in the ith element of the net input at layer m: Gradient: Ming-Feng Yeh

20 Steepest Descent Algorithm
The steepest descent algorithm for the approximate mean square error: Matrix form: s m F n - 1 2 S = Ming-Feng Yeh

21 BP the Sensitivity Backpropagation: a recurrence relationship in which the sensitivity at layer m is computed from the sensitivity at layer m+1. Jacobian matrix: Ming-Feng Yeh

22 Matrix Repression The i,j element of Jacobian matrix Ming-Feng Yeh

23 Recurrence Relation The recurrence relation for the sensitivity
The sensitivities are propagated backward through the network from the last layer to the first layer. Ming-Feng Yeh

24 Backpropagation Algorithm
At the final layer: Ming-Feng Yeh

25 Summary The first step is to propagate the input forward through the network: The second step is to propagate the sensitivities backward through the network: Output layer: Hidden layer: The final step is to update the weights and biases: Ming-Feng Yeh

26 BP Neural Network Ming-Feng Yeh

27 Ex: Function Approximation
e + 1-2-1 Network Ming-Feng Yeh

28 Network Architecture p 1-2-1 Network a Ming-Feng Yeh

29 Initial Values Initial Network Response: Ming-Feng Yeh

30 Forward Propagation Initial input: Output of the 1st layer:
Output of the 2nd layer: error: Ming-Feng Yeh

31 Transfer Func. Derivatives
Ming-Feng Yeh

32 Backpropagation The second layer sensitivity:
The first layer sensitivity: Ming-Feng Yeh

33 Weight Update Learning rate Ming-Feng Yeh

34 Choice of Network Structure
Multilayer networks can be used to approximate almost any function, if we have enough neurons in the hidden layers. We cannot say, in general, how many layers or how many neurons are necessary for adequate performance. Ming-Feng Yeh

35 Illustrated Example 1 1-3-1 Network Ming-Feng Yeh

36 Illustrated Example 2 1-2-1 1-3-1 1-4-1 1-5-1 Ming-Feng Yeh

37 Convergence Convergence to Global Min. Convergence to Local Min.
1 2 3 4 5 1 2 3 4 5 Convergence to Global Min. Convergence to Local Min. The numbers to each curve indicate the sequence of iterations. Ming-Feng Yeh

38 Generalization In most cases the multilayer network is trained with a finite number of examples of proper network behavior: This training set is normally representative of a much larger class of possible input/output pairs. Can the network successfully generalize what it has learned to the total population? Ming-Feng Yeh

39 Generalization Example
-2 -1 1 2 3 1-2-1 1-9-1 Generalize well Not generalize well For a network to be able to generalize, it should have fewer parameters than there are data points in the training set. Ming-Feng Yeh


Download ppt "CHAPTER 11 Back-Propagation Ming-Feng Yeh."

Similar presentations


Ads by Google