Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural Networks 2nd Edition Simon Haykin

Similar presentations


Presentation on theme: "Neural Networks 2nd Edition Simon Haykin"— Presentation transcript:

1 Neural Networks 2nd Edition Simon Haykin
柯博昌 Chap 4. Multilayer Perceptrons

2 Introduction Typically, the network constitute the input layer, one or more hidden layers and an output layer of computation nodes. Multilayer perceptrons have been applied successfully to solve some difficult and diverse problems by training them in a supervised manner with a highly popular algorithm known as error back-propagation algorithm. The back-propagation algorithm is based on the error-correction learning rule. It may be viewed as a generalization of LMS. The back-propagation algorithm consists of two passes Forward pass: Producing output(s) as the response of network. Backward pass: The synaptic weights are all adjusted by error-correction rule.

3 Characteristics of multilayer perceptrons
The model of each neuron in the network includes a nonlinear activation function. The nonlinearity is smooth and differentiable everywhere by using sigmoidal nonlinearity. The network contains one or more layers of hidden neurons which enable the network to learn complex tasks by extracting more meaningful features from the input patterns. The network exhibits a high degree of connectivity, determined by the synapses of the network.

4 Multilayer Perceptron Architecture
Architectural graph Signal flow Function Signals: An input signal comes from the input end, propagates forward through the network, and emerges at the output end of the network. Error Signals: Originates at an output neuron, and propagates backward through the network. “Error Signal” means its computation by every neuron of the network involves an error-dependent function.

5 Notations used in Back-propagation Algorithm
i, j and k refer to different neurons; with signals propagating through the network from left to right, neuron j lies in a layer to the right of neuron i. In iteration n, the nth training sample is presented to the network. C(n): The instantaneous sum of error squares or error energy at iteration n. Cav: The average of C(n) over all values of n. ej(n): The error signal at the output of neuron j for iteration n. dj(n): The desired response for neuron j and is used to compute ej(n). yj(n): The function signal appearing at the output of neuron j at iteration n.

6 Notations used in Back-propagation Algorithm (Cont.)
wji(n): The synaptic weight connecting the output of neuron i to the input of neuron j at iteration n. The correction applied to this weight at iteration is denoted as wji(n). vj(n): The weighted sum of all synaptic inputs plusbias of neuron j at iteration n. j(): The activation function describing the input-output functional relationship of the nonlinearity associated with neuron j. bj: The bias applied to neuron j. it’s also represented by a synapse of weight wj0=bj connected to a fixed input equals to +1. xi(n): The i-th element of the input vector. ok(n): The k-th element of the overall output vector. : Learning Rate ml: The size (number of nodes) in layer l of the multilayer perceptron; l =0, 1, …, L, where L is the depth of the network.

7 Back-propagation Algorithm
The error signal at the output of neuron j at iteration n: The error energy for neuron j: Total error energy: Average squared error energy: In a manner similar to LMS, (Chain Rule)

8 Back-propagation Algorithm Case 1: Neuron j is an Output Node
(n)/wji(n) represents a sensitivity factor Let

9 Back-propagation Algorithm Case 2: Neuron j is a Hidden Node
Because

10 Back-propagation Algorithm (Cont.) Case 2: Neuron j is a Hidden Node
Summary (Delta Rule) If neuron j is an output node, j(n)=’j(vj(n)) If neuron j is a hidden node, j(n)=’j(vj(n))*weighted sum of the s

11 Two passes of Computation
Forward Pass m: total number of inputs applied to neuron j If m=m0 If m=mL oj(n): the jth element of the output vector. Backward Pass Start at the output layer by passing the error signal leftward through network, layer by layer, and recursively computing the  for each neuron. Following delta rule, this recursive process permits adjusting synaptic weights.

12 Activation Function - Logistic Function
If neuron j is located in the output layer, If neuron j is located in the hidden layer,

13 Activation Function – Hyperbolic Tangent Function
If neuron j is located in the output layer, If neuron j is located in the hidden layer,

14 Rates of Learning The beck-propagation algorithm provides an “approximation” to the trajectory in weight space computed by steepest descent. The smaller  makes smaller changes (smoother trajectory) to the synaptic weight from one iteration to the next. If  is too large, the network may become unstable (i.e. oscillatory).

15 Sequential and Batch Modes of Training
Epoch: One complete presentation of the entire training set during the training process. The learning process is maintained on an epoch-by-epoch basis until the synaptic weights and bias stabilize and the average squared error converges to some minimum value. The back-propagation learning may proceed in one of two basic ways: Sequential Mode: The weight updating is performed after the presentation of each training example. (Suitable for on-line operation) Batch Mode: The weight updating is performed after the presentation of all training examples.

16 Summary of the Back-propagation Algorithm
Initialization. Pick all of the wji from a uniform distribution. Presentations of Training Examples. Forward Computation. Backward Computation. Iteration.


Download ppt "Neural Networks 2nd Edition Simon Haykin"

Similar presentations


Ads by Google