Download presentation

Presentation is loading. Please wait.

Published byAlfredo Linder Modified about 1 year ago

1
1 Least-squares-based Multilayer perceptron training with weighted adaptation -- Software simulation project EE 690 Design of Embodied Intelligence

2
2 Outline Multilayer Perceptron Least-squares based Learning Algorithm Weighted Adaptation in training Signal-to-Noise Ratio Figure and Overfitting Software simulation project

3
3 Inputs xOutputs z Feedforward (no recurrent connections) network with units arranged in layers Multilayer perceptron (MLP)

4
4 Efficient mapping from inputs to outputs Powerful universal function approximation Number of inputs and outputs determined by the data Number of hidden neurons Number of hidden layers inputs outputs Multilayer perceptron (MLP) MLP

5
5 Multilayer Perceptron Learning hidden layer input layer output layer Back-propagation (BP) training algorithm: how much each weight is responsible for the error signal BP has two phases: Forward pass phase: feedforward propagation of input signals through network Backward pass phase: propagates the error backwards through network

6
6 Backward Pass We want to know how to modify weights in order to decrease E. Use gradient descent: Multilayer Perceptron Learning 1.Gradient-based adjustment could go to local minima 2.Time-consuming due to large number of learning steps and the step size needs to be configured

7
7 Least-squares based Learning Algorithm Least-squared fit (LSF) : to obtain the minimum sum of squared error For underdetermined problem, LSF finds the solution with the minimum SSE For overdetermined problem, pseudo-inverse finds the solution with minimum norm Can be applied in the optimization for weights or signals on the layers Optimized weights Optimized signals

8
8 I.Start with desired output signal back-propagation signals optimization 1.Propagation of the desired outputs back through layers 2.Optimization of the weights between layers (1). y2=f -1 (z2), scale y1 to (-1, 1). (2). Based on W2, b2: W2.z1=y2-b2. (3). y1=f -1 (z1), scale y1 to (-1, 1). (4). Optimize W1, b1 to satisfy W1.x-b1=y1. (5). Evaluate z1, y1 using the new W1 and bias b1. (6). Optimize W2, b2 to satisfy W2.z1+b2=y2. (7). Evaluate z2, y2 using the new W2 and bias b2. (8). Evaluate the MSE Least-squares based Learning Algorithm (I) z2 y2 d W1 y1z1 b1 W2 x b2

9
9 Least-squares based Learning Algorithm (I) Weights optimization with weighted LSF The location of x on the transfer function determines its effect on output signal of this layer dy/dx weighting term in LSF Optimize W1, b1 to satisfy W1.x=y1-b1 Weighted LSF ΔxΔx ΔxΔx ΔyΔy ΔyΔy

10
10 Least-squares based Learning Algorithm (II) II. Weights optimization with iterative fitting W 1 can be further adjusted based on the output error x Each hidden neuron: basis function Start with the 1 st hidden neurons, and continue to other neurons as long as e out exists

11
11 III. Start with input feedforward weights optimization 1.Propagation of the inputs forward through layers 2.Optimization of the weights between layers and signals on layers (1). Evaluate z1, y1 using the initial W1 and bias b1. (2). y2=f -1 (d). (3). Optimize W2, b2 to satisfy W2.z1+b2=y2. (4). Based on W2, b2, optimize z1 to satisfy W2.z1-b2=y2. (5). y1=f-1(z1). (6). Optimize W1, b1 to satisfy W1.x+b1=y1. (7). Evaluate y1, z1, y2, z2 using the new W1,W2 and bias b1,b2. (8). Evaluate the MSE Least-squares based Learning Algorithm (III) z2 y2 d W1 y1z1 b1 W2 x b2

12
12 Least-squares based Learning Algorithm (III) Signal optimization with weighted adaptation The location of x on the transfer function determines how much the signal can be changed x y

13
13 Overfitting problem Learning algorithm can adapt MLP to fit into the training data. For the noisy training data, how well we should learn into the data? Overfitting Number of hidden neurons Number of layers affect the training accuracy, determined by users: critical Optimized Approximation Algorithm – SNRF criterion

14
14 Sampled data: function value + noise Error signal: approximation error component + noise component Signal-to-noise ratio figure (SNRF) Noise part Should not be learned Useful signal Should be reduced Assumption: continuous function & WGN as noise Signal-to-noise ratio figure (SNRF): signal energy/noise energy Compare SNRF e and SNRF WGN Learning should stop – ? If there is useful signal left unlearned If noise dominates in the error signal

15
15 Signal-to-noise ratio figure (SNRF) Training data and approximating function Error signal approximation error component noise component +

16
16 Optimization using SNRF Noise dominates in the error signal, Little information left unlearned, Learning should stop SNRF e < threshold SNRF WGN Start with small network (small # of neurons or layers) Train the MLP e train Compare SNRF e & SNRF WGN Add hidden neurons Stopping criterion: SNRF e < threshold SNRFWGN

17
17 Optimization using SNRF Set the structure of MLP Train the MLP with back-propagation iteration e train Compare SNRF e & SNRF WGN Keep training with more iterations Applied in optimizing number of iterations in back-propagation training to avoid overfitting (overtraining)

18
18 Prepare the data Data sample along the row: N samples Features along the column: M features Desired output in a row vector: N values Save “features” and “values” in a training MAT file How to recall the function Run “main_MLP_LS.m” Specify MAT file path and name and MLP parameters in command window. M x N matrix: “Features” 1 x N vector: “Values” Software simulation project

19
19 Input the path where data file can be found (C:*): E:\Research\MLP_LSInitial_desired\MLP_LS_package\ Input the name of data file (*.mat): mackey_glass_data.mat There are overall 732 samples. How do you like to divide them into training and testing set? Number of training samples: 500 Number of testing samples: 232 How many layers does MLP have? 3:2:7 How many neurons there are on each hidden layer ? 3:1:10 What kind of tranfer function you like to have on hidden neurons? 0. Linear tranfer function 1. Tangent sigmoid 2. Logrithmic sigmoid 2 Software simulation project

20
20 z2 y2 d W1 y1z1 b1b1 W2 x b2 There are 4 types of training algorithms you can choose from. Which type you like to use? 1. Least-squared based training (I) 2. Least-squared based training with iterative neuron fitting (II) 3. Least-squared based training with weighted signal adaptation (III) 4. Back-propagation training (BP) 1 How many iterations you would like to have in the training ? 3 How many Monte-Carlo runs you would like to have for the training? 2 Software simulation project

21
21 Results: J_train (num_layer, num_neuron) J_test (num_layer, num_neuron) SNRF (num_layer, num_neuron) Present training and testing errors for various configurations of the MLP Present the optimum configuration found by SNRF Present the comparison of the results, including errors, network structure Software simulation project

22
22 Typical database and literature survey Function approximation & classification dataset “IEEE Neural Networks Council Standards Committee Working Group on Data modeling Benchmarks” “Neural Network Databases and Learning Data” “UCI Machine Learning Repository” Data are normalized Multiple input, with signal output. For multiple output data, use separate MLPs. Compare results from literature which uses the same dataset (*) Software simulation project

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google