We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byBrandon Dobson
Modified over 3 years ago
On Simple Adaptive Momentum - 1 Presented at CIS 2008 © Dr Richard Mitchell 2008 On Simple Adaptive Momentum Dr Richard Mitchell Cybernetics Intelligence Research Group Cybernetics, School of Systems Engineering University of Reading, UK R.J.Mitchell@reading.ac.uk
On Simple Adaptive Momentum - 2 Presented at CIS 2008 © Dr Richard Mitchell 2008 Overview Simple Adaptive Momentum speeds training of (MLPs) It adapts the normal momentum term depending on the angle between the current and previous changes in the weights of the MLP. In the original paper, the weight changes of the whole network are used in determining this angle. This paper considers adapting the momentum term using certain subsets of these weights. It is inspired by the authors object oriented approach to programming MLPs, successfully used in teaching. It is concluded that the angle is best determined using the weight changes in each layer separately.
On Simple Adaptive Momentum - 3 Presented at CIS 2008 © Dr Richard Mitchell 2008 Nomenclature in Multi Layer Net x r (i) is o/p of node i in layer r; w r (i,j) is weight i of link to node j in layer r x 1 (2) x 3 (1) x 1 (1) w 3 (3,2) x 2 (2) x 2 (3) x 3 (2) x 2 (1) w 2 (0,1) w 2 (0,2) w 2 (0,3) w 3 (0,2) w 3 (0,1) w 3 (1,2) w 3 (2,2) w 3 (3,1) w 3 (2,1) w 2 (1,2) w 3 (1,1) w 2 (1,1) w 2 (2,1) w 2 (2,3) w 2 (1,3) w 2 (2,2) Inputs Outputs Change weights : Δ t w r (i,j) = η δ r (j) x r-1 (i) + α Δ t-1 w r (i,j) δ is function of error; varies with f(z); error also varies
On Simple Adaptive Momentum - 4 Presented at CIS 2008 © Dr Richard Mitchell 2008 Simple Adaptive Momentum Swanston, Bishop, & Mitchell, R.J. (1994), "Simple adaptive momentum: new algorithm for training multilayer perceptrons", Elect. Lett, Vol 30, No 18, pp1498-1500 Concept: adapt the momentum term depending on whether weight change this time in same direction as last. Direction? Weight changes in array … so are a vector Have two vectors, for current and previous, Δwc & Δwp w2w2 w1w1 Δw p2 Δw p1 Can see angle between vectors w2w2 w1w1 θ ΔwpΔwp ΔwcΔwc e.g. In 2D
On Simple Adaptive Momentum - 5 Presented at CIS 2008 © Dr Richard Mitchell 2008 Implementing SAM The simple idea is to replace momentum constant by (1+cos( )) where is angle between vector of current and previous deltaWeights, Δw c and Δw p. In original paper Δws apply to all weights in network In this paper, we consider adapting α at the network level, layer level and neuron level. Inspired by object oriented programming of MLP – provides good example and practice for students of properties of OOP albeit on old ANN.
On Simple Adaptive Momentum - 6 Presented at CIS 2008 © Dr Richard Mitchell 2008 OO Approach – Network Layers Can program MLP with objects for each neuron. But as need inputs from prev layer and deltas from next – need many pointers – problematic for students. So easier to have object for layer of neurons (all with same inputs): get inputs and weighted deltas in an array Base object is layer of linearly activated neurons LinActLayer – a single layer network of neurons f(z) = z. For Neurons with Sigmoidal Activation – only need two different functions – for calculating output and delta So have SigActLayer – an object inheriting LinActLayer uses existing members, adds 2 different ones
On Simple Adaptive Momentum - 7 Presented at CIS 2008 © Dr Richard Mitchell 2008 Network For Hidden Layers Need enhanced SigActLayer with own calculate error func: (weighted deltas in next layer). Existing objects are whole net. So have SigActHidLayer as a multiple layer network, Inherits from SigActLayer but also has a pointer to next layer. Most functions have 2 lines - process own layer and next Class Base SigActHidLayer LinActLayer SigActLayer
On Simple Adaptive Momentum - 8 Presented at CIS 2008 © Dr Richard Mitchell 2008 SAM and Hierarchy Given approach can adjust momentum using weight changes a) over the whole network b) separately by layer c) separately for each neuron For a) need to calculate the η * delta * inputs for all layers, then globally set α (1 + cosθ) For b) calculate η * delta * inputs for each layer and set the α (1 + cosθ) for each layer separately For c) do the same, but for each neuron in each layer. This works easily in the hierarchy.
On Simple Adaptive Momentum - 9 Presented at CIS 2008 © Dr Richard Mitchell 2008 Experimentation 3 problems. Have Training Validation Unseen data Stop training when error on validation set rises Run 6 times per problem with different initial weights Problem 1: 2 inputs, 10 nodes in hidden, 1 output SAM ModeNoneNeuronLayerNetwork Mean Epochs taken867227202257 SAM modeTrain SSEValid SSEUnseen SSE None0.00819850.00659650.0092535 Neuron0.01004450.00843950.0107985 Layer0.01032650.00868050.0106505 Network0.00771250.00710950.0084845
On Simple Adaptive Momentum - 10 Presented at CIS 2008 © Dr Richard Mitchell 2008 Problem 2 5 inputs, 15 nodes in hidden layer and 1 output SAM modeNoneNeuronLayerNetwork Mean Epochs1712315262312 SAM modeTrain SSEValid SSEUnseen SSE None0.00047250.00056250.0006665 Neuron0.00065850.00076350.0009525 Layer0.00076850.00087450.0011055 Network0.00062150.00076550.0009505 Trained much more quickly, but SSE worse Very little diff one layer and whole network, so..
On Simple Adaptive Momentum - 11 Presented at CIS 2008 © Dr Richard Mitchell 2008 Problem 3 5 inputs, 15 nodes in hidden layer and 3 outputs SAM ModeNoneNeuronLayerNetwork Mean Epochs1133497638977 SAM ModeTrain SSEValid SSEUnseen SSE None0.00447350.00438350.0054605 Neuron0.00482050.00456850.0057955 Layer0.00456750.00441050.0053225 Network0.00454650.00440550.0053445 SSEs averaged over 3 outputs : here Layer best
On Simple Adaptive Momentum - 12 Presented at CIS 2008 © Dr Richard Mitchell 2008 Conclusions and Further Work The Object Oriented hierarchy works neatly here SAM clearly reduces number of Epochs taken to learn – little extra overhead per epoch In one example it increased the Sum Squared Errors This needs investigating It needs to be tested on other problems, but it looks as if SAM at the layer level may be best (particularly with multiple outputs) Momentum used in other learning problems – SAM could be investigated for these.
Multilayer Perceptrons 1. Overview Recap of neural network theory The multi-layered perceptron Back-propagation Introduction to training Uses.
Increasing Completion of Neural Networks Coursework- 1 Presented at CIS 2011 © Dr Richard Mitchell 2011 Increasing Completion of Neural Networks Coursework.
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Intelligence 12. Two Layer ANNs
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Beyond Linear Separability
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Artificial Neural Networks
Appendix B: An Example of Back-propagation algorithm
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
CS 484 – Artificial Intelligence
Artificial Intelligence CIS 342 The College of Saint Rose David Goldschmidt, Ph.D.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
1 Chapter 6: Artificial Neural Networks Part 2 of 3 (Sections 6.4 – 6.6) Asst. Prof. Dr. Sukanya Pongsuparb Dr. Srisupa Palakvangsa Na Ayudhya Dr. Benjarath.
BACKPROPAGATION (CONTINUED) Hidden unit transfer function usually sigmoid (s-shaped), a smooth curve. Limits the output (activation) unit between 0..1.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
NEURAL NETWORKS Backpropagation Algorithm
Multi-Layer Perceptrons Michael J. Watts
Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.
LINEAR CLASSIFICATION. Biological inspirations Some numbers… The human brain contains about 10 billion nerve cells ( neurons ) Each neuron is connected.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Artificial Intelligence (CS 461D)
1 Introduction to Artificial Neural Networks Andrew L. Nelson Visiting Research Faculty University of South Florida.
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
CS 4700: Foundations of Artificial Intelligence
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Multi-Layer Perceptron (MLP)
NN – cont. Alexandra I. Cristea USI intensive course Adaptive Systems April-May 2003.
2101INT – Principles of Intelligent Systems Lecture 10.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
EE 690 Design of Embodied Intelligence
Artificial Neural Networks An Overview and Analysis.
BACKPROPAGATION: An Example of Supervised Learning One useful network is feed-forward network (often trained using the backpropagation algorithm) called.
Introduction to Neural Networks John Paxton Montana State University Summer 2003.
Neural Networks - Berrin Yanıkoğlu1 Applications and Examples From Mitchell Chp. 4.
NEURAL NETWORKS LECTURE 1 dr Zoran Ševarac FON, 2015.
Radial Basis Function (RBF) Networks
Dimensions of Neural Networks Ali Akbar Darabi Ghassem Mirroshandel Hootan Nokhost.
Machine Learning Neural Networks
Intro. ANN & Fuzzy Systems Lecture 11. MLP (III): Back-Propagation.
LOGO Classification III Lecturer: Dr. Bo Yuan
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
© 2017 SlidePlayer.com Inc. All rights reserved.