November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.

Slides:



Advertisements
Similar presentations
Computational Intelligence Winter Term 2009/10 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Advertisements

Computational Intelligence Winter Term 2011/12 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Computational Intelligence Winter Term 2014/15 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
Slides from: Doug Gray, David Poole
Support Vector Machines
Unsupervised Learning with Artificial Neural Networks The ANN is given a set of patterns, P, from space, S, but little/no information about their classification,
Ch. 4: Radial Basis Functions Stephen Marsland, Machine Learning: An Algorithmic Perspective. CRC 2009 based on slides from many Internet sources Longin.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Perceptron.
Machine Learning Neural Networks
Simple Neural Nets For Pattern Classification
6/10/ Visual Recognition1 Radial Basis Function Networks Computer Science, KAIST.
RBF Neural Networks x x1 Examples inside circles 1 and 2 are of class +, examples outside both circles are of class – What NN does.
Radial Basis-Function Networks. Back-Propagation Stochastic Back-Propagation Algorithm Step by Step Example Radial Basis-Function Networks Gaussian response.
Radial Basis Functions
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
Radial Basis Function Networks 표현아 Computer Science, KAIST.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
Prediction Networks Prediction –Predict f(t) based on values of f(t – 1), f(t – 2),… –Two NN models: feedforward and recurrent A simple example (section.
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
Neural Networks Lecture 17: Self-Organizing Maps
Dan Simon Cleveland State University
Radial Basis Function (RBF) Networks
Last lecture summary.
Radial-Basis Function Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Radial Basis Function Networks
8/10/ RBF NetworksM.W. Mak Radial Basis Function Networks 1. Introduction 2. Finding RBF Parameters 3. Decision Surface of RBF Networks 4. Comparison.
Radial Basis Function Networks
Radial Basis Function Networks
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Chapter 4 Supervised learning: Multilayer Networks II.
Artificial Neural Networks Shreekanth Mandayam Robi Polikar …… …... … net k   
Radial Basis Function Networks:
November 26, 2013Computer Vision Lecture 15: Object Recognition III 1 Backpropagation Network Structure Perceptrons (and many other classifiers) can only.
Multi-Layer Perceptron
Linear Discrimination Reading: Chapter 2 of textbook.
Non-Bayes classifiers. Linear discriminants, neural networks.
CS-424 Gregory Dudek Today’s Lecture Neural networks –Training Backpropagation of error (backprop) –Example –Radial basis functions.
ADALINE (ADAptive LInear NEuron) Network and
1 Lecture 6 Neural Network Training. 2 Neural Network Training Network training is basic to establishing the functional relationship between the inputs.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
November 20, 2014Computer Vision Lecture 19: Object Recognition III 1 Linear Separability So by varying the weights and the threshold, we can realize any.
Computational Intelligence Winter Term 2015/16 Prof. Dr. Günter Rudolph Lehrstuhl für Algorithm Engineering (LS 11) Fakultät für Informatik TU Dortmund.
EEE502 Pattern Recognition
Hazırlayan NEURAL NETWORKS Backpropagation Network PROF. DR. YUSUF OYSAL.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Today’s Lecture Neural networks Training
Supervised Learning in ANNs
Chapter 4 Supervised learning: Multilayer Networks II
Neural Networks Winter-Spring 2014
Data Mining, Neural Network and Genetic Programming
One-layer neural networks Approximation problems
Chapter 4 Supervised learning: Multilayer Networks II
Computational Intelligence
Neuro-Computing Lecture 4 Radial Basis Function Network
Neural Network - 2 Mayank Vatsa
Computational Intelligence
Capabilities of Threshold Neurons
Computational Intelligence
Computer Vision Lecture 19: Object Recognition III
Prediction Networks Prediction A simple example (section 3.7.3)
Computational Intelligence
Presentation transcript:

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance of the node’s output with the current network error. Covariance: : vector of weights to the new node : vector of weights to the new node : output of the new node to p-th input sample : output of the new node to p-th input sample : error of k-th output node for p-th input sample before the new node is added : error of k-th output node for p-th input sample before the new node is added : averages over the training set : averages over the training set

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 2 Cascade Correlation Since we want to maximize S (as opposed to minimizing some error), we use gradient ascent: : i-th input for the p-th pattern : i-th input for the p-th pattern : sign of the correlation between the node’s output and the k-th network output : sign of the correlation between the node’s output and the k-th network output : learning rate : learning rate : derivative of the node’s activation function with respect to its net input, evaluated at p-th pattern : derivative of the node’s activation function with respect to its net input, evaluated at p-th pattern

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 3 Cascade Correlation If we can find weights so that the new node’s output perfectly covaries with the error in each output node, we can set the new output node weights and offsets so that the new error is zero. More realistically, there will be no perfect covariance, which means that we will set each output node weight so that the error is minimized. To do this, we can use gradient descent or linear regression for each individual output node weight. The next added hidden node will further reduce the remaining network error, and so on, until we reach a desired error threshold.

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 4 Cascade Correlation This learning algorithm is much faster than backpropagation learning, because only one neuron is trained at a time. On the other hand, its inability to retrain neurons may prevent the cascade correlation network from finding optimal weight patterns for encoding the given function.

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 5 Input Space Clusters One of our basic assumptions about functions to be learned by ANNs is that inputs belonging to the same class (or requiring similar outputs) are located close to each other in the input space. Often, input vectors from the same class form clusters, i.e., local groups of data points. For such data distributions, the linearly dividing functions used by perceptrons, Adalines, or BPNs are not optimal.

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 6 Input Space Clusters Example: x1x1 x1x1 x2x2 x2x2 Line 1 Line 2 Line 3 Line 4 Circle 1 A network with linearly separating functions would require four neurons plus one higher-level neuron. On the other hand, a single neuron with a local, circular “receptive field” would suffice. Class 1 Class -1

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 7 Radial Basis Functions (RBFs) To achieve such local “receptive fields,” we can use radial basis functions, i.e., functions whose output only depends on the Euclidean distance  between the input vector and another (“weight”) vector. A typical choice is a Gaussian function: where c determines the “width” of the Gaussian. However, any radially symmetric, non-increasing function could be used.

November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 8 Linear Interpolation: 1-Dimensional Case For function approximation, the desired output for new (untrained) inputs could be estimated by linear interpolation. As a simple example, how do we determine the desired output of a one-dimensional function at a new input x 0 that is located between known data points x 1 and x 2 ? which simplifies to: with distances D 1 and D 2 from x 0 to x 1 and x 2, resp.

Linear Interpolation: Multiple Dimensions In the multi-dimensional case, hyperplane segments connect neighboring points so that the desired output for a new input x 0 is determined by the P 0 known samples that surround it: Where D p is the Euclidean distance between x 0 and x p and f(x p ) is the desired output value for input x p. November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 9

Linear Interpolation: Multiple Dimensions Example for f:R 2  R 1 (with desired output indicated): For four nearest neighbors, the desired output for x 0 is November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 10 X 0 : ? X 3 : 4 X 1 : 9 X 2 : 5 X 4 : -6 X 7 : 6 X 8 : -9 X 6 : 7 X 5 : 8 D2D2D2D2 D3D3D3D3 D6D6D6D6 D7D7D7D7