Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, 281-294 (1989)

Similar presentations


Presentation on theme: "Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, 281-294 (1989)"— Presentation transcript:

1 Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, 281-294 (1989)

2 Network Architecture Responses of neurons are “locally-tuned” or “selective” for some part of the input space. Contains a single hidden layer of these locally-tuned neurons. Hidden layer outputs are fed to a layer of linear neurons, giving network output. For mathematical simplicity, we’ll assume only one neuron in the linear output layer.

3 Network Architecture (2)

4 Biological Plausibility Cochlear stereocilia cells in human ear exhibit locally-tuned response to frequency. Cells in visual cortex respond selectively to stimulation that is both local in retinal position and local in angle of orientation. Prof. Wang showed locally-tuned responses to motion of particular speeds and orientations.

5 Mathematical Definitions A network of M locally-tuned units has overall response function: Here, is a real-valued vector in input space, is the response function of the locally-tuned unit, R is a radially-symmetric function with a single maximum its center and which drops to zero at large radii.

6 Mathematical Definitions (2) and are the center and width in the input space of the unit, and is the weight or amplitude of the unit. A simple R is the unit normalized Gaussian

7 Possible Training Methods Fully supervised training to find neuron centers, widths, and amplitude. –Uses error gradient found by varying all parameters (no restrictions on the parameters). –In particular, widths can grow large, thereby losing the local nature of the neurons. –Compared with backpropagation, achieves lower error, but like BP, very slow to train.

8 Possible Training Methods (2) Combination of supervised and unsupervised learning, a better choice? –Neuron centers and widths are determined through unsupervised learning. –Weights or amplitudes for hidden layer outputs are determined through supervised training.

9 Unsupervised Learning Determination of neuron centers, how? k-means clustering –Find set a k neuron centers which represent a local minimum of the total squared euclidean distances between the training vectors and the neuron centers. Learning Vector Quantization (LVQ)

10 Unsupervised Learning (2) Determination of neuron widths, how? P nearest-neighbor heuristics –Vary widths to achieve certain amount of response overlap between each neuron and its P nearest neighbors. Global first nearest-neighbor, P = 1 –Uses global average width between each neuron and its nearest neighbor as net’s uniform width.

11 Supervised Learning Determination of weights, how? Simple case for 1 linear output –Use Widrow-Hoff learning rule. For a layer of linear outputs? –Simply use Gradient Descent learning rule. Reduced to a linear optimization problem.

12 Advantages Over Backprop Training via a combination of linear supervised and linear self-organizing techniques is much faster than backprop. For a given input, only a small fraction of neurons (those with nearby centers) will give ~ non-zero responses. Hence we don’t need to fire all neurons to get overall output. This improves performance.

13 Advantages Over Backprop (2) Based on well-developed mathematical theory (kernel theory) yielding statistical robustness. Computational simplicity since only one layer is involved in supervised training. Provides guaranteed, globally optimal solution via simple linear optimization.

14 Project Proposal Currently debugging C++ RBF network with n dimensional input and 1 linear output neuron. –Uses k-means clustering, global first nearest neighbor heuristic, and gradient descent. –Experimentation with different training algs. Try to reproduce results for RBF neural nets performing face-recognition.


Download ppt "Fast Learning in Networks of Locally-Tuned Processing Units John Moody and Christian J. Darken Yale Computer Science Neural Computation 1, 281-294 (1989)"

Similar presentations


Ads by Google