Presentation is loading. Please wait.

Presentation is loading. Please wait.

Neural Networks 2nd Edition Simon Haykin

Similar presentations


Presentation on theme: "Neural Networks 2nd Edition Simon Haykin"— Presentation transcript:

1 Neural Networks 2nd Edition Simon Haykin
柯博昌 Chap 2. Learning Processes

2 Learning vs. Neural Network
The neural network is stimulated by an environment. The neural network undergoes changes in its free parameters as a result of this stimulation. The neural network responds in a new way to the environment because of the changes that have occurred in its internal structure.

3 Error-Correction Learning
Desired Response (Target Output) Input vector One or more Layers of hidden neurons Output Neuron k x(n) yk(n) dk(n) - + ek(n)=dk(n)-yk(n) Multilayer feedforward network Error Signal Objective: Minimizing a cost function or index of performance, The step-by-step adjustment are continued until the system reaches a steady state.

4 Delta Rule (Widrow-Hoff Rule)
Let wkj(n) denote the value of synaptic weight wkj of neuron k excited by element xj(n) of the signal vector x(n) at time step n. : a positive constant determining the rate of learning as we proceed from one step in the learning process to another. (learning-rate parameter) In effect, wkj(n) and wkj(n+1) may be viewed as the old and new values of synaptic weight wkj, respectively.

5 Memory-based Learning
Def: All (or most) of the past experiences are explicitly stored in a large memory of correctly classified input-output examples. Without loss of generality, the desired response is restricted to be a scalar. Ex: A binary pattern classification problem Assume there are two classes/hypotheses denoted C1 and C2 di= 0 (or -1) 1 for C1 for C2 Retrieving the training data in a “local neighborhood” of xtest Classification of a test vector xtest Nearest neighbor rule is a simple yet effective type of learning K-nearest neighbor classifier is to identify k patterns lying nearest to xtest and use a majority vote to make classification.

6 Hebbian Learning Hebb’s postulate of learning is the oldest and most famous of all learning rule. When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic changes take place in one or both cells (A, B). Hebb’s rule was expanded: If two neurons on either side of a synapse are activated simultaneously, then the strength of that synapse is selectively increased. Otherwise, if such two neurons are activated asynchronously, then that synapse is selectively weakened or eliminated. Such a synapse is called a Hebbian synapse. Hebbian Synapse Characteristics Time-dependent mechanism Local mechanism Interactive mechanism Conjunctional or correlational mechanism

7 Mathematical Models of Hebbian Modifications
Let wkj denote a synaptic weight of neuron k with pre-synaptic and post-synaptic signals denoted by xj and yk, respectively. Hebby’s hypothesis (the Simplest form): (Activity product rule) slope=xj Hebb’s hypothesis Covariance hypothesis Postsynaptic activity yk Balance point yk Maximum depression point wkj Covariance hypothesis: : the rate of learning (a positive constant ) Limitation of Hebby’s hypothesis: The repeated application of xj leads to an increase in yk and exponential growth that finally drives the synaptic connection into saturation. No information will be stored in the synapse and selectivity is lost.

8 Competitive Learning Characteristics: Architectural Graph
The output neurons compete among themselves to become active. Only a single output neuron is active at any one time. The neuron that wins the competition is called a winner-takes-all neuron. Architectural Graph If k is the winning neuron, it induced that vk must be the largest among all the neurons in the network for a specified input pattern x.

9 Boltzmann Learning A stochastic learning algorithm derived from ideas rooted in statistical mechanism. A neural network designed based on Boltzmann learning rule is called a Bolzmann machine. The neurons constitute a recurrent structure, and operate in a binary manner (ex: +1, -1). Energy Function: xj is the state of neuron j. jk means that none of the neurons has self-feedback. The machine operates by choosing a neuron at random. (A brief review of statistical mechanics is presented in Chapter 11)

10 Credit-Assignment Problem
The problem of assigning credit or blame for overall outcomes to each of the internal decisions. For example: the error-correction learning is applied to a multilayer feedforward neural network. (Presented in Chapter 4)

11 Learning with a Teacher (Supervised Learning)

12 Learning without a Teacher
Reinforcement Learning Unsupervised Learning I/O mapping is performed through continued interaction with the environment to minimize a scaler index of performance. No teacher or critic to oversee the learning process. Environment Learning System Ex: Competitive learning

13 Learning Tasks Pattern Association
Autoassociation: A neural network stores a set of patterns by repeatedly presenting them to network. Then, the network is presented a partial description or distorted version of an original pattern. Heteroassociation: An arbitrary set of input patterns is paired with another arbitrary set of output patterns. xk: key pattern yk: memorized pattern Pattern Association: xkyk, k=1, 2, …, q where q is the number of patterns stored in the network In Autoassociation, yk=xk In Heteroassociation, yk=xk

14 Learning Tasks Pattern Recognition
Def: A received pattern/signal is assigned to one of a prescribed number of classes. Feature vector y Unsupervised network for feature extraction 1 Input pattern x Supervised network for classification (A) 2 r (B)

15 Learning Tasks Function Approximation
Nonlinear input-output mapping x: input vector d: output vector f() is assumed to be unknown d=f(x) Given a set of labeled examples: Requirement: Design a neural network to approximate this unknown function f() such that F(). ||F(x)-f(x)||< for all x, where  is a small positive number System Identification Inverse System

16 Learning Tasks Control
Goal: Supply appropriate inputs to the plant to make its output y track reference d. Error-correction algorithm needs the Jacobian matrix:

17 Learning Tasks Filtering and Beamforming
Extract information from a set of noisy data. Ex: Cocktail party problem. Beamforming A spatial form of filtering, which is used to distinguish between the spatial properties of a target signal and background noise. Ex: Echo-locating bats

18 Memory Signal-flow graph model of a linear neuron labeled i
W(k) W(k) is a weight matrix determined by input-output pair (xk,yk). Memory matrix M defines the overall connectivity between input and output layers.

19 Correlation Matrix Memory
(Recursion Form)

20 Recall xk: a randomly selected key pattern y: the yielded response
Let each of key patterns x1, x2, …, xq be normalized to have unit energy Desired Response Error Vector Because If These key vectors are orthogonal. It means vj=0 But, the key patterns presented to an associative memory are neither orthogonal nor highly separated from each other.

21 Adaptation If the operating environment is stationary (the statistical characteristics do not change with the time), the essential statistics of the environment can, in theory, be learned under the supervision of a teacher. In general, the environment of interest is non-stationary. So, the neural network must adapt its free parameters to variations continuously in a real-time fashion. (Continuous Learning or Learning-on-the-fly) Pseudo-stationary: The statistical characteristics of a non-stationary process change slowly enough over a window of short enough duration. Dynamic Approach to Learning Select a window short enough. When a new data sample is received, update the window by dropping the oldest data and shifting the remaining data by one time unit. Use the updated data window to retrain the network. Repeat the procedure on a continuing basis.


Download ppt "Neural Networks 2nd Edition Simon Haykin"

Similar presentations


Ads by Google