ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路

Name: ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路
Uploaded: 2017-08-19T19:35:15+00:00
Duration: PTM32S39
Channel: Randolf Chambers
Description: ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路

ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路
National Cheng Kung University/Walsin Lihwa Corp. 「Center for Research of E-life DIgital Technology」成功大學/華新麗華「數位生活科技研究中心」 ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路指導教授：郭耀煌教授碩士班學生:黃盛裕 96級 2008/7/18

Outline Introduction Single Layer Perceptron – Perceptron Example
Single Layer Perceptron – Adaline Multilayer Perceptron – Back–propagation neural network Competitive Learning - Example Radial Basis Function (RBF) Networks Q&A and Homework

Artificial Neural Networks (ANN)
simulate human brain approximate any nonlinear and complex functions accuracy Fig.1 Fig.2

Neural Networks vs. Computer
Table 1 processing elements element size energy use processing speed style of computation fault tolerant learns intelligent, conscious 1014 synapses 10-6 m 30 W 100 Hz parallel, distributed yes usually 108 transistors 30 W (CPU) 109 Hz serial, centralized no a little not (yet)

Biological neural networks
Fig.3

Biological neural networks
About 1011 neurons in human brain About 1014~15 interconnections Pulse-transmission frequency million times slower than electronic circuits Face recognition hundred million second by human Network of artificial neuron operation speed only a few million second

Applications of ANN Neural Networks Fig.4 Pattern Recognition
Prediction Economics Optimization VLSI Neural Networks Control Power & Energy AI Bioinformatics Communication Signal Processing Image Processing Successful apps can be found in well-constrained environment None is flexible enough to perform well outside its domain.

Challenging Problems Fig.5 Pattern classification
Clustering/categorization Function approximation Prediction/forecasting Optimization (TSP problem) Retrieval by content control

Brief historical review
Three periods of extensive activity 1940s: McCulloch and Pitts’ pioneering work 1960s: Rosenblatt’s perceptron convergence theorem Minsky and Papert’s showing the limitation of a simple perceptron 1980s: Hopfield’s energy approach in 1982 Werbos’ Back-propagation learning algorithm

Neuron vs. Artificial Neuron
McCulloch and Pitts propose MP neural model in 1943. Hebb learning rule. Fig.7 Fig.6

Element of Artificial Neuron
Weight (Synapse) Bais θj x1 w1j x2 w2j Summation function Transfer function Output Yj xi wij … … Inputs wn-1 j xn-1 wn j xn Fig.8 The McCulloch-Pitts model (1949)

Summation function An adder for summing the input signal, weighted by the respective synapses of the neuron. Summation Euclidean Distance

Transfer functions An activation function for limiting the amplitude of the neuron of a neuron. Threshold (step) function Piecewise-Linear function Threshold function Yj 1 netj Piecewise-Linear function Yj -0.5 0.5 netj

Transfer functions Sigmoid function Radial Basis Function Yj
Where a is the slop parameter of the sigmoid function. -0.5 0.5 netj Yj 1 Where a is the variance parameter of the radial basis function. netj -0.5 0.5

Network architectures
Fig.9 A taxonomy of feed-forward and recurrent/feedback network architectures.

Network architectures
Feed-forward networks Static: produce only one set of output value Memory-less: independent of previous state Recurrent (or feedback) networks Dynamics system Different architectures require different appropriate learning algorithm

Learning process The ability to learn is a fundamental trait of intelligent. Automatically learn from examples. Instead of following a set of rules specified by human experts. ANNs appear to learn underlying rules. This is the major advantages over traditional expert systems.

Learning process Learning process Three main learning paradigms
Have a model of the environment Understand how network weights are updated Three main learning paradigms Supervised Unsupervised Hybrid

Learning process Three fundamental and practical issue of Learning theory Capacity Patterns Functions Decision boundaries Sample complexity The number of training samples (over-fitting) Computational complexity Time required (many learning algorithms have high complexity)

Learning process Three basic types of learning rules:
Error-correction rules Hebbian rule If neurons on both sides of a synapse are activated synchronously and repeatedly, the synapse’s strength is selectively increased. Competitive learning rules

Table 2 Well-known learning algorithms.

Error-Correction Rules
Fig.10 The threshold function: if v > 0 , then y = +1 otherwise y = 0

Learning mode On-line (Sequential) mode: Off-line (Batch) mode:
Update weights for each training data More accurate Require more computational time Faster learning convergence Off-line (Batch) mode: Update weights after apply all training data Less accurate Require less computational time Require extra storage

Error-Correction Rules
However, a single-layer perceptron can only separate linearly separable patterns as long as a monotonic activation is used. The back-propagation learning algorithm is based on error-correction principle.

Preprocess of Neural networks
Input layers are mapping in [-1,1]. Output layers are mapping in [0,1]

Perceptron In 1957,A single-layer Perceptron network consists of 1 or more artificial neurons in parallel. Each neuron in the single layer provides one network output, and is usually connected to all of the external (or environmental) inputs. Supervised MP neuron model + Hebb learning …… …… Fig.11

Perceptron Learning Algorithm output Adjust weight & bias
Energy function

Perceptron Example by hand(1/11)
Use two-layer Perceptron to solve AND problem Initial parameter =0.1 =0.5 W13=1.0 W23=-1.0 X1 X2 Y -1 1 X3 Fig.12 X1 X2

1st learning cycle Input 1st example X1=-1, X2=-1, T=0 net=W13•X1 +W23•X2-=-0.5, Y=0 =T-Y=0 W13=X1=0, W23=0, =-=0 Input 2nd~4th example No X1 X2 T net Y  W13 W23  1 -1 -0.5 2 -2.5 3 1.5 -0.1 0.1 4 0.2

Adjust weight & bias W13=1, W23=-0.8, =0.5 2nd learning cycle No X1 X2 T net Y  W13 W23  1 -1 -0.7 2 -2.3 3 1.3 -0.1 0.1 4 -0.3 0.2

Adjust weight & bias W13=1, W23=-0.6, =0.5 3rd learning cycle No X1 X2 T net Y  W13 W23  1 -1 -0.9 2 -2.1 3 1.1 -0.1 0.1 4 0.2

Adjust weight & bias W13=1, W23=-0.4, =0.5 4th learning cycle No X1 X2 T net Y  W13 W23  1 -1 -1.1 2 -1.9 3 0.9 -0.1 0.1 4

Adjust weight & bias W13=0.9, W23=-0.3, =0.6 5th learning cycle No X1 X2 T net Y  W13 W23  1 -1 -1.2 2 -1.8 3 0.6 -0.1 0.1 4 0.2

Adjust weight & bias W13=0.9, W23=-0.1, =0.6 6th learning cycle No X1 X2 T net Y  W13 W23  1 -1 -1.4 2 -1.6 3 0.4 -0.1 0.1 4 0.2

Adjust weight & bias W13=0.8, W23=0, =0.7 7th learning cycle No X1 X2 T net Y  W13 W23  1 -1 -1.5 2 3 0.1 -0.1 4

Adjust weight & bias W13=0.7, W23=0.1, =0.8 8th learning No X1 X2 T net Y  W13 W23  1 -1 -1.6 2 -1.4 3 -0.2 4 0.1 -0.1

Adjust weight & bias W13=0.8, W23=0.2, =0.7 9th learning No X1 X2 T net Y  W13 W23  1 -1 -1.7 2 -1.3 3 -0.1 4 0.3

Adjust weight & bias W13=0.8, W23=0.2, =0.7 10th learning (no change, stop learning) No X1 X2 T net Y  W13 W23  1 -1 -1.7 2 -1.3 3 -0.1 4 0.3

Example input value desired output value x1 = (1, 0, 1)T y1 = -1
Fig.13 input value desired output value x1 = (1, 0, 1)T y1 = -1 x2 = (0,−1,−1)T y2 = 1 x3 = (−1,−0.5,−1)T y3 = 1 the learning constant is assume to be 0.1 The initial weight vector is w0 = (1, -1, 0)T

Step 1: Step 2: <w0, x1> = (1, -1, 0)*(1, 0, 1)T = 1
Correction is needed since y1 = -1 ≠ sign (1) w1 = w *(-1-1)*x1 w1 = (1, -1, 0)T – 0.2*(1, 0, 1)T = (0.8, -1, -0.2)T Step 2: <w1, x2> = 1.2 y2 = 1 = sign(1.2) w2 = w1

Step 3: Step 4: <w2, x3> = (0.8, -1, -0.2 )*(−1,−0.5,−1)T = -0.1
Correction is needed since y3 = 1 ≠ sign (-0.1) w3 = w *(1-(-1))*x3 w3 = (0.8, -1, -0.2 )T– 0.2*(−1,−0.5,−1)T = (0.6, -1.1, -0.4)T Step 4: <w3, x1> = (0.6, -1.1, -0.4)*(1, 0, 1)T = 0.2 Correction is needed since y1 = -1 ≠ sign (0.2) w4 = w *(-1-1)*x1 w4 = (0.6, -1.1, -0.4)T– 0.2*(1, 0, 1)T = (0.4, -1.1, -0.6)T

W6 terminates the learning process.
Step 5: <w4, x2> = 1.7 y2 = 1 = sign(1.7) w5 = w4 Step 6: <w5, x3> = 0.75 y3 = 1 = sign(0.75) w6 = w5 W6 terminates the learning process. <w6, x1> = < 0 <w6, x2> = > 0 <w6, x3> = 0.75 > 0

Adaline Fig.14 Architecture of Adaline Application
X1 Architecture of Adaline Application Filter communication Learning algorithm (Least mean Square，LMS ) Y= purelin(ΣWX-b)=W1X1+W2X2-b W(t+1)=W(t)+2ηe(t)X(t) b(t+1)=b(t)+2ηe(t) e(t)=T-Y Fig.14 W1 X2 W2 Y Weight -1 b Input Layer Output Layer

Perceptron in XOR problem
1 1 1 ○ ○ × ○ ○ × -1 1 -1 1 -1 1 × ○ × × × ○ -1 -1 -1 OR AND XOR

Multilayer Feed-Forward Networks
Fig. 15 Network architectures: A taxonomy of feed-forward and recurrent/feedback network architectures.

Multilayer perceptron
Xq Wqi(1) Wij(2) Wjk(L) Yk(L) x1 y1 x2 y2 xn yn Input layer Hidden layer Output layer Fig. 16 A typical three-layer feed-forward network architecture.

Multilayer perceptron
Most popular class Which can form arbitrarily complex decision boundaries and represent any Boolean function. Back-propagation Let Squared-error cost function A geometric interpretation

A geometric interpretation of the role of hidden unit in a two-dimensional input space
Fig.17

Back-propagation neural network (BPN)
In 1985 Architecture ‧‧ ‧‧‧ Input Vector Output Vector Input layer Hidden layer Output layer Fig.18

BPN Algorithm Using Gradient Steepest Descent Method to reduce error.
Energy function E = (1/2) (Tj-Yj)2 Output layer  Hidden layer Hidden layer  Hidden layer

Competitive Learning Rules
Know as winner-take-all method It’s an unsupervised learning Often clusters or categorizes the input data The simplest network Fig.19

Competitive Learning Rules
A geometric interpretation of competitive learning Fig. 20 (a) Before learning (b) after learning

Example

Examples (Cont’d.)

Examples (Cont’d.) Fig.21

Radial Basis Function network
A special class of feed-forward networks Origin: Cover’s Theorem Radial basis function (kernel function) Gaussian function ψ1 x1 Fig.22 x2 ψ2

Radial Basis Function network
There are a variety of learning algorithms for the RBF network Basic one is two-step learning strategy Hybrid learning Converges much faster than the back-propagation But involves a larger number of hidden units Runtime speed (after training) is slower The efficiencies of RBF network and multilayer perceptron are problem-dependent.

Issue How many layers are needed for a given task,
How many units are needed per layer, Generalization ability How large the training set should be for ‘good’ generalization. Although multilayer feed-forward networks has been widely used, but parameters identification still must be determined by trail and error.

Journal Neural networks
Neural Networks (The Official Journal of the International Neural Network Society, INNS) IEEE Transactions on Neural Networks International Journal of Neural Systems International Journal of Neuroncomputing Neural Computation

Books Artificial Intelligence (AI) Machine learning Neural networks
Artificial Intelligence: A Modern Approach (2nd Edition)，Stuart J. Russell, Peter Norvig Machine learning Machine Learning，Tom M. Mitchell Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence，Jyh-Shing Roger Jang, Chuen-Tsai Sun, Eiji Mizutani Neural networks 類神經網路模式應用與實作，葉怡成應用類神經網路，葉怡成類神經網路 –MATLAB的應用，羅華強 Neural Networks: A Comprehensive Foundation (2nd Edition)，Simon Haykin Neural Network Design，Martin T. Hagan, Howard B. Demuth, Mark H. Beale Genetic Algorithm Genetic Algorithms in Search, Optimization, and Machine Learning，David E. Goldberg Genetic Algorithms + Data Structures = Evolution Programs， Zbigniew Michalewicz An Introduction to Genetic Algorithms for Scientists and Engineers，David A. Coley

Home work Use two-layer Perceptron to solve OR problem.
Draw the topology (structure) of the neural network, including the number of nodes in each layer and the associated weight linkage. Please discuss how initial parameters(weights, bias, learning rate) affect the learning process. Please discuss the difference between batch mode learning and on-line learning. Use two-layer Perceptron to solve XOR problem. Please discuss why it cannot solve XOR problem.

Thanks

ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路

Similar presentations

Presentation on theme: "ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路

Similar presentations

Presentation on theme: "ISMP Lab 新生訓練課程 Artificial Neural Networks 類神經網路"— Presentation transcript:

Similar presentations

About project

Feedback