Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Neural Networks

Similar presentations


Presentation on theme: "Artificial Neural Networks"— Presentation transcript:

1 Artificial Neural Networks
Introduction

2 Biological Inspirations
Humans perform complex tasks like vision, motor control, or language understanding very well One way to build intelligent machines is to try to imitate the (organizational principles of) human brain

3 Human Brain The brain is a highly complex, non-linear, and parallel computer, composed of some 1011 neurons that are densely connected (~104 connection per neuron). We have just begun to understand how the brain works... A neuron is much slower (10-3sec) compared to a silicon logic gate (10-9sec), however the massive interconnection between neurons make up for the comparably slow rate. Complex perceptual decisions are arrived at quickly (within a few hundred milliseconds) Plasticity: Some of the neural structure of the brain is present at birth, while other parts are developed through learning, especially in early stages of life, to adapt to the environment (new inputs).

4 Human Brain Some of the neural structure of the brain is present at birth, while other parts are developed through learning, especially in early stages of life, to adapt to the environment (new inputs). sensory input Integration (learning) motor output

5 Biology Neuron • Neurons respond slowly
– 10-3 s compared to 10-9 s for electrical circuits • The brain uses massively parallel computation – »1011 neurons in the brain – »104 connections per neuron

6 Biological Neuron Dendrites(樹突): nerve fibres carrying electrical signals to the cell cell body: computes a non-linear function of its inputs Axon(軸突): single long fiber that carries the electrical signal from the cell body to other neurons Synapse(突觸): the point of contact between the axon of one cell and the dendrite of another, regulating a chemical connection whose strength affects the input to the cell.

7 Biological Neuron A variety of different neurons exist (motor neuron, on-center off-surround visual cells…), with different branching structures The connections of the network and the strengths of the individual synapses establish the function of the network.

8 Artificial Neural Networks
Computational models inspired by the human brain: Massively parallel, distributed system, made up of simple processing units (neurons) Synaptic connection strengths among neurons are used to store the acquired knowledge. Knowledge is acquired by the network from its environment through a learning process

9 Artificial Neural Networks
Early ANN Models: Perceptron, ADALINE, Hopfield Network Current Models: Multilayer feedforward networks (Multilayer perceptrons) Radial Basis Fuction networks Self Organizing Networks ...

10 Properties of ANNs Learning from examples Adaptivity Non-linearity
labeled or unlabeled Adaptivity changing the connection strengths to learn things Non-linearity the non-linear activation functions are essential Fault tolerance if one of the neurons or connections is damaged, the whole network still works quite well

11 Artificial Neuron Model
x0= +1 x1 x2 x3 xm bi :Bias wi1 S f ai Neuroni Activation wim function Input Synaptic Output Weights

12 Pre-1940: von Hemholtz, Mach, Pavlov, etc.
General theories of learning, vision, conditioning No specific mathematical models of neuron operation 1940s: Hebb, McCulloch and Pitts Mechanism for learning in biological neurons Neural-like networks can compute any arithmetic function 1958 Rosenblatt->Perceptron

13 Single-Input Neuron

14 ai = f (ni) = f (Swijxj + bi)
Bias n ai = f (ni) = f (Swijxj + bi) j = 1 An artificial neuron: - computes the weighted sum of its input and - if that value exceeds its “bias” (threshold), - it “fires” (i.e. becomes active)

15 ai = f (ni) = f (Swijxj) = f(wi.xj)
Bias Bias can be incorporated as another weight clamped to a fixed input of +1.0 This extra free variable (bias) makes the neuron more powerful. n ai = f (ni) = f (Swijxj) = f(wi.xj) j = 0

16 Activation functions Also called the squashing function as it limits the amplitude of the output of the neuron. Many types of activations functions are used: linear: a = f(n) = n threshold: a = {1 if n >= 0 (hardlimiting) 0 if n < 0 sigmoid: a = 1/(1+e-n)

17 Activation functions: threshold, linear, sigmoid

18 Activation Functions

19 Ex. The input to a single-input neuron is 2.0, its weight is 2.3 and its bias is -3. What is the net input to the transfer function? What is the neuron output for applying hard limit and linear transfer functions?

20 sol The net input: n = wp+b = 2.3*2+(-3) = 1.6
Output for hard limit transfer function a= hardlim(1.6) = 1.6 a = purelin(1.6) = 1.6

21 Different Network Topologies
Single layer feed-forward networks Input layer projecting into the output layer Single layer network Input Output layer layer

22 ex Given a two-input neuron with the following parameters: b = 1.2, w=[3,2], and p = [-5, 6]T,caculate the neuron output for the following transfer functions: A symmetrical hard limit transfer function. A hyperbolic tangent sigmoid transfer function

23 sol N = wp+b = [3, 2] + 1.2 = -1.8 1. a = hardlims (-1.8) = -1
2. a = tansig(-1.8) = 0

24 ex. A single-layer neural network is to have six inputs and two outputs. The outputs are to be limited to and continuous over the range 0 to 1. Specify the ANN architecture? 1. how many neurons are required? 2. what are the dimensions of the weight matrix? 3. what kind of transfer functions could be used? Is a bias required?

25 sol 1. two neurons, one for each output
2. two rows for two neurons, six columns for six inputs. Logsig Not enough information for bias.

26 Different Network Topologies
Multi-layer feed-forward networks One or more hidden layers. Input projects only from previous layers onto a layer. 2-layer or 1-hidden layer fully connected network Input Hidden Output layer layer layer

27 Different Network Topologies
Recurrent networks A network with feedback, where some of its inputs are connected to some of its outputs (discrete time). Recurrent network Input Output layer layer

28 How to Decide on a Network Topology?
# of input nodes? Number of features # of output nodes? Suitable to encode the output representation transfer function? Suitable to the problem # of hidden nodes? Not exactly known

29 Illustrative Examples

30 Boolean OR Given the above input-output pairs (p,t), can you find (manually) the weights of a perceptron to do the job?

31 Boolean OR Solution 1) Pick an admissable decision boundary
2) Weight vector should be orthogonal to the decision boundary. 3) Pick a point on the decision boundary to find the bias.

32 p w Decision Boundary wT.p = ||w||||p||Cosq proj. of p onto w = ||p||Cosq = wT.p/||w|| q proj. of p onto w • All points on the decision boundary have the same inner product (= -b) with the weight vector • Therefore they have the same projection onto the weight vector; so they must lie on a line orthogonal to the weight vector

33 Perceptron Each neuron will have its own decision boundary.
A single neuron can classify input vectors into two categories. An S-neuron perceptron can classify input vectors into 2S categories.

34 Apple/Banana Sorter

35 Prototype Vectors Measurement Vector Prototype Banana Prototype Apple
Shape: {1 : round ; -1 : eliptical} Texture: {1 : smooth ; -1 : rough} Weight: {1 : > 1 lb. ; -1 : < 1 lb.}

36 Perceptron

37 Apple/Banana Example The decision boundary should
separate the prototype vectors. The weight vector should be orthogonal to the decision boundary, and should point in the direction of the vector which should produce an output of 1. The bias determines the position of the boundary

38 Testing the Network Banana: Apple: “Rough” Banana:

39 Perceptron Learning Rule

40 Types of Learning Network is provided with a set of examples
• Supervised Learning (classification) Network is provided with a set of examples of proper network behavior (inputs/targets) • Reinforcement Learning (classification) Network is only provided with a grade, or score, which indicates network performance • Unsupervised Learning (clustering) Only network inputs are available to the learning algorithm. Network learns to categorize (cluster) the inputs.

41 Learning Rule Test Problem
Input-output:

42 Starting Point Random initial weight: Present p1 to the network:
Incorrect Classification.

43 Tentative Learning Rule
• Set 1w to p1 – Not stable • Add p1 to 1w Tentative Rule:

44 Second Input Vector (Incorrect Classification) Modification to Rule:

45 Patterns are now correctly classified.
Third Input Vector (Incorrect Classification) Patterns are now correctly classified.

46 Unified Learning Rule

47 Unified Learning Rule => Define:
A bias is a weight with an input of 1.

48 Apple/Banana Example Training Set Initial Weights First Iteration e t
1 a =

49 Second Iteration

50 Check

51 Perceptron Rule Capability
The perceptron rule will always converge to weights which accomplish the desired classification, assuming that such weights exist.

52 Perceptron Limitations
Linear Decision Boundary Linearly Inseparable Problems

53 Matlab for perceptron ex 4.7
plotpv(P,T) P defines a 2-element input vectors and a row vector T defines the vector's target categories. We can plot these vectors with PLOTPV.

54 net = newp([-1 1;-1 1],1); plotpv(P,T); plotpc(net.IW{1},net.b{1}); NEWP creates a network object and configures it as a perceptron. The first argument specifies the expected ranges of two inputs. The second determines that there is only one neuron in the layer. IW(input layer’s initial weight) b (bias’s initial value)

55 net.adaptParam.passes = 3;
net = adapt(net,P,T); plotpc(net.IW{1},net.b{1}); net.adaptParam.passes train for 3 times ADAPT returns a new network object that performs as a better classifier (can use train(net, P, T).

56 Classify new point p = [0.7; 1.2]; a = net(p); plotpv(p,a);
point = findobj(gca,'type','line'); set(point,'Color','red'); hold on; plotpv(P,T); plotpc(net.IW{1},net.b{1}); hold off;

57 P = [ ; ]; T = [ ]; plotpv(P,T) net = newp([-1 1;-1 1],1); plotpv(P,T); plotpc(net.IW{1},net.b{1}); net.adaptParam.passes = 3; net = adapt(net,P,T); p = [0.7; 1.2]; a = net(p); plotpv(p,a); point = findobj(gca,'type','line'); set(point,'Color','red'); hold on; hold off;

58 Ex. P4.3 We have a classification problem with four classes of input vector Class 1 Class 2 Class 3 Class 4 Design a perceptron network to solve this problem.

59 Solved Problem P4.3 Design a perceptron network to solve the next problem (1,2) (1,1) (2,0) (2,1) (1,2) (2,1) (1, 1) (2, 2) Class 2: t = (0,1) Class 1: t = (0,0) Class 3: t = (1,0) Class 4: t = (1,1) A two-neuron perceptron creates two decision boundaries.

60 Solution of P4.3 Class 1 Class 2 Class 3 Class 4 -3 -1 -2 1 xin xout yin yout

61 P=[ ; ]; T=[ ; ]; net=newp(minmax(P),2,'hardlim','learnp'); %net.IW{1,1}=randn(2); net.IW{1,1}; % you can try net.IW{1,1}=randn(2); %net.b{1}=randn(2,1); net.b{1} % or net.b{1}=randn(2,1); net.adaptParam.passes = 5; net = train(net,P,T); a=sim(net,P); plotpc(net.IW{1, 1},net.b{1}); figure(2); plotpv(P,T); hold on; plotpc(net.IW{1},net.b{1});

62 Solved Problem P4.5 Train a perceptron network to solve P4.3 problem using the perceptron learning rule.

63 Solution of P4.5

64 Solution of P4.5

65 Solution of P4.5 -2 -1 xin xout yin yout

66 Hamming Network Hamming network was designed explicitly to solve binary (1 or –1) pattern recognition problem. It uses both feedforward and recurrent (feedback) layers. The objective is to decide which prototype vector is closest to the input vector. When the recurrent layer converges, there will be only one neuron with nonzero output.

67 Hamming Network + D Feedforward Layer Recurrent Layer a1 n1 W1 p 1 b1
S1 n1 + W1 SR R p R1 S 1 b1 W2 SS S D n2(t+1) S1 a2(t+1) a2(t) n1 = W1p + b1 a1 = purelin(n1) n2(t+1) = W2a2(t) a2(t+1) = poslin[n2(t+1)] a2(0) = a1

68 Feedforward Layer + W1 1 b1 a1 S1 n1 SR R p R1 S n1 = W1p + b1
a1 = purelin(n1) The feedforward layer performs a correlation, or inner product, between each of the prototype patterns and the input pattern. The connection matrix W1 are set to the prototype patterns. Each element of the bias vector b1 is set to be R, such that the outputs a1 can never be negative.

69 Feedforward Layer P1: a prototype orange P2: a prototype apple The outputs are equal to the inner products of each prototype pattern (p1 and p2) with the input (p), plus R (3).

70 Feedforward Layer The Hamming distance between two vectors is equal to the number of elements that are different. It is defined only for binary vectors. The neuron with the largest output will correspond to the prototype pattern that is closest in Hamming distance to the input pattern. Output = 2  (R – Hamming distance) a11 = 2  (3 – 1) = 4, a12 = 2  (3 – 2) = 2

71 Recurrent Layer D W2 SS S n2(t+1) S1 a2(t+1) a2(t) a1
n2(t+1) = W2a2(t) a2(t+1) = poslin[n2(t+1)] a2(0) = a1 The recurrent layer of the Hamming network is what is known as a “competitive” layer. The neurons compete with each other to determine a winner. After the competition, only one neuron will have a nonzero output.

72 Recurrent Layer Each element is reduced by the same fraction of the other.  The difference between large and small will be increased. The effect of the recurrent layer is to zero out all neuron outputs, except the one with the largest initial value (which corresponds to the prototype pattern that is closest in Hamming distance to the input).

73 Recurrent Layer Since the outputs of successive iterations produce the same result, the network has converged. Prototype pattern number one, the orange, is chosen as the correct match.


Download ppt "Artificial Neural Networks"

Similar presentations


Ads by Google