Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Neural Networks : An Introduction

Similar presentations


Presentation on theme: "Artificial Neural Networks : An Introduction"— Presentation transcript:

1 Artificial Neural Networks : An Introduction
G.Anuradha

2 Learning Objectives Reasons to study neural computation
Comparison between biological neuron and artificial neuron Basic models of ANN Different types of connections of NN, Learning and activation function Basic fundamental neuron model-McCulloch-Pitts neuron and Hebb network

3 Reasons to study neural computation
To understand how brain actually works Computer simulations are used for this purpose To understand the style of parallel computation inspired by neurons and their adaptive connections Different from sequential computation To solve practical problems by using novel learning algorithms inspired by brain

4 Fundamental concept NN are constructed and implemented to model the human brain. Performs various tasks such as pattern-matching, classification, optimization function, approximation, vector quantization and data clustering. These tasks are difficult for traditional computers

5 Biological Neural Network

6 Neuron and a sample of pulse train

7 Biological Neuron Has 3 parts End of axon splits into fine strands
Soma or cell body:- cell nucleus is located Dendrites:- nerve connected to cell body Axon: carries impulses of the neuron End of axon splits into fine strands Each strand terminates into a bulb-like organ called synapse Electric impulses are passed between the synapse and dendrites Synapses are of two types Inhibitory:- impulses hinder the firing of the receiving cell Excitatory:- impulses cause the firing of the receiving cell Neuron fires when the total of the weights to receive impulses exceeds the threshold value during the latent summation period After carrying a pulse an axon fiber is in a state of complete nonexcitability for a certain time called the refractory period.

8 How does the brain work Each neuron receives inputs from other neurons
Use spikes to communicate The effect of each input line on the neuron is controlled by a synaptic weight Positive or negative Synaptic weight adapts so that the whole network learns to perform useful computations Recognizing objects, understanding languages, making plans, controlling the body There are 1011 neurons with 104 weights.

9 Modularity and brain Different bits of the cortex do different things
Local damage to the brain has specific effects Early brain damage makes function relocate Cortex gives rapid parallel computation plus flexibility Conventional computers requires very fast central processors for long sequential computations

10 Information flow in nervous system

11 ANN ANN posess a large number of processing elements called nodes/neurons which operate in parallel. Neurons are connected with others by connection link. Each link is associated with weights which contain information about the input signal. Each neuron has an internal state of its own which is a function of the inputs that neuron receives- Activation level

12 Comparison between brain verses computer
ANN Speed Few ms. Few nano sec. massive ||el processing Size and complexity 1011 neurons & 1015 interconnections Depends on designer Storage capacity Stores information in its interconnection or in synapse. No Loss of memory Contiguous memory locations loss of memory may happen sometimes. Tolerance Has fault tolerance No fault tolerance Inf gets disrupted when interconnections are disconnected Control mechanism Complicated involves chemicals in biological neuron Simpler in ANN

13 Artificial Neural Networks
x1 x2 X1 X2 w1 w2 Y y

14 McCulloch-Pitts Neuron Model

15 McCulloch Pits for And and or model

16 McCulloch Pitts for NOT Model

17 Advantages and Disadvantages of McCulloch Pitt model
Simplistic Substantial computing power Disadvantages Weights and thresholds are fixed Not very flexible

18 Features of McCulloch-Pitts model
Allows binary 0,1 states only Operates under a discrete-time assumption Weights and the neurons’ thresholds are fixed in the model and no interaction among network neurons Just a primitive model

19 General symbol of neuron consisting of processing node and synaptic connections

20 Neuron Modeling for ANN
Is referred to activation function. Domain is set of activation values net. Scalar product of weight and input vector Neuron as a processing node performs the operation of summation of its weighted input.

21 Binary threshold neurons
There are two equivalent ways to write the equations for a binary threshold neuron: 1 if 1 if 0 otherwise 0 otherwise

22 Sigmoid neurons These give a real-valued output that is a smooth and bounded function of their total input. Typically they use the logistic function They have nice derivatives which make learning easy 1 0.5

23 Activation function Bipolar binary and unipolar binary are called as hard limiting activation functions used in discrete neuron model Unipolar continuous and bipolar continuous are called soft limiting activation functions are called sigmoidal characteristics.

24 Activation functions Bipolar continuous Bipolar binary functions

25 Activation functions Unipolar continuous Unipolar Binary

26 Common models of neurons
Binary perceptrons Continuous perceptrons

27 Quiz Which of the following tasks are neural networks good at?
Recognizing fragments of words in a pre-processed sound wave. Recognizing badly written characters. Storing lists of names and birth dates. logical reasoning Neural networks are good at finding statistical regularities that allow them to recognize patterns. They are not good at flawlessly applying symbolic rules or storing exact numbers.

28 Basic models of ANN

29 Classification based on interconnections
Feed forward Single layer Multilayer Feed Back Recurrent

30 Feed-forward neural networks
These are the commonest type of neural network in practical applications. The first layer is the input and the last layer is the output. If there is more than one hidden layer, we call them “deep” neural networks. They compute a series of transformations that change the similarities between cases. The activities of the neurons in each layer are a non-linear function of the activities in the layer below. output units hidden units input units

31 Feedforward Network Its output and input vectors are respectively
Weight wij connects the i’th neuron with j’th input. Activation rule of ith neuron is where EXAMPLE

32 Multilayer feed forward network
Can be used to solve complicated problems

33 Feedback network When outputs are directed back as inputs to same or preceding layer nodes it results in the formation of feedback networks

34 Lateral feedback If the feedback of the output of the processing elements is directed back as input to the processing elements in the same layer then it is called lateral feedback

35 Recurrent networks These have directed cycles in their connection graph. That means you can sometimes get back to where you started by following the arrows. They can have complicated dynamics and this can make them very difficult to train. There is a lot of interest at present in finding efficient ways of training recurrent nets. They are more biologically realistic. Recurrent nets with multiple hidden layers are just a special case that has some of the hiddenhidden connections missing.

36 Recurrent neural networks for modeling sequences
time  Recurrent neural networks are a very natural way to model sequential data: They are equivalent to very deep nets with one hidden layer per time slice. Except that they use the same weights at every time slice and they get input at every time slice. They have the ability to remember information in their hidden state for a long time. But its very hard to train them to use this potential. output output output hidden hidden hidden input input input

37 An example of what recurrent neural nets can now do (to whet your interest!)
Ilya Sutskever (2011) trained a special type of recurrent neural net to predict the next character in a sequence. After training for a long time on a string of half a billion characters from English Wikipedia, he got it to generate new text. It generates by predicting the probability distribution for the next character and then sampling a character from this distribution.

38 Some text generated one character at a time by Ilya Sutskever’s recurrent neural network
In 1974 Northern Denver had been overshadowed by CNL, and several Irish intelligence agencies in the Mediterranean region. However, on the Victoria, Kings Hebrew stated that Charles decided to escape during an alliance. The mansion house was completed in 1882, the second in its bridge are omitted, while closing is the proton reticulum composed below it aims, such that it is the blurring of appearing on any well-paid type of box printer.

39 Symmetrically connected networks
These are like recurrent networks, but the connections between units are symmetrical (they have the same weight in both directions). John Hopfield (and others) realized that symmetric networks are much easier to analyze than recurrent networks. They are also more restricted in what they can do. because they obey an energy function. For example, they cannot model cycles. Symmetrically connected nets without hidden units are called “Hopfield nets”.

40 Symmetrically connected networks with hidden units
These are called “Boltzmann machines”. They are much more powerful models than Hopfield nets. They are less powerful than recurrent neural networks. They have a beautifully simple learning algorithm.

41 Basic models of ANN

42 Learning It’s a process by which a NN adapts itself to a stimulus by making proper parameter adjustments, resulting in the production of desired response Two kinds of learning Parameter learning:- connection weights are updated Structure Learning:- change in network structure

43 Training The process of modifying the weights in the connections between network layers with the objective of achieving the expected output is called training a network. This is achieved through Supervised learning Unsupervised learning Reinforcement learning

44 Classification of learning
Supervised learning:- Learn to predict an output when given an input vector. Unsupervised learning Discover a good internal representation of the input. Reinforcement learning Learn to select an action to maximize payoff.

45 Supervised Learning Child learns from a teacher
Each input vector requires a corresponding target vector. Training pair=[input vector, target vector] Neural Network W X Y (Actual output) (Input) Error (D-Y) signals Error Signal Generator (Desired Output)

46 Two types of supervised learning
Each training case consists of an input vector x and a target output t. Regression: The target output is a real number or a whole vector of real numbers. The price of a stock in 6 months time. The temperature at noon tomorrow. Classification: The target output is a class label. The simplest case is a choice between 1 and 0. We can also have multiple alternative labels.

47 Unsupervised Learning
How a fish or tadpole learns All similar input patterns are grouped together as clusters. If a matching input pattern is not found a new cluster is formed One major aim is to create an internal representation of the input that is useful for subsequent supervised or reinforcement learning. It provides a compact, low-dimensional representation of the input.

48 Self-organizing In unsupervised learning there is no feedback
Network must discover patterns, regularities, features for the input data over the output While doing so the network might change in parameters This process is called self-organizing

49 Reinforcement Learning
X NN W Y (Input) (Actual output) Error signals Error Signal Generator R Reinforcement signal

50 When Reinforcement learning is used?
If less information is available about the target output values (critic information) Learning based on this critic information is called reinforcement learning and the feedback sent is called reinforcement signal Feedback in this case is only evaluative and not instructive

51 Basic models of ANN

52 Activation Function Identity Function f(x)=x for all x
Binary Step function Bipolar Step function Sigmoidal Functions:- Continuous functions Ramp functions:-

53 Some learning algorithms we will learn are
Supervised: Adaline, Madaline Perceptron Back Propagation multilayer perceptrons Radial Basis Function Networks Unsupervised Competitive Learning Kohenen self organizing map Learning vector quantization Hebbian learning

54 Neural processing Recall:- processing phase for a NN and its objective is to retrieve the information. The process of computing o for a given x Basic forms of neural information processing Auto association Hetero association Classification

55 Neural processing-Autoassociation
Set of patterns can be stored in the network If a pattern similar to a member of the stored set is presented, an association with the input of closest stored pattern is made

56 Neural Processing- Heteroassociation
Associations between pairs of patterns are stored Distorted input pattern may cause correct heteroassociation at the output

57 Neural processing-Classification
Set of input patterns is divided into a number of classes or categories In response to an input pattern from the set, the classifier is supposed to recall the information regarding class membership of the input pattern.

58 Important terminologies of ANNs
Weights Bias Threshold Learning rate Momentum factor Vigilance parameter Notations used in ANN

59 Weights Each neuron is connected to every other neuron by means of directed links Links are associated with weights Weights contain information about the input signal and is represented as a matrix Weight matrix also called connection matrix

60 Weight matrix W= =

61 Weights contd… wij –is the weight from processing element ”i” (source node) to processing element “j” (destination node) X1 1 Xi Yj Xn w1j wij wnj bj

62 Activation Functions Used to calculate the output response of a neuron. Sum of the weighted input signal is applied with an activation to obtain the response. Activation functions can be linear or non linear Already dealt Identity function Single/binary step function Discrete/continuous sigmoidal function.

63 Bias Bias is like another weight. Its included by adding a component x0=1 to the input vector X. X=(1,X1,X2…Xi,…Xn) Bias is of two types Positive bias: increase the net input Negative bias: decrease the net input

64 Why Bias is required? The relationship between input and output given by the equation of straight line y=mx+c C(bias) Input X Y y=mx+C

65 Threshold Set value based upon which the final output of the network may be calculated Used in activation function The activation function using threshold can be defined as

66 Learning rate Denoted by α.
Used to control the amount of weight adjustment at each step of training Learning rate ranging from 0 to 1 determines the rate of learning in each time step

67 Other terminologies Momentum factor: Vigilance parameter:
used for convergence when momentum factor is added to weight updation process. Vigilance parameter: Denoted by ρ Used to control the degree of similarity required for patterns to be assigned to the same cluster

68 Neural Network Learning rules
c – learning constant

69 Hebbian Learning Rule FEED FORWARD UNSUPERVISED LEARNING The learning signal is equal to the neuron’s output

70 Features of Hebbian Learning
Feedforward unsupervised learning “When an axon of a cell A is near enough to exicite a cell B and repeatedly and persistently takes place in firing it, some growth process or change takes place in one or both cells increasing the efficiency” If oixj is positive the results is increase in weight else vice versa

71

72 Perceptron Learning rule
Learning signal is the difference between the desired and actual neuron’s response Learning is supervised

73 Example

74 Quiz Suppose we have 3D input x=(0.5,-0.5) connected to a neuron with weights w=(2,-1) and bias b=0.5. furthermore the target for x is t=0. in this case we use a binary threshold neuron for the output so that y=1 if xTw+b>=0 and 0 otherwise What will be the weights and bias after 1 iteration of perceptron learning algorithm? w= (1.5,-0.5) b=-1.5 w=(1.5,-0.5) b=-0.5 w=(2.5,-1.5) b=0.5 w=(-1.5,0.5) b=1.5

75 Delta Learning Rule Only valid for continuous activation function
Used in supervised training mode Learning signal for this rule is called delta The aim of the delta rule is to minimize the error over all training patterns

76 Delta Learning Rule Contd.
Learning rule is derived from the condition of least squared error. Calculating the gradient vector with respect to wi Minimization of error requires the weight changes to be in the negative gradient direction

77 Widrow-Hoff learning Rule
Also called as least mean square learning rule Introduced by Widrow(1962), used in supervised learning Independent of the activation function Special case of delta learning rule wherein activation function is an identity function ie f(net)=net Minimizes the squared error between the desired output value di and neti

78 Winner-Take-All learning rules

79 Winner-Take-All Learning rule Contd…
Can be explained for a layer of neurons Example of competitive learning and used for unsupervised network training Learning is based on the premise that one of the neurons in the layer has a maximum response due to the input x This neuron is declared the winner with a weight

80

81 Summary of learning rules

82 Linear Separability Separation of the input space into regions is based on whether the network response is positive or negative Line of separation is called linear-separable line. Example:- AND function & OR function are linear separable Example EXOR function Linearly inseparable. Example

83 Hebb Network Hebb learning rule is the simpliest one
The learning in the brain is performed by the change in the synaptic gap When an axon of cell A is near enough to excite cell B and repeatedly keep firing it, some growth process takes place in one or both cells According to Hebb rule, weight vector is found to increase proportionately to the product of the input and learning signal.

84 Flow chart of Hebb training algorithm
Start 1 Initialize Weights Activate output y=t Weight update For Each s:t n Bias update b(new)=b(old) + y y Activate input xi=si Stop 1

85


Download ppt "Artificial Neural Networks : An Introduction"

Similar presentations


Ads by Google