From Biological to Artificial Neural Networks Marc Pomplun Department of Computer Science University of Massachusetts at Boston

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

Introduction to Neural Networks
Perceptron Lecture 4.
Machine Learning Neural Networks.
NEURAL NETWORKS Perceptron
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Neural Networks  A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
Computer Vision Lecture 18: Object Recognition II
Artificial Intelligence (CS 461D)
Simple Neural Nets For Pattern Classification
September 7, 2010Neural Networks Lecture 1: Motivation & History 1 Welcome to CS 672 – Neural Networks Fall 2010 Instructor: Marc Pomplun Instructor: Marc.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
November 19, 2009Introduction to Cognitive Science Lecture 20: Artificial Neural Networks I 1 Artificial Neural Network (ANN) Paradigms Overview: The Backpropagation.
September 21, 2010Neural Networks Lecture 5: The Perceptron 1 Supervised Function Approximation In supervised learning, we train an ANN with a set of vector.
September 14, 2010Neural Networks Lecture 3: Models of Neurons and Neural Networks 1 Visual Illusions demonstrate how we perceive an “interpreted version”
November 5, 2009Introduction to Cognitive Science Lecture 16: Symbolic vs. Connectionist AI 1 Symbolism vs. Connectionism There is another major division.
An Illustrative Example
September 16, 2010Neural Networks Lecture 4: Models of Neurons and Neural Networks 1 Capabilities of Threshold Neurons By choosing appropriate weights.
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
Artificial Neural Networks
November 30, 2010Neural Networks Lecture 20: Interpolative Associative Memory 1 Associative Networks Associative networks are able to store a set of patterns.
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
October 5, 2010Neural Networks Lecture 9: Applying Backpropagation 1 K-Class Classification Problem Let us denote the k-th class by C k, with n k exemplars.
October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.
Neural Networks Lecture 17: Self-Organizing Maps
November 21, 2012Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms III 1 Learning in the BPN Gradients of two-dimensional functions:
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Neuro-fuzzy Systems Xinbo Gao School of Electronic Engineering Xidian University 2004,10.
November 25, 2014Computer Vision Lecture 20: Object Recognition IV 1 Creating Data Representations The problem with some data representations is that the.
November 7, 2012Introduction to Artificial Intelligence Lecture 13: Neural Network Basics 1 Note about Resolution Refutation You have a set of hypotheses.
Neurons, Neural Networks, and Learning 1. Human brain contains a massively interconnected net of (10 billion) neurons (cortical cells) Biological.
MSE 2400 EaLiCaRA Spring 2015 Dr. Tom Way
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Semiconductors, BP&A Planning,
Machine Learning Dr. Shazzad Hosain Department of EECS North South Universtiy
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 14/15 – TP19 Neural Networks & SVMs Miguel Tavares.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
So Far……  Clustering basics, necessity for clustering, Usage in various fields : engineering and industrial fields  Properties : hierarchical, flat,
Introduction to Neural Networks. Biological neural activity –Each neuron has a body, an axon, and many dendrites Can be in one of the two states: firing.
Chapter 2 Single Layer Feedforward Networks
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
November 20, 2014Computer Vision Lecture 19: Object Recognition III 1 Linear Separability So by varying the weights and the threshold, we can realize any.
Dr.Abeer Mahmoud ARTIFICIAL INTELLIGENCE (CS 461D) Dr. Abeer Mahmoud Computer science Department Princess Nora University Faculty of Computer & Information.
November 21, 2013Computer Vision Lecture 14: Object Recognition II 1 Statistical Pattern Recognition The formal description consists of relevant numerical.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
1 Azhari, Dr Computer Science UGM. Human brain is a densely interconnected network of approximately neurons, each connected to, on average, 10 4.
Intro. ANN & Fuzzy Systems Lecture 3 Basic Definitions of ANN.
March 31, 2016Introduction to Artificial Intelligence Lecture 16: Neural Network Paradigms I 1 … let us move on to… Artificial Neural Networks.
April 5, 2016Introduction to Artificial Intelligence Lecture 17: Neural Network Paradigms II 1 Capabilities of Threshold Neurons By choosing appropriate.
Artificial Neural Networks This is lecture 15 of the module `Biologically Inspired Computing’ An introduction to Artificial Neural Networks.
1 Neural Networks Winter-Spring 2014 Instructor: A. Sahebalam Instructor: A. Sahebalam Neural Networks Lecture 3: Models of Neurons and Neural Networks.
Neural networks.
Supervised Learning in ANNs
Chapter 2 Single Layer Feedforward Networks
Artificial Intelligence (CS 370D)
Neural Networks A neural network is a network of simulated neurons that can be used to recognize instances of patterns. NNs learn by searching through.
Machine Learning. Support Vector Machines A Support Vector Machine (SVM) can be imagined as a surface that creates a boundary between points of data.
FUNDAMENTAL CONCEPT OF ARTIFICIAL NETWORKS
Machine Learning. Support Vector Machines A Support Vector Machine (SVM) can be imagined as a surface that creates a boundary between points of data.
Machine Learning. Support Vector Machines A Support Vector Machine (SVM) can be imagined as a surface that creates a boundary between points of data.
Supervised Function Approximation
Artificial Intelligence Lecture No. 28
Capabilities of Threshold Neurons
The Naïve Bayes (NB) Classifier
Computer Vision Lecture 19: Object Recognition III
Presentation transcript:

From Biological to Artificial Neural Networks Marc Pomplun Department of Computer Science University of Massachusetts at Boston Homepage:

From Biological to Artificial Neural Networks Overview: Why Artificial Neural Networks? How do NNs and ANNs work? An Artificial Neuron Capabilities of Threshold Neurons Linear and Sigmoidal Neurons Learning in ANNs

Computers vs. Neural Networks “Standard” ComputersNeural Networks one CPUhighly parallel processing fast processing unitsslow processing units reliable unitsunreliable units static infrastructuredynamic infrastructure

Why Artificial Neural Networks? There are two basic reasons why we are interested in building artificial neural networks (ANNs): Technical viewpoint: Some problems such as character recognition or the prediction of future states of a system require massively parallel and adaptive processing. Technical viewpoint: Some problems such as character recognition or the prediction of future states of a system require massively parallel and adaptive processing. Biological viewpoint: ANNs can be used to replicate and simulate components of the human (or animal) brain, thereby giving us insight into natural information processing. Biological viewpoint: ANNs can be used to replicate and simulate components of the human (or animal) brain, thereby giving us insight into natural information processing.

How do NNs and ANNs work? The “building blocks” of neural networks are the neurons.The “building blocks” of neural networks are the neurons. In technical systems, we also refer to them as units or nodes.In technical systems, we also refer to them as units or nodes. Basically, each neuronBasically, each neuron –receives input from many other neurons, –changes its internal state (activation) based on the current input, –sends one output signal to many other neurons, possibly including its input neurons (recurrent network)

How do NNs and ANNs work? Information is transmitted as a series of electric impulses, so-called spikes.Information is transmitted as a series of electric impulses, so-called spikes. The frequency and phase of these spikes encodes the information.The frequency and phase of these spikes encodes the information. In biological systems, one neuron can be connected to as many as 10,000 other neurons.In biological systems, one neuron can be connected to as many as 10,000 other neurons. Neurons of similar functionality are usually organized in separate areas (or layers).Neurons of similar functionality are usually organized in separate areas (or layers). Often, there is a hierarchy of interconnected layers with the lowest layer receiving sensory input and neurons in higher layers computing more complex functions.Often, there is a hierarchy of interconnected layers with the lowest layer receiving sensory input and neurons in higher layers computing more complex functions.

“Data Flow Diagram” of Visual Areas in Macaque Brain Blue: motion perception pathway Green: object recognition pathway

How do NNs and ANNs work? NNs are able to learn by adapting their connectivity patterns so that the organism improves its behavior in terms of reaching certain (evolutionary) goals.NNs are able to learn by adapting their connectivity patterns so that the organism improves its behavior in terms of reaching certain (evolutionary) goals. The strength of a connection, or whether it is excitatory or inhibitory, depends on the state of a receiving neuron’s synapses.The strength of a connection, or whether it is excitatory or inhibitory, depends on the state of a receiving neuron’s synapses. The NN achieves learning by appropriately adapting the states of its synapses.The NN achieves learning by appropriately adapting the states of its synapses.

An Artificial Neuron o1o1 o2o2 onon … w i1 w i2 … w in oioi neuron i net input signal synapses activation output

The Net Input Signal The net input signal is the sum of all inputs after passing the synapses: This can be viewed as computing the inner product of the vectors w i and o: where  is the angle between the two vectors.

The Net Input Signal In most ANNs, the activation of a neuron is simply defined to equal its net input signal: Then, the neuron’s activation function (or output function) f i is applied directly to net i (t): What do such functions f i look like?

The Activation Function One possible choice is a threshold function: The graph of this function looks like this: 1 0  f i (net i (t)) net i (t)

Capabilities of Threshold Neurons What can threshold neurons do for us? To keep things simple, let us consider such a neuron with two inputs: The computation of this neuron can be described as the inner product of the two-dimensional vectors o and w i, followed by a threshold operation. o1o1 o2o2 w i1 w i2 oioi

Capabilities of Threshold Neurons Let us assume that the threshold  = 0 and illustrate the function computed by the neuron for sample vectors w i and o: Since the inner product is positive for -90     90 , in this example the neuron’s output is 1 for any input vector o to the right of or on the dotted line, and 0 for any other input vector. wiwiwiwi first vector component second vector component o

Capabilities of Threshold Neurons By choosing appropriate weights w i and threshold  we can place the line dividing the input space into regions of output 0 and output 1in any position and orientation. Therefore, our threshold neuron can realize any linearly separable function R n  {0, 1}. Although we only looked at two-dimensional input, our findings apply to any dimensionality n. For example, for n = 3, our neuron can realize any function that divides the three-dimensional input space along a two-dimension plane (general term: (n-1)-dimensional hyperplane).

Linear Separability Examples (two dimensions): 1011 x2x2 x1x x2x2 x1x linearly separablelinearly inseparable OR functionXOR function This means that a single threshold neuron cannot realize the XOR function.

Capabilities of Threshold Neurons What do we do if we need a more complex function? We can also combine multiple artificial neurons to form networks with increased capabilities. For example, we can build a two-layer network with any number of neurons in the first layer giving input to a single neuron in the second layer. The neuron in the second layer could, for example, implement an AND function.

Capabilities of Threshold Neurons What kind of function can such a network realize? o1o1o1o1 o2o2o2o2 o1o1o1o1 o2o2o2o2 o1o1o1o1 o2o2o2o2... oioioioi

Capabilities of Threshold Neurons Assume that the dotted lines in the diagram represent the input-dividing lines implemented by the neurons in the first layer: 1 st comp. 2 nd comp. Then, for example, the second-layer neuron could output 1 if the input is within a polygon, and 0 otherwise.

Capabilities of Threshold Neurons However, we still may want to implement functions that are more complex than that. An obvious idea is to extend our network even further. Let us build a network that has three layers, with arbitrary numbers of neurons in the first and second layers and one neuron in the third layer. The first and second layers are completely connected, that is, each neuron in the first layer sends its output to every neuron in the second layer.

Capabilities of Threshold Neurons What type of function can a three-layer network realize? o1o1o1o1 o2o2o2o2 o1o1o1o1 o2o2o2o2 o1o1o1o1 o2o2o2o2... oioioioi...

Capabilities of Threshold Neurons Assume that the polygons in the diagram indicate the input regions for which each of the second-layer neurons yields output 1: 1 st comp. 2 nd comp. Then, for example, the third-layer neuron could output 1 if the input is within any of the polygons, and 0 otherwise.

Capabilities of Threshold Neurons The more neurons there are in the first layer, the more vertices can the polygons have. With a sufficient number of first-layer neurons, the polygons can approximate any given shape. The more neurons there are in the second layer, the more of these polygons can be combined to form the output function of the network. With a sufficient number of neurons and appropriate weight vectors w i, a three-layer network of threshold neurons can realize any (!) function R n  {0, 1}.

Terminology Usually, we draw neural networks in such a way that the input enters at the bottom and the output is generated at the top. Arrows indicate the direction of data flow. The first layer, termed input layer, just contains the input vector and does not perform any computations. The second layer, termed hidden layer, receives input from the input layer and sends its output to the output layer. After applying their activation function, the neurons in the output layer contain the output vector.

Terminology Example: Network function f: R 3  {0, 1} 2 output layer hidden layer input layer input vector output vector

Linear Neurons Obviously, the fact that threshold units can only output the values 0 and 1 restricts their applicability to certain problems. We can overcome this limitation by eliminating the threshold and simply turning f i into the identity function so that we get: With this kind of neuron, we can build networks with m input neurons and n output neurons that compute a function f: R m  R n.

Linear Neurons Linear neurons are quite popular and useful for applications such as interpolation. However, they have a serious limitation: Each neuron computes a linear function, and therefore the overall network function f: R m  R n is also linear. This means that if an input vector x results in an output vector y, then for any factor  the input  x will result in the output  y. Obviously, many interesting functions cannot be realized by networks of linear neurons.

Sigmoidal Neurons Sigmoidal neurons accept any vectors of real numbers as input, and they output a real number between 0 and 1. Sigmoidal neurons are the most common type of artificial neuron, especially in learning networks. A network of sigmoidal units with m input neurons and n output neurons realizes a network function f: R m  (0,1) n

Sigmoidal Neurons The parameter  controls the slope of the sigmoid function, while the parameter  controls the horizontal offset of the function in a way similar to the threshold neurons f i (net i (t)) net i (t)  = 1  = 0.1

Learning in ANNs In supervised learning, we train an ANN with a set of vector pairs, so-called exemplars. Each pair (x, y) consists of an input vector x and a corresponding output vector y. Whenever the network receives input x, we would like it to provide output y. The exemplars thus describe the function that we want to “teach” our network. Besides learning the exemplars, we would like our network to generalize, that is, give plausible output for inputs that the network had not been trained with.

Learning in ANNs There is a tradeoff between a network’s ability to precisely learn the given exemplars and its ability to generalize (i.e., inter- and extrapolate). This problem is similar to fitting a function to a given set of data points. Let us assume that you want to find a fitting function f:R  R for a set of three data points. You try to do this with polynomials of degree one (a straight line), two, and nine.

Learning in ANNs Obviously, the polynomial of degree 2 provides the most plausible fit. f(x)x deg. 1 deg. 2 deg. 9

Learning in ANNs The same principle applies to ANNs: If an ANN has too few neurons, it may not have enough degrees of freedom to precisely approximate the desired function. If an ANN has too few neurons, it may not have enough degrees of freedom to precisely approximate the desired function. If an ANN has too many neurons, it will learn the exemplars perfectly, but its additional degrees of freedom may cause it to show implausible behavior for untrained inputs; it then presents poor ability of generalization. If an ANN has too many neurons, it will learn the exemplars perfectly, but its additional degrees of freedom may cause it to show implausible behavior for untrained inputs; it then presents poor ability of generalization. Unfortunately, there are no known equations that could tell you the optimal size of your network for a given application; you always have to experiment.