# Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.

## Presentation on theme: "Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."— Presentation transcript:

Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008

© Prof. Rolf Ingold 2 Outline  Biological vs. artificial neural networks  Artificial neuron model  Artificial neural networks  Multi-layer perceptron  Feed-forward activation  Learning approach  Back-propagation method  Optimal learning  Illustration of JavaNNS

© Prof. Rolf Ingold 3 Biological neurons  Artificial neural networks are inspired by biological neurons of the central nervous system  each neuron is connected to many other neurons  information is transmitted via synapses (electro- chemical process)‏  a neuron receives input from its dendrites, and transmit output via the axon to synapses

© Prof. Rolf Ingold 4 Biological vs artificial networks up to 10 8 approx. 10 13 number de synapses up to 10 6 approx. 10 10 number of neurons very fastrelatively slowtransmission time mathematical function chemicalprocessing artificial neural network biological neural network

© Prof. Rolf Ingold 5 Artificial neuron model  A neuron receives input signals x 1,..., x n  These signals are multiplied by synaptic weights w 1,..., w n, which can be positive or negative  The activation of the neuron is transmitted to a non linear function f with threshold w 0  The output signal y = f (a-w 0 ) is then propagated to other neurons

© Prof. Rolf Ingold 6 Characteristics of artificial neural networks  Artificial neural networks may vary in different aspects  the topology of the network, i.e.  the number of neurons, possibly organized in layers or classes  how each neuron (of a given layer/class) is connected to its neighbors  the transfer function used in each neuron  The use and the learning strategy has to be adapted

© Prof. Rolf Ingold 7 Topology of the neural network  The synaptic connections have a major influence on the behavior of the neural network  Two main categories can be considered  feed-forward networks where each neuron is propagating its output signal to neurons that have not yet been used  as special case, the multi-layer perceptron has a sequence of layers such than a neuron from one layer is connected only to neurons of the next layer  dynamic networks where neurons are connected without restrictions, in a cyclic way

© Prof. Rolf Ingold 8 Multi-layer perceptron  The multi-layer perceptron (MLP) has 3 (or more) layers  an input layer with one input neuron per feature  one or several hidden layers having each an arbitrary number of neurons, connected to the previous layer  an output layer with one neuron per class each neuron being connected to the previous layer  Hidden and output layers can be completely or only partly connected  The decision is in favor of the class corresponding to the highest output activation

© Prof. Rolf Ingold 9 Impact of the hidden layer(s)‏  Networks with hidden layers generate arbitrary decision boundaries  however the number of hidden layers has no impact !

© Prof. Rolf Ingold 10 Feed-forward activation  As for the single perceptron, the feature space is augmented with a feature x 0 =1 to take into account the bias w 0.  Each neuron j of a hidden layer computes an activation with  Each neuron k of an output layer computes an activation with

© Prof. Rolf Ingold 11 Transfer function  The transfer function f is supposed to be  monotonic increasing, within the range [-1,+1]  antisymmetric, i.e. f (-net) = - f (net)‏  continuous and derivable (for back-propagation)‏  Typical functions are  step (simple threshold)‏  ramp  sigmoid

© Prof. Rolf Ingold 12 Learning in a multi-layer perceptron  Learning consists of setting the weights w, based on training samples  The method is called back-propagation, because the training error is propagated recursively from the output layer back to the hidden and input layers  The training error on a given pattern is defined as the squared difference between the desired output and the observed output, i.e.  In practice, the desired output is +1 for the correct class and  1 (or sometimes 0 ) for all other classes

© Prof. Rolf Ingold 13 Back-propagation of errors  The weight vectors are changed in the direction of their gradient where  is the learning rate

© Prof. Rolf Ingold 14 Error correction on the output layer  Since the error does not directly depend upon w ji we apply the differential chain rule with and  Thus the update rule becomes

© Prof. Rolf Ingold 15 Error correction on the hidden layer(s)‏  Applying the following chain rule with  Finally the update rule becomes

© Prof. Rolf Ingold 16 Learning algorithm  The learning process starts with randomly initialized weights  The weights are adjusted iteratively by patterns from the training set  the pattern is presented to the network and the feed-forward activation is computed  the output error is computed  the error is used to update the weights reversely, from the output layer to the hidden layers  The process is repeated until a quality criteria is reached

© Prof. Rolf Ingold 17 Risk of overfitting  Minimizing the global error over all training sample tends to produce overfitting  To avoid overfitting, the best strategy is to minimize the global error on a validation set which is independent of the training set

© Prof. Rolf Ingold 18 JavaNNS  JavaNNS is an interactive software framework for experimenting artificial neural networks,  it has been developed at University of Tübingen  it is based on SNNS, an efficient ANN kernel written in C  It supports the following features  multiple topologies (MLP, dynamic networks,...)‏  various transfer functions  various learning strategies  network pruning ...

© Prof. Rolf Ingold 19 Font recognition with JavaNNS  Original neural network with 9 hidden units

© Prof. Rolf Ingold 20 Pruned neural network for font recognition  Neural network obtained after pruning

Download ppt "Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008."

Similar presentations