Download presentation
Presentation is loading. Please wait.
1
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008
2
© Prof. Rolf Ingold 2 Outline
3
© Prof. Rolf Ingold 3 Biological neurons Artificial neural networks are inspired by biological neurons of the central nervous system each neuron is connected to many other neurons information is transmitted via synapses (electro- chemical process) a neuron receives input on the from its dendrites, and transmit output via the axon to synapses
4
© Prof. Rolf Ingold 4 Biological vs artificial networks biologic neural network artificial neural network processingchemicalmathematical function transmission timerelatively slowvery fast number of neuronsapprox. 10 10 max. 10 4 à 10 6 number de synapsesapprox. 10 13 up to 10 8
5
© Prof. Rolf Ingold 5 Artificial neuron A neuron receives input signals x 1,..., x n These signals are multiplied by synaptic weights w 1,..., w n, which can be positive or negative The activation of the neuron is transmitted to a non linear function f with threshold w 0 The output signal y = f (a-w 0 ) is then propagated to other neurons
6
© Prof. Rolf Ingold 6 Characteristics of artificial neural networks Artificial neural networks may vary in different aspects the topology of the network, i.e. the number of neurons, possibly organized in layers or classes how each neuron (of a given layer/class) is connected to its neighbors the transfer function used in each neuron The use and the learning strategy has to be adapted
7
© Prof. Rolf Ingold 7 Topology of the neural network The synaptic connections have a major influence on the behavior of the neural network Two main categories can be considered feed-forward networks where each neuron is propagating its output signal to neurons that have not yet been used as special case the multi-layer perceptron has a sequence of layers such than a neurons from one layer is connected only to neurons of the next layer dynamic networks where neurons are connected without restrictions, in a cyclic way
8
© Prof. Rolf Ingold 8 Multi-layer perceptron The multi-layer perceptron (MLP) has 3 (or more) layers an input layer with one input neuron per feature one or several hidden layers having each an arbitrary number of neurons, connected to the previous layer an output layer with one neuron per class each neuron being connected to the previous layer Hidden and output layers can be completely or only partly connected The decision is in favor of the class corresponding to the highest output activation
9
© Prof. Rolf Ingold 9 Impact of the hidden layer(s) Networks with hidden layers generate arbitrary decision boundaries however the number of hidden layers has no impact !
10
© Prof. Rolf Ingold 10 Feed-forward activation As for the single perceptron, the feature space is augmented with a feature x 0 =1 to take into account the bias w 0. Each neuron j of a hidden layer computes an activation with Each neuron k of a output layer computes an activation with
11
© Prof. Rolf Ingold 11 Transfer function The transfer function f is supposed to be monotonic increasing, within the range [-1,+1] antisymmetric, i.e. f (-net) = - f (net) continuous and derivable (for back-propagation) Typical functions are simple threshold xxx sigmoide
12
© Prof. Rolf Ingold 12 Learning in a multi-layer perceptron Learning consists of setting the weights w, based on training samples The method is called back-propagation, because the training error is propagated recursively from the output layer back to the hidden and input layers The training error on a given pattern is defined as the squared difference between the desired output and the observed output, i.e. In practice, the desired output is +1 for the correct class and 1 (or sometimes 0 ) for all other classes
13
© Prof. Rolf Ingold 13 Back-propagation of errors The weight vectors are changed in the direction of their gradient where is the learning rate
14
© Prof. Rolf Ingold 14 Error correction on the output layer Since the error does not directly depend upon w ji we apply the differential chain rule with and Thus the update rule becomes
15
© Prof. Rolf Ingold 15 Error correction on the hidden layer(s) Applying the following chain rule with Finally the update rule becomes
16
© Prof. Rolf Ingold 16 Learning algorithm The learning process starts with randomly initialized weights The weights are adjusted iteratively by patterns from the training set the pattern is presented to the network and the feed-forward activation is computed the output error is computed the error is used to update the weights reversely, from the output layer to the hidden layers The process is repeated for until a quality criteria is reached
17
© Prof. Rolf Ingold 17 Risk of overfitting By minimizing the global error over all training sample tends to produce overfitting To avoid overfitting, the best strategy is to minimize the global error on a validation set which is independent of training set
18
© Prof. Rolf Ingold 18 JavaNNS JavaNNS is an interactive software framework for experimenting artificial neural networks, it has been developed at University of Tübingen it is It supporting the following features multiple topologies (MLP, dynamic networks,...) different transfer functions different learning strategies network pruning ...
19
© Prof. Rolf Ingold 19 Font recognition with JavaNNS Original neural network with 9 hidden units
20
© Prof. Rolf Ingold 20 Pruned neural network for font recognition Neural network obtained after pruning
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.