Presentation is loading. Please wait.

Presentation is loading. Please wait.

Artificial Neural Networks An Introduction. Outline Introduction Biological and artificial neurons Perceptrons (problems) Backpropagation network Training.

Similar presentations


Presentation on theme: "Artificial Neural Networks An Introduction. Outline Introduction Biological and artificial neurons Perceptrons (problems) Backpropagation network Training."— Presentation transcript:

1 Artificial Neural Networks An Introduction

2 Outline Introduction Biological and artificial neurons Perceptrons (problems) Backpropagation network Training Other ANNs (examples in HEP)

3 Introduction - What are ANNs? Artificial Neural Networks: –data analysis tools (/computational modelling tools) –model complex real-world problems –structures comprised of densely interconnected simple processing elements –each element is linked to neighbours with varying strengths –learning is accomplished by adjusting these strengths to cause network to output appropriate results –learn from experience (rather than being explicitly programmed with rules) –inspired by biological neural networks (ANN’s idea is not to replicate operation of bio systems, but use what’s known of their functionality to solve complex problems)

4 Information processing characteristics : –nonlinearity (allows better fit to data) –fault and failure tolerance (for uncertain data and measurement errors) –learning and adaptivity (allows system to update its internal structure in response to changing environment) –generalization (enables application of model to unlearned data) Generally ANNs outperform other computational tools in solving a variety of problems: Pattern classification; categorizes set of input patterns in terms of different features Clustering; clusters formed by exploring similarities between input patterns based on their inter-correlations Function approximation; training ANN to approx. the underlying rules relating the inputs to the outputs

5 Biological Neuron 3 major functional units Dendrites Cell body Axon Synapse Amount of signal passing through a neuron depends on: Intensity of signal from feeding neurons Their synaptic strengths Threshold of the receiving neuron Hebb rule (plays key part in learning) (A synapse which repeatedly triggers the activation of a postsynaptic neuron will grow in strength, others will gradually weaken.) Learn by adjusting magnitudes of synapses’ strengths y w1w1 w2w2 wnwn x1x1 x2x2 xnxn ξ g(ξ)

6 Artificial Neurons (basic computational entities of an ANN) Analogy between artificial and biological (connection weights represent synapses) In 1958 Rosenblatt introduced mechanics (perceptron) Input to output (y=g(∑ i w i x j ) Only when sum exceeds the threshold limit will neuron fire Weights can enhance or inhibit Collective behaviour of neurons is what’s interesting for intelligent data processing w1w1 w2w2 w3w3 x1x1 x2x2 x3x3 y ∑w.x g( )

7 Perceptrons Can be trained on a set of examples using a special learning rule (process) Weights are changed in proportion to the difference (error) between target output and perceptron solution for each example. Minimize summed square error function: E = 1/2 ∑ p ∑ i (o i (p) - t i (p) ) 2 with respect to the weights. Error is function of all the weights and forms an irregular multidimensional complex hyperplane with many peaks, saddle points and minima. Error minimized by finding set of weights that correspond to global minimum. Done with gradient descent method – (weights incrementally updated in proportion to δE/δ w ij ) Updating reads: w ij (t + 1) = w ij (t) – Δw ij Aim is to produce a true mapping for all patterns oioi xjxj w ij ξ g(ξ) threshold

8 Summary of Learning for Perceptron 1.Initialize w ij with random values. 2.Repeat until w ij (t + 1) ≈ w ij (t): Pick pattern p from training set. Feed input to network and calculate the output. Update the weights according to w ij (t + 1) = w ij (t) – Δw ij where Δ w ij = -η δE/δ w ij. When no change (within some accuracy) occurs, the weights are frozen and network is ready to use on data it has never seen.

9 Example x1x1 x2x2 tx1x1 x2x2 t 111111 100101 010011 000000 AND OR Perceptron learns these rules easily (ie sets appropriate weights and threshold) (to w=(w 0,w 1,w 2 ) = (-1.5,1.0,1.0) and (-0.5,1.0,1.0) where w 0 corresponds to the threshold term)

10 Problems Perceptrons can only perform accurately with linearly separable classes (linear hyperplane can place one class of objects on one side of plane and other class on other) ANN research put on hold for 20yrs. Solution: additional (hidden) layers of neurons, MLP architecture Able to solve non-linear classification problems x1x1 x2x2 x1x1 x2x2

11 MLPs Learning procedure is extension of simple perceptron algorithm Response function: o i =g(∑ i w ij g(∑ k w jk x k )) Which is non-linear so network able to perform non-linear mappings (Theory tells us that a neural network with at least 1 hidden layer can represent any function) Vast number of ANN types exist oioi w ij w jk xkxk hjhj

12 Backpropagation ANNs Most widely used type of network Feedforward Supervised (learns mapping from one data space to another using examples) Error propagated backwards Versatile. Used for data modelling, classification, forecasting, data and image compression and pattern recognition.

13 BP Learning Algorithm Like Perceptron, uses gradient descent to minimize error (generalized to case with hidden layers) Each iteration constitutes two sweeps To minimize Error we need δE/δ w ij but also need δE/δ w jk (which we get using the chain rule) Training of MLP using BP can be thought of as a walk in weight space along an energy surface, trying to find global minimum and avoiding local minima Unlike for Perceptron, there is no guarantee that global minimum will be reached, but most cases energy landscape is smooth

14 Summary of BP learning algorithm 1.Initialize w ij and w jk with random values. 2.Repeat until w ij and w jk have converged or the desired performance level is reached: Pick pattern p from training set. Present input and calculate the output. Update weights according to: w ij (t + 1) = w ij (t) – Δw ij w jk (t + 1) = w jk (t) – Δw jk where Δw = -η δE/δw. (…etc…for extra hidden layers).

15 Training Generalization; network’s performance on a set of test patterns it has never seen before. (lower than on training set) Training set used to let ANN capture features in data or mapping. Initial large drop in error is due to learning, but subsequent slow reduction is due to: 1.Network memorization (too many training cycles used). 2.Overfitting (too many hidden nodes). (network learns individual training examples and loses generalization ability) Error (eg SSE) No. of hidden nodes or training cycles Optimum network Testing Training

16 Other Popular ANNs Some applications may be solved using variety of ANN types, some only via specific. (problem logistics) Hopfield networks; optimization. Presented with incomplete/noisy pattern, network responds by retrieving an internally stored pattern it most closely resembles. Kohonen networks; (self-organizing) Trained in an unsupervised manner to form clusters in the data. Used for pattern classification and data compression.

17 HEP Applications ANNs applied from off-line data analysis to low- level experimental triggers Signal to background ratios reduced. (BP…) –ie in flavour tagging, Higgs detection Feature recognition problems in track finding. (feed-back) Function approximation tasks (feed-back) –ie reconstructing the mass of a decayed particle from calorimeter information

18 http://www.doc.ic.ac.uk/~nd/surprise_96.journal/vol4/cs11/report.html http://www.cs.stir.ac.uk/~lss/NNIntro/InvSlides.html Carsten Peterson and Thorsteinn Rognvaldsson, An Introduction to Artificial Neural Networks, LU TP 91-23, September 1991 (Lectures given at the 1991 Cern School of Computing, Sweden)


Download ppt "Artificial Neural Networks An Introduction. Outline Introduction Biological and artificial neurons Perceptrons (problems) Backpropagation network Training."

Similar presentations


Ads by Google