Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Perceptron Lecture 4.
G53MLE | Machine Learning | Dr Guoping Qiu
Artificial Neural Networks (1)
NEURAL NETWORKS Perceptron
also known as the “Perceptron”
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Supervised Learning Recap
Artificial Neural Networks - Introduction -
Artificial Neural Networks - Introduction -
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Artificial Neural Networks
1 Neural Networks - Basics Artificial Neural Networks - Basics Uwe Lämmel Business School Institute of Business Informatics
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Radial Basis Function (RBF) Networks
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks II PROF. DR. YUSUF OYSAL.
Radial Basis Function Networks
Neural Networks.
Artificial Intelligence Lecture No. 28 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Classification Part 3: Artificial Neural Networks
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Introduction to Neural Networks Debrup Chakraborty Pattern Recognition and Machine Learning 2006.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
Appendix B: An Example of Back-propagation algorithm
Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi ​ Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.
Radial Basis Function Networks:
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
Non-Bayes classifiers. Linear discriminants, neural networks.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
Radial Basis Function ANN, an alternative to back propagation, uses clustering of examples in the training set.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.
EEE502 Pattern Recognition
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Machine Learning Supervised Learning Classification and Regression
CSE 473 Introduction to Artificial Intelligence Neural Networks
CSE P573 Applications of Artificial Intelligence Neural Networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
CS621: Artificial Intelligence
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
Chapter 3. Artificial Neural Networks - Introduction -
Neuro-Computing Lecture 4 Radial Basis Function Network
CSE 573 Introduction to Artificial Intelligence Neural Networks
Neural Network - 2 Mayank Vatsa
Lecture Notes for Chapter 4 Artificial Neural Networks
Introduction to Radial Basis Function Networks
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
Presentation transcript:

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell

Slide 2 EE3J2 Data Mining Objectives  Unsupervised and supervised learning  Modelling and discrimination  Introduction to Artificial Neural Networks (ANNs)

Slide 3 EE3J2 Data Mining Unsupervised learning  So far we have looked at techniques which try to discover structure in ‘raw’ data – data with no information about classes –Gaussian Mixture Modelling –Clustering  We treat the whole data set as a single entity, and try to discover underlying structure  The analysis is unsupervised, and automatic learning of the structure of the data is unsupervised learning

Slide 4 EE3J2 Data Mining Supervised learning  In some cases additional information is available  For example, for speech data we might know who was speaking, or what he or she said  This is information about the class of each piece of data  When the analysis is driven by class labels, it is called supervised learning

Slide 5 EE3J2 Data Mining Modelling and Discrimination  In supervised learning we can: –Analyse the data for each class separately –Try to discover how to distinguish between classes  Could apply GMM or clustering separately to model each class  Alternatively, we could try to find a method to discriminate between the classes

Slide 6 EE3J2 Data Mining Modelling and Discrimination Class models Decision boundary

Slide 7 EE3J2 Data Mining Discrimination  In the simplest cases we can discriminate between two classes using a class boundary  Allocation of a point to a class depends on which side of the boundary it lies Linear decision boundary Non- linear decision boundary

Slide 8 EE3J2 Data Mining Artificial Neural Networks  There are many approaches to discrimination  A common class of approaches is based on the idea of Artificial Neural Networks (ANNs)  Inspiration for the basic elements of an ANN (artificial neuron) comes from biology…  …but the analogy really stops there  ANNs are just a computational device for processing patterns – not “artificial brains”

Slide 9 EE3J2 Data Mining A model of a neuron

Slide 10 EE3J2 Data Mining An Artificial Neuron  Simple artificial neuron  Basic idea – –if the input to unit u 4 is big enough, then the neurone ‘fires’ –Otherwise nothing happens  How do we calculate the input to u 4 ? i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4

Slide 11 EE3J2 Data Mining Artificial Neurone (2)  Suppose that the inputs to units 1, 2 and 3 are i 1, i 2 and i 3  Then the input to u 4 is:  In general, for an artificial neuron with N input units the input to unit k is: i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4

Slide 12 EE3J2 Data Mining The ‘threshold’ activation function  The activation function decides whether the neuron should “fire”  A suitable activation function is the threshold function g:  The output of u 4 is then: i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4

Slide 13 EE3J2 Data Mining Other activation functions  Linear:  Sigmoid Sigmoid activation function

Slide 14 EE3J2 Data Mining The ‘bias’  As described, the neuron will ‘fire’ only if its input is greater than 0  We can change the value of the point of firing by introducing a bias  This is an additional input unit whose input is fixed at 1 i1i1 i2i2 i3i3 w 1,4 w 2,4 w 3,4 u4u4 w b,4 1

Slide 15 EE3J2 Data Mining How the bias works…  The artificial neuron ‘fires’ if input to u 4 is greater than or equal to 0  I.E:  But this happens only if  Or, equivalently,

Slide 16 EE3J2 Data Mining Example (2D)  Suppose u has a threshold or sigmoid activation function  u will ‘fire’ if: xy 3 1 u -2 1

Slide 17 EE3J2 Data Mining Example (continued) xy 3 1 u4u /3 2 u1u1 u2u2 u3u3

Slide 18 EE3J2 Data Mining Example (continued)  Assume –linear activation functions for units u 1, u 2 and u 3 –Sigmoid activation function for u 4  If input to u 1 is 2 and input to u 2 is 2, then: –Input to u 4 is 2 × ×1 + 1 × (-2) = 6 –Hence output from u 4 is g(6) =  If input to u 1 is -2 and input to u 2 is -2, then: –Input to u 4 is -2 × ×1 + 1 × (-2) = -10 –Hence output from u 4 is g(-10) = 4.54 × 10 -5

Slide 19 EE3J2 Data Mining Example 2 xy 2 u4u4 1 1/2

Slide 20 EE3J2 Data Mining Combining 2 Artificial Neurons x y 3 1 u -2 1 x y 2 u 1 1/2 2 2/3

Slide 21 EE3J2 Data Mining Combining neurons – artificial neural networks x y 3 u4u u1u1 u2u u5u5 u6u6 -2 1

Slide 22 EE3J2 Data Mining Combining neurons  Input to u 4 is 3 × x + 1 × y - 2  Input to u 5 is 2 × x + (-1) × y – 1  When x = 3, y = 0 –Input to u 4 is 7, input to u 5 is 5 –Output from u 4 is 1, output from u 5 is 0.99 –Input to u 6 is 1 × × (-20) - 2 = –Output from u 6 is 0.13

Slide 23 EE3J2 Data Mining Outputs i1i1 i2i2 o6o

Slide 24 EE3J2 Data Mining Combining neurones 2 2/3 ‘firing region’

Slide 25 EE3J2 Data Mining Single layer Multi-Layer Perceptron (MLP) Input layer Hidden layer Output layer

Slide 26 EE3J2 Data Mining Single Layer MLP  Can characterize arbitrary convex regions  Defines the region using linear decision boundaries

Slide 27 EE3J2 Data Mining Two-layer MLP Hidden layers

Slide 28 EE3J2 Data Mining Two-Layer MLP  An MLP with two hidden layers can characterize arbitrary shapes  First hidden layer characterises convex regions  Second hidden layer combines these convex regions  There is no advantage in having more than two hidden layers

Slide 29 EE3J2 Data Mining MLP training  To define an MLP must decide: –Number of layers –Number of input units –Number of hidden units –Number of output units  Once these are defined, properties of the MLP are completely defined by the values of the weights  How do we choose the weight values?

Slide 30 EE3J2 Data Mining MLP training (continued)  MLP weights learnt automatically from training data  We have already seen computational techniques for estimating: –Parameters of GMMs –Centroid positions in clustering  Similarly there is an iterative computational technique for estimating MLP weights – “Error- Back-Propagation”

Slide 31 EE3J2 Data Mining Error-back propagation (EBP)  EBP is a ‘gradient descent’ method, like others we have seen  First stage is to choose initial values for the weights  The EBP algorithm then changes the weights incrementally to identify the class boundaries  Only guaranteed to find a local optimum

Slide 32 EE3J2 Data Mining Other types of ANN  Multi-Layer Perceptrons (MLP) are not the only types of ANNs  There are lots of others: –Radial Basis Function (RBF) networks –Support Vector Machines (SVMs) –…  There are also ANN interpretations of other methods

Slide 33 EE3J2 Data Mining Summary  Discrimination versus Modelling  Brief introduction to neural networks  Definition of an ‘artificial neurone’  Activation functions – linear and sigmoid  Linear boundary defined by a single neurone  Convex region defined by a one-level MLP  Two-level MLPs