Neural Networks Lecture 4 out of 4. Practical Considerations Input Architecture Output.

Slides:

Advertisements

Similar presentations

Artificial Neural Networks

Advertisements

Artificial Intelligence 12. Two Layer ANNs

Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 7: Learning in recurrent networks Geoffrey Hinton.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

Machine Learning Lecture 4 Multilayer Perceptrons G53MLE | Machine Learning | Dr Guoping Qiu1.

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.

Modular Neural Networks CPSC 533 Franco Lee Ian Ko.

Machine Learning Neural Networks

Lecture 14 – Neural Networks

Artificial Intelligence (CS 461D)

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

x – independent variable (input)

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks Martin Russell.

Neural Networks Marco Loog.

Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.

MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

October 7, 2010Neural Networks Lecture 10: Setting Backpropagation Parameters 1 Creating Data Representations On the other hand, sets of orthogonal vectors.

Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.

CSC 4510 – Machine Learning Dr. Mary-Angela Papalaskari Department of Computing Sciences Villanova University Course website:

Cascade Correlation Architecture and Learning Algorithm for Neural Networks.

Artificial Neural Networks

Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.

Multi-Layer Perceptrons Michael J. Watts

Neural Networks Chapter 6 Joost N. Kok Universiteit Leiden.

Artificial Neural Networks An Overview and Analysis.

Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10

Appendix B: An Example of Back-propagation algorithm

ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.

Artificial Intelligence Lecture No. 29 Dr. Asad Ali Safi Assistant Professor, Department of Computer Science, COMSATS Institute of Information Technology.

Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.

Artificial Intelligence Techniques Multilayer Perceptrons.

1 Chapter 11 Neural Networks. 2 Chapter 11 Contents (1) l Biological Neurons l Artificial Neurons l Perceptrons l Multilayer Neural Networks l Backpropagation.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Neural Networks in Computer Science n CS/PY 231 Lab Presentation # 1 n January 14, 2005 n Mount Union College.

Soft computing Lecture 7 Multi-Layer perceptrons.

CS Inductive Bias1 Inductive Bias: How to generalize on novel data.

CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.

Neural Networks Presented by M. Abbasi Course lecturer: Dr.Tohidkhah.

Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22

CSC321 Lecture 5 Applying backpropagation to shape recognition Geoffrey Hinton.

Each neuron has a threshold value Each neuron has weighted inputs from other neurons The input signals form a weighted sum If the activation level exceeds.

CSC321: Introduction to Neural Networks and Machine Learning Lecture 23: Linear Support Vector Machines Geoffrey Hinton.

Neural Networks Lecture 11: Learning in recurrent networks Geoffrey Hinton.

Computational Intelligence Semester 2 Neural Networks Lecture 2 out of 4.

Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.

Artificial Neural Networks This is lecture 15 of the module `Biologically Inspired Computing’ An introduction to Artificial Neural Networks.

Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.

Today’s Lecture Neural networks Training

Machine Learning Supervised Learning Classification and Regression

Neural networks.

Neural Network Architecture Session 2

Neural Networks.

Neural Networks CS 446 Machine Learning.

Intelligent Information System Lab

Artificial Intelligence Methods

Creating Data Representations

Multilayer Perceptron & Backpropagation

Neural Networks Geoff Hulten.

Lecture Notes for Chapter 4 Artificial Neural Networks

Artificial Intelligence 12. Two Layer ANNs

Computer Vision Lecture 19: Object Recognition III

CSC321: Neural Networks Lecture 11: Learning in recurrent networks

David Kauchak CS51A Spring 2019

An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,

David Kauchak CS158 – Spring 2019

Introduction to Neural Networks

EE 193/Comp 150 Computing with Biological Parts

Presentation transcript:

Neural Networks Lecture 4 out of 4

Practical Considerations Input Architecture Output

Multilayer Perceptron Architecture

Input

What is the input to a neural net? Binary or real numbered strings How do we represent real world input like this? XOR is simply 2 (usually) binary inputs. But how would you represent – an image? – a voice? – a document?

Input: Features Extraction: How to get features out of the input. Selection: Which features to choose. Representation: How to represent these features as binary or real valued strings

Example: Audio Input Eg voice recognition amplitude (loudness) Pitch (frequency) Frequency ‘signature’ of a voice can be represented as a sequence of numbers.

Example: Time Series Data Eg stock market prediction Use 'past performance' figures – caveat: past performance is not a guarantee of future performance Use a 'sliding window' technique – each day the ANN predicts tomorrow’s performance – Often uses a different architecture (recurrent connections) ANNs are now a standard part of the analysis toolkits – And so their effect has been 'factored into' the market

Example: Image Processing Eg face recognition – Face recognition (is it you?) is easier than face classification (who is it?) Naïve approach - bitmap values for each pixel features – edges, corners, even shapes markers - position of eyes, mouths, faces (in photos). Lots of ways of turning an image into a string of real numbers

Example: Text Processing eg document classification (news recommender systems). o Scan the document looking for interesting features. o Words: dictionary of interesting words o Entities: people or places you care about. o Concepts: ideas and themes eg political stories, sports news

Input: Features Extraction: How to get features out of the input. Selection: Which features to choose. Representation: How to represent these features as binary or real valued strings

Feature Extraction Audio: Fourier transforms to represent a sound as a series of frequencies and their relative importance. Question: what ‘granularity’ do we choose? Time Series Data: use 'past performance' figures over a certain time period (day? minute? second?) Images: preprocessing to extract features like eyes; easier to detect simple things like edges Documents: easy to extract words from a document!

Feature Selection Which features are most important? Audio: what frequencies? Time Series Data: what time period? Images: what are the ‘most important’ features (eyes position? Inter-eye distance? Gaze direction?) Documents: most ‘informative’ words (words like ‘and’ are probably not that useful) – stopwords – information gain/entropy

Feature Representation Binary or real? Scaling? between 0 and 1 Audio: normalise each pitch contribution Time Series Data: normalise magnitudes – Can also look at differences rather than absolute level Images: – Binary: presence or absence of a feature. – Real numbered: eg size, colour or illumination Documents: – Binary for presence or absence of a word/entity/concept – real numbered for (normalised) frequency – Standard measure TFIDF

Middle Layer

Multilayer Perceptron Architecture

Architecture basically means how many hidden layer nodes Input units fixed (once you have decided on features) Output units fixed (once you have decided on output – see next section) Most networks are feedforward, fully connected (changing this means changing network type) What’s left is the middle layer

Overfitting and underfitting What a neural network does, under the hood, is to compute a function. – It’s difficult to work out what this function is, but in most cases we won’t actually care. We just want to know what the output will be for any given input. Underfitting is where you fit too simple a function eg a straight line where you actually needed a curve. Overfitting is where you fit too complex a function eg a very wavy line where you just needed a simple curve without twists and turns. These 2 faults show themselves up in different ways.

Training and Test Data training data = data for which we already know the answer (expected output). Therefore can work out the error –.. apply backpropagation to change the weights. Test data = we also know the answer (expected output) but we don’t tell the network during training – After training, ‘freeze’ the weights – Try it out on test data – see how well it generalises to unseen data.

How do over and underfitting neural networks do on training and test data? Underfitting neural networks perform poorly on both training and test sets. Overfitting networks may do very well on training sets but terribly on test sets. Underfitting neural networks tend to have too few hidden units: overfitting networks have too many. Some learning regimes allow the number of hidden units to be changed during training.

Validation Data A set of data that, like the test set, is ‘hidden’ from the network during training. From time to time during training, we try out the network against validation data. If the validation error starts increasing while the training error is decreasing, we have started overfitting and should stop training. We may allow the network to add (or remove) hidden units during this process.

Output

Multilayer Perceptron Architecture

Classification vs Regression Classification: putting data into ‘classes’ – ‘whose face is this’, – ‘is this an interesting web page’, – ‘is this an example of fraud’.. Regression: function approximation. – If I adjust the dials like this, what will the result be? – If I change the interest rate, what will happen to the economy? For classification, output units produce a ‘1’ or ‘0’ – i.e. a threshold For regression, output units produce a real number – Typically they are linear – the output is exactly the same as the weighted sum of the input.

Summary Input: – Feature Extraction, – Feature Selection – Feature Representation Architecture: – Training, Test, Validation – Number of hidden units Output – Classification – Regression

Summary of Module Neural Networks: Nature Inspired, units connected with weights that can be changed Used for classification and regression tasks Long history; logical calculus, perceptrons Problem of linear separability (solved by backpropagation) Pragmatics – main task is input processing.

Next Up… Other nature inspired techniques – Evolutionary Algorithms – Swarms