1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Multi-Layer Perceptron (MLP)
Slides from: Doug Gray, David Poole
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
Machine Learning Neural Networks
Simple Neural Nets For Pattern Classification
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
The back-propagation training algorithm
Connectionist Modeling Some material taken from cspeech.ucd.ie/~connectionism and Rich & Knight, 1991.
Prénom Nom Document Analysis: Artificial Neural Networks Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Back-Propagation Algorithm
Artificial Neural Networks
Lecture 4 Neural Networks ICS 273A UC Irvine Instructor: Max Welling Read chapter 4.
Data Mining with Neural Networks (HK: Chapter 7.5)
Artificial Neural Networks
LOGO Classification III Lecturer: Dr. Bo Yuan
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CS 4700: Foundations of Artificial Intelligence
September 28, 2010Neural Networks Lecture 7: Perceptron Modifications 1 Adaline Schematic Adjust weights i1i1i1i1 i2i2i2i2 inininin …  w 0 + w 1 i 1 +
CS 484 – Artificial Intelligence
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Neural networks.
Artificial neural networks:
Dr. Hala Moushir Ebied Faculty of Computers & Information Sciences
1 Introduction to Artificial Neural Networks Andrew L. Nelson Visiting Research Faculty University of South Florida.
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Artificial Neural Networks
Computer Science and Engineering
Artificial Neural Networks
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
CS464 Introduction to Machine Learning1 Artificial N eural N etworks Artificial neural networks (ANNs) provide a general, practical method for learning.
Machine Learning Chapter 4. Artificial Neural Networks
1 Machine Learning The Perceptron. 2 Heuristic Search Knowledge Based Systems (KBS) Genetic Algorithms (GAs)
Artificial Neural Network Supervised Learning دكترمحسن كاهاني
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
Artificial Intelligence Chapter 3 Neural Networks Artificial Intelligence Chapter 3 Neural Networks Biointelligence Lab School of Computer Sci. & Eng.
Multi-Layer Perceptron
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
Chapter 2 Single Layer Feedforward Networks
Neural Networks - lecture 51 Multi-layer neural networks  Motivation  Choosing the architecture  Functioning. FORWARD algorithm  Neural networks as.
Neural Networks Teacher: Elena Marchiori R4.47 Assistant: Kees Jong S2.22
Artificial Neural Network
IE 585 History of Neural Networks & Introduction to Simple Learning Rules.
Neural Networks 2nd Edition Simon Haykin
1 Perceptron as one Type of Linear Discriminants IntroductionIntroduction Design of Primitive UnitsDesign of Primitive Units PerceptronsPerceptrons.
Perceptrons Michael J. Watts
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Chapter 6 Neural Network.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
第 3 章 神经网络.
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
Synaptic DynamicsII : Supervised Learning
Artificial Intelligence Chapter 3 Neural Networks
Perceptron as one Type of Linear Discriminants
Artificial Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Presentation transcript:

1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)

2 What is an Artificial Neural Network? §It is a formalism for representing functions inspired from biological systems and composed of parallel computing units which each compute a simple function. §Some useful computations taking place in Feedforward Multilayer Neural Networks are: l Summation l Multiplication l Threshold (e.g., 1/(1+e ) [the sigmoidal threshold function]. Other functions are also possible -x

3 Biological Motivation Biological Learning Systems are built of very complex webs of interconnected neurons. Information-Processing abilities of biological neural systems must follow from highly parallel processes operating on representations that are distributed over many neurons ANNs attempt to capture this mode of computation

4 Multilayer Neural Network Representation Examples: Input Units Hidden Units Output units weights AutoassociationHeteroassociation

5 How is a function computed by a Multilayer Neural Network? h j =g(  w ji.x i ) y 1 =g(  w kj.h j ) where g(x)= 1/(1+e ) x 1 x 2 x 3 x 4 x 5 x 6 h 1 h 2 h 3 y1y1 kjikji w ji ’s w kj ’s g (sigmoid): 0 1/2 0 1 Typically, y 1 =1 for positive example and y 1 =0 for negative example -x i j

6 Learning in Multilayer Neural Networks §Learning consists of searching through the space of all possible matrices of weight values for a combination of weights that satisfies a database of positive and negative examples (multi-class as well as regression problems are possible). §Note that a Neural Network model with a set of adjustable weights defines a restricted hypothesis space corresponding to a family of functions. The size of this hypothesis space can be increased or decreased by increasing or decreasing the number of hidden units present in the network.

7 Appropriate Problems for Neural Network Learning § Instances are represented by many attribute-value pairs (e.g., the pixels of a picture. ALVINN [Mitchell, p. 84]). §The target function output may be discrete-valued, real- valued, or a vector of several real- or discrete-valued attributes. §The training examples may contain errors. §Long training times are acceptable. §Fast evaluation of the learned target function may be required. §The ability for humans to understand the learned target function is not important.

8 History of Neural Networks §1943: McCulloch and Pitts proposed a model of a neuron --> Perceptron (read [Mitchell, section 4.4 ]) §1960s: Widrow and Hoff explored Perceptron networks (which they called “Adelines”) and the delta rule. §1962: Rosenblatt proved the convergence of the perceptron training rule. §1969: Minsky and Papert showed that the Perceptron cannot deal with nonlinearly-separable data sets---even those that represent simple function such as X-OR. § : Very little research on Neural Nets §1986: Invention of Backpropagation [Rumelhart and McClelland, but also Parker and earlier on: Werbos] which can learn from nonlinearly-separable data sets. §Since 1985: A lot of research in Neural Nets!

9 Backpropagation: Purpose and Implementation §Purpose: To compute the weights of a feedforward multilayer neural network adaptatively, given a set of labeled training examples. §Method: By minimizing the following cost function (the sum of square error): E= 1/2  n=1  k=1 [y k -f k (x )] where N is the total number of training examples and K, the total number of output units (useful for multiclass problems) and f k is the function implemented by the neural net NK nn 2

10 Backpropagation: Overview §Backpropagation works by applying the gradient descent rule to a feedforward network. §The algorithm is composed of two parts that get repeated over and over until a pre-set maximal number of epochs, EPmax. §Part I, the feedforward pass: the activation values of the hidden and then output units are computed. §Part II, the backpropagation pass: the weights of the network are updated--starting with the hidden to output weights and followed by the input to hidden weights-- with respect to the sum of squares error and through a series of weight update rules called the Delta Rule.

11 Backpropagation: The Delta Rule I §For the hidden to output connections (easy case) §  w kj = -   E/  w kj =   n=1 [y k - f k (x )] g’(h k ) V j =   n=1  k V j with Nn n n nn nN  corresponding to the learning rate (an extra parameter of the neural net) h k =  j=0 w kj V j V j = g(  i=0 w ji x i ) and  k = g’(h k )(y k - f k (x )) n nn nn n n M d n M is the number of hidden units and d the number of input units

12 Backpropagation: The Delta Rule II §For the input to hidden connections (hard case: no pre-fixed values for the hidden units) §  w ji = -   E/  w ji = -   n=1  E/  V j  V j /  w ji (Chain Rule) =   k,n [y k - f k (x )] g’(h k ) w kj g’(h j )x i =   k w kj g’(h j )x i =   n=1  j x i with nnnn n n nN nn n Nnn h j =  i=0 w ji x i  j = g’(h j )  k=1 w kj  k and all the other quantities already defined dn n nnK n

13 Backpropagation: The Algorithm 1. Initialize the weights to small random values; create a random pool of all the training patterns; set EP, the number of epochs of training to Pick a training pattern  from the remaining pool of patterns and propagate it forward through the network. 3. Compute the deltas,  k for the output layer. 4. Compute the deltas,  j for the hidden layer by propagating the error backward. 5. Update all the connections such that w ji = w ji +  w ji and w kj = w kj +  w kj 6. If any pattern remains in the pool, then go back to Step 2. If all the training patterns in the pool have been used, then set EP = EP+1, and if EP  EP Max, then create a random pool of patterns and go to Step 2. If EP = EP Max, then stop.   NewOldNewOld

14 Backpropagation: The Momentum §To this point, Backpropagation has the disadvantage of being too slow if  is small and it can oscillate too widely if  is large. §To solve this problem, we can add a momentum to give each connection some inertia, forcing it to change in the direction of the downhill “force”. §New Delta Rule:  w pq (t+1) = -   E/  w pq +   w pq (t) where p and q are any input and hidden, or, hidden and outpu units; t is a time step or epoch; and  is the momentum parameter which regulates the amount of inertia of the weights.