Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks Expressiveness.

Slides:



Advertisements
Similar presentations
Perceptron Lecture 4.
Advertisements

G53MLE | Machine Learning | Dr Guoping Qiu
Perceptron Learning Rule
NEURAL NETWORKS Perceptron
1 Neural networks. Neural networks are made up of many artificial neurons. Each input into the neuron has its own weight associated with it illustrated.
Artificial Neural Networks
CS 4700: Foundations of Artificial Intelligence
Classification Neural Networks 1
Neural Networks.
Simple Neural Nets For Pattern Classification
Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks: Concepts (Reading:
The Perceptron CS/CMPE 333 – Neural Networks. CS/CMPE Neural Networks (Sp 2002/2003) - Asim LUMS2 The Perceptron – Basics Simplest and one.
20.5 Nerual Networks Thanks: Professors Frank Hoffmann and Jiawei Han, and Russell and Norvig.
Rutgers CS440, Fall 2003 Neural networks Reading: Ch. 20, Sec. 5, AIMA 2 nd Ed.
Neural Networks Marco Loog.
Biological neuron artificial neuron.
Linear Learning Machines  Simplest case: the decision function is a hyperplane in input space.  The Perceptron Algorithm: Rosenblatt, 1956  An on-line.
Artificial Neural Networks
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
CS 4700: Foundations of Artificial Intelligence
Backpropagation Introduction to Artificial Intelligence COS302 Michael L. Littman Fall 2001.
Artificial Neural Network
Where We’re At Three learning rules  Hebbian learning regression  LMS (delta rule) regression  Perceptron classification.
Some more Artificial Intelligence
Artificial Neural Networks
Artificial Neural Networks
Multi-Layer Perceptrons Michael J. Watts
ANNs (Artificial Neural Networks). THE PERCEPTRON.
Machine Learning Chapter 4. Artificial Neural Networks
Waqas Haider Khan Bangyal. Multi-Layer Perceptron (MLP)
Artificial Intelligence Techniques Multilayer Perceptrons.
Artificial Neural Networks. The Brain How do brains work? How do human brains differ from that of other animals? Can we base models of artificial intelligence.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 31: Feedforward N/W; sigmoid.
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 30: Perceptron training convergence;
Non-Bayes classifiers. Linear discriminants, neural networks.
Linear Classification with Perceptrons
CS-424 Gregory Dudek Today’s Lecture Neural networks –Training Backpropagation of error (backprop) –Example –Radial basis functions.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
CS344: Introduction to Artificial Intelligence (associated lab: CS386) Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 32: sigmoid neuron; Feedforward.
Chapter 2 Single Layer Feedforward Networks
CS621 : Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 21: Perceptron training and convergence.
Artificial Neural Networks Chapter 4 Perceptron Gradient Descent Multilayer Networks Backpropagation Algorithm 1.
Multilayer Neural Networks (sometimes called “Multilayer Perceptrons” or MLPs)
Perceptrons Michael J. Watts
Chapter 6 Neural Network.
Neural NetworksNN 21 Architecture We consider the architecture: feed- forward NN with one layer It is sufficient to study single layer perceptrons with.
Start with student evals. What function does perceptron #4 represent?
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Announcements 1. Textbook will be on reserve at library 2. Topic schedule change; modified reading assignment: This week: Linear discrimination, evaluating.
Computational Intelligence Semester 2 Neural Networks Lecture 2 out of 4.
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
Linear separability Hyperplane In 2D: Feature 1 Feature 2 A perceptron can separate data that is linearly separable.
CSE 473 Introduction to Artificial Intelligence Neural Networks
Linear separability Hyperplane In 2D: Feature 1 Feature 2 A perceptron can separate data that is linearly separable.
Basic Concepts Number of inputs/outputs is variable
Multi-layer Perceptrons (MLPs)
CSSE463: Image Recognition Day 17
CS621: Artificial Intelligence
Classification Neural Networks 1
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Neural Networks Chapter 5
CSSE463: Image Recognition Day 17
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Prof. Pushpak Bhattacharyya, IIT Bombay
CS621: Artificial Intelligence Lecture 18: Feedforward network contd
CS621: Artificial Intelligence Lecture 17: Feedforward network (lecture 16 was on Adaptive Hypermedia: Debraj, Kekin and Raunak) Pushpak Bhattacharyya.
Presentation transcript:

Carla P. Gomes CS4700 CS 4700: Foundations of Artificial Intelligence Prof. Carla P. Gomes Module: Neural Networks Expressiveness of Perceptrons (Reading: Chapter 20.5)

Carla P. Gomes CS4700 Expressiveness of Perceptrons

Carla P. Gomes CS4700 Expressiveness of Perceptrons What hypothesis space can a perceptron represent? Even more complex Booelan functions such as majority function. But can it represent any arbitrary Boolean function?

Carla P. Gomes CS4700 Expressiveness of Perceptrons A threshold perceptron returns 1 iff the weighted sum of its inputs (including the bias) is positive, i.e.,: I.e., iff the input is on one side of the hyperplane it defines. Linear discriminant function or linear decision surface. Weights determine slope and bias determines offset. Perceptron  Linear Separator

Carla P. Gomes CS4700 x1x1 x2x Can view trained network as defining a “separation line”. Linear Separability Percepton used for classification Consider example with two inputs, x1, x2: What is its equation?

Carla P. Gomes CS4700 Linear Separability x1x1 x2x2   OR

Carla P. Gomes CS4700 Linear Separability x1x1 x2x2   AND

Carla P. Gomes CS4700 Linear Separability x1x1 x2x2   XOR

Carla P. Gomes CS4700 Linear Separability x1x1 x2x2   XOR Minsky & Papert (1969) Bad News: Perceptrons can only represent linearly separable functions. Not linearly separable

Carla P. Gomes CS4700 Consider a threshold perceptron for the logical XOR function (two inputs): Our examples are: x1x2label Linear Separability: XOR Given our examples, we have the following inequalities for the perceptron: From (1) ≤ T  T  0 From (2) w > T  w 1 > T From (3) 0 + w2 > T  w 2 > T From (4) w1 + w2 ≤ T w1 + w2 > 2T contradiction So, XOR is not linearly separable

Carla P. Gomes CS4700 Convergence of Perceptron Learning Algorithm … training data linearly separable … step size  sufficiently small … no “hidden” units Perceptron converges to a consistent function, if…

Perceptron learns majority function easily, DTL is hopeless

DTL learns restaurant function easily, perceptron cannot represent it

Carla P. Gomes CS4700 Good news: Adding hidden layer allows more target functions to be represented. Minsky & Papert (1969)

Carla P. Gomes CS4700 Multi-layer Perceptrons (MLPs) Single-layer perceptrons can only represent linear decision surfaces. Multi-layer perceptrons can represent non-linear decision surfaces.

Carla P. Gomes CS4700 Minsky & Papert (1969) “[The perceptron] has many features to attract attention: its linearity; its intriguing learning theorem; its clear paradigmatic simplicity as a kind of parallel computation. There is no reason to suppose that any of these virtues carry over to the many-layered version. Nevertheless, we consider it to be an important research problem to elucidate (or reject) our intuitive judgment that the extension is sterile.” Bad news: No algorithm for learning in multi-layered networks, and no convergence theorem was known in 1969! Minsky & Papert (1969) pricked the neural network balloon …they almost killed the field. Winter of Neural Networks Rumors say these results may have killed Rosenblatt….

Carla P. Gomes CS4700 Two major problems they saw were 1.How can the learning algorithm apportion credit (or blame) to individual weights for incorrect classifications depending on a (sometimes) large number of weights? 2.How can such a network learn useful higher-order features?

Carla P. Gomes CS4700 Good news: Successful credit-apportionment learning algorithms developed soon afterwards (e.g., back-propagation). Still successful, in spite of lack of convergence theorem. The “Bible” (1986)