Classification Neural Networks 1

Slides:



Advertisements
Similar presentations
Artificial Neural Networks
Advertisements

Beyond Linear Separability
Slides from: Doug Gray, David Poole
Introduction to Neural Networks Computing
G53MLE | Machine Learning | Dr Guoping Qiu
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Navneet Goyal, BITS-Pilani Perceptrons. Labeled data is called Linearly Separable Data (LSD) if there is a linear decision boundary separating the classes.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Reading for Next Week Textbook, Section 9, pp A User’s Guide to Support Vector Machines (linked from course website)
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
CS Perceptrons1. 2 Basic Neuron CS Perceptrons3 Expanded Neuron.
Perceptron.
Machine Learning Neural Networks
Overview over different methods – Supervised Learning
Artificial Neural Networks
Lecture 14 – Neural Networks
Simple Neural Nets For Pattern Classification
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Connectionist models. Connectionist Models Motivated by Brain rather than Mind –A large number of very simple processing elements –A large number of weighted.
Biological neuron artificial neuron.
Machine Learning Motivation for machine learning How to set up a problem How to design a learner Introduce one class of learners (ANN) –Perceptrons –Feed-forward.
Artificial Neural Networks
September 23, 2010Neural Networks Lecture 6: Perceptron Learning 1 Refresher: Perceptron Training Algorithm Algorithm Perceptron; Start with a randomly.
Artificial Neural Networks
CS 484 – Artificial Intelligence
Artificial Neural Networks
Presentation on Neural Networks.. Basics Of Neural Networks Neural networks refers to a connectionist model that simulates the biophysical information.
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Computer Science and Engineering
Artificial Neural Networks
1 Artificial Neural Networks Sanun Srisuk EECP0720 Expert Systems – Artificial Neural Networks.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
CS464 Introduction to Machine Learning1 Artificial N eural N etworks Artificial neural networks (ANNs) provide a general, practical method for learning.
Machine Learning Chapter 4. Artificial Neural Networks
Artificial Neural Network Yalong Li Some slides are from _24_2011_ann.pdf.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Classification / Regression Neural Networks 2
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
Chapter 2 Single Layer Feedforward Networks
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
Artificial Neural Network
Artificial Intelligence Methods Neural Networks Lecture 3 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Artificial Neural Network. Introduction Robust approach to approximating real-valued, discrete-valued, and vector-valued target functions Backpropagation.
Lecture 12. Outline of Rule-Based Classification 1. Overview of ANN 2. Basic Feedforward ANN 3. Linear Perceptron Algorithm 4. Nonlinear and Multilayer.
Learning: Neural Networks Artificial Intelligence CMSC February 3, 2005.
Pattern Recognition Lecture 20: Neural Networks 3 Dr. Richard Spillman Pacific Lutheran University.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Machine Learning Supervised Learning Classification and Regression
Artificial Neural Networks
Chapter 2 Single Layer Feedforward Networks
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Artificial Neural Networks
Classification / Regression Neural Networks 2
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
Classification Neural Networks 1
Perceptron as one Type of Linear Discriminants
Lecture Notes for Chapter 4 Artificial Neural Networks
CSSE463: Image Recognition Day 17
Seminar on Machine Learning Rada Mihalcea
Outline Announcement Neural networks Perceptrons - continued
Presentation transcript:

Classification Neural Networks 1

Neural networks Topics Perceptrons Multilayer networks structure training expressiveness Multilayer networks possible structures activation functions training with gradient descent and backpropagation

Connectionist models Consider humans: Neuron switching time ~ 0.001 second Number of neurons ~ 1010 Connections per neuron ~ 104-5 Scene recognition time ~ 0.1 second 100 inference steps doesn’t seem like enough  Massively parallel computation

Neural networks Properties: Many neuron-like threshold switching units Many weighted interconnections among units Highly parallel, distributed process Emphasis on tuning weights automatically

Neural network application ALVINN: An Autonomous Land Vehicle In a Neural Network (Carnegie Mellon University Robotics Institute, 1989-1997) ALVINN is a perception system which learns to control the NAVLAB vehicles by watching a person drive. ALVINN's architecture consists of a single hidden layer back-propagation network. The input layer of the network is a 30x32 unit two dimensional "retina" which receives input from the vehicles video camera. Each input unit is fully connected to a layer of five hidden units which are in turn fully connected to a layer of 30 output units. The output layer is a linear representation of the direction the vehicle should travel in order to keep the vehicle on the road.

Neural network application ALVINN drives 70 mph on highways!

Perceptron structure Model is an assembly of nodes connected by weighted links Output node sums up its input values according to the weights of their links Output node sum then compared against some threshold t or

Example: modeling a Boolean function Output Y is 1 if at least two of the three inputs are equal to 1.

Perceptron model

Perceptron decision boundary Perceptron decision boundaries are linear (hyperplanes in higher dimensions) Example: decision surface for Boolean function on preceding slides

Expressiveness of perceptrons Can model any function where positive and negative examples are linearly separable Examples: Boolean AND, OR, NAND, NOR Cannot (fully) model functions which are not linearly separable. Example: Boolean XOR

Perceptron training process Initialize weights with random values. Do Apply perceptron to each training example. If example is misclassified, modify weights. Until all examples are correctly classified, or process has converged.

Perceptron training process Two rules for modifying weights during training: Perceptron training rule train on thresholded outputs driven by binary differences between correct and predicted outputs modify weights with incremental updates Delta rule train on unthresholded outputs driven by continuous differences between correct and predicted outputs modify weights via gradient descent

Perceptron training rule Initialize weights with random values. Do Apply perceptron to each training sample i. If sample i is misclassified, modify all weights j. Until all samples are correctly classified.

Perceptron training rule If sample i is misclassified, modify all weights j. Examples:

Perceptron training rule Example of processing one sample

Delta training rule Based on squared error function for weight vector: Note that error is difference between correct output and unthresholded sum of inputs, a continuous quantity (rather than binary difference between correct output and thresholded output). Weights are modified by descending gradient of error function.

Squared error function for weight vector w

Gradient of error function

Gradient of squared error function

Delta training rule Initialize weights with random values. Do Apply perceptron to each training sample i. If sample i is misclassified, modify all weights j. Until all samples are correctly classified, or process converges.

Gradient descent: batch vs. incremental Incremental mode (illustrated on preceding slides) Compute error and weight updates for a single sample. Apply updates to weights before processing next sample. Batch mode Compute errors and weight updates for a block of samples (maybe all samples). Apply all updates simultaneously to weights.

Perceptron training rule vs. delta rule Perceptron training rule guaranteed to correctly classify all training samples if: Samples are linearly separable. Learning rate  is sufficiently small. Delta rule uses gradient descent. Guaranteed to converge to hypothesis with minimum squared error if: Even when: Training data contains noise. Training data not linearly separable.

Equivalence of perceptron and linear models 