Artificial Neural Networks

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

NEURAL NETWORKS Backpropagation Algorithm
Learning in Neural and Belief Networks - Feed Forward Neural Network 2001 년 3 월 28 일 안순길.
EE 690 Design of Embodied Intelligence
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
Artificial Intelligence 13. Multi-Layer ANNs Course V231 Department of Computing Imperial College © Simon Colton.
Multilayer Perceptrons 1. Overview  Recap of neural network theory  The multi-layered perceptron  Back-propagation  Introduction to training  Uses.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Lecture 14 – Neural Networks
Artificial Neural Networks
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Artificial Neural Networks
Computer Science and Engineering
Artificial Neural Networks
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
Neural Networks Ellen Walker Hiram College. Connectionist Architectures Characterized by (Rich & Knight) –Large number of very simple neuron-like processing.
Chapter 11 – Neural Networks COMP 540 4/17/2007 Derek Singer.
Machine Learning Chapter 4. Artificial Neural Networks
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
Neural Networks1 Introduction to NETLAB NETLAB is a Matlab toolbox for experimenting with neural networks Available from:
Classification / Regression Neural Networks 2
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Neural Networks and Machine Learning Applications CSC 563 Prof. Mohamed Batouche Computer Science Department CCIS – King Saud University Riyadh, Saudi.
Multi-Layer Perceptron
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
CS621 : Artificial Intelligence
Fundamentals of Artificial Neural Networks Chapter 7 in amlbook.com.
EEE502 Pattern Recognition
Chapter 6 Neural Network.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Today’s Lecture Neural networks Training
Machine Learning Supervised Learning Classification and Regression
Neural networks.
Fall 2004 Backpropagation CS478 - Machine Learning.
Artificial Neural Networks
Learning with Perceptrons and Neural Networks
Learning in Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-5)
第 3 章 神经网络.
Real Neurons Cell structures Cell body Dendrites Axon
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
Artificial Neural Networks
Instructor : Saeed Shiry
Classification / Regression Neural Networks 2
CS621: Artificial Intelligence
Machine Learning Today: Reading: Maria Florina Balcan
CSC 578 Neural Networks and Deep Learning
ECE 471/571 - Lecture 17 Back Propagation.
Synaptic DynamicsII : Supervised Learning
of the Artificial Neural Networks.
Artificial Intelligence Chapter 3 Neural Networks
Perceptron as one Type of Linear Discriminants
Bayesian Decision Theory
Neural Network - 2 Mayank Vatsa
Capabilities of Threshold Neurons
Lecture Notes for Chapter 4 Artificial Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
Machine Learning: Lecture 4
Machine Learning: UNIT-2 CHAPTER-1
Artificial Intelligence Chapter 3 Neural Networks
Artificial Intelligence Chapter 3 Neural Networks
CS623: Introduction to Computing with Neural Nets (lecture-5)
Computer Vision Lecture 19: Object Recognition III
Seminar on Machine Learning Rada Mihalcea
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
David Kauchak CS158 – Spring 2019
Artificial Intelligence Chapter 3 Neural Networks
Presentation transcript:

Artificial Neural Networks Introduction Design of Primitive Units Perceptrons The Backpropagation Algorithm What is machine learning?

In contrast to perceptrons, multilayer networks can learn not only Basics In contrast to perceptrons, multilayer networks can learn not only multiple decision boundaries, but the boundaries may be nonlinear. Output nodes Internal nodes What is machine learning? Input nodes

Example x2 What is machine learning? x1

To make nonlinear partitions on the space we need to define One Single Unit To make nonlinear partitions on the space we need to define each unit as a nonlinear function (unlike the perceptron). One solution is to use the sigmoid unit. x1 w1 x2 g(x) What is machine learning? w2 Σ w0 wn xn O = σ(g(x)) = 1 / 1 + e –g(x) X0=1

Function σ is called the sigmoid or logistic function. More Precisely O(x1,x2,…,xn) = σ ( WX ) where: σ ( WX ) = 1 / 1 + e -WX Function σ is called the sigmoid or logistic function. It has the following property: d σ(y) / dy = σ(y) (1 – σ(y)) What is machine learning?

Backpropagation Algorithm Goal: To learn the weights for all links in an interconnected multilayer network. We begin by defining our measure of error: E(W) = ½ Σm Σk (tmk – omk) 2 k varies along the output nodes and m over the training examples. The idea is to use again a gradient descent over the space of weights to find a global minimum (no guarantee). What is machine learning?

The idea is to find a minimum in the space of weights and Gradient Descent The idea is to find a minimum in the space of weights and the error function E: E(W) What is machine learning? w1 w2

Output Nodes Output nodes What is machine learning?

Create a network with nin input nodes, nhidden internal nodes, Algorithm Create a network with nin input nodes, nhidden internal nodes, and nout output nodes. Initialize all weights to small random numbers. Until error is small do: For each example X do Propagate example X forward through the network Propagate errors backward through the network What is machine learning?

Given example X, compute the output of every node until Propagating Forward Given example X, compute the output of every node until we reach the output nodes: Output nodes Compute sigmoid function Internal nodes What is machine learning? Input nodes Example X

Propagating Error Backward For each output node k compute the error: δk = Ok (1-Ok)(tk – Ok) For each hidden unit h, calculate the error: δh = Oh (1-Oh) Σk Wkh δk Update each network weight: Wji = Wji + ΔWji where ΔWji = η δj Xji (Wji and Xji are the input and weight of node i to node j) What is machine learning?

Remarks on Backpropagation It implements a gradient descent search over the weight space. It may become trapped in local minima. In practice, it is very effective. 4. How to avoid local minima? Add momentum. Use stochastic gradient descent. Use different networks with different initial values for the weights. What is machine learning?

Generalization and Overfitting One obvious stopping point for backpropagation is to continue iterating until the error is below some threshold; this can lead to overfitting. Validation set error Error What is machine learning? Training set error Number of weight updates

Use a validation set and stop until the error is small in this set. Solutions Use a validation set and stop until the error is small in this set. Use 10 fold cross validation. Use weight decay; the weights are decreased slowly on each iteration. What is machine learning?

Historical Background Paul Werbos (1974) Proposed the back propagation algorithm. Several neurons are trained together. Rediscovered by Rumelhart, Hinton, McClelland (1986) John Hopfield (1982) A neural network can find a minimum when it reaches a state of minimum energy. What is machine learning?

Scaling Input If one attribute is much larger than another attribute, the weights will be adjusted to represent such differences; that is not desirable. Solution: Standardize all features previous to training. The mean of each feature should be zero. The variance should be fixed (e.g 1.0 ) What is machine learning?

Training with Noise If training set is small, one can “produce” examples and use them as if they were normal examples by generating them from the same distribution. Assumption: Add d-dimensional Gaussian noise to the true training points. x2 What is machine learning? x1

Number of Hidden Units The number of hidden units is related to the “expressiveness” of the neural network (the complexity of the decision boundary). If examples are easy to discriminate few nodes are necessary. Conversely complex problems require many internal nodes. A rule of thumb is to choose roughly m / 10 weights, where m is the number of training examples. What is machine learning?

Learning Rates Different learning rates affect significantly the performance of a neural network. Optimal Learning Rate: Leads to the error minimum in one learning step. It’s been found that a principled method to set the learning rate is to assign a value “separately” for each weight. What is machine learning?

What is machine learning?

The weight update rule can be modified so as to depend Adding Momentum The weight update rule can be modified so as to depend on the last iteration. At iteration s we have the following: ΔWji (s) = η δj Xji + αΔWji (s-1) Where α ( 0 <= α <= 1) is a constant called the momentum. It increases the speed along a local minimum. It increases the speed along flat regions. What is machine learning?

Cascade-Correlation Main ideas: Begin with a two-layer network and train it. If error is low enough stop. If not do the following: a. Fix all weights b. Add one node and connect it to all input and output units. c. Train the network by adjusting only the weights of the new node 4. Go to step 2. What is machine learning?

What is machine learning?

Recurrent Networks (Time Series Analysis) Recurrent networks have found application in time series prediction. Main ideas: The output units are “fed back” and duplicated as auxiliary inputs. During classification a pattern is presented to the input units. The feedforward flow is done, and the outputs serve as auxiliary input nodes. This produces new activations, and new outputs. What is machine learning?

What is machine learning?