Principles of Back-Propagation

Slides:



Advertisements
Similar presentations
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Advertisements

Slides from: Doug Gray, David Poole
NEURAL NETWORKS Backpropagation Algorithm
1 Machine Learning: Lecture 4 Artificial Neural Networks (Based on Chapter 4 of Mitchell T.., Machine Learning, 1997)
ImageNet Classification with Deep Convolutional Neural Networks
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
Supervised learning 1.Early learning algorithms 2.First order gradient methods 3.Second order gradient methods.
The back-propagation training algorithm
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2010 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Spring 2002 Shreekanth Mandayam Robi Polikar ECE Department.
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2006 Shreekanth Mandayam ECE Department Rowan University.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Traffic Sign Recognition Using Artificial Neural Network Radi Bekker
Neural Networks. Plan Perceptron  Linear discriminant Associative memories  Hopfield networks  Chaotic networks Multilayer perceptron  Backpropagation.
Classification Part 3: Artificial Neural Networks
Multiple-Layer Networks and Backpropagation Algorithms
Artificial Neural Networks
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
Classification / Regression Neural Networks 2
Neural Network Introduction Hung-yi Lee. Review: Supervised Learning Training: Pick the “best” Function f * Training Data Model Testing: Hypothesis Function.
Multi-Layer Perceptron
11 1 Backpropagation Multilayer Perceptron R – S 1 – S 2 – S 3 Network.
Introduction to Neural Networks Introduction to Neural Networks Applied to OCR and Speech Recognition An actual neuron A crude model of a neuron Computational.
EEE502 Pattern Recognition
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
C - IT Acumens. COMIT Acumens. COM. To demonstrate the use of Neural Networks in the field of Character and Pattern Recognition by simulating a neural.
Neural Networks 2nd Edition Simon Haykin
Previous Lecture Perceptron W  t+1  W  t  t  d(t) - sign (w(t)  x)] x Adaline W  t+1  W  t  t  d(t) - f(w(t)  x)] f’ x Gradient.
Convolutional Neural Network
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Lecture 3a Analysis of training of NN
CSE343/543 Machine Learning Mayank Vatsa Lecture slides are prepared using several teaching resources and no authorship is claimed for any slides.
Today’s Lecture Neural networks Training
Big data classification using neural network
Gradient-based Learning Applied to Document Recognition
Computer Science and Engineering, Seoul National University
Computing Gradient Hung-yi Lee 李宏毅
Lecture 24: Convolutional neural networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
Derivation of a Learning Rule for Perceptrons
Lecture 25: Backprop and convnets
Neural Networks and Backpropagation
CSE P573 Applications of Artificial Intelligence Neural Networks
CSE 473 Introduction to Artificial Intelligence Neural Networks
Prof. Carolina Ruiz Department of Computer Science
Introduction to Neural Networks
Goodfellow: Chap 6 Deep Feedforward Networks
Gradient Checks for ANN
CS 4501: Introduction to Computer Vision Training Neural Networks II
Convolutional Neural Networks
of the Artificial Neural Networks.
CSE 573 Introduction to Artificial Intelligence Neural Networks
network of simple neuron-like computing elements
Backpropagation.
Neural Networks Geoff Hulten.
Backpropagation.
Forward and Backward Max Pooling
Backpropagation Disclaimer: This PPT is modified based on
实习生汇报 ——北邮 张安迪.
Image Classification & Training of Neural Networks
Backpropagation and Neural Nets
Backpropagation.
CS621: Artificial Intelligence Lecture 22-23: Sigmoid neuron, Backpropagation (Lecture 20 and 21 taken by Anup on Graphical Models) Pushpak Bhattacharyya.
Introduction to Neural Networks
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
Artificial Neural Networks / Spring 2002
Prof. Carolina Ruiz Department of Computer Science
An introduction to neural network and machine learning
Presentation transcript:

Principles of Back-Propagation The relation between biological vision and computer vision Principles of Back-Propagation Prof. Bart ter Haar Romeny

Deep Learning Convolutional Neural Networks How does this actually work? Deep Learning Convolutional Neural Networks In Error backpropagation AlexNet (Alex Krizhevsky 2012) ImageNet challenge: 1.4 million images, 1000 classes 75% → 94% Convolution, ReLU, max pooling, convolution, convolution etc. A typical big deep NN has (hundreds of) millions of connections: weights.

A numerical example of backpropagation on a simple network: From Prakash Jay, Senior Data Scientist @FractalAnalytics: https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c

Approach Build a small neural network as defined in the architecture right. Initialize the weights and biases randomly. Fix the input and output. Forward pass the inputs. Calculate the cost. Compute the gradients and errors. Backprop and adjust the weights and biases accordingly. We initialize the network randomly:

Forward pass layer 1:

Forward pass layer 1: Matrix operation: Relu operation: Example:

Forward pass layer 2:

Forward pass layer 2: Matrix operation: Sigmoid operation: Example:

Forward pass layer 3:

Forward pass output layer: Matrix operation: Softmax operation: Example: [ 0.1985 0.2855 0.5158 ]

Analysis: The Actual Output should be [1.0, 0.0, 0.0] but we got [0.2698, 0.3223, 0.4078]. To calculate the error let us use cross-entropy Error: Example: Error = -(1 * Log[0.19858]+0+0 * Log[0.28559]+1 * Log[1-0.28559] +0 *Log[0.51583]+1 * Log[1-0.51583]) = 2.67818

We are done with the forward pass. We know the error Analysis: The Actual Output should be [1.0, 0.0, 0.0] but we got [0.19858, 0.28559, 0.51583]. To calculate the error let us use cross-entropy Error: Example: Error = -(1 * Log[0.19858]+0+0 * Log[0.28559]+1 * Log[1-0.28559] +0 *Log[0.51583]+1 * Log[1-0.51583]) = 2.67818 We are done with the forward pass. We know the error of the first iteration (we go do this numerous times). Now let us study the backward pass.

A chain of functions: From Rohan Kapur: https://ayearofai.com/rohan-lenny-1-neural-networks-the-backpropagation-algorithm-explained-abf4609d4f9d

We recall:

For gradient descent: The derivative of this function with respect to some arbitrary weight (for example w1) is calculated by applying the chain rule: For a simple error measure (p = predicted, a = actual):

Important derivatives: Sigmoid: ReLU: SoftMax:

= Two slides ago, we saw that

Going one more layer backwards, we can determine that: With etc. And finally: 1 And iterate until convergence:

Numerical example in great detail by Prakash Jay on Medium.com: https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c etc.

Deeper reading: https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative https://eli.thegreenplace.net/2018/backpropagation-through-a-fully-connected-layer/