CSC 578 Neural Networks and Deep Learning

Slides:



Advertisements
Similar presentations
A Brief Overview of Neural Networks By Rohit Dua, Samuel A. Mulder, Steve E. Watkins, and Donald C. Wunsch.
Advertisements

NEURAL NETWORKS Backpropagation Algorithm
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Lecture 13 – Perceptrons Machine Learning March 16, 2010.
Overview over different methods – Supervised Learning
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2008 Shreekanth Mandayam ECE Department Rowan University.
The back-propagation training algorithm
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Fall 2004 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2010 Shreekanth Mandayam ECE Department Rowan University.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks / Spring 2002 Shreekanth Mandayam Robi Polikar ECE Department.
Artificial Neural Networks
CHAPTER 11 Back-Propagation Ming-Feng Yeh.
S. Mandayam/ ANN/ECE Dept./Rowan University Artificial Neural Networks ECE /ECE Fall 2006 Shreekanth Mandayam ECE Department Rowan University.
Artificial Neural Networks
Artificial Neural Networks (ANN). Output Y is 1 if at least two of the three inputs are equal to 1.
Artificial Neural Networks
Machine Learning Chapter 4. Artificial Neural Networks
Appendix B: An Example of Back-propagation algorithm
Backpropagation An efficient way to compute the gradient Hung-yi Lee.
Lecture 3 Introduction to Neural Networks and Fuzzy Logic President UniversityErwin SitompulNNFL 3/1 Dr.-Ing. Erwin Sitompul President University
Classification / Regression Neural Networks 2
LINEAR CLASSIFICATION. Biological inspirations  Some numbers…  The human brain contains about 10 billion nerve cells ( neurons )  Each neuron is connected.
Multi-Layer Perceptron
Non-Bayes classifiers. Linear discriminants, neural networks.
Neural Networks and Backpropagation Sebastian Thrun , Fall 2000.
EEE502 Pattern Recognition
Neural Networks Vladimir Pleskonjić 3188/ /20 Vladimir Pleskonjić General Feedforward neural networks Inputs are numeric features Outputs are in.
Neural Network Terminology
NEURAL NETWORKS LECTURE 1 dr Zoran Ševarac FON, 2015.
Neural Networks 2nd Edition Simon Haykin
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Learning with Neural Networks Artificial Intelligence CMSC February 19, 2002.
Back Propagation and Representation in PDP Networks
Neural networks.
Multiple-Layer Networks and Backpropagation Algorithms
Back Propagation and Representation in PDP Networks
Fall 2004 Backpropagation CS478 - Machine Learning.
第 3 章 神经网络.
A Simple Artificial Neuron
CSE 473 Introduction to Artificial Intelligence Neural Networks
Derivation of a Learning Rule for Perceptrons
Announcements HW4 due today (11:59pm) HW5 out today (due 11/17 11:59pm)
Classification with Perceptrons Reading:
with Daniel L. Silver, Ph.D. Christian Frey, BBA April 11-12, 2017
FUNDAMENTAL CONCEPT OF ARTIFICIAL NETWORKS
CSC 578 Neural Networks and Deep Learning
CSE P573 Applications of Artificial Intelligence Neural Networks
Classification / Regression Neural Networks 2
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
Classification Neural Networks 1
Artificial Neural Network & Backpropagation Algorithm
Artificial Intelligence Chapter 3 Neural Networks
CSE 573 Introduction to Artificial Intelligence Neural Networks
Neural Network - 2 Mayank Vatsa
CSC 578 Neural Networks and Deep Learning
Multi-Layer Perceptron
Neural Networks Geoff Hulten.
Capabilities of Threshold Neurons
Artificial Intelligence Chapter 3 Neural Networks
Neural Networks References: “Artificial Intelligence for Games”
Ch4: Backpropagation (BP)
Artificial Intelligence Chapter 3 Neural Networks
Back Propagation and Representation in PDP Networks
Backpropagation David Kauchak CS159 – Fall 2019.
Ch4: Backpropagation (BP)
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
CSC 578 Neural Networks and Deep Learning
Presentation transcript:

CSC 578 Neural Networks and Deep Learning Fall 2018/19 2. Backpropagation (Some figures adapted from NNDL book) Noriko Tomuro

0. Some Terminologies of Neural Networks “N-layer neural network” – By naming convention, we do NOT include the input layer because it doesn’t have parameters. Size of the network – usually indicated by the number of nodes in each layer, starting from the input layer. e.g. [3,4,4,1]. Hyper-parameters – Parameters in the network/model for which the values can be set by passing in (from outside; e.g. learning rate η), rather than parameters whose values are determined and controlled internally in the algorithm.. Noriko Tomuro

1. Notations in the NNDL book Differences in the notations between Mitchell’s and NNDL (ch 1) Mitchell NNDL Perceptron Output (in component notation) Vector notation Sigmoid (or logistic) Function σ 𝜎 𝑛𝑒𝑡 = 1 1+ 𝑒 −𝑛𝑒𝑡 where 𝑛𝑒𝑡= 𝑖=0 𝑤 𝑖 ∙ 𝑥 𝑖 𝜎 𝑧 = 1 1+ 𝑒 −𝑧 where 𝒛=𝒘∙𝒙+𝒃 where b is a bias, and b = -threshold Noriko Tomuro

𝑣→ 𝑣 ′ =𝑣−𝜂∇𝐶 Mitchell NNDL Objective Function (to minimize) Error (Sum of Squared Error) Note: No other error/cost function is used in the book. Cost function (Quadratic cost; MSE) But most of the time the only the function symbol C is used because several cost functions are discussed. Gradient of Error/Cost function Weight change Weight vector: Individual weight: Vector notation: where v= 𝑣 1 , 𝑣 2 ,… Weight update rule 𝑣→ 𝑣 ′ =𝑣−𝜂∇𝐶 Noriko Tomuro

Weight Update: batch vs. stochastic Batch: Mitchell NNDL Weight Update: batch vs. stochastic Batch: Stochastic/ Online: where Mini-batch/ stochastic: Noriko Tomuro

Vector Notation and Multilayer Networks Noriko Tomuro

Bias – in a single neuron Noriko Tomuro

Bias – in a network of neurons Noriko Tomuro

2. The Backpropagation Algorithm The Backpropagation algorithm (BP) finds/learns network weights so as to minimize the network error (cost function) by iteratively adjusting the weights. Iterative weight updates is done by ‘rolling down’ the error surface (to the minimum point). Gradient descent algorithm is used for the procedure. BP applies to networks with any number of layers (i.e., multi-layer neural networks). Error at the output layer is propagated back to the hidden layers, so as to adjust the weights between the hidden layers (as well as the weights connected to the output layer). Noriko Tomuro

𝜎(𝑧)= 1 1+ 𝑒 −𝑧 , and 𝜎 ′ 𝑧 = 𝜎(𝑧)∙(1−𝜎 𝑧 ) Mitchell NNDL (ch 2) Note: The error function E assumes/is using a (quadratic) sum of squared error (with multiple output units), 𝐸 𝑤 = 1 2 𝑑∈𝐷 𝑘∈𝑜𝑢𝑡𝑝𝑢𝑠 𝑡 𝑘𝑑 − 𝑜 𝑘𝑑 2 Note: The cost function C is left unspecified. But the activation function is sigmoid: 𝜎(𝑧)= 1 1+ 𝑒 −𝑧 , and 𝜎 ′ 𝑧 = 𝜎(𝑧)∙(1−𝜎 𝑧 ) Noriko Tomuro

Notations in the NNDL BP Algorithm (ch 2) Indices and indications Activation of a neuron: jth neuron in the lth layer: 𝑎 𝑗 𝑙 =𝜎 𝑘 𝑤 𝑗𝑘 𝑙 ∙ 𝑎 𝑘 𝑙−1 + 𝑏 𝑗 𝑙 Vector notation: 𝑎 𝑙 =𝜎 𝑤 𝑙 ∙ 𝑎 𝑙−1 + 𝑏 𝑙 Cost function (quadratic): 𝐶= 1 2 𝑦− 𝑎 𝐿 2 = 1 2 𝑗 𝑦 𝑗 − 𝑎 𝑗 𝐿 2 Noriko Tomuro

The Hadamard product, 𝑠⊙𝑡 The four fundamental equations: Given 𝑧 𝑙 = 𝑤 𝑙 ∙ 𝑎 𝑙−1 + 𝑏 𝑙 (or 𝑧 𝑗 𝑙 = 𝑘 𝑤 𝑗𝑘 𝑙 ∙ 𝑎 𝑘 𝑗−1 + 𝑏 𝑗 𝑘 ), Error: 2. Noriko Tomuro

Rate of change of the cost: Noriko Tomuro

NNDL BP Code >>> import network >>> net = network.Network([784, 30, 10]) >>> net.SGD(training_data, 30, 10, 3.0, test_data=test_data) Noriko Tomuro

NNDL BP Code Noriko Tomuro