Introduction to Neural Networks

Introduction to Neural Networks

Overview The motivation for NNs Neuron – The basic unit
Fully-connected Neural Networks Feedforward (inference) The linear algebra behind Convolutional Neural Networks & Deep Learning

The Brain does Complex Tasks
3×4 12 Tiger Danger! Fine

Inside the Brain

Real vs. Artificial Neuron
Inputs Inputs Weights 𝐼 1 Output 𝑤 1 𝑓( ) 𝐼 2 𝑤 2 𝑤 3 𝐼 3 Outputs

Σ 𝜎 Neuron – General Model 𝑎 𝑎=𝜎 𝑗=1 𝑁 𝐼 𝑗 ⋅ 𝑤 𝑗 (activation) 𝐼 1 𝑤 1
𝐼 2 𝑤 2 𝑎 Σ 𝜎 𝑤 𝑁 (activation) 𝐼 𝑁 𝑎=𝜎 𝑗=1 𝑁 𝐼 𝑗 ⋅ 𝑤 𝑗

Neuron Activation Functions
The most popular activation functions: Sigmoid was the first to be used, tanh came later Rectified Linear Unit (ReLU) is the most widely used today (also the simplest function) Allows for faster net training (will be discussed later)

Feedforward Neural Network
Hidden Layers Input Layer Output Layer

Feedforward Neural Network

Feedforward – How it Works?
Input Layer Output Layer 𝐼 1 𝑂 1 𝐼 2 𝐼 3 𝑂 2 Flow of Computation

Feedforward – What Can it Do?
Classification region vs. # of layers in a NN: More neurons  More complexity in classification

𝑎 3 1 𝑤 1 3 2 Indexing Conventions (activation) (weight) Layer 1
Input Output Layer index 𝑎 3 1 𝑤 1 3 2 (activation) (weight) Neuron index Weight index within The neuron

Feedforward – General Equations
Layer 1 Layer 2 Input Output 𝐼 1 𝐼 2 𝐼 3 Input Weights matrix of layer j 𝑊 [𝑚,𝑛] 𝑗 = & 𝑤 1 1 𝑗 ⋯ neuron 1 weights ⋯ 𝑤 1 𝑛 𝑗 ⋮ 𝑤 𝑘 1 𝑗 ⋯ neuron 𝑘 weights ⋯ 𝑤 𝑘 𝑛 𝑗 & ⋮ 𝑤 𝑚 1 𝑗 ⋯ neuron 𝑚 weights ⋯ 𝑤 𝑚 𝑛 𝑗 𝐼= 𝐼 1 𝐼 2 ⋮ 𝐼 𝑁 Neurons number of activations of layer 𝑗‑1 Inputs per neuron =

The Linear Algebra Behind Feedforward: Example
Layer 1 Layer 2 Input Output 𝐼 1 𝐼 2 𝐼 3 1st hidden layer weights: 𝑊 [5,3] 1 = 𝑤 𝑤 𝑤 ⋮ 𝑤 𝑤 𝑤 5 3 1 1st neuron activation: 𝑎 1 1 =𝜎 𝑗=1 3 𝑤 1 𝑗 1 ⋅ 𝐼 𝑗 𝐼= 𝐼 1 𝐼 2 𝐼 3 𝒂 𝟏 =𝜎 𝑊 1 ⋅𝐼 =𝜎 𝑤 𝑤 𝑤 ⋮ 𝑤 𝑤 𝑤 ⋅ 𝐼 1 𝐼 2 𝐼 3 = 𝜎( Σ j 𝑤 1 𝑗 1 ⋅ 𝐼 𝑗 ) ⋮ 𝜎( Σ j 𝑤 5 𝑗 1 ⋅ 𝐼 𝑗 )

The Linear Algebra Behind Feedforward
Layer 1 𝑎 [5,1] 1 = 𝜎(Σ 𝑤 1 𝑗 1 ⋅ 𝐼 𝑗 ) ⋮ 𝜎(Σ 𝑤 5 𝑗 1 ⋅ 𝐼 𝑗 ) Input Layer 2 Output 2st hidden layer weights: 𝑊 [4,5] 2 = 𝑤 𝑤 … 𝑤 ⋮ 𝑤 𝑤 … 𝑤 4 5 2 =𝜎 𝑤 𝑤 … 𝑤 ⋮ 𝑤 𝑤 … 𝑤 ⋅ 𝑎 𝑎 2 1 ⋮ 𝑎 5 1 𝒂 𝟐 =𝜎 𝑊 2 ⋅ 𝑎 1 = 𝜎( Σ j 𝑤 1 𝑗 2 ⋅ 𝑎 𝑗 1 ) ⋮ 𝜎( Σ j 𝑤 4 𝑗 2 ⋅ 𝑎 𝑗 1 )

The Linear Algebra Behind Feedforward
Layer 1 Input Layer 2 Output 𝒂 𝒌 =𝜎 𝑊 𝑘 ⋅ 𝑎 𝑘−1 =𝜎 𝑤 1 1 𝑘 … 𝑤 1 𝑛 𝑘 ⋮ 𝑤 𝑚 1 𝑘 … 𝑤 𝑚 𝑛 𝑘 ⋅ 𝑎 1 𝑘−1 𝑎 2 𝑘−1 ⋮ 𝑎 𝑛 𝑘−1 = 𝜎( Σ j 𝑤 1 1 𝑘 ⋅ 𝑎 1 𝑘−1 ) ⋮ 𝜎( Σ j 𝑤 𝑚 𝑗 𝑘 ⋅ 𝑎 𝑗 𝑘−1 ) 𝒂 𝒌 =𝜎 𝑊 𝑘 ⋅ 𝑎 𝑘−1 =𝜎 𝑊 𝑘 ⋅𝜎 𝑊 𝑘−1 ⋅ 𝑎 𝑘−2 =…

Number of Computations
Given NN with 𝒌 layers (hidden + output): Total DMV (Dense Matrix⋅Vector) multiplications = 𝑘 Time complexity = 𝑂( 𝑖=1 𝑘 𝑛 𝑖 ⋅ 𝑛 𝑖−1 ) Memory complexity = 𝑂( 𝑖=1 𝑘 𝑛 𝑖 ⋅ 𝑛 𝑖−1 ) 𝑛 𝑖−1 𝑤 1 1 𝑖 … 𝑤 1 𝑛 𝑖−1 𝑖 ⋮ 𝑤 𝑛 𝑖 1 𝑘 … 𝑤 𝑛 𝑖 𝑛 𝑖−1 𝑘 ⋅ 𝑎 1 𝑖−1 ⋮ 𝑎 𝑛 𝑖−1 𝑖−1 𝑛 𝑖 𝑛 𝑖−1

Small Leftover – The Bias
Last activation is constant 1 Layer 1 Input Layer 2 Output +1 +1 +1 𝒂 𝟏 =𝜎 𝑊 1 ⋅𝐼 =𝜎 𝑤 𝑤 𝑤 𝑤 ⋮ 𝑤 𝑤 𝑤 𝑤 ⋅ 𝐼 1 𝐼 2 𝐼 3 1 𝒂 𝟐 =𝜎 𝑊 2 ⋅ 𝑎 1 =𝜎 𝑤 𝑤 … 𝑤 𝑤 ⋮ 𝑤 𝑤 … 𝑤 𝑤 ⋅ 𝑎 𝑎 2 1 ⋮ 𝑎

Classification – Softmax Layer
Classification  one output = ‘1’, the rest are ‘0’. Neuron’s weighted sum is not limited to 1. The solution: softmax 𝑧 𝑘 = 𝑗=1…𝑁 𝑤 𝑘 𝑗 1 ⋅ 𝐼 𝑗 𝑎 𝑘 = 𝑒 𝑧 𝑘 𝑗∈𝑜𝑢𝑡𝑝𝑢𝑡 𝑒 𝑧 𝑗 ∀𝑘: 𝑎 𝑘 ∈(0,1]

Convolutional Neural Networks
A main building-block of deep learning Called Convnets / CNNs in short Motivation: When the input data is an image 1000 1000 Spatial correlation is local Better to put resources elsewhere

Convolutional Layer Reduce connectivity to local regions
Example: 1000×1000 image 100 different filters Filter size: 10×10 10k parameters Every filter is different

Convnets Sliding window computation: 𝑾 𝟏,𝟑 𝑾 𝟏,𝟐 𝑾 𝟏,𝟏 𝑾 𝟐,𝟑 𝑾 𝟐,𝟐
𝑾 𝟐,𝟏 𝑾 𝟑,𝟑 𝑾 𝟑,𝟐 𝑾 𝟑,𝟏

Convnets Sliding window computation:

* = Conv Layer – The Math 𝑤 – Filter kernel of size 𝐾×𝐾 𝑥 - Input
𝑎 𝑖,𝑗 =𝑤∗ 𝑥 𝑖,𝑗 = 𝑝=1 𝐾 𝑞=1 𝐾 𝑤 𝑝, 𝑞 ⋅ 𝑥 𝑖+𝑝,𝑗+𝑞 * = Filter

Conv Layer - Parameters
Zero padding – add surrounding zeros so that output size = input size Stride – number of pixels to move the filter * = Stride = 2

Conv Layer – Multiple Inputs
Output = sum of convolutions Multiple outputs  called Feature Maps Σ Output Feature maps for (n = 0; n < N; n++) for (m = 0; m < M; m ++) for(y = 0; y<Y; y++) for(x = 0; x<X; x++) for (p = 0; p< K; p++) for (q = 0; q< K; q++) AL (n; x, y) += AL-1(m, x+p, y+q) * w (m , n; p, q); Input Feature maps Single Input Filter Kernel

Pooling Layer Multiple inputs  single output Reduces amount of data
Several types: Max (most common) Average …

Putting it All Together - AlexNet
The first CNN to win ImageNet challenge ImageNet: 1.2M 256×256 images,1000 classes Krizhevsky et al. “ImageNet Classification with deep CNNs” NIPS 2012 Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." In European Conference on Computer Vision, 2014.

AlexNet – Inside the Feature Maps
Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." In European Conference on Computer Vision, pp Springer International Publishing, 2014.

Putting it All Together - AlexNet
Trained for 1 week on 2 GPUs Each Geforce GTX 580 – 3GB memory Krizhevsky et al. “ImageNet Classification with deep CNNs” NIPS 2012

Architecture for Classification
category prediction Total nr. params: 60M 4M Total nr. flops: 832M 4M LINEAR 16M 37M FULLY CONNECTED 16M 37M FULLY CONNECTED MAX POOLING 442K CONV 74M 1.3M 884K CONV 224M 149M CONV MAX POOLING LOCAL CONTRAST NORM 307K CONV 223M MAX POOLING LOCAL CONTRAST NORM CONV 35K 105M 110 Ranzato Krizhevsky et al. “ImageNet Classification with deep CNNs” NIPS 2012

Convnet - Bigger is Better?
GoogLeNet (2014) – ImageNet winner Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

Thank you

Introduction to Neural Networks

Similar presentations

Presentation on theme: "Introduction to Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Introduction to Neural Networks

Similar presentations

Presentation on theme: "Introduction to Neural Networks"— Presentation transcript:

Similar presentations

About project

Feedback