Shunyuan Zhang Nikhil Malik

Slides:



Advertisements
Similar presentations
Dougal Sutherland, 9/25/13.
Advertisements

Machine Learning: Connectionist McCulloch-Pitts Neuron Perceptrons Multilayer Networks Support Vector Machines Feedback Networks Hopfield Networks.
Overview of Back Propagation Algorithm
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
Convolutional LSTM Networks for Subcellular Localization of Proteins
Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Deep Learning a.k.o. Neural Network.
Today’s Lecture Neural networks Training
Neural networks and support vector machines
Convolutional Sequence to Sequence Learning
Unsupervised Learning of Video Representations using LSTMs
Learning linguistic structure with simple and more complex recurrent neural networks Psychology February 2, 2017.
Convolutional Neural Network
CS 388: Natural Language Processing: LSTM Recurrent Neural Networks
CS 4501: Introduction to Computer Vision Computer Vision + Natural Language Connelly Barnes Some slides from Fei-Fei Li / Andrej Karpathy / Justin Johnson.
The Relationship between Deep Learning and Brain Function
CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.
Environment Generation with GANs
Deep Learning Amin Sobhani.
Natural Language and Text Processing Laboratory
Data Mining, Neural Network and Genetic Programming
Recurrent Neural Networks for Natural Language Processing
Learning Mid-Level Features For Recognition
Show and Tell: A Neural Image Caption Generator (CVPR 2015)
Matt Gormley Lecture 16 October 24, 2016
LECTURE ??: DEEP LEARNING
Deep Learning: Model Summary
Intro to NLP and Deep Learning
ICS 491 Big Data Analytics Fall 2017 Deep Learning
Introductory Seminar on Research: Fall 2017
Intelligent Information System Lab
Neural networks (3) Regularization Autoencoder
Supervised Training of Deep Networks
Lecture 5 Smaller Network: CNN
Neural Networks 2 CS446 Machine Learning.
Convolutional Networks
CS6890 Deep Learning Weizhen Cai
Non-linear classifiers Neural networks
Department of Electrical and Computer Engineering
Introduction to Neural Networks
Grid Long Short-Term Memory
Image Captions With Deep Learning Yulia Kogan & Ron Shiff
Deep learning Introduction Classes of Deep Learning Networks
March 2017 Project: IEEE P Working Group for Wireless Personal Area Networks (WPANs) Submission Title: Deep Leaning Method for OWC Date Submitted:
LECTURE 35: Introduction to EEG Processing
Long Short Term Memory within Recurrent Neural Networks
Neural Networks Geoff Hulten.
Other Classification Models: Recurrent Neural Network (RNN)
Lecture: Deep Convolutional Neural Networks
Papers 15/08.
LECTURE 33: Alternative OPTIMIZERS
Lecture 16: Recurrent Neural Networks (RNNs)
Recurrent Encoder-Decoder Networks for Time-Varying Dense Predictions
Convolutional Neural Networks
实习生汇报 ——北邮 张安迪.
Neural networks (3) Regularization Autoencoder
Deep Learning Authors: Yann LeCun, Yoshua Bengio, Geoffrey Hinton
Automatic Handwriting Generation
Introduction to Neural Networks
A unified extension of lstm to deep network
Image recognition.
Recurrent Neural Networks
Deep learning: Recurrent Neural Networks CV192
Week 7 Presentation Ngoc Ta Aidean Sharghi
Principles of Back-Propagation
Presentation transcript:

Shunyuan Zhang Nikhil Malik Deep Learning A tutorial for dummies Param Vir Singh Shunyuan Zhang Nikhil Malik

Dokyun Lee: The Deep Learner http://leedokyun.com/deep-learning-reading-list.html

“Deep Learning doesn’t do different things, “Deep Learning doesn’t do different things, it does things differently”

Performance vs Sample Size Deep Learning algorithms Performance Traditional ML algorithms Size of Data

Discriminator Network Outline Supervised Learning Convolutional Neural Network Sequence Modelling: RNN and its extensions Unsupervised Learning Autoencoder Stacked Denoising Autoencoder Unsupervised Learning (+Supervised) Generative Adversarial Networks Reinforcement Learning Deep Reinforcement Learning GAN Produce Poetry like Shakespeare Output Generated Shakespeare Poetry Generator Network Discriminator Network Input Fake Real Shakespeare Poetry Real

Supervised Learning Traditional pattern recognition models work with hand crafted features and relatively simple trainable classifiers. Limitations Very tedious and costly to develop hand crafted features. The hand-crafted features are usually highly dependents on one application. Trainable Classifier (e.g. SVM, Random Forrest) Extract Hand Crafted Features Output (e.g. Outdoor Yes or No)

Deep Learning Deep learning has an inbuilt automatic multi stage feature learning process that learns rich hierarchical representations (i.e. features). Low Level Features Mid Level Features High Level Features Output (e.g. outdoor, indoor) Trainable Classifier

Deep Learning Input Image Text Low Level Features Mid Level Features High Level Features Input Trainable Classifier Output Image Pixel Edge Texture Motif Part Object Text Character Word Word-group Clause Sentence Story Each module in Deep Learning transforms its input representation into a higher-level one, in a way similar to human cortex.

Let us see how it all works!

A Simple Neural Network An Artificial Neural Network is an information processing paradigm that is inspired by the biological nervous systems, such as the human brain’s information processing mechanism. x1 a1(1) x2 a2(1) Y a1(2) x3 a3(1) x4 a4(1) Input Hidden Layers Output

A Simple Neural Network Softmax x1 x2 x3 x4 a1(1) w4 w3 w2 w1 1 1+ 𝑒 −𝑤 ∗𝑎1 (2) x1 x2 x3 x4 Y a1(1) a2(1) a3(1) a4(1) a1(2) 𝑎1 (1) =𝑓 𝑤1∗𝑥1+𝑤2∗𝑥2+𝑤3∗𝑥3+𝑤4∗𝑥4 f( ) is activation function: Relu or sigmoid 𝑅𝑒𝑙𝑢:max⁡(0,𝑥) 𝑎1 (1) =𝑚𝑎𝑥 0, 𝑤1∗𝑥1+𝑤2∗𝑥2+𝑤3∗𝑥3+𝑤4∗𝑥4

4*4 + 4 +1 Number of Parameters Softmax Y Input Hidden Layers Output

If the input is an Image? Y 400 X 400 X 3 Input Hidden Layers Output Number of Parameters 480000*480000 + 480000 +1 = approximately 230 Billion !!! 480000*1000 + 1000 +1 = approximately 480 million !!!

Let us see how convolutional layers help.

Convolutional Layers Filter 0 1 0 1 -4 1 Filter Input Image Convoluted Image Inspired by the neurophysiological experiments conducted by Hubel and Wiesel 1962.

Convolutional Layers What is Convolution? ℎ2=𝑓 𝑏∗𝑤1+𝑐∗𝑤2+𝑓∗𝑤3+𝑔∗𝑤4 ℎ1=𝑓 𝑎∗𝑤1+𝑏∗𝑤2+𝑒∗𝑤3+𝑓∗𝑤4 a b c d e f g h i j k l m n o p w1 w2 w3 w4 h1 h2 Input Image Filter Convolved Image (Feature Map) ℎ2=𝑓 𝑏∗𝑤1+𝑐∗𝑤2+𝑓∗𝑤3+𝑔∗𝑤4 Number of Parameters for one feature map = 4 Number of Parameters for 100 feature map = 4*100

Lower Level to More Complex Features Filter 1 Filter 2 Input Image Layer 1 Feature Map Layer 2 Feature Map In Convolutional neural networks, hidden units are only connected to local receptive field.

Pooling Max pooling: reports the maximum output within a rectangular neighborhood. Average pooling: reports the average output of a rectangular neighborhood. MaxPool with 2X2 filter with stride of 2 1 3 5 4 2 4 5 3 Input Matrix Output Matrix

Convolutional Neural Network Maxpool Output Vector Feature Extraction Architecture 64 Living Room 128 Bed Room 256 512 512 Kitchen Bathroom Outdoor Max Pool Filter Fully Connected Layers

Convolutional Neural Networks Output: Binary, Multinomial, Continuous, Count Input: fixed size, can use padding to make all images same size. Architecture: Choice is ad hoc requires experimentation. Optimization: Backward propagation hyper parameters for very deep model can be estimated properly only if you have billions of images. Use an architecture and trained hyper parameters from other papers (Imagenet or Microsoft/Google APIs etc) Computing Power: Buy a GPU!!

Automatic Colorization of Black and White Images

Optimizing Images Post Processing Feature Optimization (Color Curves and Details) Post Processing Feature Optimization (Illumination) Post Processing Feature Optimization (Color Tone: Warmness)

Recurrent Neural Networks

Why RNN? The limitations of the Convolutional Neural Networks Take fixed length vectors as input and produce fixed length vectors as output. Allow fixed amount of computational steps. We need to model the data with temporal or sequential structures and varying length of inputs and outputs e.g. This movie is ridiculously good. This movie is very slow in the beginning but picks up pace later on and has some great action sequences and comedy scenes.

Modeling Sequences Awesome tutorial. Positive Happy Diwali A person riding a motorbike on dirt road Image Captioning Awesome tutorial. Positive Sentiment Analysis Happy Machine Translation Diwali शुभ दीपावली

What is RNN? Recurrent neural networks are connectionist models with the ability to selectively pass information across sequence steps, while processing sequential data one element at a time. Allows a memory of the previous inputs to persist in the model’s internal state and influence the outcome. OUTPUT h(t) h(t) Hidden Layer Delay h(t-1) x(t) INPUT

RNN (rolled over time) RNN is awesome 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 OUTPUT x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1 x2 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h2 x3 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h3 𝑤 𝑦 𝑓 𝑦 () is awesome RNN ℎ 𝑡 = 𝑓 ℎ 𝑤 ℎ ∗ℎ 𝑡−1 + 𝑤 𝑥 ∗𝑥 𝑡

RNN (rolled over time) RNN is so cool OUTPUT x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1 𝑤 𝑦 𝑓 𝑦 () OUTPUT RNN is so cool

The Vanishing Gradient Problem RNN’s use back propagation. Back propagation uses chain rule. Chain rule multiplies derivatives If these derivatives are between 0 and 1 the product vanishes as the chain gets longer. or the product explodes if the derivatives are greater than 1. Sigmoid activation function in RNN leads to this problem. Relu, in theory, avoids this problem but not in practice.

Problem with Vanishing or Exploding Gradients Don’t allow us to learn long term dependencies. Param is a hard worker. VS. Param, student of Yong, is a hard worker. BAD!!!! Misguided!!!! Unacceptable!!!!

Long Short Term Memory (LSTM) LSTM provide solution to the vanishing/exploding gradient problem. Solution: Memory Cell, which is updated at each step in the sequence. Three Gates control the flow of information to and from the Memory cell Input Gate: protect the current step from irrelevant inputs Output Gate: prevents current step from passing irrelevant information to later steps. Forget Gate: limits information passed from one cell to the next.

LSTM + c0 Forget f1 c1 Input i1 𝑐 1 = 𝑓 1 . 𝑐 0 + 𝑖 𝑖 . 𝑢 1 x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1 u1 𝑐 𝑡 = 𝑓 𝑡 . 𝑐 𝑡−1 + 𝑖 𝑡 . 𝑢 𝑡

LSTM + 𝑓 𝑓 () 𝑤 ℎ𝑓 𝑓 1 = 𝑓 𝑓 𝑊 ℎ𝑓 ∗ ℎ 0 + 𝑊 𝑥𝑓 ∗ 𝑥 1 𝑤 ℎ 𝑓 ℎ () x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1 u1 c0 c1 + Forget f1 Input i1 x1 h0 𝑤 ℎ𝑓 𝑤 𝑥𝑓 𝑓 𝑓 () Forget f1 𝑓 1 = 𝑓 𝑓 𝑊 ℎ𝑓 ∗ ℎ 0 + 𝑊 𝑥𝑓 ∗ 𝑥 1 𝑓 𝑡 = 𝑓 𝑓 𝑊 ℎ𝑓 ∗ ℎ 𝑡−1 + 𝑊 𝑥𝑓 ∗ 𝑥 𝑡

LSTM + 𝑤 ℎ𝑖 𝑓 𝑖 () 𝑖 1 = 𝑓 𝑖 𝑊 ℎ𝑖 ∗ ℎ 0 + 𝑊 𝑥𝑖 ∗ 𝑥 1 𝑤 ℎ 𝑓 ℎ () x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1 u1 c0 c1 + Forget f1 Input i1 x1 h0 𝑤 ℎ𝑖 𝑤 𝑥𝑖 𝑓 𝑖 () Input i1 𝑖 1 = 𝑓 𝑖 𝑊 ℎ𝑖 ∗ ℎ 0 + 𝑊 𝑥𝑖 ∗ 𝑥 1 𝑖 𝑡 = 𝑓 𝑓 𝑊 ℎ𝑖 ∗ ℎ 𝑡−1 + 𝑊 𝑥𝑖 ∗ 𝑥 𝑡

LSTM + 𝑜 𝑡 = 𝑓 𝑜 𝑊 ℎ𝑜 ∗ ℎ 𝑡−1 + 𝑊 𝑥𝑜 ∗ 𝑥 𝑡 ℎ 𝑡 = 𝑜 𝑡 .𝑡𝑎𝑛ℎ 𝑐 𝑡 𝑓 𝑜 () 𝑜 𝑡 = 𝑓 𝑜 𝑊 ℎ𝑜 ∗ ℎ 𝑡−1 + 𝑊 𝑥𝑜 ∗ 𝑥 𝑡 ℎ 𝑡 = 𝑜 𝑡 .𝑡𝑎𝑛ℎ 𝑐 𝑡 x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1 u1 c0 c1 + Forget f1 Input i1 𝑓 𝑜 () Output o1 h1 x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 h1

LSTM + + x1 h0 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 u1 c0 c1 Forget f1 Input i1 𝑓 𝑜 () h1 Output o1 x2 𝑓 ℎ () 𝑤 ℎ 𝑤 𝑥 u2 c2 + Forget f2 Input i2 𝑓 𝑜 () h2 Output o2

Combining CNN and LSTM

Visual Question Answering