Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov 2009 1.

Slides:



Advertisements
Similar presentations
Deep Belief Nets and Restricted Boltzmann Machines
Advertisements

Thomas Trappenberg Autonomous Robotics: Supervised and unsupervised learning.
Deep Learning Bing-Chen Tsai 1/21.
CIAR Second Summer School Tutorial Lecture 2a Learning a Deep Belief Net Geoffrey Hinton.
CS590M 2008 Fall: Paper Presentation
Advanced topics.
Rajat Raina Honglak Lee, Roger Grosse Alexis Battle, Chaitanya Ekanadham, Helen Kwong, Benjamin Packer, Narut Sereewattanawoot Andrew Y. Ng Stanford University.
Stacking RBMs and Auto-encoders for Deep Architectures References:[Bengio, 2009], [Vincent et al., 2008] 2011/03/03 강병곤.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
ImageNet Classification with Deep Convolutional Neural Networks
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
Learning Deep Energy Models
Deep Learning.
How to do backpropagation in a brain
Deep Belief Networks for Spam Filtering
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
CSC321: Introduction to Neural Networks and Machine Learning Lecture 20 Learning features one layer at a time Geoffrey Hinton.
Submitted by:Supervised by: Ankit Bhutani Prof. Amitabha Mukerjee (Y )Prof. K S Venkatesh.
Deep Boltzman machines Paper by : R. Salakhutdinov, G. Hinton Presenter : Roozbeh Gholizadeh.
Learning Energy-Based Models of High-Dimensional Data Geoffrey Hinton Max Welling Yee-Whye Teh Simon Osindero
What is the Best Multi-Stage Architecture for Object Recognition Kevin Jarrett, Koray Kavukcuoglu, Marc’ Aurelio Ranzato and Yann LeCun Presented by Lingbo.
Comp 5013 Deep Learning Architectures Daniel L. Silver March,
Multiclass object recognition
Convolutional Neural Networks for Image Processing with Applications in Mobile Robotics By, Sruthi Moola.
How to do backpropagation in a brain
Using Fast Weights to Improve Persistent Contrastive Divergence Tijmen Tieleman Geoffrey Hinton Department of Computer Science, University of Toronto ICML.
CSC2535: Computation in Neural Networks Lecture 11: Conditional Random Fields Geoffrey Hinton.
Hierarchical Distributed Genetic Algorithm for Image Segmentation Hanchuan Peng, Fuhui Long*, Zheru Chi, and Wanshi Siu {fhlong, phc,
End-to-End Text Recognition with Convolutional Neural Networks
A shallow introduction to Deep Learning
Learning Lateral Connections between Hidden Units Geoffrey Hinton University of Toronto in collaboration with Kejie Bao University of Toronto.
Building high-level features using large-scale unsupervised learning Anh Nguyen, Bay-yuan Hsu CS290D – Data Mining (Spring 2014) University of California,
Learning to perceive how hand-written digits were drawn Geoffrey Hinton Canadian Institute for Advanced Research and University of Toronto.
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
CSC 2535 Lecture 8 Products of Experts Geoffrey Hinton.
How to learn a generative model of images Geoffrey Hinton Canadian Institute for Advanced Research & University of Toronto.
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
CSC321: Introduction to Neural Networks and Machine Learning Lecture 19: Learning Restricted Boltzmann Machines Geoffrey Hinton.
Introduction to Deep Learning
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.
CSC2515 Fall 2008 Introduction to Machine Learning Lecture 8 Deep Belief Nets All lecture slides will be available as.ppt,.ps, &.htm at
CSC321 Lecture 24 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
Deep Belief Network Training Same greedy layer-wise approach First train lowest RBM (h 0 – h 1 ) using RBM update algorithm (note h 0 is x) Freeze weights.
CSC 2535: Computation in Neural Networks Lecture 10 Learning Deterministic Energy-Based Models Geoffrey Hinton.
CSC321 Lecture 27 Using Boltzmann machines to initialize backpropagation Geoffrey Hinton.
1 Restricted Boltzmann Machines and Applications Pattern Recognition (IC6304) [Presentation Date: ] [ Ph.D Candidate,
Neural networks (2) Reminder Avoiding overfitting Deep neural network Brief summary of supervised learning methods.
Deep Learning Overview Sources: workshop-tutorial-final.pdf
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Xintao Wu University of Arkansas Introduction to Deep Learning 1.
Some Slides from 2007 NIPS tutorial by Prof. Geoffrey Hinton
Learning Deep Generative Models by Ruslan Salakhutdinov
Deep Feedforward Networks
Energy models and Deep Belief Networks
CSC321: Neural Networks Lecture 22 Learning features one layer at a time Geoffrey Hinton.
Learning Mid-Level Features For Recognition
Matt Gormley Lecture 16 October 24, 2016
Restricted Boltzmann Machines for Classification
LECTURE ??: DEEP LEARNING
Multimodal Learning with Deep Boltzmann Machines
Deep Learning Qing LU, Siyuan CAO.
Deep Belief Networks Psychology 209 February 22, 2013.
Computer Vision James Hays
Towards Understanding the Invertibility of Convolutional Neural Networks Anna C. Gilbert1, Yi Zhang1, Kibok Lee1, Yuting Zhang1, Honglak Lee1,2 1University.
Deep learning Introduction Classes of Deep Learning Networks
CSC321 Winter 2007 Lecture 21: Some Demonstrations of Restricted Boltzmann Machines Geoffrey Hinton.
Autoencoders David Dohan.
Presentation transcript:

Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

CRBMs for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov

Problems Human detection Handwritten digit classification 3

Sliding Window Approach 4

Sliding Window Approach (Cont’d) 5 [INRIA Person Dataset] Decision Boundary

Success or Failure of an object recognition algorithm hinges on the features used Input Feature representation Label Our Focus Classifier ? Human Background 0 / 1 / 2 / 3 / … 6 Learning

Local Feature Detector Hierarchies 7 Larger More complicated Less frequent

Generative & Layerwise Learning 8 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Generative CRBM ? ? ? ? ? ? ? ? ?? ? ?

Visual Features: Filtering Filter Kernel (Feature) Filter Response

Our approach to feature learning is generative ? ? ? Binary Hidden Variables 10 (CRBM model)

Related Work 11

Related Work Convolutional Neural Network (CNN) – Filtering layers are bundled with a classifier, and all the layers are learned together using error backpropagation. – Does not perform well on natural images Biologically plausible models – Hand-crafted first layer vs. Randomly selected prototypes for second layer. [Lecun et al. 98] [Ranzato et al. CVPR'07] [Serre et al., PAMI'07][Mutch and Lowe, CVPR'06] 12 Discriminative No Learning

Related Work (cont’d) Deep Belief Net – A two layer partially observed MRF, called RBM, is the building block – Learning is performed unsupervised and layer-by- layer from bottom layer upwards Our contributions: We incorporate spatial locality into RBMs and adapt the learning algorithm accordingly We add more complicated components such as pooling and sparsity into deep belief nets [Hinton et al., NC'2006] 13 Generative & Unsupervised

Why Generative &Unsupervised Discriminative learning of deep and large neural networks has not been successful – Requires large training sets – Easily gets over-fitted for large models – First layer gradients are relatively small Alternative hybrid approach – Learn a large set of first layer features generatively – Switch to a discriminative model to select the discriminative features from those that are learned – Discriminative fine-tuning is helpful

Details 15

CRBM Image is the visible layer and hidden layer is related to filter responses An energy based probabilistic model 16 Dot product of vectorized matrices

Training CRBMs Maximum likelihood learning of CRBMs is difficult Contrastive Divergence (CD) learning is applicable For CD learning we need to compute the conditionals and. data 17 sample

CRBM (Backward) Nearby hidden variables cooperate in reconstruction Conditional Probabilities take the form 18

Learning the Hierarchy The structure is trained bottom up and layerwise The CRBM model for training filtering layers Filtering layers are followed by down-sampling CRBM Classifier Pooling 19 Filtering Non-linearity Reduce the dimensionality layers

Input 1 st Filters2 nd Filters Responses 1 324

Experiments 21

Evaluation MNIST digit dataset Training set: 60,000 image of digits of size 28x28 Test set: 10,000 images INRIA person dataset Training set: 2416 person windows of size 128 x 64 pixels and 4.5x10 6 negative windows Test set: 1132 positive and 2x10 6 negative windows 22

First layer filters Gray-scale images of INRIA positive set 15 filters of 7x7 23 MNIST unlabeled digits 15 filters of 5x5

Second Layer Features (MNIST) Hard to visualize the filters We show patches highly responded to filters: 24

Second Layer Features (INRIA) 25

MNIST Results MNIST error rate when model is trained on the full training set 26

Results 27 False Positive

1 st 28

2 nd 29

3 rd 30

4 th 31

5 th 32

INRIA Results Adding our large-scale features significantly improves performance of the baseline (HOG) 33

Conclusion We extended the RBM model to Convolutional RBM, useful for domains with spatial locality We exploited CRBMs to train local hierarchical feature detectors one layer at a time and generatively This method obtained results comparable to state-of-the-art in digit classification and human detection 34

Thank You 35

Hierarchical Feature Detector 36 ??? ??? ???

Contrastive Divergence Learning 37

Training CRBMs (Cont'd) The problem of reconstructing border region becomes severe when number of Gibbs sampling steps > 1. – Partition visible units into middle and border regions Instead of maximizing the likelihood, we (approximately) maximize

Enforcing Feature Sparsity The CRBM's representation is K (number of filters) times overcomplete After a few CD learning iterations, V is perfectly reconstructed Enforce sparsity to tackle this problem – Hidden bias terms were frozen at large negative values Having a single non-sparse hidden unit improves the learned features – Might be related to the ergodicity condition

Probabilistic Meaning of Max Max

The Classifier Layer We used SVM as our final classifier – RBF kernel for MNIST – Linear kernel for INRIA – For INRIA we combined our 4 th layer outputs and HOG features We experimentally observed that relaxing the sparsity of CRBM's hidden units yields better results – This lets the discriminative model to set the thresholds itself

Why HOG features are added? Because part-like features are very sparse Having a template of the human figure helps a lot f

RBM Two layer pairwise MRF with a full set of hidden-visible connections RBM Is an energy based model Hidden random variables are binary, Visible variables can be binary or continuous Inference is straightforward: and Contrastive Divergence learning for training h v w

Why Unsupervised Bottom-Up Discriminative learning of deep structure has not been successful – Requires large training sets – Easily is over-fitted for large models – First layer gradients are relatively small Alternative hybrid approach – Learn a large set of first layer features generatively – Later, switch to a discriminative model to select the discriminative features from those learned – Fine-tune the features using

INRIA Results (Cont'd) Missrate at different FPPW rates FPPI is a better indicator of performance More experiments on size of features and number of layers are desired