1 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Deep Epitomic Nets and Scale/Position Search for Image Classification TTIC_ECP team George.

Slides:



Advertisements
Similar presentations
Visual Dictionaries George Papandreou CVPR 2014 Tutorial on BASIS
Advertisements

Advanced topics.
Classification spotlights
ImageNet Classification with Deep Convolutional Neural Networks
Tiled Convolutional Neural Networks TICA Speedup Results on the CIFAR-10 dataset Motivation Pretraining with Topographic ICA References [1] Y. LeCun, L.
Karen Simonyan Andrew Zisserman
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Presented by: Mingyuan Zhou Duke University, ECE September 18, 2009
Deconvolutional Networks
Deep Learning and Neural Nets Spring 2015
Large-Scale Object Recognition with Weak Supervision
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang1;2, Manohar Paluri1, Marc’Aurelio Ranzato1, Trevor Darrell2, Lubomir Bourdev1 1: Facebook.
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
Convolutional Neural Nets
Spatial Pyramid Pooling in Deep Convolutional
Image Classification using Sparse Coding: Advanced Topics
Kuan-Chuan Peng Tsuhan Chen
A shallow introduction to Deep Learning
Detection, Segmentation and Fine-grained Localization
Deep face recognition Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman.
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Object detection, deep learning, and R-CNNs
ECE 6504: Deep Learning for Perception Dhruv Batra Virginia Tech Topics: –(Finish) Backprop –Convolutional Neural Nets.
Deep Convolutional Nets
M. Wang, T. Xiao, J. Li, J. Zhang, C. Hong, & Z. Zhang (2014)
Neural networks in modern image processing Petra Budíková DISA seminar,
A shallow look at Deep Learning
Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 6: Applying backpropagation to shape recognition Geoffrey Hinton.
CS 188: Artificial Intelligence Learning II: Linear Classification and Neural Networks Instructors: Stuart Russell and Pat Virtue University of California,
Regionlets for Generic Object Detection IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 37, NO. 10, OCTOBER 2015 Xiaoyu Wang, Ming.
Cascade Region Regression for Robust Object Detection
Convolutional Neural Network
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Philipp Gysel ECE Department University of California, Davis
Spatial Localization and Detection
Convolutional Neural Networks for Direct Text Deblurring
Introduction to Convolutional Neural Networks
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
SUPER RESOLUTION USING NEURAL NETS Hila Levi & Eran Amar Weizmann Ins
Convolutional Neural Networks
Deep Learning and Its Application to Signal and Image Processing and Analysis Class III - Fall 2016 Tammy Riklin Raviv, Electrical and Computer Engineering.
Recent developments in object detection
Deep Residual Learning for Image Recognition
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
Learning Mid-Level Features For Recognition
References [1] - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11): ,
Combining CNN with RNN for scene labeling (segmentation)
Classification of Hand-Written Digits Using Scattering Convolutional Network Dongmian Zou Advisor: Professor Radu Balan.
Deep Belief Networks Psychology 209 February 22, 2013.
CS6890 Deep Learning Weizhen Cai
Machine Learning: The Connectionist
Object detection.
Computer Vision James Hays
Recognition IV: Object Detection through Deep Learning and R-CNNs
Introduction to Neural Networks
Object Detection + Deep Learning
A Proposal Defense On Deep Residual Network For Face Recognition Presented By SAGAR MISHRA MECE
Semantic segmentation
Neural Networks Geoff Hulten.
Outline Background Motivation Proposed Model Experimental Results
RCNN, Fast-RCNN, Faster-RCNN
边缘检测年度进展概述 Ming-Ming Cheng Media Computing Lab, Nankai University
Heterogeneous convolutional neural networks for visual recognition
Course Recap and What’s Next?
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Semantic Segmentation
Presentation transcript:

1 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Deep Epitomic Nets and Scale/Position Search for Image Classification TTIC_ECP team George Papandreou Toyota Technological Institute at Chicago Iasonas Kokkinos Ecole Centrale Paris/INRIA

2 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search TTIC_ECP entry in a nutshell Goal: Invariance in Deep CNNs Part 1: Deep epitomic nets: local translation (deformation) Part 2: Global scaling and translation Top-5 error. All DCNNs have 6 convolutional and 2 fully-connected layers. (0) Baseline: max-pooled net 13.0% (1) epitomic DCNN 11.9% (2) epitomic DCNN+ search 10.56% Fusion (1)+(2) 10.22% ~1% gain ~1.5% gain

3 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search LeCun et al.: Gradient-Based Learning Applied to Document Recognition, Proc. IEEE 1998 Krizhevsky et al.: ImageNet Classification with Deep CNNs, NIPS 2012 Cascade of convolution + max-pooling blocks (deformation-invariant template matching) Deep Convolutional Neural Networks (DCNNs) Our work: different blocks (P1) & different architecture (P2) convolutionalfully connected

4 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Part 1: Deep epitomic nets

5 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Epitomes: translation-invariant patch models Patch Templates Jojic, Frey, Kannan: Epitomic analysis of appearance and shape, ICCV 2003 EM-based training Benoit, Mairal, Bach, Ponce: Sparse image representation with epitomes, CVPR 2011 Grosse, Raina, Kwong, Ng: Shift-invariant sparse coding, UAI 2007 Epitomes: a lot more for just a bit more Separate modeling: more data & less power per parameter

6 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Papandreou, Chen, Yuille: Modeling Image Patches with a Dictionary of Mini-Epitomes, CVPR14 Mini-epitomes for image classification Dictionary of mini-epitomesDictionary of patches (K-means) Gains in (flat) BoW classification

7 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search From flat to deep: Epitomic convolution Max-Pooling Epitomic Convolution G. Papandreou: Deep Epitomic Convolutional Neural Networks, arXiv, June Max over image positions Max over epitome positions

8 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Deep Epitomic Convolutional Nets Convolution + max-pooling Epitomic convolution Supervised dictionary learning by back-propagation G. Papandreou: Deep Epitomic Convolutional Neural Networks, arXiv, June 2014.

9 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Deep Epitomic Convolutional Nets Parameter sharing: faster and more reliable model learning (0) Baseline: max-pooled net 13.0% (1) epitomic DCNN 11.9% ~1% gain Consistent improvements

10 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Part 2: Global scaling and translation

11 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Scale Invariance challenge Scale-dependent (area) Category-dependent (ear detector) Dogs

12 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Scale Invariance challenge Scale-dependent Category-dependent (ear detector) Dogs Skyscrapers

13 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Scale Invariance challenge Scale-dependent Category-dependent (ear detector) Dogs Skyscrapers Training set

14 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Scale Invariance challenge Scale-dependent Category-dependent (ear detector) Dogs Skyscrapers Rule: Large skyscrapers have ears, large dogs don’t

15 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Scale Invariant classification Scale-dependent Category-dependent A. Howard. Some improvements on deep convolutional neural network based image classification, T. Dietterich et al. Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence, K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition, This work: MIL: End-to-end training! ‘bag’ of features feature

16 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Step 1: Efficient multi-scale convolutional features stitch pyramid GPU unstitch I(x,y) I(x,y,s) Patchwork(x,y)C(x,y) C(x,y,s) multi-scale convolutional features Sermanet, P., Eigen, D., Zhang, X., Mathieu, M., Fergus, R., LeCun, Y.: Overfeat : ICLR 2014 Dubout, C., Fleuret, F.: Exact acceleration of linear object detectors. ECCV 2012 Iandola, F., Moskewicz, M., Karayev, S., Girshick, R., Darrell, T., Keutzer, K.: Densenet. arXiv x220x3 5x5x512

17 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Step 2: From fully connected to fully convolutional convolutional pyramid GPU I(x,y) Patchwork(x,y) F(x,y) stich I(x,y,s) 220x220x3 1x1x4096 convolutionalfully connected

18 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Step 3: Global max-pooling pyramid GPU I(x,y) Patchwork(x,y) stich I(x,y,s) For free: argmax yields 48% localization error (0) Baseline: max-pooled net 13.0% (1) epitomic DCNN 11.9% ~1% gain (2) epitomic DCNN+ search 10.56% ~1.5% gain learned class-specific bias Fusion (1)+(2) 10.22% Consistent, explicit position and scale search during training and testing

19 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search DCNN: 6 Convolutional + 2 Fully Connected layers (0) Baseline: max-pooled net 13.0% (1)Epitomic DCNN 11.9% (2) search 10.56% Fusion (1)+(2) 10.22% ~1% gain ~1.5% gain The Deeper the Better: stay tuned! Deep Epitomic Nets and Scale/Position Search for Image Classification Goal: Invariance in Deep CNNs ?

20 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Epitomic implementation details Architecture of our deep epitomic net (11.94%) Training took 3 weeks on a singe Titan (60 epochs) Standard choices for learning rate, momentum, etc.

21 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Pyramidal search implementation details Image warp to square image. Position in mosaic is fixed Scales: 400, 300, 220, 160, 120, 90 pixels  Mosaic: 720 pixels