Classification spotlights

Slides:



Advertisements
Similar presentations
Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)
Advertisements

Lecture 6: Classification & Localization
Limin Wang, Yu Qiao, and Xiaoou Tang
ImageNet Classification with Deep Convolutional Neural Networks
Karen Simonyan Andrew Zisserman
Large Scale Visual Recognition Challenge (ILSVRC) 2013: Detection spotlights.
1 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Deep Epitomic Nets and Scale/Position Search for Image Classification TTIC_ECP team George.
Large-Scale Object Recognition with Weak Supervision
PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang1;2, Manohar Paluri1, Marc’Aurelio Ranzato1, Trevor Darrell2, Lubomir Bourdev1 1: Facebook.
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
Spatial Pyramid Pooling in Deep Convolutional
Kuan-Chuan Peng Tsuhan Chen
Fully Convolutional Networks for Semantic Segmentation
Deep Convolutional Nets
Neural networks in modern image processing Petra Budíková DISA seminar,
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Feedforward semantic segmentation with zoom-out features
ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.
Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,
Philipp Gysel ECE Department University of California, Davis
PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.
Introduction to Convolutional Neural Networks
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.
When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.
Recent developments in object detection
Automatic Grading of Diabetic Retinopathy through Deep Learning
Deep Learning for Dual-Energy X-Ray
Convolutional Neural Network
The Relationship between Deep Learning and Brain Function
CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.
From Vision to Grasping: Adapting Visual Networks
DeepCount Mark Lenson.
Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek
The Problem: Classification
Krishna Kumar Singh, Yong Jae Lee University of California, Davis
Jure Zbontar, Yann LeCun
Article Review Todd Hricik.
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Lecture 24: Convolutional neural networks
Combining CNN with RNN for scene labeling (segmentation)
ECE 6504 Deep Learning for Perception
Training Techniques for Deep Neural Networks
Deep Belief Networks Psychology 209 February 22, 2013.
CS6890 Deep Learning Weizhen Cai
Machine Learning: The Connectionist
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Adri`a Recasens, Aditya Khosla, Carl Vondrick, Antonio Torralba
Object detection.
Bird-species Recognition Using Convolutional Neural Network
Computer Vision James Hays
Introduction to Neural Networks
Counting in Dense Crowds using Deep Learning
Dog/Cat Classifier Christina Stiff.
Object Detection + Deep Learning
Very Deep Convolutional Networks for Large-Scale Image Recognition
KFC: Keypoints, Features and Correspondences
Lecture: Deep Convolutional Neural Networks
Outline Background Motivation Proposed Model Experimental Results
Visualizing and Understanding Convolutional Networks
Deep Learning Some slides are from Prof. Andrew Ng of Stanford.
Heterogeneous convolutional neural networks for visual recognition
CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.
Course Recap and What’s Next?
Deep Object Co-Segmentation
CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Semantic Segmentation
YOLO-based Object Detection on ARM Mali GPU
Adrian E. Gonzalez , David Parra Department of Computer Science
Presentation transcript:

Classification spotlights Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights

Additions to the ConvNet Image Classification Pipeline Andrew Howard – Andrew Howard Consulting Changes to Training: Use more pixels: Train on square patches from rectangular image instead of cropped central square Additional color manipulation of contrast, brightness, color balance used on training patches Use Patches From: Instead of Patches From: Changes to Testing: Make Predictions at different scales and different views which use all pixels Previous: Used 10 predictions (2 flips * 5 translations) This Submission: Used 90 predictions (2 flips * 5 translations * 3 scales * 3 views) The number of predictions can be reduced with no loss of accuracy with stagewise regression Training neural networks can be quite time consuming, so the focus was on simple to test and simple to implement ideas to improve the convolutional neural network based image classification pipeline. Models perform better with more data so we added more training image patches by using all of the pixels from the image rather than only selecting training patches from the cropped central square. We also added additional color manipulations to these training image patches. At test time, we make predictions over multiple scales and multiple views of the image in order to generate diverse predictions and improve the overall combined prediction. Additionally, we build models on higher resolution images which can be quickly trained by fine tuning previously trained models. The final system achieves a 13.6% top five error rate using 5 base models and 5 high resolution models and these new additions to the pipeline. View 1: View 2: View 3: Higher Resolution Models: Use a fully trained model and fine tune on image patches from a higher resolution image This can be trained in about 1/3 the number of epochs Predictions on higher resolution images give complimentary predictions to the base model Final Vision System achieves 13.6% error and is made of 5 base models and 5 higher resolution models Structure is the same as last year with fully connected layers twice as large, which doesn’t add much value

CognitiveVision team Cognitive Psychology Inspired Image Classification using Deep Neural Network Kuiyuan Yang, Microsoft Research Yalong Bai, Harbin Institute of Technology Yong Rui, Microsoft Research

Our Classification Scheme CognitiveVision team Given a image, predict its basic category firstly. Basic Category Classification … Dog Cat Easy to distinguish Predict sub category Dog Classification Cat Classification dalmatian French bulldog Egyptian cat … tiger cat Maltese dog English setter Siamese cat

Publicly available at http://caffe.berkeleyvision.org/ Caffe: Open-Sourcing Deep Learning Yangqing Jia, Trevor Darrell, UC Berkeley Convolutional Architecture for Fast Feature Extraction Seamless switching between CPU and GPU Fast computation (2.5ms / image with GPU) Full training and testing capability Reference ImageNet model available A framework to support multiple applications: Your next Application! Classification Embedding Detection Publicly available at http://caffe.berkeleyvision.org/

Experiments for large scale visual recognition + We tried: Deep CNN (following Krizhevsky et al’12) Low level features &spatial granularities Where did we fail? top 1 acc = 0.567 Appliance and instrument are confusing for us, including - TV vs. Screen, - Coffee mug vs. Cup, - Flute vs. Microphone, - … Television (0.18) Hair spray (0.18) Coffee mug (0.10) Flute (0.10)

Agenda 8:30 Classification&localization 10:30 Detection Noon Discussion panel 14:00 Invited talk by Vittorio Ferrari: Auto-annotation and self-assessment in ImageNet 14:40 Fine-Grained Challenge 2013 8:50 9:20 9:35 9:50 Spotlights 9:05 10:50 11:10 11:30 Spotlights 11:40 http://www.image-net.org/challenges/LSVRC/2013/iccv2013