Classification spotlights

Slides:

Advertisements

Similar presentations

Rich feature Hierarchies for Accurate object detection and semantic segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Jitandra Malik (UC Berkeley)

Advertisements

Lecture 6: Classification & Localization

Limin Wang, Yu Qiao, and Xiaoou Tang

ImageNet Classification with Deep Convolutional Neural Networks

Karen Simonyan Andrew Zisserman

Large Scale Visual Recognition Challenge (ILSVRC) 2013: Detection spotlights.

1 TTIC_ECP: Deep Epitomic CNNs and Explicit Scale/Position Search Deep Epitomic Nets and Scale/Position Search for Image Classification TTIC_ECP team George.

Large-Scale Object Recognition with Weak Supervision

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang1;2, Manohar Paluri1, Marc’Aurelio Ranzato1, Trevor Darrell2, Lubomir Bourdev1 1: Facebook.

AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/

Spatial Pyramid Pooling in Deep Convolutional

Kuan-Chuan Peng Tsuhan Chen

Fully Convolutional Networks for Semantic Segmentation

Deep Convolutional Nets

Neural networks in modern image processing Petra Budíková DISA seminar,

Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:

Feedforward semantic segmentation with zoom-out features

ImageNet Classification with Deep Convolutional Neural Networks Presenter: Weicong Chen.

Rich feature hierarchies for accurate object detection and semantic segmentation 2014 IEEE Conference on Computer Vision and Pattern Recognition Ross Girshick,

Philipp Gysel ECE Department University of California, Davis

PANDA: Pose Aligned Networks for Deep Attribute Modeling Ning Zhang 1,2 Manohar Paluri 1 Marć Aurelio Ranzato 1 Trevor Darrell 2 Lumbomir Boudev 1 1 Facebook.

Introduction to Convolutional Neural Networks

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition arXiv: v4 [cs.CV(CVPR)] 23 Apr 2015 Kaiming He, Xiangyu Zhang, Shaoqing.

When deep learning meets object detection: Introduction to two technologies: SSD and YOLO Wenchi Ma.

Recent developments in object detection

Automatic Grading of Diabetic Retinopathy through Deep Learning

Deep Learning for Dual-Energy X-Ray

Convolutional Neural Network

The Relationship between Deep Learning and Brain Function

CS 6501: 3D Reconstruction and Understanding Convolutional Neural Networks Connelly Barnes.

From Vision to Grasping: Adapting Visual Networks

DeepCount Mark Lenson.

Convolutional Neural Fabrics by Shreyas Saxena, Jakob Verbeek

The Problem: Classification

Krishna Kumar Singh, Yong Jae Lee University of California, Davis

Jure Zbontar, Yann LeCun

Article Review Todd Hricik.

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Lecture 24: Convolutional neural networks

Combining CNN with RNN for scene labeling (segmentation)

ECE 6504 Deep Learning for Perception

Training Techniques for Deep Neural Networks

Deep Belief Networks Psychology 209 February 22, 2013.

CS6890 Deep Learning Weizhen Cai

Machine Learning: The Connectionist

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Adri`a Recasens, Aditya Khosla, Carl Vondrick, Antonio Torralba

Object detection.

Bird-species Recognition Using Convolutional Neural Network

Computer Vision James Hays

Introduction to Neural Networks

Counting in Dense Crowds using Deep Learning

Dog/Cat Classifier Christina Stiff.

Object Detection + Deep Learning

Very Deep Convolutional Networks for Large-Scale Image Recognition

KFC: Keypoints, Features and Correspondences

Lecture: Deep Convolutional Neural Networks

Outline Background Motivation Proposed Model Experimental Results

Visualizing and Understanding Convolutional Networks

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Heterogeneous convolutional neural networks for visual recognition

CSCI 5922 Neural Networks and Deep Learning: Convolutional Nets For Image And Speech Processing Mike Mozer Department of Computer Science and Institute.

Course Recap and What’s Next?

Deep Object Co-Segmentation

CS295: Modern Systems: Application Case Study Neural Network Accelerator Sang-Woo Jun Spring 2019 Many slides adapted from Hyoukjun Kwon‘s Gatech “Designing.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Semantic Segmentation

YOLO-based Object Detection on ARM Mali GPU

Adrian E. Gonzalez , David Parra Department of Computer Science

Presentation transcript:

Classification spotlights Large Scale Visual Recognition Challenge (ILSVRC) 2013: Classification spotlights

Additions to the ConvNet Image Classification Pipeline Andrew Howard – Andrew Howard Consulting Changes to Training: Use more pixels: Train on square patches from rectangular image instead of cropped central square Additional color manipulation of contrast, brightness, color balance used on training patches Use Patches From: Instead of Patches From: Changes to Testing: Make Predictions at different scales and different views which use all pixels Previous: Used 10 predictions (2 flips * 5 translations) This Submission: Used 90 predictions (2 flips * 5 translations * 3 scales * 3 views) The number of predictions can be reduced with no loss of accuracy with stagewise regression Training neural networks can be quite time consuming, so the focus was on simple to test and simple to implement ideas to improve the convolutional neural network based image classification pipeline. Models perform better with more data so we added more training image patches by using all of the pixels from the image rather than only selecting training patches from the cropped central square. We also added additional color manipulations to these training image patches. At test time, we make predictions over multiple scales and multiple views of the image in order to generate diverse predictions and improve the overall combined prediction. Additionally, we build models on higher resolution images which can be quickly trained by fine tuning previously trained models. The final system achieves a 13.6% top five error rate using 5 base models and 5 high resolution models and these new additions to the pipeline. View 1: View 2: View 3: Higher Resolution Models: Use a fully trained model and fine tune on image patches from a higher resolution image This can be trained in about 1/3 the number of epochs Predictions on higher resolution images give complimentary predictions to the base model Final Vision System achieves 13.6% error and is made of 5 base models and 5 higher resolution models Structure is the same as last year with fully connected layers twice as large, which doesn’t add much value

CognitiveVision team Cognitive Psychology Inspired Image Classification using Deep Neural Network Kuiyuan Yang, Microsoft Research Yalong Bai, Harbin Institute of Technology Yong Rui, Microsoft Research

Our Classification Scheme CognitiveVision team Given a image, predict its basic category firstly. Basic Category Classification … Dog Cat Easy to distinguish Predict sub category Dog Classification Cat Classification dalmatian French bulldog Egyptian cat … tiger cat Maltese dog English setter Siamese cat

Publicly available at http://caffe.berkeleyvision.org/ Caffe: Open-Sourcing Deep Learning Yangqing Jia, Trevor Darrell, UC Berkeley Convolutional Architecture for Fast Feature Extraction Seamless switching between CPU and GPU Fast computation (2.5ms / image with GPU) Full training and testing capability Reference ImageNet model available A framework to support multiple applications: Your next Application! Classification Embedding Detection Publicly available at http://caffe.berkeleyvision.org/

Experiments for large scale visual recognition + We tried: Deep CNN (following Krizhevsky et al’12) Low level features &spatial granularities Where did we fail? top 1 acc = 0.567 Appliance and instrument are confusing for us, including - TV vs. Screen, - Coffee mug vs. Cup, - Flute vs. Microphone, - … Television (0.18) Hair spray (0.18) Coffee mug (0.10) Flute (0.10)

Agenda 8:30 Classification&localization 10:30 Detection Noon Discussion panel 14:00 Invited talk by Vittorio Ferrari: Auto-annotation and self-assessment in ImageNet 14:40 Fine-Grained Challenge 2013 8:50 9:20 9:35 9:50 Spotlights 9:05 10:50 11:10 11:30 Spotlights 11:40 http://www.image-net.org/challenges/LSVRC/2013/iccv2013