CNN architectures Mostly linear structure

Slides:



Advertisements
Similar presentations
|| Dmitry Laptev, Joachim M. Buhmann Machine Learning Lab, ETH Zurich 05/09/14Dmitry Laptev1 Convolutional Decision Trees.
Advertisements

Object Recognition with Informative Features and Linear Classification Authors: Vidal-Naquet & Ullman Presenter: David Bradley.
Spatial Pyramid Pooling in Deep Convolutional
A shallow introduction to Deep Learning
Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.
School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.
Deep Convolutional Nets
Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:
Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.
Convolutional Neural Network
Deep Learning Overview Sources: workshop-tutorial-final.pdf
Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University
Automatic Lung Cancer Diagnosis from CT Scans (Week 3)
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Deep Learning and Its Application to Signal and Image Processing and Analysis Class III - Fall 2016 Tammy Riklin Raviv, Electrical and Computer Engineering.
Recent developments in object detection
Big data classification using neural network
Demo.
Convolutional Neural Network
The Relationship between Deep Learning and Brain Function
Machine Learning for Big Data
Deep learning David Kauchak CS158 – Fall 2016.
Automatic Lung Cancer Diagnosis from CT Scans (Week 2)
Data Mining, Neural Network and Genetic Programming
ECE 5424: Introduction to Machine Learning
Deep Learning Insights and Open-ended Questions
Week III: Deep Tracking
Automatic Lung Cancer Diagnosis from CT Scans (Week 4)
Matt Gormley Lecture 16 October 24, 2016
Lecture 24: Convolutional neural networks
Regularizing Face Verification Nets To Discrete-Valued Pain Regression
Inception and Residual Architecture in Deep Convolutional Networks
RIVER SEGMENTATION FOR FLOOD MONITORING
Neural networks (3) Regularization Autoencoder
ECE 6504 Deep Learning for Perception
Deep learning and applications to Natural language processing
Convolution Neural Networks
Training Techniques for Deep Neural Networks
Deep Learning Qing LU, Siyuan CAO.
CS6890 Deep Learning Weizhen Cai
Machine Learning: The Connectionist
Object detection as supervised classification
R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.
Introduction to Deep Learning for neuronal data analyses
Bird-species Recognition Using Convolutional Neural Network
Computer Vision James Hays
Introduction to Neural Networks
Image Classification.
Deep Learning Tutorial
Deep learning Introduction Classes of Deep Learning Networks
Brief Review of Recognition + Context
Smart Robots, Drones, IoT
Object Classes Most recent work is at the object level We perceive the world in terms of objects, belonging to different classes. What are the differences.
Creating Data Representations
Lecture: Deep Convolutional Neural Networks
Deep Learning for Non-Linear Control
Visualizing and Understanding Convolutional Networks
Tuning CNN: Tips & Tricks
Deep Learning Some slides are from Prof. Andrew Ng of Stanford.
Neural networks (3) Regularization Autoencoder
Inception-v4, Inception-ResNet and the Impact of
Course Recap and What’s Next?
CIS 519 Recitation 11/15/18.
Lecture 21: Machine Learning Overview AP Computer Science Principles
Introduction to Neural Networks
Word representations David Kauchak CS158 – Fall 2016.
VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
Semantic Segmentation
Learning Deconvolution Network for Semantic Segmentation
Lecture 9: Machine Learning Overview AP Computer Science Principles
Presentation transcript:

CNN architectures Mostly linear structure Generally a DAG, directed acyclic graph LeNet AlexNet ZF Net GoogLeNet VGGNet ResNet http://cs231n.github.io/neural-networks-1/ VisGraph, HKUST VisGraph, HKUST

Learned convolutional filters: Stage 1 9 patches with strongest activation learned filters (7x7x96) These visualizations by Matt Zeiler and Rob Fergus will give you an idea of what the network is doing at different stages. The higher you go the richer and the more specialized the features are. Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 2 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 3 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 4 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 5 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Open questions Only empirical that deeper is better Images contain hierarchical structures Overfitting and generalization meaningful data! Intrinsic laws Networks are non-convex Need regularization Smaller networks are hard to train with local methods Local minima are bad, in loss, not stable, large variance Bigger ones are easier More local minima, but better, more stable, small variance As big as the computational power, and data!

CNN applications Transfer learning Fine-tuning the CNN Keep some early layers Early layers contain more generic features, edges, color blobs Common to many visual tasks Fine-tune the later layers More specific to the details of the class CNN as feature extractor Remove the last fully connected layer A kind of descriptor or CNN codes for the image AlexNet gives a 4096 Dim descriptor http://cs231n.github.io/neural-networks-1/ VisGraph, HKUST VisGraph, HKUST

What is object classification? Demo on stanford cs231n

What are visual tasks? General visual tasks Increased difficulties Specific ‘face’, ‘people’ detection and recognition OCR  don’t under-estimate ‘small’ problems Increased difficulties Classification (the main or dominant object) Localization (the dominant object) Detection (any number, any size) Segmentation (semantic pixel level) I’ll first define object detection from a computer vision perspective with respect to other tasks such as classification which usually assumes predicting the class of the main object of an image, localization where you have to predict the class of the main object but also a tight bounding box around it. Now detection is similar to localization except that objects can be of any size and in any number (including zero). Segmentation goes one step beyond by labeling every pixel of an image. These are ordered by increasing difficulty, and you probably don’t need to go all the way to segmentation for many tasks.

Why are they important? Robotics Perception is broader, and the bottleneck, the visual perception is the fundamental Self-driving cars Surveillance Perception is a big deal and is currently one of the biggest bottlenecks for applications such as robotics, self driving cars or surveillance (pandas only).

Face detection and recognition detection is easy pre-DNN Voila’s approach, 2001, Haar feature, adaboosting, cascading classifier verification: a binary classification verify weather two images belong to the same person identification: a multi-class classification classify an image into one of N identity classes Key challenges intra-personal variations inter-personal variations

Are they deployed? classification personal image search (Google, Baidu, Bing) detection face detection cameras election duplicate votes CCTV border control casinos visa processing crime solving prosopagnosia (face blindness) objects license plates pedestrian detection (Daimler, MobileEye): e.g. 2013 Mercedes-Benz E-Class and S-Class: warning and automatic braking reducing accidents and severity vehicle detection for forward collision warning (MobileEye) traffic sign detection (MobileEye) What has been deployed so far? Regarding the recent deep learning work, mostly classification (deployed in 6 months at Google after the acquisition of the Toronto group). Regarding the more traditional vision, there’s been a lot of deployment for face detection because that’s one of the easiest detection problems. But more complicated detection has recently made its way into cars for example with pedestrian detection in the 2013 mercedes.

Pre- and post-DNN hand-crafted features and descriptors DNN era Bag of words Vocabulary tree DNN era

SUPERVISED DEEP SHALLOW UNSUPERVISED Recurrent Neural Net Boosting Convolutional Neural Net Neural Net Perceptron SVM DEEP SHALLOW Deep (sparse/denoising) Autoencoder Autoencoder Neural Net Sparse Coding To situate this tutorial in the machine learning context, we’ll be talking about convnets which are in the deep and supervised area of machine learning. although they can be initialized with unsupervised pre-training too. SP GMM Deep Belief Net Restricted BM BayesNP UNSUPERVISED Slide: M. Ranzato