CNN architectures Mostly linear structure

Slides:

Advertisements

Similar presentations

|| Dmitry Laptev, Joachim M. Buhmann Machine Learning Lab, ETH Zurich 05/09/14Dmitry Laptev1 Convolutional Decision Trees.

Advertisements

Object Recognition with Informative Features and Linear Classification Authors: Vidal-Naquet & Ullman Presenter: David Bradley.

Spatial Pyramid Pooling in Deep Convolutional

A shallow introduction to Deep Learning

Dr. Z. R. Ghassabi Spring 2015 Deep learning for Human action Recognition 1.

School of Engineering and Computer Science Victoria University of Wellington Copyright: Peter Andreae, VUW Image Recognition COMP # 18.

Deep Convolutional Nets

Learning Features and Parts for Fine-Grained Recognition Authors: Jonathan Krause, Timnit Gebru, Jia Deng, Li-Jia Li, Li Fei-Fei ICPR, 2014 Presented by:

Object Recognizing. Deep Learning Success in 2012 DeepNet and speech processing.

Convolutional Neural Network

Deep Learning Overview Sources: workshop-tutorial-final.pdf

Yann LeCun Other Methods and Applications of Deep Learning Yann Le Cun The Courant Institute of Mathematical Sciences New York University

Automatic Lung Cancer Diagnosis from CT Scans (Week 3)

Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.

Deep Learning and Its Application to Signal and Image Processing and Analysis Class III - Fall 2016 Tammy Riklin Raviv, Electrical and Computer Engineering.

Recent developments in object detection

Big data classification using neural network

Convolutional Neural Network

The Relationship between Deep Learning and Brain Function

Machine Learning for Big Data

Deep learning David Kauchak CS158 – Fall 2016.

Automatic Lung Cancer Diagnosis from CT Scans (Week 2)

Data Mining, Neural Network and Genetic Programming

ECE 5424: Introduction to Machine Learning

Deep Learning Insights and Open-ended Questions

Week III: Deep Tracking

Automatic Lung Cancer Diagnosis from CT Scans (Week 4)

Matt Gormley Lecture 16 October 24, 2016

Lecture 24: Convolutional neural networks

Regularizing Face Verification Nets To Discrete-Valued Pain Regression

Inception and Residual Architecture in Deep Convolutional Networks

RIVER SEGMENTATION FOR FLOOD MONITORING

Neural networks (3) Regularization Autoencoder

ECE 6504 Deep Learning for Perception

Deep learning and applications to Natural language processing

Convolution Neural Networks

Training Techniques for Deep Neural Networks

Deep Learning Qing LU, Siyuan CAO.

CS6890 Deep Learning Weizhen Cai

Machine Learning: The Connectionist

Object detection as supervised classification

R-CNN region By Ilia Iofedov 11/11/2018 BGU, DNN course 2016.

Introduction to Deep Learning for neuronal data analyses

Bird-species Recognition Using Convolutional Neural Network

Computer Vision James Hays

Introduction to Neural Networks

Image Classification.

Deep Learning Tutorial

Deep learning Introduction Classes of Deep Learning Networks

Brief Review of Recognition + Context

Smart Robots, Drones, IoT

Object Classes Most recent work is at the object level We perceive the world in terms of objects, belonging to different classes. What are the differences.

Creating Data Representations

Lecture: Deep Convolutional Neural Networks

Deep Learning for Non-Linear Control

Visualizing and Understanding Convolutional Networks

Tuning CNN: Tips & Tricks

Deep Learning Some slides are from Prof. Andrew Ng of Stanford.

Neural networks (3) Regularization Autoencoder

Inception-v4, Inception-ResNet and the Impact of

Course Recap and What’s Next?

CIS 519 Recitation 11/15/18.

Lecture 21: Machine Learning Overview AP Computer Science Principles

Introduction to Neural Networks

Word representations David Kauchak CS158 – Fall 2016.

VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION

Semantic Segmentation

Learning Deconvolution Network for Semantic Segmentation

Lecture 9: Machine Learning Overview AP Computer Science Principles

Presentation transcript:

CNN architectures Mostly linear structure Generally a DAG, directed acyclic graph LeNet AlexNet ZF Net GoogLeNet VGGNet ResNet http://cs231n.github.io/neural-networks-1/ VisGraph, HKUST VisGraph, HKUST

Learned convolutional filters: Stage 1 9 patches with strongest activation learned filters (7x7x96) These visualizations by Matt Zeiler and Rob Fergus will give you an idea of what the network is doing at different stages. The higher you go the richer and the more specialized the features are. Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 2 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 3 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 4 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Strongest activations: Stage 5 Visualizing and understanding convolutional neural networks. Zeiler, Matthew D., and Rob Fergus. arXiv preprint arXiv:1311.2901 (2013).

Open questions Only empirical that deeper is better Images contain hierarchical structures Overfitting and generalization meaningful data! Intrinsic laws Networks are non-convex Need regularization Smaller networks are hard to train with local methods Local minima are bad, in loss, not stable, large variance Bigger ones are easier More local minima, but better, more stable, small variance As big as the computational power, and data!

CNN applications Transfer learning Fine-tuning the CNN Keep some early layers Early layers contain more generic features, edges, color blobs Common to many visual tasks Fine-tune the later layers More specific to the details of the class CNN as feature extractor Remove the last fully connected layer A kind of descriptor or CNN codes for the image AlexNet gives a 4096 Dim descriptor http://cs231n.github.io/neural-networks-1/ VisGraph, HKUST VisGraph, HKUST

What is object classification? Demo on stanford cs231n

What are visual tasks? General visual tasks Increased difficulties Specific ‘face’, ‘people’ detection and recognition OCR  don’t under-estimate ‘small’ problems Increased difficulties Classification (the main or dominant object) Localization (the dominant object) Detection (any number, any size) Segmentation (semantic pixel level) I’ll first define object detection from a computer vision perspective with respect to other tasks such as classification which usually assumes predicting the class of the main object of an image, localization where you have to predict the class of the main object but also a tight bounding box around it. Now detection is similar to localization except that objects can be of any size and in any number (including zero). Segmentation goes one step beyond by labeling every pixel of an image. These are ordered by increasing difficulty, and you probably don’t need to go all the way to segmentation for many tasks.

Why are they important? Robotics Perception is broader, and the bottleneck, the visual perception is the fundamental Self-driving cars Surveillance Perception is a big deal and is currently one of the biggest bottlenecks for applications such as robotics, self driving cars or surveillance (pandas only).

Face detection and recognition detection is easy pre-DNN Voila’s approach, 2001, Haar feature, adaboosting, cascading classifier verification: a binary classification verify weather two images belong to the same person identification: a multi-class classification classify an image into one of N identity classes Key challenges intra-personal variations inter-personal variations

Are they deployed? classification personal image search (Google, Baidu, Bing) detection face detection cameras election duplicate votes CCTV border control casinos visa processing crime solving prosopagnosia (face blindness) objects license plates pedestrian detection (Daimler, MobileEye): e.g. 2013 Mercedes-Benz E-Class and S-Class: warning and automatic braking reducing accidents and severity vehicle detection for forward collision warning (MobileEye) traffic sign detection (MobileEye) What has been deployed so far? Regarding the recent deep learning work, mostly classification (deployed in 6 months at Google after the acquisition of the Toronto group). Regarding the more traditional vision, there’s been a lot of deployment for face detection because that’s one of the easiest detection problems. But more complicated detection has recently made its way into cars for example with pedestrian detection in the 2013 mercedes.

Pre- and post-DNN hand-crafted features and descriptors DNN era Bag of words Vocabulary tree DNN era

SUPERVISED DEEP SHALLOW UNSUPERVISED Recurrent Neural Net Boosting Convolutional Neural Net Neural Net Perceptron SVM DEEP SHALLOW Deep (sparse/denoising) Autoencoder Autoencoder Neural Net Sparse Coding To situate this tutorial in the machine learning context, we’ll be talking about convnets which are in the deep and supervised area of machine learning. although they can be initialized with unsupervised pre-training too. SP GMM Deep Belief Net Restricted BM BayesNP UNSUPERVISED Slide: M. Ranzato