Part 3: discriminative methods

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Detecting Faces in Images: A Survey
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
Face detection Behold a state-of-the-art face detector! (Courtesy Boris Babenko)Boris Babenko.
Computer vision: models, learning and inference
AdaBoost & Its Applications
Face detection Many slides adapted from P. Viola.
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei Li,
EE462 MLCV Lecture 5-6 Object Detection – Boosting Tae-Kyun Kim.
The Viola/Jones Face Detector (2001)
HCI Final Project Robust Real Time Face Detection Paul Viola, Michael Jones, Robust Real-Time Face Detetion, International Journal of Computer Vision,
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Generic Object Detection using Feature Maps Oscar Danielsson Stefan Carlsson
Object Recognition with Informative Features and Linear Classification Authors: Vidal-Naquet & Ullman Presenter: David Bradley.
Robust Real-time Object Detection by Paul Viola and Michael Jones ICCV 2001 Workshop on Statistical and Computation Theories of Vision Presentation by.
CS 223B Assignment 1 Help Session Dan Maynes-Aminzade.
Fitting a Model to Data Reading: 15.1,
Face Detection CSE 576. Face detection State-of-the-art face detection demo (Courtesy Boris Babenko)Boris Babenko.
Face Detection using the Viola-Jones Method
Learning Based Hierarchical Vessel Segmentation
EADS DS / SDC LTIS Page 1 7 th CNES/DLR Workshop on Information Extraction and Scene Understanding for Meter Resolution Image – 29/03/07 - Oberpfaffenhofen.
Part 3: discriminative methods Antonio Torralba. Overview of section Object detection with classifiers Boosting –Gentle boosting –Weak detectors –Object.
Recognition using Boosting Modified from various sources including
LOGO Ensemble Learning Lecturer: Dr. Bo Yuan
A Statistically Selected Part-Based Probabilistic Model for Object Recognition Zhipeng Zhao, Ahmed Elgammal Department of Computer Science, Rutgers, The.
Window-based models for generic object detection Mei-Chen Yeh 04/24/2012.
Lecture 29: Face Detection Revisited CS4670 / 5670: Computer Vision Noah Snavely.
Face detection Slides adapted Grauman & Liebe’s tutorial
Sharing features for multi-class object detection Antonio Torralba, Kevin Murphy and Bill Freeman MIT Computer Science and Artificial Intelligence Laboratory.
BOOSTING David Kauchak CS451 – Fall Admin Final project.
Supervised Learning of Edges and Object Boundaries Piotr Dollár Zhuowen Tu Serge Belongie.
ECE738 Advanced Image Processing Face Detection IEEE Trans. PAMI, July 1997.
Face Detection Ying Wu Electrical and Computer Engineering Northwestern University, Evanston, IL
Adaboost and Object Detection Xu and Arun. Principle of Adaboost Three cobblers with their wits combined equal Zhuge Liang the master mind. Failure is.
Lecture 6: Classification – Boosting and SVMs CAP 5415 Fall 2006.
Lecture 09 03/01/2012 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
The Viola/Jones Face Detector A “paradigmatic” method for real-time object detection Training is slow, but detection is very fast Key ideas Integral images.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
Learning to Detect Faces A Large-Scale Application of Machine Learning (This material is not in the text: for further information see the paper by P.
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Ensemble Methods: Bagging, Boosting Readings: Murphy 16.4; Hastie 16.
Convolutional Restricted Boltzmann Machines for Feature Learning Mohammad Norouzi Advisor: Dr. Greg Mori Simon Fraser University 27 Nov
Face detection Many slides adapted from P. Viola.
AdaBoost Algorithm and its Application on Object Detection Fayin Li.
1 Bilinear Classifiers for Visual Recognition Computational Vision Lab. University of California Irvine To be presented in NIPS 2009 Hamed Pirsiavash Deva.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
CS262: Computer Vision Lect 06: Face Detection
Reading: R. Schapire, A brief introduction to boosting
Cascade for Fast Detection
License Plate Detection
Session 7: Face Detection (cont.)
Lit part of blue dress and shadowed part of white dress are the same color
Recognition using Nearest Neighbor (or kNN)
ECE 5424: Introduction to Machine Learning
Li Fei-Fei, UIUC Rob Fergus, MIT Antonio Torralba, MIT
Object detection as supervised classification
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
In summary C1={skin} C2={~skin} Given x=[R,G,B], is it skin or ~skin?
Cos 429: Face Detection (Part 2) Viola-Jones and AdaBoost Guest Instructor: Andras Ferencz (Your Regular Instructor: Fei-Fei Li) Thanks to Fei-Fei.
Face Detection via AdaBoost
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Ensemble learning.
Midterm Exam Closed book, notes, computer Similar to test 1 in format:
Lecture 29: Face Detection Revisited
Jie Chen, Shiguang Shan, Shengye Yan, Xilin Chen, Wen Gao
An introduction to: Deep Learning aka or related to Deep Neural Networks Deep Structural Learning Deep Belief Networks etc,
Presentation transcript:

Part 3: discriminative methods

Discriminative methods Object detection and recognition is formulated as a classification problem. The image is partitioned into a set of overlapping windows, and a decision is taken at each window about if it contains a target object or not. Each window is represented by extracting a large number of features that encode information such as boundaries, textures, color, spatial structure. The classification function, that maps an image window into a binary decision, is learnt using methods such as SVMs, boosting or neural networks.

Overview General introduction, formulation of the problem in this framework and some history. Explore demo + related theory Current challenges and future trends There are many discriminative approaches for object recognition. The literature is very large here. In this presentation I will not try to make an overview of the important methods. Instead, I will go into depth into one single technique that is easy enough to get a good handle on it on just one hour. I will also review some other techniques with the goal of illustrating the typical formulation of the object recognition problem under a discriminative framework. Then, at the end of the talk, I will discuss some of the future directions I think will become important in the next few years.

Discriminative methods Object detection and recognition is formulated as a classification problem. The image is partitioned into a set of overlapping windows … and a decision is taken at each window about if it contains a target object or not. Decision boundary Computer screen Background In some feature space Where are the screens? Bag of image patches

Formulation Formulation: binary classification Classification function … x1 x2 x3 … xN xN+1 xN+2 … xN+M Features x = y = -1 +1 -1 -1 Labels ? ? ? Training data: each image patch is labeled as containing the object or background Test data Where belongs to some family of functions Classification function

Formulation The loss function Find that minimizes the loss on the training

Discriminative vs. generative Generative model 10 20 30 40 50 60 70 0.05 0.1 x = data Discriminative model 10 20 30 40 50 60 70 0.5 1 This is one of the arguments some times used to indicate some advantages that a discriminative framework might have over a generative framework. x = data Classification function 10 20 30 40 50 60 70 80 -1 1 x = data

Classifiers NN SVM K-NN, nearest neighbors Additive models

Nearest neighbor classifiers Learning involves adjusting the parameters that define the distance

Object models Invariance: search strategy Part based Template matching

Object models Big emphasis on the past on face detection: It provided a clear problem with a well defined visual category and many applications. It allowed making progress on efficient techniques. Rowley Scheiderman Poggio Viola Ullman Geman

A simple object detector with Boosting Download Toolbox for manipulating dataset Code and dataset Matlab code Gentle boosting Object detector using a part based model Dataset with cars and computer monitors In the next part I will focus on particular technique, and I will provide a detailed description of things you need to know to make it work. This part is acompanied by a Matlab toolbox that you can find on the web site of the class. Is entirely self contained and written in matlab. It has an implementation of gentle boosting, and an object detector. The demo shows results on detecting screens and cars and can be easily trained to detect other objects. http://people.csail.mit.edu/torralba/iccv2005/

Boosting A simple algorithm for learning robust classifiers (Shapiro, Friedman) Provides efficient algorithm for sparse visual feature selection (Tieu & Viola, Viola & Jones)

Boosting Defines a classifier using an additive model: Strong classifier Weak classifier Weight Features vector We need to define a family of weak classifiers from a family of weak classifiers

Boosting It is a greedy procedure: xt=1 Each data point has xt a class label: xt xt=2 +1 ( ) -1 ( ) yt = and a weight: wt =1

Toy example Weak learners from the family of lines Each data point has a class label: +1 ( ) -1 ( ) yt = and a weight: wt =1 h => p(error) = 0.5 it is at chance

Toy example Each data point has a class label: +1 ( ) yt = -1 ( ) +1 ( ) -1 ( ) yt = and a weight: wt =1 This one seems to be the best This is a ‘weak classifier’: It performs slightly better than chance.

Toy example Each data point has a class label: +1 ( ) yt = -1 ( ) +1 ( ) -1 ( ) yt = We update the weights: wt wt exp{-yt Ht} We set a new problem for which the previous weak classifier performs at chance again

Toy example Each data point has a class label: +1 ( ) yt = -1 ( ) +1 ( ) -1 ( ) yt = We update the weights: wt wt exp{-yt Ht} We set a new problem for which the previous weak classifier performs at chance again

Toy example Each data point has a class label: +1 ( ) yt = -1 ( ) +1 ( ) -1 ( ) yt = We update the weights: wt wt exp{-yt Ht} We set a new problem for which the previous weak classifier performs at chance again

Toy example Each data point has a class label: +1 ( ) yt = -1 ( ) +1 ( ) -1 ( ) yt = We update the weights: wt wt exp{-yt Ht} We set a new problem for which the previous weak classifier performs at chance again

Toy example f1 f2 f4 f3 The strong (non- linear) classifier is built as the combination of all the weak (linear) classifiers.

Boosting Different cost functions and minimization algorithms result is various flavors of Boosting In this demo, I will use gentleBoosting: it is simple to implement and numerically stable.

gentleBoosting GentleBoosting fits the additive model by minimizing the exponential loss Training samples Minimization is perform Newton steps

gentleBoosting Iterative procedure. At each step we add We chose that we minimizes the cost: A Taylor approximation of this term gives: At each iterations we just need to solve a weighted least squares problem Weights at this iteration For more details: Friedman, Hastie, Tibshirani. “Additive Logistic Regression: a Statistical View of Boosting” (1998)

Weak classifiers Generic form Regression stumps: simple but commonly used in object detection. Decision tree fm(x) b=Ew(y [x> q]) a=Ew(y [x< q]) Four parameters: x q

gentleBoosting.m Initialize weights w = 1 Solve weighted least-squares function classifier = gentleBoost(x, y, Nrounds) … for m = 1:Nrounds fm = selectBestWeakClassifier(x, y, w); w = w .* exp(- y .* fm); % store parameters of fm in classifier end Initialize weights w = 1 Solve weighted least-squares Re-weight training samples

Demo gentleBoosting Demo using Gentle boost and stumps with hand selected 2D data: > demoGentleBoost.m

Probabilistic interpretation Generative model Discriminative (Boosting) model In a probabilistic framework, opposed to the generative model, here we fit a conditional probability. Boosting is a way of fitting a logistic regression. It can be a set of arbitrary functions This provides a great flexibility, difficult to beat by current generative models. But also there is the danger of not understanding what are they really doing.

From images to features: Weak detectors We will now define a family of visual features that can be used as weak classifiers (“weak detectors”)

Weak detectors Textures of textures Tieu and Viola, CVPR 2000 One the first papers to use boosting for a vision task. Take one filter, filter the image, take the absolute value and downsample. Take another filter, take absolute value of the output and downsample. Do this once more. Then, for each color channel independently, sum the value of all the pixels and the resulting number will be one of the features. If you do this for all triples of filters, you’ll end up with 46875 features. Every combination of three filters generates a different feature This gives thousands of features. Boosting selects a sparse subset, so computations on test time are very efficient. Boosting is also very robust to overfitting.

Weak detectors Haar filters and integral image

Weak detectors Edge probes & chamfer distances

Weak detectors Part based: similar to part-based generative models. We create weak detectors by using parts and voting for the object center location Screen model Car model These features are the ones used for the detector on the course web site.

Weak detectors We can create a family of “weak detectors” by collecting a set of filtered templates from a set of training objects. = * = The simples weak detector will be using template matching. In this case we use template matching to detect the presence of a part of the object. Once the presence is detected it will vote for the object in the center of the object coordinates frame. In this example, we are trying to detect a computer screen. We use the top left corner of the screen. The first thing we do is compute a normalize correlation between the image and the template. This will give a pick in the response near the top-left corner of the screen and, of course, many other matches that do not correspond to screen. Now we displace the output of the correlation taking into account the relative positions between the top left corner of the screen and the center of the screen. We also blur the output. Now, if we apply a threshold we can see that the center of the screen has a value above the threshold while most of the background gets supressed. There are a lot of other locations that also are above the threshold. Therefore, this is not a good detector. But it is better than chance! Better than chance

Weak detectors We can do a better job using filtered images Still a weak detector but better than before We can create a family of “weak detectors” by collecting a set of filtered templates from a set of training objects.

From images to features: Weak detectors First we create a family of weak detectors. 1) We collect a set of filtered templates from a set of training objects. So, in sumary, we start collecting many patches 2) Each weak detector works by doing template matching using one of the templates.

Weak detectors Generative model Discriminative (Boosting) model Image In a probabilistic framework, opposed to the generative model, here we fit a conditional probability. Boosting is a way of fitting a logistic regression. Image fi, Pi Feature gi Part template Relative position wrt object center

Training First we evaluate all the N features on all the training images. Then, we sample the feature outputs on the object center and at random locations in the background: So, now we have to train. We need positive and negative examples. So we do the next thing: get positive samples by taking the values of all the features in the center of the target. Then take negative samples from the background.

Example of weak ‘screen detector’ We run boosting: At each iteration we select the best feature on the weighted classification problem: hm (v) A decision stump is a threshold on a single feature Each decision stump has 4 parameters: {f, q, a, b} f = template index (selected among a dictionary of templates) q = Threshold, a,b = average class value (-1, +1) at each side of the threshold

Representation … … Selected features for the screen detector 100 1 2 3 4 10

Representation … … Selected features for the car detector 1 2 3 4 10 100

Example: screen detection Feature output

Example: screen detection Feature output Thresholded output Weak ‘detector’ Produces many false alarms.

Example: screen detection Feature output Thresholded output Strong classifier at iteration 1

Example: screen detection Feature output Thresholded output Strong classifier Second weak ‘detector’ Produces a different set of false alarms.

Example: screen detection Feature output Thresholded output Strong classifier + Strong classifier at iteration 2

Example: screen detection Feature output Thresholded output Strong classifier + … Strong classifier at iteration 10

Example: screen detection Feature output Thresholded output Strong classifier + … Adding features Final classification Strong classifier at iteration 200

Demo Demo of screen and car detectors using parts, Gentle boost, and stumps: > runDetector.m

Performance measures ROC Precision-recall

Cascade of classifiers Geman Viola

Hybrid methods In many cases, algorithms can not be classified as generative or discriminative Mixture of both generative (just one slide mention these methods) Hinton, Schiele, Perona (Welling ICCV05), Poggio (morphable models and learning face from one example) Some examples

“Head in the coffee beans problem” Can you find the head in this image?

Multiclass object detection Multiview Multiclass

Multiclass object detection

Sharing representations Multiview Multiclass

Sharing representations Ullman & Evgyne

Context Multiview Multiclass

Context: relationships between objects Detect first simple objects (reliable detectors) that provide strong contextual constraints to the target (screen -> keyboard -> mouse)

Session 3: discriminative methods 20’ demo Simple (ada-) boosting + object detector 20’ review + new stuff Viola jones Schneiderman et al. Opelt et al. Torralba et al. SVM (triggs, CVPR 05) NN (LeCun)

Session 3: Demo Simple (ada-) boosting LeCun demo???? Viola demo???

Session 3: Demo gentle Boost

Session 3: Review Viola jones Schneiderman et al. Opelt et al. Torralba et al. SVM (triggs, CVPR 05 – get video) NN (LeCun)

Session 3: multiclass Multiclass boosting, Some examples

Single category object detection and the “Head in the coffee beans problem”

Discriminative methods Object detection and recognition is formulated as a classification problem. The image is partitioned into a set of overlapping windows Bag of image patches

Discriminative methods … and a decision is taken at each window about if it contains a target object or not. Background Computer screen Decision boundary

Discriminative methods … and a decision is taken at each window about if it contains a target object or not. Background Computer screen Decision boundary