Unsupervised Learning of Models for Recognition

Slides:



Advertisements
Similar presentations
Bayesian Belief Propagation
Advertisements

Joint Face Alignment The Recognition Pipeline
Active Appearance Models
Learning deformable models Yali Amit, University of Chicago Alain Trouvé, CMLA Cachan.
A Bayesian Approach to Recognition Moshe Blank Ita Lifshitz Reverend Thomas Bayes
Computer vision: models, learning and inference Chapter 13 Image preprocessing and feature extraction.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Unsupervised Learning of Visual Object Categories Michael Pfeiffer
Robust Object Tracking via Sparsity-based Collaborative Model
Instructor: Mircea Nicolescu Lecture 13 CS 485 / 685 Computer Vision.
Visual Recognition Tutorial
EE-148 Expectation Maximization Markus Weber 5/11/99.
Model: Parts and Structure. History of Idea Fischler & Elschlager 1973 Yuille ‘91 Brunelli & Poggio ‘93 Lades, v.d. Malsburg et al. ‘93 Cootes, Lanitis,
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Segmentation Divide the image into segments. Each segment:
Prénom Nom Document Analysis: Data Analysis and Clustering Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Lecture 4 Unsupervised Learning Clustering & Dimensionality Reduction
Expectation Maximization Algorithm
Object Recognition Using Geometric Hashing
Object Class Recognition by Unsupervised Scale-Invariant Learning R. Fergus, P. Perona, and A. Zisserman Presented By Jeff.
Visual Recognition Tutorial
Pattern Recognition. Introduction. Definitions.. Recognition process. Recognition process relates input signal to the stored concepts about the object.
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Crash Course on Machine Learning
Exercise Session 10 – Image Categorization
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
Building local part models for category-level recognition C. Schmid, INRIA Grenoble Joint work with G. Dorko, S. Lazebnik, J. Ponce.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Bayesian Parameter Estimation Liad Serruya. Agenda Introduction Bayesian decision theory Scale-Invariant Learning Bayesian “One-Shot” Learning.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Expectation-Maximization (EM) Case Studies
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Lecture 3: MLE, Bayes Learning, and Maximum Entropy
SIFT.
Unsupervised Learning Part 2. Topics How to determine the K in K-means? Hierarchical clustering Soft clustering with Gaussian mixture models Expectation-Maximization.
Lecture 1.31 Criteria for optimal reception of radio signals.
Object Recognition by Parts
CS479/679 Pattern Recognition Dr. George Bebis
SIFT Scale-Invariant Feature Transform David Lowe
Chapter 3: Maximum-Likelihood Parameter Estimation
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
A Forest of Sensors: Using adaptive tracking to classify and monitor activities in a site Eric Grimson AI Lab, Massachusetts Institute of Technology
Classification of unlabeled data:
Particle Filtering for Geometric Active Contours
LOCUS: Learning Object Classes with Unsupervised Segmentation
Data Mining Lecture 11.
Dynamical Statistical Shape Priors for Level Set Based Tracking
Object Recognition by Parts
Lecture 26: Faces and probabilities
Object Recognition by Parts
Bayesian Models in Machine Learning
Probabilistic Models with Latent Variables
Object Recognition by Parts
SMEM Algorithm for Mixture Models
Brief Review of Recognition + Context
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Unsupervised Learning II: Soft Clustering with Gaussian Mixture Models
Unsupervised learning of models for recognition
SIFT.
EM Algorithm and its Applications
Object Recognition by Parts
Object Recognition with Interest Operators
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Image recognition.
Presentation transcript:

Unsupervised Learning of Models for Recognition M. Weber, M. Welling, P. Perona ECCV 2000 + M.Weber’s PhD thesis Presented by Greg Shakhnarovich for 6.899 – Learning and Vision seminar May 1, 2002 6.899 May 1, 2002 Weber,Welling,Perona

The problem “Recognizing members of object class” i.e. detection Object class defined by common parts that Are visually similar (inter-class variation) Occur in similar but varying configurations (intra-class variation) Are less pose-dependent than the whole object 6.899 May 1, 2002 Weber,Welling,Perona

Meet the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Spot the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Spot the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Spot the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Spot the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Spot the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Spot the xyz 6.899 May 1, 2002 Weber,Welling,Perona

Meet the abc 6.899 May 1, 2002 Weber,Welling,Perona

Example: house in rural scene Random BG (Poisson) House: roof, 2 windows Rooftop, window pos. normally distributed Fixed scale BG objects: Trees, flowers,fences Random position/scale Occasional occlusion 6.899 May 1, 2002 Weber,Welling,Perona

Main ideas Unsupervised learning of relevant parts Fully automatic, from cluttered images Only positive examples (exactly one object present) Learning the affine shape (constellations of parts) distribution using EM Decision made in probabilistic framework 6.899 May 1, 2002 Weber,Welling,Perona

Background clutter 6.899 May 1, 2002 Weber,Welling,Perona

Related work Amit & Geman, ’99 Burl,Leung,Perona,Weber ’95-’98 Assumes registration (alignment) Burl,Leung,Perona,Weber ’95-’98 Requires manual labeling Taylor, Cutts, Edwards ’96-’98 AAM – model deformations 6.899 May 1, 2002 Weber,Welling,Perona

Overview Part selection Probabilistic model Learning the model parameters Results 6.899 May 1, 2002 Weber,Welling,Perona

Part selection Detection: using normalized correlation Efficiency, good performance Choose the templates in 2 steps: Identify points of interest Förstner’s interest operator  ~150 candidates per training image Learn vector quantization in order to reduce the number of candidates 6.899 May 1, 2002 Weber,Welling,Perona

Part selection Interesting points: points/regions where image significantly changes two-dimensionally Edges are not Corners and circular features (contours or blobs) are Förstner’s operator 6.899 May 1, 2002 Weber,Welling,Perona

Part selection: interest operator 6.899 May 1, 2002 Weber,Welling,Perona

Vector Quantization Goal: learn small subset of best representatives Think of it as minimal error codebook construction Possible solution: -means clustering Number of clusters set to 100 Discard small clusters (less than 10 patterns) Remove duplicates (up to small shift in any direction) Merge/split clusters Select correct number of clusters 6.899 May 1, 2002 Weber,Welling,Perona

Unsupervised detector training - 2 ©WWP “Pattern Space” (100+ dimensions) 6.899 May 1, 2002 Weber,Welling,Perona

Example: part selection 34,000 99 parts from 200 images Harris corner detector -means clustering, = 100 6.899 May 1, 2002 Weber,Welling,Perona

Generative Object Model Part: type (one of ) + position in image Observations: where is a 2D location Hypothesis: means is the location of the part of type Occluded parts: Locations of missing parts: 6.899 May 1, 2002 Weber,Welling,Perona

Example: observed detections 3-part model: 6.899 May 1, 2002 Weber,Welling,Perona

Example: observed detections Parts: Observations: Correct hypothesis: 6.899 May 1, 2002 Weber,Welling,Perona

Probabilistic model Joint pdf Notation: iff the number of BG detections in the -th row of 6.899 May 1, 2002 Weber,Welling,Perona

Example: model components Parts: Observations: Correct hypothesis: 6.899 May 1, 2002 Weber,Welling,Perona

Number of BG part detections Assumptions about part detections in the BG: Independence between types Independence between locations Binomial  Poisson where is the average number of BG detections of part type 6.899 May 1, 2002 Weber,Welling,Perona

Probability of FG part detection Shouldn’t assume independence e.g., certain parts often occluded simultaneously Model as joint probability mass function with entries 6.899 May 1, 2002 Weber,Welling,Perona

Probability of the hypothesis Let be the set of hypotheses consistent with given Assumption: all hypotheses in equally likely 6.899 May 1, 2002 Weber,Welling,Perona

Likelihood of the observations Notation: - all FG part locations - all the BG detections Assuming independence between FG & BG: Modeling 6.899 May 1, 2002 Weber,Welling,Perona

Example: constellation model 6.899 May 1, 2002 Weber,Welling,Perona

Positions of BG part detections is all the BG detections in Probability of BG detection (given their actual number) is uniform over the image: where is the image area 6.899 May 1, 2002 Weber,Welling,Perona

Affine invariance Must ensure TRS invariance Make positions relative Shape rather than positions Make positions relative Single reference point: eliminates translation Two points: eliminates rotation + scaling; dimension decreases by two Want to keep a simple form for the densities Part detectors must be TRS-invariant, too… 6.899 May 1, 2002 Weber,Welling,Perona

Shape representation A figure from: M.C.Burl and P.Perona. Recognition of Planar Object Classes. CVPR 1996 Dryden & Mardia: distributions in shape space (for Gaussian in figure space) Scale/rotation difficult; the authors only implement translation 6.899 May 1, 2002 Weber,Welling,Perona

Classification formulation Two classes: - obj. absent, - obj. present Null-hypothesis : all detections are in BG Decision: MAP given the detections A true hypothesis testing setup?.. 6.899 May 1, 2002 Weber,Welling,Perona

Model details Start with a pool of candidate parts Greedily choose optimal parts Start with random selection Randomly replace one of the parts and see if improve Stop when no more improvement Set and start over May optimize a bit 6.899 May 1, 2002 Weber,Welling,Perona

ML using EM ©WWP 1. Current estimate 2. Assign probabilities to constellations Image 1 Large P Image 2 Small P Image i pdf ... 3. Use probabilities as weights to reestimate parameters. Example:  Large P x + Small P x + … = new estimate of  6.899 May 1, 2002 Weber,Welling,Perona

Model parameter estimation Probability of hypothesized constellation on the object Probability of missing a part of the true object Probability of the observed number of detections in BG The parameters: Must infer hidden variables from observed data EM, maximizing the likelihood 6.899 May 1, 2002 Weber,Welling,Perona

Experiment 1: faces 200 images with faces, 30 people 200 BG images – same environment Grayscale, 240 x 160 pixels Random split to train + test sets Parts: 11x11 pixels Tried 2,3,4,5 parts in model 6.899 May 1, 2002 Weber,Welling,Perona

Learned model - faces 6.899 May 1, 2002 Weber,Welling,Perona

Sample results - faces 6.899 May 1, 2002 Weber,Welling,Perona

Sample results - faces 6.899 May 1, 2002 Weber,Welling,Perona

ROC curves 93.5% correct 6.899 May 1, 2002 Weber,Welling,Perona

Experiment 2: cars Images high-pass filtered 6.899 May 1, 2002 Weber,Welling,Perona

Results – cars (86.5% correct) 6.899 May 1, 2002 Weber,Welling,Perona

Results - cars 6.899 May 1, 2002 Weber,Welling,Perona

Other data sets… 6.899 May 1, 2002 Weber,Welling,Perona

Experiment 3: occlusion More overfitting Larger parts behave worse? 6.899 May 1, 2002 Weber,Welling,Perona

Experiment 4: multi-scale 6.899 May 1, 2002 Weber,Welling,Perona

Advantages Unsupervised: not misled by our intuition, doesn’t require expensive labeling Handles occlusion in a well-defined probabilistic way Some pose-invariance, promising results in view-based training Potentially fast (once trained) 6.899 May 1, 2002 Weber,Welling,Perona

Apparent limitations Unsupervised (not led by our intuition) Must have at most a single object in image Current model doesn’t handle repeated parts (e.g. identical windows) Rotation,scale invariance (theoretically possible) Very expensive training Illumination ? 6.899 May 1, 2002 Weber,Welling,Perona

That’s all… Discussion (hopefully) 6.899 May 1, 2002 Weber,Welling,Perona

EM In each iteration, want to find Maximize instead Value being optimized for (current iteration) Current value (previous iteration) 6.899 May 1, 2002 Weber,Welling,Perona

EM: update rules Expectations w.r.t. the posterior density 6.899 May 1, 2002 Weber,Welling,Perona