Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.

Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of Technology

Goal Recognition of object categories Unassisted learning

Some object categories Learn from examples Difficulties: Size variation Background clutter Occlusion Intra-class variation

Representation Learning Recognition Main issues Representation Learning Recognition

Model: Constellation of Parts f Fischler & Elschlager 1973  Yuille ‘ 91  Brunelli & Poggio ‘ 93  Lades, v.d. Malsburg et al. ‘ 93  Cootes, Lanitis, Taylor et al. ‘ 95  Amit & Geman ‘ 95, ‘ 99  Perona et al. ‘ 95, ‘ 96, ’ 98, ’ 00 f Agarwal & Roth ‘02

Detection & Representation of regions Appearance Location Scale (x,y) coords. of region center Diameter of region (pixels) 11x11 patchNormalize Projection onto PCA basis c1c1 c2c2 c 15 ……….. Gives representation of appearance in low-dimensional vector space Find regions within image Use Kadir and Brady's salient region operator [IJCV ’01]

Foreground model Gaussian shape pdf Poission pdf on # detections Uniform shape pdf Gaussian part appearance pdf Generative probabilistic model Clutter model Gaussian background appearance pdf Gaussian relative scale pdf log(scale) Prob. of detection 0.80.750.9 Uniform relative scale pdf log(scale) based on Burl, Weber et al. [ECCV ’98, ’00]

Recognition

Motorbikes Samples from appearance model

Learning

Task:Estimation of model parameters Learning Let the assignments be a hidden variable and use EM algorithm to learn them and the model parameters Chicken and Egg type problem, since we initially know neither: - Model parameters - Assignment of regions to foreground / background

Learning procedure E-step:Compute assignments for which regions are foreground / background M-step:Update model parameters Find regions & their location, scale & appearance Initialize model parameters Use EM and iterate to convergence: Trying to maximize likelihood – consistency in shape & appearance

Experiments

Experimental procedure Two series of experiments: Fixed-scale model - Objects the same size (manual normalization) Scale-invariant model - Objects between 100 and 550 pixels in width Datasets Training 50% images No identifcation of object within image Testing 50% images Simple object present/absent test Motorbikes Airplanes Frontal Faces Cars (Side) Cars (Rear) Spotted cats

Motorbikes

Background images evaluated with motorbike model

Frontal faces

Airplanes

Spotted cats

Summary of results Dataset Fixed scale experiment Scale invariant experiment Motorbikes7.56.7 Faces4.6 Airplanes9.87.0 Cars (Rear)15.29.7 Spotted cats10.0 % equal error rate Note: Within each series, same settings used for all datasets

Comparison to other methods DatasetOursOthers Motorbikes7.516.0 Weber et al. [ECCV ‘00] Faces4.66.0Weber Airplanes9.832.0Weber Cars (Side)11.521.0 Agarwal Roth [ECCV ’02]  % equal error rate

Robustness of Algorithm

Summary Limitations  future work Comprehensive probabilistic model for object classes Learn appearance, shape, relative scale, occlusion etc. simultaneously in scale and translation invariant manner Same algorithm gives <= 10% error across 5 diverse datasets with identical settings Datasets available from: http://www.robots.ox.ac.uk/~vgg/data Very reliant on region detector Different part types (e.g. edgel curves) Only learns a single viewpoint Use mixture models Need lots of images to learn Bayesian learning - fewer images [ICCV ’03 (Fei Fei, Fergus, Perona)] Need more through testing Looking towards testing 100’s of datasets

Ease of training # Categories (log 2 ) Overview of approaches to category recognition  Unsupervise d 00  Labelled  Normalized & labelled  Normalized & labelled & segmented 11 22 33 88  Schneiderman & Kanade [CVPR ’00]  Weber et al. [ECCV ’00]  Fergus et al. [CVPR ’03]  14  Humans  Viola & Jones [CVPR ’01]

Cars from rear

ROC equal error rates Pre-scaled data (identical settings): Scale-invariant learning and recognition:

Image size histograms

Sampling from models FacesMotorbikes

Overview of Object Representation It handles: Appearance Shape Relative scale Occlusion Background statistics  all in a probabilistic manner Objects represented as a constellation of parts The constellation of parts uses a generative parametric probabilistic model, which is an extension of Weber et. al. [ECCV ’00] Model uses salient regions found by Kadir and Brady’s interest operator PCA is used to give a compact representation of appearance for each region

Learning Task:Estimation of model parameters Done using EM algorithm in an maximum-likelihood manner: E-step:Compute soft assignments for which P regions are foreground / background M-step:Update model parameters 200-400 images required to avoid overfitting 6 part model with 30 regions/image and 400 training images takes around 24 hrs to run (~100 EM iterations) Foreground/background decision is automated Characteristics of region detector are learnt Generative model, so no background set used

Salient regions Find regions within image Use Kadir and Brady's salient region operator [IJCV ’01] Uses gray-scale input Finds maxima in entropy over scale and location

Motorbikes

Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.

Similar presentations

Presentation on theme: "Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.

Similar presentations

Presentation on theme: "Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of."— Presentation transcript:

Similar presentations

About project

Feedback