LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.

LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research Cambridge with Nebojsa Jojic, MSR Redmond 7 th July 2006

Overview Learning object models The LOCUS model Experiments & results Extensions to LOCUS

Goal Long Term Goal Recognise ~10,000 object classes.

Learning from ‘buckets’ of images Horse model Learning algorithm Object Segmentation Object Recognition Object Detection

Object segmentation + Horse model LOCUS

Related work

Constellation models  Weakly supervised  Probabilistic framework  Sparse  No segmentation Object class recognition by unsupervised scale-invariant learning. R. Fergus, P. Perona, and A. Zisserman. CVPR 2003 A Bayesian approach to unsupervised One-Shot learning of Object categories. L. Fei-Fei, R. Fergus, and P. Perona. ICCV 2003

Fragment-based Learning to segment. E. Borenstein and S. Ullman. ECCV 2004 Combining top-down and bottom-up segmentation. E. Borenstein, E. Sharon, and S. Ullman. CVPR 2004  Dense model  Supervised  Non-probabilistic  No global shape model

Codebook-based Combined object categorization and segmentation with an implicit shape model. B. Leibe, A. Leonardis, and B. Schiele. ECCV ‘04  Probabilistic  Dense model  Supervised  Ad-hoc inference

OBJ CUT  Probabilistic  Dense model  Supervised  Requires video

LOCUS overview  Weakly supervised learning Buckets of images - no annotation required.  Probabilistic generative model of both object and background.  Dense model All pixels modelled, not just at interest points.  Combines global and local cues Models global shape and local appearance + edges.  Iterative inference process Simultaneous localisation, segmentation, pose estimation.

The LOCUS model

LOCUS model Deformation field D Position & size T Class shape π Class edge sprite μ o,σ o Edge image e Image Object appearance λ 1 Background appearance λ 0 Mask m Shared between images Different for each image

LOCUS model: appearance background object Mask m Background mixture coefficients λ0λ0 Object mixture coefficients λ1λ1 Image z Shared mixture components:

LOCUS model: mask background object 8-neighbour Markov Random Field (as used in GrabCut) favours segmentation along contrast edges

LOCUS model: shape/position … … TNTN T4T4 T2T2 T3T3 T1T1 Transformation Class shape π

Iterative inference … … TNTN T4T4 T2T2 T3T3 T1T1 Class shape π Iteration #1

Non-rigid objects Class shape π Translation and scale is not enough.

LOCUS model: pose Class shape π T Deformation field D 5x5 blocks Prior ensures smoothness

LOCUS model: pose Class shape π TD 1 TD 2 TD 3 TD N … …

LOCUS model: edge TD 1 TD 2 TD 3 TD N … … Edge images e … Original images Class edge sprite μ o,σ o

LOCUS model: overview Deformation field D Position & size T Class shape π Class edge sprite μ o,σ o Edge image e Image Object appearance λ 1 Background appearance λ 0 Mask m Shared between images Different for each image

Inference Aim to infer all latent variables, For each image: background appearance λ 0, object appearance λ 1, deformation D, transformation T, mask m, Class variables: shape π, edge sprite μ o, σ o. Bayesian inference is carried out using variational message passing with a fully factorised variational distribution. Optimisation of grid-structured variational free energy terms (relating to the deformation field D and the mask m ) achieved using graph cuts.

Experiments & results

Experiments LOCUS applied to 8 sets of 20 images each containing objects of the same class. Horses Faces Cars (rear) Cars (side) Motorbikes Aeroplanes Cows Trees For each class, we ran separate experiments for color and texture appearance models.

Results: horses

Results: cars

Results: remaining classes Cars (rear)FacesMotorbikesPlanesCowsTrees

Segmentation accuracy HorsesCars (side) LOCUS (color) LOCUS (texture) unannotated training images 93.1% 93.0% 91.4% 94.0% Borenstein et al. hand-segmented training images 93.6%- Each image segmented separately 88.6%82.1% To evaluate segmentation quantitively, we used hand segmentations for horses and cars (side).

Object registration Transformation + deformation field registers object outlines (and some internal edges).

Object registration

Extensions to LOCUS

Recognition + segmentation Object recognition using only global shape: Overall: 88% accuracy.

Probabilistic Index Maps 2 indices9 indices Each image has a ‘palette’ of appearance models – palette invariance.

Probabilistic Index Maps

Learning objects from video Object shape Object edge sprite

Locumotion Add flow and track constraints to achieve motion segmentation: Tracking/flow estimation by Larry Zitnick

Conclusions LOCUS gives unsupervised segmentations of accuracy equivalent to state-of-the-art supervised methods. General-purpose model allows: Object localisation Pose estimation Object segmentation Motion segmentation/object tracking Object recognition/detection (in combination with discriminative model)

Questions ?

LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.

Similar presentations

Presentation on theme: "LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.

Similar presentations

Presentation on theme: "LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research."— Presentation transcript:

Similar presentations

About project

Feedback