OBJ CUT & Pose Cut CVPR 05 ECCV 06

Name: OBJ CUT & Pose Cut CVPR 05 ECCV 06
Uploaded: 2017-07-29T02:55:54+00:00
Duration: PTM26S5
Channel: Justin Fletcher
Description: OBJ CUT & Pose Cut CVPR 05 ECCV 06

OBJ CUT & Pose Cut CVPR 05 ECCV 06
UNIVERSITY OF OXFORD OBJ CUT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman

Conclusion Combining pose inference and segmentation worth investigating. (tommorrow) Tracking = Detection Detection = Segmentation Tracking (pose estimation) = Segmentation.

Segmentation To distinguish cow and horse? First segmentation problem

Aim Given an image, to segment the object
Category Model Segmentation Cow Image Segmented Cow Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion

Challenges Intra-Class Shape Variability
Intra-Class Appearance Variability Self Occlusion

Motivation Magic Wand Current methods require user intervention
Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Cow Image

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

Motivation Problem Manually intensive
Segmentation is not guaranteed to be ‘object-like’ Non Object-like Segmentation

Our Method Borenstein and Ullman, ECCV ’02
Combine object detection with segmentation Borenstein and Ullman, ECCV ’02 Leibe and Schiele, BMVC ’03 Incorporate global shape priors in MRF Detection provides Object Localization Global shape priors Automatically segments the object Note our method completely generic Applicable to any object category model

Outline Problem Formulation Form of Shape Prior Optimization Results

Problem Labelling m over the set of pixels D
Shape prior provided by parameter Θ Energy E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) Unary terms Likelihood based on colour Unary potential based on distance from Θ Pairwise terms Prior Contrast term Find best labelling m* = arg min ∑ wi E (m,Θi) wi is the weight for sample Θi Unary terms Pairwise terms

MRF m (labels) D (pixels) mx Prior Ψxy(mx,my) my
Probability for a labelling consists of Likelihood Unary potential based on colour of pixel Prior which favours same labels for neighbours (pairwise potentials) m (labels) mx Prior Ψxy(mx,my) my Unary Potential Φx(D|mx) x D (pixels) y Image Plane

Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour)
Background Seed Pixels Object Seed Pixels Φx(D|obj) x … x … Φx(D|bkg) Ψxy(mx,my) y … y … … … … … Likelihood Ratio (Colour) Prior

Example Cow Image Likelihood Ratio (Colour) Prior Background Seed
Pixels Object Seed Pixels Likelihood Ratio (Colour) Prior

Contrast-Dependent MRF
Probability of labelling in addition has Contrast term which favours boundaries to lie on image edges m (labels) mx my x Contrast Term Φ(D|mx,my) D (pixels) y Image Plane

Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour)
Background Seed Pixels Object Seed Pixels Φx(D|obj) x … x … Φx(D|bkg) Ψxy(mx,my)+ Φ(D|mx,my) y … y … … … … … Likelihood Ratio (Colour) Prior + Contrast

Example Cow Image Likelihood Ratio (Colour) Prior + Contrast
Background Seed Pixels Object Seed Pixels Likelihood Ratio (Colour) Prior + Contrast

Our Model Θ (shape parameter) m (labels) D (pixels) Unary Potential
Probability of labelling in addition has Unary potential which depend on distance from Θ (shape parameter) Θ (shape parameter) Unary Potential Φx(mx|Θ) m (labels) mx my Object Category Specific MRF x D (pixels) y Image Plane

Example Cow Image Shape Prior Θ Distance from Θ Prior + Contrast
Background Seed Pixels Object Seed Pixels Shape Prior Θ Distance from Θ Prior + Contrast

Example Cow Image Shape Prior Θ Likelihood + Distance from Θ
Background Seed Pixels Object Seed Pixels Shape Prior Θ Likelihood + Distance from Θ Prior + Contrast

Outline Problem Formulation Form of Shape Prior Optimization Results
E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) Form of Shape Prior Optimization Results

Detection BMVC 2004

Layered Pictorial Structures (LPS)
Generative model Composition of parts + spatial layout Layer 2 Spatial Layout (Pairwise Configuration) Layer 1 Parts in Layer 2 can occlude parts in Layer 1

Cow Instance Layer 2 Transformations Θ1 P(Θ1) = 0.9 Layer 1

Cow Instance Layer 2 Transformations Θ2 P(Θ2) = 0.8 Layer 1

Unlikely Instance Layer 2 Transformations Θ3 P(Θ3) = 0.01 Layer 1

How to learn LPS From video via motion segmentation see Kumar Torr and Zisserman ICCV 2005.

LPS for Detection Learning
Learnt automatically using a set of examples Detection Matches LPS to image using Loopy Belief Propagation Localizes object parts

Detection Like a proposal process.

Pictorial Structures (PS)
Fischler and Eschlager. 1973 PS = 2D Parts Configuration Aim: Learn pictorial structures in an unsupervised manner Layered Pictorial Structures (LPS) Parts + Configuration + Relative depth Identify parts Learn configuration Learn relative depth of parts

Pictorial Structures Affine warp of parts Each parts is a variable
States are image locations AND affine deformation Affine warp of parts

Pictorial Structures Each parts is a variable
States are image locations MRF favours certain configurations

Bayesian Formulation (MRF)
D = image. Di = pixels Є pi , given li (PDF Projection Theorem. ) z = sufficient statistics ψ(li,lj) = const, if valid configuration = 0, otherwise. Pott’s model

Defining the likelihood
We want a likelihood that can combine both the outline and the interior appearance of a part. Define features which will be sufficient statistics to discriminate foreground and background:

Features Outline: z1 Chamfer distance Interior: z2 Textons
Model joint distribution of z1 z2 as a 2D Gaussian.

Chamfer Match Score Outline (z1) : minimum chamfer distances over multiple outline exemplars dcham= 1/n Σi min{ minj ||ui-vj ||, τ } Image Edge Image Distance Transform

Texton Match Score Texture(z2) : MRF classifier
(Varma and Zisserman, CVPR ’03) Multiple texture exemplars x of class t Textons: 3 X 3 square neighbourhood VQ in texton space Descriptor: histogram of texton labelling χ2 distance

Bag of Words/Histogram of Textons
Having slagged off BoW’s I reveal we used it all along, no big deal. So this is like a spatially aware bag of words model… Using a spatially flexible set of templates to work out our bag of words.

2. Fitting the Model Cascades of classifiers Solving MRF
Efficient likelihood evaluation Solving MRF LBP, use fast algorithm GBP if LBP doesn’t converge Could use Semi Definite Programming (2003) Recent work second order cone programming method best CVPR 2006.

Efficient Detection of parts
Cascade of classifiers Top level use chamfer and distance transform for efficient pre filtering At lower level use full texture model for verification, using efficient nearest neighbour speed ups.

Cascade of Classifiers-for each part
Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95

High Levels based on Outline
(x,y)

Side Note Chamfer like linear classifier on distance transform image Felzenszwalb. Tree is a set of linear classifiers. Pictorial structure is a parameterized family of linear classifiers.

Low levels on Texture The top levels of the tree use outline to eliminate patches of the image. Efficiency: Using chamfer distance and pre computed distance map. Remaining candidates evaluated using full texture model.

Efficient Nearest Neighbour
Goldstein, Platt and Burges (MSR Tech Report, 2003) Conversion from fixed distance to rectangle search bitvectorij(Rk) = 1 = 0 Nearest neighbour of x Find intervals in all dimensions ‘AND’ appropriate bitvectors Nearest neighbour search on pruned exemplars Rk Є Ii in dimension j

Recently solve via Integer Programming
SDP formulation (Torr 2001, AI stats) SOCP formulation (Kumar, Torr & Zisserman this conference) LBP (Huttenlocher, many)

Optimization Given image D, find best labelling as m* = arg max p(m|D)
Treat LPS parameter Θ as a latent (hidden) variable EM framework E : sample the distribution over Θ M : obtain the labelling m

E-Step Given initial labelling m’, determine p(Θ|m’,D) Problem
Efficiently sampling from p(Θ|m’,D) Solution We develop efficient sum-product Loopy Belief Propagation (LBP) for matching LPS. Similar to efficient max-product LBP for MAP estimate Felzenszwalb and Huttenlocher, CVPR ‘04

Results Different samples localize different parts well.
We cannot use only the MAP estimate of the LPS.

M-Step Given samples from p(Θ|m’,D), get new labelling mnew
Sample Θi provides Object localization to learn RGB distributions of object and background Shape prior for segmentation Problem Maximize expected log likelihood using all samples To efficiently obtain the new labelling

M-Step w1 = P(Θ1|m’,D) Cow Image Shape Θ1 RGB Histogram for Object
RGB Histogram for Background

M-Step Best labelling found efficiently using a Single Graph Cut
w1 = P(Θ1|m’,D) Cow Image Shape Θ1 Θ1 m (labels) Image Plane D (pixels) Best labelling found efficiently using a Single Graph Cut

Segmentation using Graph Cuts
Obj x … y … … … m z … … Bkg

M-Step w2 = P(Θ2|m’,D) Cow Image Shape Θ2 RGB Histogram for Object
RGB Histogram for Background

M-Step Best labelling found efficiently using a Single Graph Cut
w2 = P(Θ2|m’,D) Cow Image Shape Θ2 Θ2 m (labels) Image Plane D (pixels) Best labelling found efficiently using a Single Graph Cut

M-Step Θ1 Θ2 w1 + w2 + …. Image Plane Image Plane m* = arg min ∑ wi E (m,Θi) Best labelling found efficiently using a Single Graph Cut

Results Using LPS Model for Cow Image Segmentation

Results Using LPS Model for Cow
In the absence of a clear boundary between object and background Image Segmentation

Results Using LPS Model for Cow Image Segmentation

Results Using LPS Model for Horse Image Segmentation

Results Image Our Method Leibe and Schiele

Results Shape Appearance Shape+Appearance Without Φx(D|mx)
Without Φx(mx|Θ)

Face Detector and ObjCut

Do we really need accurate models?
Segmentation boundary can be extracted from edges Rough 3D Shape-prior enough for region disambiguation

Energy of the Pose-specific MRF
Energy to be minimized Unary term Pairwise potential Potts model Shape prior But what should be the value of θ?

The different terms of the MRF
Likelihood of being foreground given a foreground histogram Likelihood of being foreground given all the terms Shape prior model Grimson- Stauffer segmentation Shape prior (distance transform) Resulting Graph-Cuts segmentation Original image

Can segment multiple views simultaneously
Put the complete energy in this slide.

Solve via gradient descent
Comparable to level set methods Could use other approaches (e.g. Objcut) Need a graph cut per function evaluation

Formulating the Pose Inference Problem
Put the complete energy in this slide.

But… … to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACH iteration and the maxflow recomputed! However… Kohli and Torr showed how dynamic graph cuts can be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).

Dynamic Graph Cuts SA PA PB* PB SB differences between A and B solve
Simpler problem PB* differences between A and B similar Talk about jigsaw; write that A and B are similar cheaper operation PB SB computationally expensive operation

Dynamic Image Segmentation
Reuse the flows from the previous image frame (consecutive frames) Segmentation Obtained Flows in n-edges

Our Algorithm Ga Gb Maximum flow MAP solution G` difference
residual graph (Gr) MAP solution First segmentation problem Ga G` difference between Ga and Gb updated residual graph Gb second segmentation problem Consistent problem G and G*

Dynamic Graph Cut vs Active Cuts
Our method flow recycling AC cut recycling Both methods: Tree recycling

Experimental Analysis
Running time of the dynamic algorithm MRF consisting of 2x105 latent variables connected in a 4-neighborhood.

Segmentation Comparison
Grimson-Stauffer Bathia04 Our method

Segmentation Get rid of the ids and not the ideas

Conclusion Combining pose inference and segmentation worth investigating. Tracking = Detection Detection = Segmentation Tracking = Segmentation. Segmentation = SFM ??

OBJ CUT & Pose Cut CVPR 05 ECCV 06

Similar presentations

Presentation on theme: "OBJ CUT & Pose Cut CVPR 05 ECCV 06"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

OBJ CUT & Pose Cut CVPR 05 ECCV 06

Similar presentations

Presentation on theme: "OBJ CUT & Pose Cut CVPR 05 ECCV 06"— Presentation transcript:

Similar presentations

About project

Feedback