Download presentation

Presentation is loading. Please wait.

1
**OBJ CUT & Pose Cut CVPR 05 ECCV 06**

UNIVERSITY OF OXFORD OBJ CUT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman

2
Conclusion Combining pose inference and segmentation worth investigating. (tommorrow) Tracking = Detection Detection = Segmentation Tracking (pose estimation) = Segmentation.

3
Segmentation To distinguish cow and horse? First segmentation problem

4
**Aim Given an image, to segment the object**

Category Model Segmentation Cow Image Segmented Cow Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion

5
**Challenges Intra-Class Shape Variability**

Intra-Class Appearance Variability Self Occlusion

6
**Motivation Magic Wand Current methods require user intervention**

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Cow Image

7
**Motivation Magic Wand Current methods require user intervention**

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

8
**Motivation Magic Wand Current methods require user intervention**

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

9
**Motivation Magic Wand Current methods require user intervention**

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

10
**Motivation Magic Wand Current methods require user intervention**

Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

11
**Motivation Problem Manually intensive**

Segmentation is not guaranteed to be ‘object-like’ Non Object-like Segmentation

12
**Our Method Borenstein and Ullman, ECCV ’02**

Combine object detection with segmentation Borenstein and Ullman, ECCV ’02 Leibe and Schiele, BMVC ’03 Incorporate global shape priors in MRF Detection provides Object Localization Global shape priors Automatically segments the object Note our method completely generic Applicable to any object category model

13
Outline Problem Formulation Form of Shape Prior Optimization Results

14
**Problem Labelling m over the set of pixels D**

Shape prior provided by parameter Θ Energy E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) Unary terms Likelihood based on colour Unary potential based on distance from Θ Pairwise terms Prior Contrast term Find best labelling m* = arg min ∑ wi E (m,Θi) wi is the weight for sample Θi Unary terms Pairwise terms

15
**MRF m (labels) D (pixels) mx Prior Ψxy(mx,my) my**

Probability for a labelling consists of Likelihood Unary potential based on colour of pixel Prior which favours same labels for neighbours (pairwise potentials) m (labels) mx Prior Ψxy(mx,my) my Unary Potential Φx(D|mx) x D (pixels) y Image Plane

16
**Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour)**

Background Seed Pixels Object Seed Pixels Φx(D|obj) x … x … Φx(D|bkg) Ψxy(mx,my) y … y … … … … … Likelihood Ratio (Colour) Prior

17
**Example Cow Image Likelihood Ratio (Colour) Prior Background Seed**

Pixels Object Seed Pixels Likelihood Ratio (Colour) Prior

18
**Contrast-Dependent MRF**

Probability of labelling in addition has Contrast term which favours boundaries to lie on image edges m (labels) mx my x Contrast Term Φ(D|mx,my) D (pixels) y Image Plane

19
**Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour)**

Background Seed Pixels Object Seed Pixels Φx(D|obj) x … x … Φx(D|bkg) Ψxy(mx,my)+ Φ(D|mx,my) y … y … … … … … Likelihood Ratio (Colour) Prior + Contrast

20
**Example Cow Image Likelihood Ratio (Colour) Prior + Contrast**

Background Seed Pixels Object Seed Pixels Likelihood Ratio (Colour) Prior + Contrast

21
**Our Model Θ (shape parameter) m (labels) D (pixels) Unary Potential**

Probability of labelling in addition has Unary potential which depend on distance from Θ (shape parameter) Θ (shape parameter) Unary Potential Φx(mx|Θ) m (labels) mx my Object Category Specific MRF x D (pixels) y Image Plane

22
**Example Cow Image Shape Prior Θ Distance from Θ Prior + Contrast**

Background Seed Pixels Object Seed Pixels Shape Prior Θ Distance from Θ Prior + Contrast

23
**Example Cow Image Shape Prior Θ Likelihood + Distance from Θ**

Background Seed Pixels Object Seed Pixels Shape Prior Θ Likelihood + Distance from Θ Prior + Contrast

24
**Example Cow Image Shape Prior Θ Likelihood + Distance from Θ**

Background Seed Pixels Object Seed Pixels Shape Prior Θ Likelihood + Distance from Θ Prior + Contrast

25
**Outline Problem Formulation Form of Shape Prior Optimization Results**

E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) Form of Shape Prior Optimization Results

26
Detection BMVC 2004

27
**Layered Pictorial Structures (LPS)**

Generative model Composition of parts + spatial layout Layer 2 Spatial Layout (Pairwise Configuration) Layer 1 Parts in Layer 2 can occlude parts in Layer 1

28
**Layered Pictorial Structures (LPS)**

Cow Instance Layer 2 Transformations Θ1 P(Θ1) = 0.9 Layer 1

29
**Layered Pictorial Structures (LPS)**

Cow Instance Layer 2 Transformations Θ2 P(Θ2) = 0.8 Layer 1

30
**Layered Pictorial Structures (LPS)**

Unlikely Instance Layer 2 Transformations Θ3 P(Θ3) = 0.01 Layer 1

31
How to learn LPS From video via motion segmentation see Kumar Torr and Zisserman ICCV 2005.

32
**LPS for Detection Learning**

Learnt automatically using a set of examples Detection Matches LPS to image using Loopy Belief Propagation Localizes object parts

33
Detection Like a proposal process.

34
**Pictorial Structures (PS)**

Fischler and Eschlager. 1973 PS = 2D Parts Configuration Aim: Learn pictorial structures in an unsupervised manner Layered Pictorial Structures (LPS) Parts + Configuration + Relative depth Identify parts Learn configuration Learn relative depth of parts

35
**Pictorial Structures Affine warp of parts Each parts is a variable**

States are image locations AND affine deformation Affine warp of parts

36
**Pictorial Structures Each parts is a variable**

States are image locations MRF favours certain configurations

37
**Bayesian Formulation (MRF)**

D = image. Di = pixels Є pi , given li (PDF Projection Theorem. ) z = sufficient statistics ψ(li,lj) = const, if valid configuration = 0, otherwise. Pott’s model

38
**Defining the likelihood**

We want a likelihood that can combine both the outline and the interior appearance of a part. Define features which will be sufficient statistics to discriminate foreground and background:

39
**Features Outline: z1 Chamfer distance Interior: z2 Textons**

Model joint distribution of z1 z2 as a 2D Gaussian.

40
Chamfer Match Score Outline (z1) : minimum chamfer distances over multiple outline exemplars dcham= 1/n Σi min{ minj ||ui-vj ||, τ } Image Edge Image Distance Transform

41
**Texton Match Score Texture(z2) : MRF classifier**

(Varma and Zisserman, CVPR ’03) Multiple texture exemplars x of class t Textons: 3 X 3 square neighbourhood VQ in texton space Descriptor: histogram of texton labelling χ2 distance

42
**Bag of Words/Histogram of Textons**

Having slagged off BoW’s I reveal we used it all along, no big deal. So this is like a spatially aware bag of words model… Using a spatially flexible set of templates to work out our bag of words.

43
**2. Fitting the Model Cascades of classifiers Solving MRF**

Efficient likelihood evaluation Solving MRF LBP, use fast algorithm GBP if LBP doesn’t converge Could use Semi Definite Programming (2003) Recent work second order cone programming method best CVPR 2006.

44
**Efficient Detection of parts**

Cascade of classifiers Top level use chamfer and distance transform for efficient pre filtering At lower level use full texture model for verification, using efficient nearest neighbour speed ups.

45
**Cascade of Classifiers-for each part**

Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95

46
**High Levels based on Outline**

(x,y)

47
Side Note Chamfer like linear classifier on distance transform image Felzenszwalb. Tree is a set of linear classifiers. Pictorial structure is a parameterized family of linear classifiers.

48
Low levels on Texture The top levels of the tree use outline to eliminate patches of the image. Efficiency: Using chamfer distance and pre computed distance map. Remaining candidates evaluated using full texture model.

49
**Efficient Nearest Neighbour**

Goldstein, Platt and Burges (MSR Tech Report, 2003) Conversion from fixed distance to rectangle search bitvectorij(Rk) = 1 = 0 Nearest neighbour of x Find intervals in all dimensions ‘AND’ appropriate bitvectors Nearest neighbour search on pruned exemplars Rk Є Ii in dimension j

50
**Recently solve via Integer Programming**

SDP formulation (Torr 2001, AI stats) SOCP formulation (Kumar, Torr & Zisserman this conference) LBP (Huttenlocher, many)

51
Outline Problem Formulation Form of Shape Prior Optimization Results

52
**Optimization Given image D, find best labelling as m* = arg max p(m|D)**

Treat LPS parameter Θ as a latent (hidden) variable EM framework E : sample the distribution over Θ M : obtain the labelling m

53
**E-Step Given initial labelling m’, determine p(Θ|m’,D) Problem**

Efficiently sampling from p(Θ|m’,D) Solution We develop efficient sum-product Loopy Belief Propagation (LBP) for matching LPS. Similar to efficient max-product LBP for MAP estimate Felzenszwalb and Huttenlocher, CVPR ‘04

54
**Results Different samples localize different parts well.**

We cannot use only the MAP estimate of the LPS.

55
**M-Step Given samples from p(Θ|m’,D), get new labelling mnew**

Sample Θi provides Object localization to learn RGB distributions of object and background Shape prior for segmentation Problem Maximize expected log likelihood using all samples To efficiently obtain the new labelling

56
**M-Step w1 = P(Θ1|m’,D) Cow Image Shape Θ1 RGB Histogram for Object**

RGB Histogram for Background

57
**M-Step Best labelling found efficiently using a Single Graph Cut**

w1 = P(Θ1|m’,D) Cow Image Shape Θ1 Θ1 m (labels) Image Plane D (pixels) Best labelling found efficiently using a Single Graph Cut

58
**Segmentation using Graph Cuts**

Obj Cut Φx(D|bkg) + Φx(bkg|Θ) x … Ψxy(mx,my)+ Φ(D|mx,my) y … … … m z … … Φz(D|obj) + Φz(obj|Θ) Bkg

59
**Segmentation using Graph Cuts**

Obj x … y … … … m z … … Bkg

60
**M-Step w2 = P(Θ2|m’,D) Cow Image Shape Θ2 RGB Histogram for Object**

RGB Histogram for Background

61
**M-Step Best labelling found efficiently using a Single Graph Cut**

w2 = P(Θ2|m’,D) Cow Image Shape Θ2 Θ2 m (labels) Image Plane D (pixels) Best labelling found efficiently using a Single Graph Cut

62
M-Step Θ1 Θ2 w1 + w2 + …. Image Plane Image Plane m* = arg min ∑ wi E (m,Θi) Best labelling found efficiently using a Single Graph Cut

63
Outline Problem Formulation Form of Shape Prior Optimization Results

64
Results Using LPS Model for Cow Image Segmentation

65
**Results Using LPS Model for Cow**

In the absence of a clear boundary between object and background Image Segmentation

66
Results Using LPS Model for Cow Image Segmentation

67
Results Using LPS Model for Cow Image Segmentation

68
Results Using LPS Model for Horse Image Segmentation

69
Results Using LPS Model for Horse Image Segmentation

70
Results Image Our Method Leibe and Schiele

71
**Results Shape Appearance Shape+Appearance Without Φx(D|mx)**

Without Φx(mx|Θ)

72
**Face Detector and ObjCut**

73
**Do we really need accurate models?**

Segmentation boundary can be extracted from edges Rough 3D Shape-prior enough for region disambiguation

74
**Energy of the Pose-specific MRF**

Energy to be minimized Unary term Pairwise potential Potts model Shape prior But what should be the value of θ?

75
**The different terms of the MRF**

Likelihood of being foreground given a foreground histogram Likelihood of being foreground given all the terms Shape prior model Grimson- Stauffer segmentation Shape prior (distance transform) Resulting Graph-Cuts segmentation Original image

76
**Can segment multiple views simultaneously**

Put the complete energy in this slide.

77
**Solve via gradient descent**

Comparable to level set methods Could use other approaches (e.g. Objcut) Need a graph cut per function evaluation

78
**Formulating the Pose Inference Problem**

Put the complete energy in this slide.

79
But… … to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACH iteration and the maxflow recomputed! However… Kohli and Torr showed how dynamic graph cuts can be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).

80
**Dynamic Graph Cuts SA PA PB* PB SB differences between A and B solve**

Simpler problem PB* differences between A and B similar Talk about jigsaw; write that A and B are similar cheaper operation PB SB computationally expensive operation

81
**Dynamic Image Segmentation**

Reuse the flows from the previous image frame (consecutive frames) Segmentation Obtained Flows in n-edges

82
**Our Algorithm Ga Gb Maximum flow MAP solution G` difference**

residual graph (Gr) MAP solution First segmentation problem Ga G` difference between Ga and Gb updated residual graph Gb second segmentation problem Consistent problem G and G*

83
**Dynamic Graph Cut vs Active Cuts**

Our method flow recycling AC cut recycling Both methods: Tree recycling

84
**Experimental Analysis**

Running time of the dynamic algorithm MRF consisting of 2x105 latent variables connected in a 4-neighborhood.

85
**Segmentation Comparison**

Grimson-Stauffer Bathia04 Our method

86
Segmentation Get rid of the ids and not the ideas

87
Segmentation Get rid of the ids and not the ideas

88
Conclusion Combining pose inference and segmentation worth investigating. Tracking = Detection Detection = Segmentation Tracking = Segmentation. Segmentation = SFM ??

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google