Presentation is loading. Please wait.

Presentation is loading. Please wait.

OBJ CUT & Pose Cut CVPR 05 ECCV 06

Similar presentations


Presentation on theme: "OBJ CUT & Pose Cut CVPR 05 ECCV 06"— Presentation transcript:

1 OBJ CUT & Pose Cut CVPR 05 ECCV 06
UNIVERSITY OF OXFORD OBJ CUT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman

2 Conclusion Combining pose inference and segmentation worth investigating. (tommorrow) Tracking = Detection Detection = Segmentation Tracking (pose estimation) = Segmentation.

3 Segmentation To distinguish cow and horse? First segmentation problem

4 Aim Given an image, to segment the object
Category Model Segmentation Cow Image Segmented Cow Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion

5 Challenges Intra-Class Shape Variability
Intra-Class Appearance Variability Self Occlusion

6 Motivation Magic Wand Current methods require user intervention
Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Cow Image

7 Motivation Magic Wand Current methods require user intervention
Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

8 Motivation Magic Wand Current methods require user intervention
Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

9 Motivation Magic Wand Current methods require user intervention
Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Object Seed Pixels Background Seed Pixels Cow Image

10 Motivation Magic Wand Current methods require user intervention
Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

11 Motivation Problem Manually intensive
Segmentation is not guaranteed to be ‘object-like’ Non Object-like Segmentation

12 Our Method Borenstein and Ullman, ECCV ’02
Combine object detection with segmentation Borenstein and Ullman, ECCV ’02 Leibe and Schiele, BMVC ’03 Incorporate global shape priors in MRF Detection provides Object Localization Global shape priors Automatically segments the object Note our method completely generic Applicable to any object category model

13 Outline Problem Formulation Form of Shape Prior Optimization Results

14 Problem Labelling m over the set of pixels D
Shape prior provided by parameter Θ Energy E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) Unary terms Likelihood based on colour Unary potential based on distance from Θ Pairwise terms Prior Contrast term Find best labelling m* = arg min ∑ wi E (m,Θi) wi is the weight for sample Θi Unary terms Pairwise terms

15 MRF m (labels) D (pixels) mx Prior Ψxy(mx,my) my
Probability for a labelling consists of Likelihood Unary potential based on colour of pixel Prior which favours same labels for neighbours (pairwise potentials) m (labels) mx Prior Ψxy(mx,my) my Unary Potential Φx(D|mx) x D (pixels) y Image Plane

16 Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour)
Background Seed Pixels Object Seed Pixels Φx(D|obj) x x Φx(D|bkg) Ψxy(mx,my) y y Likelihood Ratio (Colour) Prior

17 Example Cow Image Likelihood Ratio (Colour) Prior Background Seed
Pixels Object Seed Pixels Likelihood Ratio (Colour) Prior

18 Contrast-Dependent MRF
Probability of labelling in addition has Contrast term which favours boundaries to lie on image edges m (labels) mx my x Contrast Term Φ(D|mx,my) D (pixels) y Image Plane

19 Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour)
Background Seed Pixels Object Seed Pixels Φx(D|obj) x x Φx(D|bkg) Ψxy(mx,my)+ Φ(D|mx,my) y y Likelihood Ratio (Colour) Prior + Contrast

20 Example Cow Image Likelihood Ratio (Colour) Prior + Contrast
Background Seed Pixels Object Seed Pixels Likelihood Ratio (Colour) Prior + Contrast

21 Our Model Θ (shape parameter) m (labels) D (pixels) Unary Potential
Probability of labelling in addition has Unary potential which depend on distance from Θ (shape parameter) Θ (shape parameter) Unary Potential Φx(mx|Θ) m (labels) mx my Object Category Specific MRF x D (pixels) y Image Plane

22 Example Cow Image Shape Prior Θ Distance from Θ Prior + Contrast
Background Seed Pixels Object Seed Pixels Shape Prior Θ Distance from Θ Prior + Contrast

23 Example Cow Image Shape Prior Θ Likelihood + Distance from Θ
Background Seed Pixels Object Seed Pixels Shape Prior Θ Likelihood + Distance from Θ Prior + Contrast

24 Example Cow Image Shape Prior Θ Likelihood + Distance from Θ
Background Seed Pixels Object Seed Pixels Shape Prior Θ Likelihood + Distance from Θ Prior + Contrast

25 Outline Problem Formulation Form of Shape Prior Optimization Results
E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my) Form of Shape Prior Optimization Results

26 Detection BMVC 2004

27 Layered Pictorial Structures (LPS)
Generative model Composition of parts + spatial layout Layer 2 Spatial Layout (Pairwise Configuration) Layer 1 Parts in Layer 2 can occlude parts in Layer 1

28 Layered Pictorial Structures (LPS)
Cow Instance Layer 2 Transformations Θ1 P(Θ1) = 0.9 Layer 1

29 Layered Pictorial Structures (LPS)
Cow Instance Layer 2 Transformations Θ2 P(Θ2) = 0.8 Layer 1

30 Layered Pictorial Structures (LPS)
Unlikely Instance Layer 2 Transformations Θ3 P(Θ3) = 0.01 Layer 1

31 How to learn LPS From video via motion segmentation see Kumar Torr and Zisserman ICCV 2005.

32 LPS for Detection Learning
Learnt automatically using a set of examples Detection Matches LPS to image using Loopy Belief Propagation Localizes object parts

33 Detection Like a proposal process.

34 Pictorial Structures (PS)
Fischler and Eschlager. 1973 PS = 2D Parts Configuration Aim: Learn pictorial structures in an unsupervised manner Layered Pictorial Structures (LPS) Parts + Configuration + Relative depth Identify parts Learn configuration Learn relative depth of parts

35 Pictorial Structures Affine warp of parts Each parts is a variable
States are image locations AND affine deformation Affine warp of parts

36 Pictorial Structures Each parts is a variable
States are image locations MRF favours certain configurations

37 Bayesian Formulation (MRF)
D = image. Di = pixels Є pi , given li (PDF Projection Theorem. ) z = sufficient statistics ψ(li,lj) = const, if valid configuration = 0, otherwise. Pott’s model

38 Defining the likelihood
We want a likelihood that can combine both the outline and the interior appearance of a part. Define features which will be sufficient statistics to discriminate foreground and background:

39 Features Outline: z1 Chamfer distance Interior: z2 Textons
Model joint distribution of z1 z2 as a 2D Gaussian.

40 Chamfer Match Score Outline (z1) : minimum chamfer distances over multiple outline exemplars dcham= 1/n Σi min{ minj ||ui-vj ||, τ } Image Edge Image Distance Transform

41 Texton Match Score Texture(z2) : MRF classifier
(Varma and Zisserman, CVPR ’03) Multiple texture exemplars x of class t Textons: 3 X 3 square neighbourhood VQ in texton space Descriptor: histogram of texton labelling χ2 distance

42 Bag of Words/Histogram of Textons
Having slagged off BoW’s I reveal we used it all along, no big deal. So this is like a spatially aware bag of words model… Using a spatially flexible set of templates to work out our bag of words.

43 2. Fitting the Model Cascades of classifiers Solving MRF
Efficient likelihood evaluation Solving MRF LBP, use fast algorithm GBP if LBP doesn’t converge Could use Semi Definite Programming (2003) Recent work second order cone programming method best CVPR 2006.

44 Efficient Detection of parts
Cascade of classifiers Top level use chamfer and distance transform for efficient pre filtering At lower level use full texture model for verification, using efficient nearest neighbour speed ups.

45 Cascade of Classifiers-for each part
Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95

46 High Levels based on Outline
(x,y)

47 Side Note Chamfer like linear classifier on distance transform image Felzenszwalb. Tree is a set of linear classifiers. Pictorial structure is a parameterized family of linear classifiers.

48 Low levels on Texture The top levels of the tree use outline to eliminate patches of the image. Efficiency: Using chamfer distance and pre computed distance map. Remaining candidates evaluated using full texture model.

49 Efficient Nearest Neighbour
Goldstein, Platt and Burges (MSR Tech Report, 2003) Conversion from fixed distance to rectangle search bitvectorij(Rk) = 1 = 0 Nearest neighbour of x Find intervals in all dimensions ‘AND’ appropriate bitvectors Nearest neighbour search on pruned exemplars Rk Є Ii in dimension j

50 Recently solve via Integer Programming
SDP formulation (Torr 2001, AI stats) SOCP formulation (Kumar, Torr & Zisserman this conference) LBP (Huttenlocher, many)

51 Outline Problem Formulation Form of Shape Prior Optimization Results

52 Optimization Given image D, find best labelling as m* = arg max p(m|D)
Treat LPS parameter Θ as a latent (hidden) variable EM framework E : sample the distribution over Θ M : obtain the labelling m

53 E-Step Given initial labelling m’, determine p(Θ|m’,D) Problem
Efficiently sampling from p(Θ|m’,D) Solution We develop efficient sum-product Loopy Belief Propagation (LBP) for matching LPS. Similar to efficient max-product LBP for MAP estimate Felzenszwalb and Huttenlocher, CVPR ‘04

54 Results Different samples localize different parts well.
We cannot use only the MAP estimate of the LPS.

55 M-Step Given samples from p(Θ|m’,D), get new labelling mnew
Sample Θi provides Object localization to learn RGB distributions of object and background Shape prior for segmentation Problem Maximize expected log likelihood using all samples To efficiently obtain the new labelling

56 M-Step w1 = P(Θ1|m’,D) Cow Image Shape Θ1 RGB Histogram for Object
RGB Histogram for Background

57 M-Step Best labelling found efficiently using a Single Graph Cut
w1 = P(Θ1|m’,D) Cow Image Shape Θ1 Θ1 m (labels) Image Plane D (pixels) Best labelling found efficiently using a Single Graph Cut

58 Segmentation using Graph Cuts
Obj Cut Φx(D|bkg) + Φx(bkg|Θ) x Ψxy(mx,my)+ Φ(D|mx,my) y m z Φz(D|obj) + Φz(obj|Θ) Bkg

59 Segmentation using Graph Cuts
Obj x y m z Bkg

60 M-Step w2 = P(Θ2|m’,D) Cow Image Shape Θ2 RGB Histogram for Object
RGB Histogram for Background

61 M-Step Best labelling found efficiently using a Single Graph Cut
w2 = P(Θ2|m’,D) Cow Image Shape Θ2 Θ2 m (labels) Image Plane D (pixels) Best labelling found efficiently using a Single Graph Cut

62 M-Step Θ1 Θ2 w1 + w2 + …. Image Plane Image Plane m* = arg min ∑ wi E (m,Θi) Best labelling found efficiently using a Single Graph Cut

63 Outline Problem Formulation Form of Shape Prior Optimization Results

64 Results Using LPS Model for Cow Image Segmentation

65 Results Using LPS Model for Cow
In the absence of a clear boundary between object and background Image Segmentation

66 Results Using LPS Model for Cow Image Segmentation

67 Results Using LPS Model for Cow Image Segmentation

68 Results Using LPS Model for Horse Image Segmentation

69 Results Using LPS Model for Horse Image Segmentation

70 Results Image Our Method Leibe and Schiele

71 Results Shape Appearance Shape+Appearance Without Φx(D|mx)
Without Φx(mx|Θ)

72 Face Detector and ObjCut

73 Do we really need accurate models?
Segmentation boundary can be extracted from edges Rough 3D Shape-prior enough for region disambiguation

74 Energy of the Pose-specific MRF
Energy to be minimized Unary term Pairwise potential Potts model Shape prior But what should be the value of θ?

75 The different terms of the MRF
Likelihood of being foreground given a foreground histogram Likelihood of being foreground given all the terms Shape prior model Grimson- Stauffer segmentation Shape prior (distance transform) Resulting Graph-Cuts segmentation Original image

76 Can segment multiple views simultaneously
Put the complete energy in this slide.

77 Solve via gradient descent
Comparable to level set methods Could use other approaches (e.g. Objcut) Need a graph cut per function evaluation

78 Formulating the Pose Inference Problem
Put the complete energy in this slide.

79 But… … to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACH iteration and the maxflow recomputed! However… Kohli and Torr showed how dynamic graph cuts can be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).

80 Dynamic Graph Cuts SA PA PB* PB SB differences between A and B solve
Simpler problem PB* differences between A and B similar Talk about jigsaw; write that A and B are similar cheaper operation PB SB computationally expensive operation

81 Dynamic Image Segmentation
Reuse the flows from the previous image frame (consecutive frames) Segmentation Obtained Flows in n-edges

82 Our Algorithm Ga Gb Maximum flow MAP solution G` difference
residual graph (Gr) MAP solution First segmentation problem Ga G` difference between Ga and Gb updated residual graph Gb second segmentation problem Consistent problem G and G*

83 Dynamic Graph Cut vs Active Cuts
Our method flow recycling AC cut recycling Both methods: Tree recycling

84 Experimental Analysis
Running time of the dynamic algorithm MRF consisting of 2x105 latent variables connected in a 4-neighborhood.

85 Segmentation Comparison
Grimson-Stauffer Bathia04 Our method

86 Segmentation Get rid of the ids and not the ideas

87 Segmentation Get rid of the ids and not the ideas

88 Conclusion Combining pose inference and segmentation worth investigating. Tracking = Detection Detection = Segmentation Tracking = Segmentation. Segmentation = SFM ??


Download ppt "OBJ CUT & Pose Cut CVPR 05 ECCV 06"

Similar presentations


Ads by Google