Presentation is loading. Please wait.

Presentation is loading. Please wait.

O BJ C UT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman UNIVERSITY OF OXFORD.

Similar presentations


Presentation on theme: "O BJ C UT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman UNIVERSITY OF OXFORD."— Presentation transcript:

1 O BJ C UT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman UNIVERSITY OF OXFORD

2 Conclusion Combining pose inference and segmentation worth investigating. (tommorrow) Tracking = Detection Detection = Segmentation Tracking (pose estimation) = Segmentation.

3 Segmentation To distinguish cow and horse? First segmentation problem

4 Aim Given an image, to segment the object Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion Segmentation Object Category Model Cow Image Segmented Cow

5 Challenges Self Occlusion Intra-Class Shape Variability Intra-Class Appearance Variability

6 Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Cow Image Object Seed Pixels

7 Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Cow Image Object Seed Pixels Background Seed Pixels

8 Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

9 Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Cow Image Object Seed Pixels Background Seed Pixels

10 Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01) Bounding Box of object (Rother et al. SIGGRAPH 04) Segmented Image

11 Problem Manually intensive Segmentation is not guaranteed to be object-like Non Object-like Segmentation Motivation

12 Our Method Combine object detection with segmentation –Borenstein and Ullman, ECCV 02 –Leibe and Schiele, BMVC 03 Incorporate global shape priors in MRF Detection provides – Object Localization – Global shape priors Automatically segments the object –Note our method completely generic –Applicable to any object category model

13 Outline Problem Formulation Form of Shape Prior Optimization Results

14 Problem Labelling m over the set of pixels D Shape prior provided by parameter Θ Energy E (m,Θ) =Φ x (D|m x )+Φ x (m x |Θ) + Ψ xy (m x,m y )+ Φ(D|m x,m y ) Unary terms –Likelihood based on colour –Unary potential based on distance from Θ Pairwise terms –Prior –Contrast term Find best labelling m* = arg min w i E (m,Θ i ) –w i is the weight for sample Θ i Unary terms Pairwise terms

15 MRF Probability for a labelling consists of Likelihood Unary potential based on colour of pixel Prior which favours same labels for neighbours (pairwise potentials) Prior Ψ xy (m x,m y ) Unary Potential Φ x (D|m x ) D (pixels) m (labels) Image Plane x y mxmx mymy

16 Example Cow Image Object Seed Pixels Background Seed Pixels Prior x … y … … … x … y … … … Φ x (D|obj) Φ x (D|bkg) Ψ xy (m x,m y ) Likelihood Ratio (Colour)

17 Example Cow Image Object Seed Pixels Background Seed Pixels Prior Likelihood Ratio (Colour)

18 Contrast-Dependent MRF Probability of labelling in addition has Contrast term which favours boundaries to lie on image edges D (pixels) m (labels) Image Plane Contrast Term Φ(D|m x,m y ) x y mxmx mymy

19 Example Cow Image Object Seed Pixels Background Seed Pixels Prior + Contrast x … y … … … x … y … … … Likelihood Ratio (Colour) Ψ xy (m x,m y ) + Φ(D|m x,m y ) Φ x (D|obj) Φ x (D|bkg)

20 Example Cow Image Object Seed Pixels Background Seed Pixels Prior + Contrast Likelihood Ratio (Colour)

21 Our Model Probability of labelling in addition has Unary potential which depend on distance from Θ (shape parameter) D (pixels) m (labels) Θ (shape parameter) Image Plane Object Category Specific MRF x y mxmx mymy Unary Potential Φ x (m x |Θ)

22 Example Cow Image Object Seed Pixels Background Seed Pixels Prior + Contrast Distance from Θ Shape Prior Θ

23 Example Cow Image Object Seed Pixels Background Seed Pixels Prior + Contrast Likelihood + Distance from Θ Shape Prior Θ

24 Example Cow Image Object Seed Pixels Background Seed Pixels Prior + Contrast Likelihood + Distance from Θ Shape Prior Θ

25 Outline Problem Formulation –E (m,Θ) =Φ x (D|m x )+Φ x (m x |Θ) + Ψ xy (m x,m y )+ Φ(D|m x,m y ) Form of Shape Prior Optimization Results

26 Detection BMVC 2004

27 Layered Pictorial Structures (LPS) Generative model Composition of parts + spatial layout Layer 2 Layer 1 Parts in Layer 2 can occlude parts in Layer 1 Spatial Layout (Pairwise Configuration)

28 Layer 2 Layer 1 Transformations Θ 1 P(Θ 1 ) = 0.9 Cow Instance Layered Pictorial Structures (LPS)

29 Layer 2 Layer 1 Transformations Θ 2 P(Θ 2 ) = 0.8 Cow Instance Layered Pictorial Structures (LPS)

30 Layer 2 Layer 1 Transformations Θ 3 P(Θ 3 ) = 0.01 Unlikely Instance Layered Pictorial Structures (LPS)

31 How to learn LPS From video via motion segmentation see Kumar Torr and Zisserman ICCV 2005.

32 LPS for Detection Learning – Learnt automatically using a set of examples Detection –Matches LPS to image using Loopy Belief Propagation –Localizes object parts

33 Detection Like a proposal process.

34 Pictorial Structures (PS) PS = 2D Parts + Configuration Fischler and Eschlager Aim: Learn pictorial structures in an unsupervised manner Identify parts Learn configuration Learn relative depth of parts Parts + Configuration + Relative depth Layered Pictorial Structures (LPS)

35 Pictorial Structures Each parts is a variable States are image locations AND affine deformation Affine warp of parts

36 Pictorial Structures Each parts is a variable States are image locations MRF favours certain configurations

37 Bayesian Formulation (MRF) D = image. D i = pixels Є p i, given l i (PDF Projection Theorem. ) z = sufficient statistics ψ(l i,l j ) = const, if valid configuration = 0, otherwise. Pott s model

38 Defining the likelihood We want a likelihood that can combine both the outline and the interior appearance of a part. Define features which will be sufficient statistics to discriminate foreground and background:

39 Features Outline: z 1 Chamfer distance Interior: z 2 Textons Model joint distribution of z 1 z 2 as a 2D Gaussian.

40 Chamfer Match Score Outline (z 1 ) : minimum chamfer distances over multiple outline exemplars d cham = 1/n Σ i min{ min j ||u i -v j ||, τ } ImageEdge Image Distance Transform

41 Texton Match Score Texture(z 2 ) : MRF classifier –(Varma and Zisserman, CVPR 03) Multiple texture exemplars x of class t Textons: 3 X 3 square neighbourhood VQ in texton space Descriptor: histogram of texton labelling χ 2 distance

42 Bag of Words/Histogram of Textons Having slagged off BoWs I reveal we used it all along, no big deal. So this is like a spatially aware bag of words model… Using a spatially flexible set of templates to work out our bag of words.

43 2. Fitting the Model Cascades of classifiers –Efficient likelihood evaluation Solving MRF –LBP, use fast algorithm –GBP if LBP doesnt converge –Could use Semi Definite Programming (2003) –Recent work second order cone programming method best CVPR 2006.

44 Efficient Detection of parts Cascade of classifiers Top level use chamfer and distance transform for efficient pre filtering At lower level use full texture model for verification, using efficient nearest neighbour speed ups.

45 Cascade of Classifiers- for each part f Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95

46 High Levels based on Outline (x,y)

47 Side Note Chamfer like linear classifier on distance transform image Felzenszwalb. Tree is a set of linear classifiers. Pictorial structure is a parameterized family of linear classifiers.

48 Low levels on Texture The top levels of the tree use outline to eliminate patches of the image. Efficiency: Using chamfer distance and pre computed distance map. Remaining candidates evaluated using full texture model.

49 Efficient Nearest Neighbour Goldstein, Platt and Burges (MSR Tech Report, 2003) Conversion from fixed distance to rectangle search bitvector i j (R k ) = 1 = 0 Nearest neighbour of x Find intervals in all dimensions AND appropriate bitvectors Nearest neighbour search on pruned exemplars R k Є I i in dimension j

50 Recently solve via Integer Programming SDP formulation (Torr 2001, AI stats) SOCP formulation (Kumar, Torr & Zisserman this conference) LBP (Huttenlocher, many)

51 Outline Problem Formulation Form of Shape Prior Optimization Results

52 Optimization Given image D, find best labelling as m* = arg max p(m|D) Treat LPS parameter Θ as a latent (hidden) variable EM framework –E : sample the distribution over Θ –M : obtain the labelling m

53 E-Step Given initial labelling m, determine p(Θ|m,D) Problem Efficiently sampling from p(Θ|m,D) Solution We develop efficient sum-product Loopy Belief Propagation (LBP) for matching LPS. Similar to efficient max-product LBP for MAP estimate –Felzenszwalb and Huttenlocher, CVPR 04

54 Results Different samples localize different parts well. We cannot use only the MAP estimate of the LPS.

55 M-Step Given samples from p(Θ|m,D), get new labelling m new Sample Θ i provides –Object localization to learn RGB distributions of object and background –Shape prior for segmentation Problem –Maximize expected log likelihood using all samples –To efficiently obtain the new labelling

56 M-Step Cow Image Shape Θ 1 w 1 = P(Θ 1 |m,D) RGB Histogram for Object RGB Histogram for Background

57 Cow Image Shape Θ 1 M-Step w 1 = P(Θ 1 |m,D) Θ1Θ1 Image Plane D (pixels) m (labels) Best labelling found efficiently using a Single Graph Cut

58 Segmentation using Graph Cuts x … y ……… z …… Obj Bkg Cut Φ x (D|bkg) + Φ x (bkg|Θ) m Φ z (D|obj) + Φ z (obj|Θ) Ψ xy (m x,m y )+ Φ(D|m x,m y )

59 Segmentation using Graph Cuts x … y ……… z …… Obj Bkg m

60 M-Step Cow Image Shape Θ 2 w 2 = P(Θ 2 |m,D) RGB Histogram for Background RGB Histogram for Object

61 M-Step Cow Image Shape Θ 2 w 2 = P(Θ 2 |m,D) Θ2Θ2 Image Plane D (pixels) m (labels) Best labelling found efficiently using a Single Graph Cut

62 M-Step Θ2Θ2 Image Plane Θ1Θ1 w1w1 + w 2 + …. Best labelling found efficiently using a Single Graph Cut m* = arg min w i E (m,Θ i )

63 Outline Problem Formulation Form of Shape Prior Optimization Results

64 SegmentationImage Results Using LPS Model for Cow

65 In the absence of a clear boundary between object and background SegmentationImage Results Using LPS Model for Cow

66 SegmentationImage Results Using LPS Model for Cow

67 SegmentationImage Results Using LPS Model for Cow

68 SegmentationImage Results Using LPS Model for Horse

69 SegmentationImage Results Using LPS Model for Horse

70 Our MethodLeibe and SchieleImage Results

71 Appearance ShapeShape+Appearance Results Without Φ x (D|m x ) Without Φ x (m x |Θ)

72 Face Detector and ObjCut

73 Do we really need accurate models? Segmentation boundary can be extracted from edges Rough 3D Shape-prior enough for region disambiguation

74 Energy of the Pose-specific MRF Energy to be minimized Unary term Shape prior Pairwise potential Potts model But what should be the value of θ?

75 The different terms of the MRF Original image Likelihood of being foreground given a foreground histogram Grimson- Stauffer segmentation Shape prior model Shape prior (distance transform) Likelihood of being foreground given all the terms Resulting Graph-Cuts segmentation

76 Can segment multiple views simultaneously

77 Solve via gradient descent Comparable to level set methods Could use other approaches (e.g. Objcut) Need a graph cut per function evaluation

78 Formulating the Pose Inference Problem

79 But… EACH … to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACH iteration and the maxflow recomputed! However… Kohli and Torr showed how dynamic graph cuts can be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).

80 Dynamic Graph Cuts PBPB SBSB cheaper operation computationally expensive operation Simpler problem P B* differences between A and B similar PAPA SASA solve

81 Dynamic Image Segmentation Image Flows in n-edges Segmentation Obtained

82 First segmentation problem MAP solution GaGa Our Algorithm GbGb second segmentation problem Maximum flow residual graph ( G r ) G` difference between G a and G b updated residual graph

83 Dynamic Graph Cut vs Active Cuts Our method flow recycling AC cut recycling Both methods: Tree recycling

84 Experimental Analysis MRF consisting of 2x10 5 latent variables connected in a 4-neighborhood. Running time of the dynamic algorithm

85 Segmentation Comparison Grimson-Stauffer Bathia04 Our method

86 Segmentation

87

88 Conclusion Combining pose inference and segmentation worth investigating. Tracking = Detection Detection = Segmentation Tracking = Segmentation. Segmentation = SFM ??


Download ppt "O BJ C UT & Pose Cut CVPR 05 ECCV 06 Philip Torr M. Pawan Kumar, Pushmeet Kohli and Andrew Zisserman UNIVERSITY OF OXFORD."

Similar presentations


Ads by Google