3SegmentationTo distinguish cow and horse?First segmentation problem
4Aim Given an image, to segment the object CategoryModelSegmentationCow ImageSegmented CowSegmentation should (ideally) beshaped like the object e.g. cow-likeobtained efficiently in an unsupervised mannerable to handle self-occlusion
6Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01)Bounding Box of object (Rother et al. SIGGRAPH 04)Object Seed PixelsCow Image
7Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01)Bounding Box of object (Rother et al. SIGGRAPH 04)Object Seed PixelsBackground Seed PixelsCow Image
8Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01)Bounding Box of object (Rother et al. SIGGRAPH 04)Segmented Image
9Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01)Bounding Box of object (Rother et al. SIGGRAPH 04)Object Seed PixelsBackground Seed PixelsCow Image
10Motivation Magic Wand Current methods require user intervention Object and background seed pixels (Boykov and Jolly, ICCV 01)Bounding Box of object (Rother et al. SIGGRAPH 04)Segmented Image
11Motivation Problem Manually intensive Segmentation is not guaranteed to be ‘object-like’Non Object-like Segmentation
12Our Method Borenstein and Ullman, ECCV ’02 Combine object detection with segmentationBorenstein and Ullman, ECCV ’02Leibe and Schiele, BMVC ’03Incorporate global shape priors in MRFDetection providesObject LocalizationGlobal shape priorsAutomatically segments the objectNote our method completely genericApplicable to any object category model
13OutlineProblem FormulationForm of Shape PriorOptimizationResults
14Problem Labelling m over the set of pixels D Shape prior provided by parameter ΘEnergy E (m,Θ) = ∑Φx(D|mx)+Φx(mx|Θ) + ∑ Ψxy(mx,my)+ Φ(D|mx,my)Unary termsLikelihood based on colourUnary potential based on distance from ΘPairwise termsPriorContrast termFind best labelling m* = arg min ∑ wi E (m,Θi)wi is the weight for sample ΘiUnary termsPairwise terms
15MRF m (labels) D (pixels) mx Prior Ψxy(mx,my) my Probability for a labelling consists ofLikelihoodUnary potential based on colour of pixelPrior which favours same labels for neighbours (pairwise potentials)m (labels)mxPrior Ψxy(mx,my)myUnary Potential Φx(D|mx)xD (pixels)yImage Plane
16Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour) Background SeedPixelsObject SeedPixelsΦx(D|obj)x…x…Φx(D|bkg)Ψxy(mx,my)y…y……………Likelihood Ratio (Colour)Prior
17Example Cow Image Likelihood Ratio (Colour) Prior Background Seed PixelsObject SeedPixelsLikelihood Ratio (Colour)Prior
18Contrast-Dependent MRF Probability of labelling in addition hasContrast term which favours boundaries to lie on image edgesm (labels)mxmyxContrast TermΦ(D|mx,my)D (pixels)yImage Plane
19Example … … … … … … … … Cow Image x x y y Likelihood Ratio (Colour) Background SeedPixelsObject SeedPixelsΦx(D|obj)x…x…Φx(D|bkg)Ψxy(mx,my)+Φ(D|mx,my)y…y……………Likelihood Ratio (Colour)Prior + Contrast
20Example Cow Image Likelihood Ratio (Colour) Prior + Contrast Background SeedPixelsObject SeedPixelsLikelihood Ratio (Colour)Prior + Contrast
21Our Model Θ (shape parameter) m (labels) D (pixels) Unary Potential Probability of labelling in addition hasUnary potential which depend on distance from Θ (shape parameter)Θ (shape parameter)Unary PotentialΦx(mx|Θ)m (labels)mxmyObject CategorySpecific MRFxD (pixels)yImage Plane
34Pictorial Structures (PS) Fischler and Eschlager. 1973PS = 2D Parts ConfigurationAim: Learn pictorial structures in an unsupervised mannerLayeredPictorialStructures(LPS)Parts +Configuration +Relative depthIdentify partsLearn configurationLearn relative depth of parts
35Pictorial Structures Affine warp of parts Each parts is a variable States are image locationsAND affine deformationAffine warp of parts
36Pictorial Structures Each parts is a variable States are image locationsMRF favours certainconfigurations
37Bayesian Formulation (MRF) D = image.Di = pixels Є pi , given li(PDF Projection Theorem. )z = sufficient statisticsψ(li,lj) = const, if valid configuration= 0, otherwise.Pott’s model
38Defining the likelihood We want a likelihood that can combine both the outline and the interior appearance of a part.Define features which will be sufficient statistics to discriminate foreground and background:
39Features Outline: z1 Chamfer distance Interior: z2 Textons Model joint distribution of z1 z2 as a 2D Gaussian.
41Texton Match Score Texture(z2) : MRF classifier (Varma and Zisserman, CVPR ’03)Multiple texture exemplars x of class tTextons: 3 X 3 square neighbourhoodVQ in texton spaceDescriptor: histogram of texton labellingχ2 distance
42Bag of Words/Histogram of Textons Having slagged off BoW’s I reveal we used it all along, no big deal.So this is like a spatially aware bag of words model…Using a spatially flexible set of templates to work out our bag of words.
432. Fitting the Model Cascades of classifiers Solving MRF Efficient likelihood evaluationSolving MRFLBP, use fast algorithmGBP if LBP doesn’t convergeCould use Semi Definite Programming (2003)Recent work second order cone programming method best CVPR 2006.
44Efficient Detection of parts Cascade of classifiersTop level use chamfer and distance transform for efficient pre filteringAt lower level use full texture model for verification, using efficient nearest neighbour speed ups.
45Cascade of Classifiers-for each part Y. Amit, and D. Geman, 97?; S. Baker, S. Nayer 95
47Side NoteChamfer like linear classifier on distance transform image Felzenszwalb.Tree is a set of linear classifiers.Pictorial structure is a parameterized family of linear classifiers.
48Low levels on TextureThe top levels of the tree use outline to eliminate patches of the image.Efficiency: Using chamfer distance and pre computed distance map.Remaining candidates evaluated using full texture model.
49Efficient Nearest Neighbour Goldstein, Platt and Burges (MSR Tech Report, 2003)Conversion from fixeddistance to rectanglesearchbitvectorij(Rk) = 1= 0Nearest neighbour of xFind intervals in all dimensions‘AND’ appropriate bitvectorsNearest neighbour search onpruned exemplarsRk Є Iiin dimension j
50Recently solve via Integer Programming SDP formulation (Torr 2001, AI stats)SOCP formulation (Kumar, Torr & Zisserman this conference)LBP (Huttenlocher, many)
51OutlineProblem FormulationForm of Shape PriorOptimizationResults
52Optimization Given image D, find best labelling as m* = arg max p(m|D) Treat LPS parameter Θ as a latent (hidden) variableEM frameworkE : sample the distribution over ΘM : obtain the labelling m
53E-Step Given initial labelling m’, determine p(Θ|m’,D) Problem Efficiently sampling from p(Θ|m’,D)SolutionWe develop efficient sum-product Loopy Belief Propagation (LBP) for matching LPS.Similar to efficient max-product LBP for MAP estimateFelzenszwalb and Huttenlocher, CVPR ‘04
54Results Different samples localize different parts well. We cannot use only the MAP estimate of the LPS.
55M-Step Given samples from p(Θ|m’,D), get new labelling mnew Sample Θi providesObject localization to learn RGB distributions of object and backgroundShape prior for segmentationProblemMaximize expected log likelihood using all samplesTo efficiently obtain the new labelling
56M-Step w1 = P(Θ1|m’,D) Cow Image Shape Θ1 RGB Histogram for Object RGB Histogram for Background
57M-Step Best labelling found efficiently using a Single Graph Cut w1 = P(Θ1|m’,D)Cow ImageShape Θ1Θ1m (labels)Image PlaneD (pixels)Best labelling found efficiently using a Single Graph Cut
58Segmentation using Graph Cuts ObjCutΦx(D|bkg) + Φx(bkg|Θ)x…Ψxy(mx,my)+Φ(D|mx,my)y………mz……Φz(D|obj) + Φz(obj|Θ)Bkg
73Do we really need accurate models? Segmentation boundary can be extracted from edgesRough 3D Shape-prior enough for region disambiguation
74Energy of the Pose-specific MRF Energy to be minimizedUnary termPairwise potentialPotts modelShape priorBut what should be the value of θ?
75The different terms of the MRF Likelihood of being foreground given a foreground histogramLikelihood of being foreground given all the termsShape prior modelGrimson- Stauffer segmentationShape prior (distance transform)Resulting Graph-Cuts segmentationOriginal image
76Can segment multiple views simultaneously Put the complete energy in this slide.
77Solve via gradient descent Comparable to level set methodsCould use other approaches (e.g. Objcut)Need a graph cut per function evaluation
78Formulating the Pose Inference Problem Put the complete energy in this slide.
79But…… to compute the MAP of E(x) w.r.t the pose, it means that the unary terms will be changed at EACH iteration and the maxflow recomputed!However…Kohli and Torr showed how dynamic graph cuts can be used to efficiently find MAP solutions for MRFs that change minimally from one time instant to the next: Dynamic Graph Cuts (ICCV05).
80Dynamic Graph Cuts SA PA PB* PB SB differences between A and B solve SimplerproblemPB*differencesbetweenA and BsimilarTalk about jigsaw; write that A and B are similarcheaperoperationPBSBcomputationallyexpensive operation
81Dynamic Image Segmentation Reuse the flows from the previous image frame (consecutive frames)Segmentation ObtainedFlows in n-edges
82Our Algorithm Ga Gb Maximum flow MAP solution G` difference residual graph (Gr)MAP solutionFirst segmentation problemGaG`differencebetweenGa and Gbupdated residual graphGbsecond segmentation problemConsistent problem G and G*
83Dynamic Graph Cut vs Active Cuts Our method flow recyclingAC cut recyclingBoth methods: Tree recycling
84Experimental Analysis Running time of the dynamic algorithmMRF consisting of 2x105 latent variables connected in a 4-neighborhood.