Part 4: Combined segmentation and recognition by Rob Fergus (MIT)

Slides:



Advertisements
Similar presentations
POSE–CUT Simultaneous Segmentation and 3D Pose Estimation of Humans using Dynamic Graph Cuts Mathieu Bray Pushmeet Kohli Philip H.S. Torr Department of.
Advertisements

OBJ CUT & Pose Cut CVPR 05 ECCV 06
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
The Layout Consistent Random Field for detecting and segmenting occluded objects CVPR, June 2006 John Winn Jamie Shotton.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Joint Optimisation for Object Class Segmentation and Dense Stereo Reconstruction Ľubor Ladický, Paul Sturgess, Christopher Russell, Sunando Sengupta, Yalin.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Learning to Combine Bottom-Up and Top-Down Segmentation Anat Levin and Yair Weiss School of CS&Eng, The Hebrew University of Jerusalem, Israel.
Shape Sharing for Object Segmentation
Scene Labeling Using Beam Search Under Mutex Constraints ID: O-2B-6 Anirban Roy and Sinisa Todorovic Oregon State University 1.
Fitting: The Hough transform. Voting schemes Let each feature vote for all the models that are compatible with it Hopefully the noise features will not.
Object class recognition using unsupervised scale-invariant learning Rob Fergus Pietro Perona Andrew Zisserman Oxford University California Institute of.
Generative Models of Images of Objects S. M. Ali Eslami Joint work with Chris Williams Nicolas Heess John Winn June 2012 UoC TTI.
LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
Graph-based image segmentation Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Prague, Czech.
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
Pedestrian Detection in Crowded Scenes Dhruv Batra ECE CMU.
Simultaneous Segmentation and 3D Pose Estimation of Humans or Detection + Segmentation = Tracking? Philip H.S. Torr Pawan Kumar, Pushmeet Kohli, Matt Bray.
Image Parsing: Unifying Segmentation and Detection Z. Tu, X. Chen, A.L. Yuille and S-C. Hz ICCV 2003 (Marr Prize) & IJCV 2005 Sanketh Shetty.
Beyond bags of features: Part-based models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Fitting: The Hough transform
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object Recognition Szeliski Chapter 14.
1 Image Recognition - I. Global appearance patterns Slides by K. Grauman, B. Leibe.
TextonBoost : Joint Appearance, Shape and Context Modeling for Multi-Class Object Recognition and Segmentation J. Shotton*, J. Winn†, C. Rother†, and A.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Oxford Brookes Seminar Thursday 3 rd September, 2009 University College London1 Representing Object-level Knowledge for Segmentation and Image Parsing:
High-Quality Video View Interpolation
The Layout Consistent Random Field for Recognizing and Segmenting Partially Occluded Objects By John Winn & Jamie Shotton CVPR 2006 presented by Tomasz.
Object Recognition: Conceptual Issues Slides adapted from Fei-Fei Li, Rob Fergus, Antonio Torralba, and K. Grauman.
Computer Vision Group University of California Berkeley 1 Cue Integration in Figure/Ground Labeling Xiaofeng Ren, Charless Fowlkes and Jitendra Malik.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
1 Outline Overview Integrating Vision Models CCM: Cascaded Classification Models Learning Spatial Context TAS: Things and Stuff Descriptive Querying of.
What, Where & How Many? Combining Object Detectors and CRFs
Object Recognition by Parts Object recognition started with line segments. - Roberts recognized objects from line segments and junctions. - This led to.
Recognition of 3D Objects or, 3D Recognition of Objects Alec Rivers.
3D LayoutCRF Derek Hoiem Carsten Rother John Winn.
Perceptual and Sensory Augmented Computing Integrating Recognitoin and Reconstruction Integrating Recognition and Reconstruction for Cognitive Scene Interpretation.
Analysis: TextonBoost and Semantic Texton Forests
MRFs and Segmentation with Graph Cuts Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 02/24/10.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Leo Zhu CSAIL MIT Joint work with Chen, Yuille, Freeman and Torralba 1.
Texture We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Texture We would like to thank Amnon Drory for this deck הבהרה : החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Perceptual and Sensory Augmented Computing Discussion Session: Sliding Windows Sliding Windows – Silver Bullet or Evolutionary Deadend? Alyosha Efros,
6.869 Advances in Computer Vision Bill Freeman and Antonio Torralba Lecture 11 MRF’s (continued), cameras and lenses. Spring 2011 remember correction on.
MSRI workshop, January 2005 Object Recognition Collected databases of objects on uniform background (no occlusions, no clutter) Mostly focus on viewpoint.
Putting Context into Vision Derek Hoiem September 15, 2004.
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
Layered Object Detection for Multi-Class Image Segmentation UC Irvine Yi Yang Sam Hallman Deva Ramanan Charless Fowlkes.
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Discussion of Pictorial Structures Pedro Felzenszwalb Daniel Huttenlocher Sicily Workshop September, 2006.
CS 1699: Intro to Computer Vision Detection II: Deformable Part Models Prof. Adriana Kovashka University of Pittsburgh November 12, 2015.
Grammars in computer vision
A Dynamic Conditional Random Field Model for Object Segmentation in Image Sequences Duke University Machine Learning Group Presented by Qiuhua Liu March.
Jigsaws: joint appearance and shape clustering John Winn with Anitha Kannan and Carsten Rother Microsoft Research, Cambridge.
Markov Random Fields & Conditional Random Fields
Object Recognition by Integrating Multiple Image Segmentations Caroline Pantofaru, Cordelia Schmid, Martial Hebert ECCV 2008 E.
Part 4: combined segmentation and recognition Li Fei-Fei.
Image segmentation.
Learning Image Statistics for Bayesian Tracking Hedvig Sidenbladh KTH, Sweden Michael Black Brown University, RI, USA
Holistic Scene Understanding Virginia Tech ECE /02/26 Stanislaw Antol.
LOCUS: Learning Object Classes with Unsupervised Segmentation
Nonparametric Semantic Segmentation
Paper Presentation: Shape and Matching
Learning to Combine Bottom-Up and Top-Down Segmentation
Learning Layered Motion Segmentations of Video
Unsupervised Learning of Models for Recognition
“Traditional” image segmentation
Presentation transcript:

Part 4: Combined segmentation and recognition by Rob Fergus (MIT)

Aim Given an image and object category, to segment the object Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion Segmentation Object Category Model Cow Image Segmented Cow Slide from Kumar ‘05

Feature-detector view

Examples of bottom-up segmentation Using Normalized Cuts, Shi & Malik, 1997 Borenstein and Ullman, ECCV 2002

Jigsaw approach: Borenstein and Ullman, 2002

Perceptual and Sensory Augmented Computing Interleaved Object Categorization and Segmentation Implicit Shape Model - Liebe and Schiele, 2003 Backprojected Hypotheses Interest Points Matched Codebook Entries Probabilistic Voting Voting Space (continuous) Backprojection of Maxima Segmentation Refined Hypotheses (uniform sampling) Liebe and Schiele, 2003, 2005

Random Fields for segmentation I = Image pixels (observed) h = foreground/background labels (hidden) – one label per pixel  = Parameters Prior LikelihoodPosteriorJoint 1.Generative approach models joint  Markov random field (MRF) 2. Discriminative approach models posterior directly  Conditional random field (CRF)

I (pixels) Image Plane i j h (labels)  {foreground, background} hihi hjhj Unary Potential  i ( I |h i,  i ) Pairwise Potential (MRF)  ij (h i, h j |  ij ) MRF PriorLikelihood Generative Markov Random Field Prior has no dependency on I

Conditional Random Field Lafferty, McCallum and Pereira 2001 PairwiseUnary Dependency on I allows introduction of pairwise terms that make use of image. For example, neighboring labels should be similar only if pixel colors are similar  Contrast term Discriminative approach I (pixels) Image Plane i j hihi hjhj e.g Kumar and Hebert 2003

I (pixels) Image Plane i j hihi hjhj Figure from Kumar et al., CVPR 2005 OBJCUT Ω (shape parameter) Kumar, Torr & Zisserman 2005 PairwiseUnary Ω is a shape prior on the labels from a Layered Pictorial Structure (LPS) model Segmentation by: - Match LPS model to image (get number of samples, each with a different pose -Marginalize over the samples using a single graph cut [Boykov & Jolly, 2001] Label smoothness Contrast Distance from Ω Color Likelihood

OBJCUT: Shape prior - Ω - Layered Pictorial Structures (LPS) Generative model Composition of parts + spatial layout Layer 2 Layer 1 Parts in Layer 2 can occlude parts in Layer 1 Spatial Layout (Pairwise Configuration) Kumar, et al. 2004, 2005

In the absence of a clear boundary between object and background SegmentationImage OBJCUT: Results Using LPS Model for Cow

Levin & Weiss [ECCV 2006] Segmentation alignment with image edges Resulting min-cut segmentation Consistency with fragments segmentation

[Lepetit et al. CVPR 2005] Decision forest classifier Features are differences of pixel intensities Classifier Winn and Shotton 2006 Layout Consistent Random Field

Layout consistency (8,3)(9,3)(7,3) (8,2)(9,2)(7,2) (8,4)(9,4)(7,4) Neighboring pixels (p,q) ? (p,q+1) (p,q) (p+1,q+1) (p-1,q+1) Layout consistent Winn and Shotton 2006

Layout Consistent Random Field Layout consistency Part detector Winn and Shotton 2006

Stability of part labelling Part color key

Object-Specific Figure-Ground Segregation Stella X. Yu and Jianbo Shi, 2002

Image parsing: Tu, Zhu and Yuille 2003

Segment out all the cars …. fused tree model for cars Unseen image Training images Segmented Cars Segmentation Trees Overview Multiscale Seg. Todorovic and Ahuja, CVPR 2006 Slide from T. Wu

LOCUS model Deformation field D Position & size T Class shape π Class edge sprite μ o,σ o Edge image e Image Object appearance λ 1 Background appearance λ 0 Mask m Shared between images Different for each image Kannan, Jojic and Frey 2004 Winn and Jojic, 2005

In this section: brief paper reviews Jigsaw approach: Borenstein & Ullman, 2001, 2002 Concurrent recognition and segmentation: Yu and Shi, 2002 Image parsing: Tu, Zhu & Yuille 2003 Interleaved segmentation: Liebe & Schiele, 2004, 2005 OBJCUT: Kumar, Torr, Zisserman 2005 LOCUS: Winn and Jojic, 2005 LayoutCRF: Winn and Shotton, 2006 Levin and Weiss, 2006 Todorovic and Ahuja, 2006

Summary Strength –Explains every pixel of the image –Useful for image editing, layering, etc. Issues –Invariance issues (especially) scale, view-point variations –Inference difficulties

Conditional Random Fields for Segmentation Segmentation map x Image I Low-level pairwise termHigh-level local term Pixel-wise similarity

Object-Specific Figure-Ground Segregation Some segmentation/detection results Yu and Shi, 2002

Multiscale Conditional Random Fields for Image Labeling Xuming He Richard S. Zemel Miguel A´. Carreira-Perpin˜a´n Conditional Random Fields for Object Recognition Ariadna Quattoni Michael Collins Trevor Darrell

OBJCUT Probability of labelling in addition has Unary potential which depend on distance from Θ (shape parameter) D (pixels) m (labels) Θ (shape parameter) Image Plane Object Category Specific MRF x y mxmx mymy Unary Potential Φ x (m x |Θ) Kumar, et al. 2004, 2005

Localization using features

Levin and Weiss 2006 Levin and Weiss, ECCV 2006

Results: horses

Cows: Results Segmentations from interest points Single-frame recognition - No temporal continuity used! Liebe and Schiele, 2003, 2005

Examples of low-level image segmentation Normalized Cuts, Shi & Malik, 1997 Borenstein & Ullman, ECCV 2002

Jigsaw approach Each patch has foreground/background mask

LayoutCRF

Interpretation of p(figure) map –per-pixel confidence in object hypothesis –Use for hypothesis verification p(figure) p(ground) Segmentation p(figure) p(ground) Original image Liebe and Schiele, 2003, 2005