Presentation is loading. Please wait.

Presentation is loading. Please wait.

LOCUS: Learning Object Classes with Unsupervised Segmentation

Similar presentations


Presentation on theme: "LOCUS: Learning Object Classes with Unsupervised Segmentation"β€” Presentation transcript:

1 LOCUS: Learning Object Classes with Unsupervised Segmentation
16-721: Advanced Perception Nik Melchior April 10, 2006

2 Learning Flexible Sprites in Video Layers (Jojic & Frey 2001)
Outline Learning Flexible Sprites in Video Layers (Jojic & Frey 2001) LOCUS 4/10/06 Nik Melchior - LOCUS

3 Motivation Challenges Object category recognition and segmentation in
cluttered scenes Challenges Spatial variability Texture variability Occlusion Pose Lighting 4/10/06 Nik Melchior - LOCUS slide: Kumar, Torr, Zisserman

4 Motivation Segmentation should (ideally) be
Given an image and object category, to segment the object Object Category Model Segmentation Cow Image so far, we've seen attempts to do this using: - template-like outline matching - DT, Chamfer distance - HoG - fragments - ISM Segmented Cow Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion 4/10/06 Nik Melchior - LOCUS slide: Fei-Fei

5 Flexible Sprites Provides a stable segmentation of whole moving objects in video Each flexible sprite is a separate layer containing the transformation of a rigid sprite 4/10/06 Nik Melchior - LOCUS

6 Big Picture Generative model
Seeks the best explanation for the pixels of all images, constrained by number of layers Operates directly on pixels rather than low-level features Rigorous formulation similar to DDMCMC 4/10/06 Nik Melchior - LOCUS

7 Flexible Sprites User specifies number of layers Each layer contains:
Sprite appearance 's' Real-valued mask 'm' Discretized simple transformation 'T' β€œpermutation matrix that rearranges the pixels in s” π‘₯=π‘‡π‘šβˆ—π‘‡π‘ ξ‚ƒπ‘‡ π‘š ξ‚— βˆ—π‘ξ‚ƒπ‘›π‘œπ‘–π‘ π‘’ 4/10/06 Nik Melchior - LOCUS

8 Appearance, mask, and transform
Zero-mean Gaussian noise Recursive formula for image appearance Formulation π‘ξ‚žπ‘₯∣ 𝑠 𝑙 , π‘š 𝑙 , 𝑇 𝑙 ξ‚Ÿ=𝑁 ξ‚žπ‘₯; 𝐿 ξ‚ž ξ‚ž 𝐿 𝑇 𝑖 π‘š 𝑖 ξ‚— ξ‚Ÿ βˆ— 𝑇 𝑙 π‘š 𝑙 βˆ— 𝑇 𝑙 𝑠 𝑙 ξ‚Ÿ ,ξ‚Έξ‚Ÿ Assuming independence of each sprite class, appearance, mask, and transformation in each layer Transformation prior (uniform) Class prior (to be learned) Mask Appearance priors: - gaussian for s, m - uniform for T need to learn: - prior over c (pi) - means and variances of sprite appearances and masks - noise beta - uniform prior over transformation due in part to discretization used π‘ξ‚žπ‘₯, 𝑠 𝑙 , π‘š 𝑙 , 𝑇 𝑙 ξ‚Ÿ=𝑁 ξ‚žπ‘₯; 𝐿 ξ‚ž ξ‚ž 𝐿 𝑇 𝑖 π‘š 𝑖 ξ‚— ξ‚Ÿ βˆ— 𝑇 𝑙 π‘š 𝑙 βˆ— 𝑇 𝑙 𝑠 𝑙 ξ‚Ÿ ,ξ‚Έξ‚Ÿ βˆ— 𝐿 ξ‚žπ‘ ξ‚ž 𝑠 𝑙 ;  𝑐 𝑙 ,  𝑐 𝑙 ξ‚Ÿ 𝑁 ξ‚ž π‘š 𝑙 ; ξ‚½ 𝑐 𝑙 ,  𝑐 𝑙 ξ‚Ÿ  𝑐 𝑙  𝑇 𝑙 ξ‚Ÿ 4/10/06 Nik Melchior - LOCUS

9 Formulation Use EM to maximize the posterior
𝑝 ξ‚ž 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , π‘š 𝑙 βˆ£π‘‹ξ‚Ÿ = 𝑝 ξ‚žπ‘₯, 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , π‘š 𝑙 ξ‚Ÿ 𝑝 ξ‚žπ‘₯, 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , π‘š 𝑙 ξ‚Ÿ β‰ˆ 𝐿 π‘žξ‚ž 𝑐 𝑙 , 𝑇 𝑙 ξ‚Ÿπ‘žξ‚ž 𝑠 𝑙 ξ‚Ÿπ‘žξ‚ž π‘š 𝑙 ξ‚Ÿ exact posterior intractable - (CJ)^L - C = # classes - J = # transformations - L = # layers - factorize to make it linear CJL compute KL divergence of 2 distributions 𝐷= ξ‚ž 𝐿 π‘žξ‚ž 𝑐 𝑙 , 𝑇 𝑙 ξ‚Ÿπ‘žξ‚ž 𝑠 𝑙 ξ‚Ÿπ‘žξ‚ž π‘š 𝑙 ξ‚Ÿ ξ‚Ÿ =ln 𝑝 ξ‚ž 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , π‘š 𝑙 ∣π‘₯ξ‚Ÿ 𝐿 π‘ž ξ‚ž 𝑐 𝑙 , 𝑇 𝑙 ξ‚Ÿπ‘žξ‚ž 𝑠 𝑙 ξ‚Ÿπ‘žξ‚ž π‘š 𝑙 ξ‚Ÿ 4/10/06 Nik Melchior - LOCUS

10 Formulation Maximize F
𝐹=𝐷lnπ‘ξ‚žπ‘₯ξ‚Ÿ = ξ‚ž 𝐿 π‘žξ‚ž 𝑐 𝑙 , 𝑇 𝑙 ξ‚Ÿπ‘žξ‚ž 𝑠 𝑙 ξ‚Ÿπ‘žξ‚ž π‘š 𝑙 ξ‚Ÿ ξ‚Ÿ =ln 𝑝 ξ‚žπ‘₯, 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , π‘š 𝑙 ξ‚Ÿ 𝐿 π‘ž ξ‚ž 𝑐 𝑙 , 𝑇 𝑙 ξ‚Ÿπ‘žξ‚ž 𝑠 𝑙 ξ‚Ÿπ‘žξ‚ž π‘š 𝑙 ξ‚Ÿ Maximize F log of numerator is a sum of Mahalanobis distances q ln 1/q is the entropy of q 4/10/06 Nik Melchior - LOCUS

11 E step For each single image:
the mean should stay close to the overall mean, but large where image does not match background variance should be high where the transformed image is similar to the background similar update for xi_T, the transform, which ensures that image matches transformed sprite 4/10/06 Nik Melchior - LOCUS

12 M step 4/10/06 Nik Melchior - LOCUS

13 Results 1 fps at 320x240 4/10/06 Nik Melchior - LOCUS

14 LOCUS Learning Object Classes with Unsupervised Segmentation 4/10/06
Nik Melchior - LOCUS

15 Comparison to Flexible Sprites
Classes instead of single sprites more complex appearance model addition of edge model 2 layers: object and model weak prior of palettes (appearance) one for object, one for background 4/10/06 Nik Melchior - LOCUS

16 Background appearance Ξ»0
LOCUS model Shared between images Class shape Ο€ Class edge sprite ΞΌo,Οƒo Deformation field D Position & size T Different for each image Emphasise class model (edge + mask) (shared) – all other variables per-image. Emphasise LEARN EVERYTHING SIMULTANEOUSLY. Mask m Edge image e Background appearance Ξ»0 Object appearance Ξ»1 4/10/06 Nik Melchior - LOCUS Image Winn and Jojic, 2005

17 LOCUS model Deformation field D Markov Random Field Transform T
scaling and translation rotation or full affine transform possible Mask m Edge model e Appearance model Gaussian mixture model (histogram of colors or textures) D: MRF penalises squared distance b/w neighboring vectors m: favors short segmentation boundaries e: not applied to background due to expected noise 4/10/06 Nik Melchior - LOCUS

18 EM formulation Posterior similar to flexible sprites
π‘ƒξ‚žξƒ†,  π‘œ , ξƒˆ π‘œ ,  𝑏 , ξƒˆ 𝑏 βˆ£πΌξ‚Ÿβ‰ˆ 𝑖 𝑄 ξ‚ž  𝑖 ξ‚Ÿπ‘„ξ‚ž  𝑖 π‘œ , ξƒˆ 𝑖 π‘œ ξ‚Ÿπ‘„ξ‚ž  𝑖 𝑏 , ξƒˆ 𝑖 𝑏 ξ‚Ÿ Posterior similar to flexible sprites Iteratively optimize over 9 latent variables - mixture model parameters - binary mask - position and size transformation - deformation field - real-valued mask - object edge model - background edge model ,π‘š,𝑇,𝐷,,  π‘œ , ξƒˆ π‘œ ,  𝑏 , ξƒˆ 𝑏 4/10/06 Nik Melchior - LOCUS

19 Results 4/10/06 Nik Melchior - LOCUS

20 Results Comparison to Borenstein fragments
Percentages indicate correctly labeled pixels across all images Last row removes class shape and edge information 4/10/06 Nik Melchior - LOCUS

21 Results 4/10/06 Nik Melchior - LOCUS


Download ppt "LOCUS: Learning Object Classes with Unsupervised Segmentation"

Similar presentations


Ads by Google