LOCUS: Learning Object Classes with Unsupervised Segmentation

Slides:



Advertisements
Similar presentations
Part 2: Unsupervised Learning
Advertisements

O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Active Appearance Models
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Learning deformable models Yali Amit, University of Chicago Alain Trouvé, CMLA Cachan.
Top-Down & Bottom-Up Segmentation
Medical Image Registration Kumar Rajamani. Registration Spatial transform that maps points from one image to corresponding points in another image.
A Bayesian Approach to Recognition Moshe Blank Ita Lifshitz Reverend Thomas Bayes
Change Detection C. Stauffer and W.E.L. Grimson, “Learning patterns of activity using real time tracking,” IEEE Trans. On PAMI, 22(8): , Aug 2000.
Agenda Introduction Bag-of-words models Visual words with spatial location Part-based models Discriminative methods Segmentation and recognition Recognition-based.
Part 4: Combined segmentation and recognition by Rob Fergus (MIT)
July 27, 2002 Image Processing for K.R. Precision1 Image Processing Training Lecture 1 by Suthep Madarasmi, Ph.D. Assistant Professor Department of Computer.
Chapter 8 Content-Based Image Retrieval. Query By Keyword: Some textual attributes (keywords) should be maintained for each image. The image can be indexed.
電腦視覺 Computer and Robot Vision I Chapter2: Binary Machine Vision: Thresholding and Segmentation Instructor: Shih-Shinh Huang 1.
Learning to estimate human pose with data driven belief propagation Gang Hua, Ming-Hsuan Yang, Ying Wu CVPR 05.
LOCUS (Learning Object Classes with Unsupervised Segmentation) A variational approach to learning model- based segmentation. John Winn Microsoft Research.
Contour Based Approaches for Visual Object Recognition Jamie Shotton University of Cambridge Joint work with Roberto Cipolla, Andrew Blake.
A Study of Approaches for Object Recognition
Problem Sets Problem Set 3 –Distributed Tuesday, 3/18. –Due Thursday, 4/3 Problem Set 4 –Distributed Tuesday, 4/1 –Due Tuesday, 4/15. Probably a total.
1Ellen L. Walker Matching Find a smaller image in a larger image Applications Find object / pattern of interest in a larger picture Identify moving objects.
Information Retrieval in Practice
Crash Course on Machine Learning
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Computer vision.
Feature and object tracking algorithms for video tracking Student: Oren Shevach Instructor: Arie nakhmani.
Prakash Chockalingam Clemson University Non-Rigid Multi-Modal Object Tracking Using Gaussian Mixture Models Committee Members Dr Stan Birchfield (chair)
TP15 - Tracking Computer Vision, FCUP, 2013 Miguel Coimbra Slides by Prof. Kristen Grauman.
Learning and Recognizing Human Dynamics in Video Sequences Christoph Bregler Alvina Goh Reading group: 07/06/06.
#MOTION ESTIMATION AND OCCLUSION DETECTION #BLURRED VIDEO WITH LAYERS
Why is computer vision difficult?
Image Segmentation and Edge Detection Digital Image Processing Instructor: Dr. Cheng-Chien LiuCheng-Chien Liu Department of Earth Sciences National Cheng.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
Epitomic Location Recognition A generative approach for location recognition K. Ni, A. Kannan, A. Criminisi and J. Winn In proc. CVPR Anchorage,
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Paper Reading Dalong Du Nov.27, Papers Leon Gu and Takeo Kanade. A Generative Shape Regularization Model for Robust Face Alignment. ECCV08. Yan.
Looking at people and Image-based Localisation Roberto Cipolla Department of Engineering Research team
Lecture 2: Statistical learning primer for biologists
Content-Based Image Retrieval QBIC Homepage The State Hermitage Museum db2www/qbicSearch.mac/qbic?selLang=English.
Efficient Belief Propagation for Image Restoration Qi Zhao Mar.22,2006.
Layer-finding in Radar Echograms using Probabilistic Graphical Models David Crandall Geoffrey C. Fox School of Informatics and Computing Indiana University,
Part 4: combined segmentation and recognition Li Fei-Fei.
776 Computer Vision Jan-Michael Frahm Spring 2012.
776 Computer Vision Jan-Michael Frahm Spring 2012.
Biointelligence Laboratory, Seoul National University
Data Mining, Neural Network and Genetic Programming
Segmentation of Dynamic Scenes
CH 5: Multivariate Methods
You can check broken videos in this slide here :
Paper Presentation: Shape and Matching
Dynamical Statistical Shape Priors for Level Set Based Tracking
Object detection as supervised classification
By: Kevin Yu Ph.D. in Computer Engineering
ECE 692 – Advanced Topics in Computer Vision
Computer Vision Lecture 16: Texture II
Learning to Combine Bottom-Up and Top-Down Segmentation
Learning Layered Motion Segmentations of Video
PRAKASH CHOCKALINGAM, NALIN PRADEEP, AND STAN BIRCHFIELD
Brief Review of Recognition + Context
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Texture.
Transformation-invariant clustering using the EM algorithm
Foundation of Video Coding Part II: Scalar and Vector Quantization
Liyuan Li, Jerry Kah Eng Hoe, Xinguo Yu, Li Dong, and Xinqi Chu
Paper Reading Dalong Du April.08, 2011.
The EM Algorithm With Applications To Image Epitome
Unsupervised Perceptual Rewards For Imitation Learning
“Traditional” image segmentation
Presentation transcript:

LOCUS: Learning Object Classes with Unsupervised Segmentation 16-721: Advanced Perception Nik Melchior April 10, 2006

Learning Flexible Sprites in Video Layers (Jojic & Frey 2001) Outline Learning Flexible Sprites in Video Layers (Jojic & Frey 2001) LOCUS 4/10/06 Nik Melchior - LOCUS

Motivation Challenges Object category recognition and segmentation in cluttered scenes Challenges Spatial variability Texture variability Occlusion Pose Lighting 4/10/06 Nik Melchior - LOCUS slide: Kumar, Torr, Zisserman

Motivation Segmentation should (ideally) be Given an image and object category, to segment the object Object Category Model Segmentation Cow Image so far, we've seen attempts to do this using: - template-like outline matching - DT, Chamfer distance - HoG - fragments - ISM Segmented Cow Segmentation should (ideally) be shaped like the object e.g. cow-like obtained efficiently in an unsupervised manner able to handle self-occlusion 4/10/06 Nik Melchior - LOCUS slide: Fei-Fei

Flexible Sprites Provides a stable segmentation of whole moving objects in video Each flexible sprite is a separate layer containing the transformation of a rigid sprite 4/10/06 Nik Melchior - LOCUS

Big Picture Generative model Seeks the best explanation for the pixels of all images, constrained by number of layers Operates directly on pixels rather than low-level features Rigorous formulation similar to DDMCMC 4/10/06 Nik Melchior - LOCUS

Flexible Sprites User specifies number of layers Each layer contains: Sprite appearance 's' Real-valued mask 'm' Discretized simple transformation 'T' “permutation matrix that rearranges the pixels in s” 𝑥=𝑇𝑚∗𝑇𝑠𝑇 𝑚  ∗𝑏𝑛𝑜𝑖𝑠𝑒 4/10/06 Nik Melchior - LOCUS

Appearance, mask, and transform Zero-mean Gaussian noise Recursive formula for image appearance Formulation 𝑝𝑥∣ 𝑠 𝑙 , 𝑚 𝑙 , 𝑇 𝑙 =𝑁 𝑥; 𝐿   𝐿 𝑇 𝑖 𝑚 𝑖   ∗ 𝑇 𝑙 𝑚 𝑙 ∗ 𝑇 𝑙 𝑠 𝑙  , Assuming independence of each sprite class, appearance, mask, and transformation in each layer Transformation prior (uniform) Class prior (to be learned) Mask Appearance priors: - gaussian for s, m - uniform for T need to learn: - prior over c (pi) - means and variances of sprite appearances and masks - noise beta - uniform prior over transformation due in part to discretization used 𝑝𝑥, 𝑠 𝑙 , 𝑚 𝑙 , 𝑇 𝑙 =𝑁 𝑥; 𝐿   𝐿 𝑇 𝑖 𝑚 𝑖   ∗ 𝑇 𝑙 𝑚 𝑙 ∗ 𝑇 𝑙 𝑠 𝑙  , ∗ 𝐿 𝑁  𝑠 𝑙 ;  𝑐 𝑙 ,  𝑐 𝑙  𝑁  𝑚 𝑙 ;  𝑐 𝑙 ,  𝑐 𝑙   𝑐 𝑙  𝑇 𝑙  4/10/06 Nik Melchior - LOCUS

Formulation Use EM to maximize the posterior 𝑝  𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , 𝑚 𝑙 ∣𝑋 = 𝑝 𝑥, 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , 𝑚 𝑙  𝑝 𝑥, 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , 𝑚 𝑙  ≈ 𝐿 𝑞 𝑐 𝑙 , 𝑇 𝑙 𝑞 𝑠 𝑙 𝑞 𝑚 𝑙  exact posterior intractable - (CJ)^L - C = # classes - J = # transformations - L = # layers - factorize to make it linear CJL compute KL divergence of 2 distributions 𝐷=  𝐿 𝑞 𝑐 𝑙 , 𝑇 𝑙 𝑞 𝑠 𝑙 𝑞 𝑚 𝑙   =ln 𝑝  𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , 𝑚 𝑙 ∣𝑥 𝐿 𝑞  𝑐 𝑙 , 𝑇 𝑙 𝑞 𝑠 𝑙 𝑞 𝑚 𝑙  4/10/06 Nik Melchior - LOCUS

Formulation Maximize F 𝐹=𝐷ln𝑝𝑥 =  𝐿 𝑞 𝑐 𝑙 , 𝑇 𝑙 𝑞 𝑠 𝑙 𝑞 𝑚 𝑙   =ln 𝑝 𝑥, 𝑐 𝑙 , 𝑇 𝑙 , 𝑠 𝑙 , 𝑚 𝑙  𝐿 𝑞  𝑐 𝑙 , 𝑇 𝑙 𝑞 𝑠 𝑙 𝑞 𝑚 𝑙  Maximize F log of numerator is a sum of Mahalanobis distances q ln 1/q is the entropy of q 4/10/06 Nik Melchior - LOCUS

E step For each single image: the mean should stay close to the overall mean, but large where image does not match background variance should be high where the transformed image is similar to the background similar update for xi_T, the transform, which ensures that image matches transformed sprite 4/10/06 Nik Melchior - LOCUS

M step 4/10/06 Nik Melchior - LOCUS

Results 1 fps at 320x240 4/10/06 Nik Melchior - LOCUS

LOCUS Learning Object Classes with Unsupervised Segmentation 4/10/06 Nik Melchior - LOCUS

Comparison to Flexible Sprites Classes instead of single sprites more complex appearance model addition of edge model 2 layers: object and model weak prior of palettes (appearance) one for object, one for background 4/10/06 Nik Melchior - LOCUS

Background appearance λ0 LOCUS model Shared between images Class shape π Class edge sprite μo,σo Deformation field D Position & size T Different for each image Emphasise class model (edge + mask) (shared) – all other variables per-image. Emphasise LEARN EVERYTHING SIMULTANEOUSLY. Mask m Edge image e Background appearance λ0 Object appearance λ1 4/10/06 Nik Melchior - LOCUS Image Winn and Jojic, 2005

LOCUS model Deformation field D Markov Random Field Transform T scaling and translation rotation or full affine transform possible Mask m Edge model e Appearance model Gaussian mixture model (histogram of colors or textures) D: MRF penalises squared distance b/w neighboring vectors m: favors short segmentation boundaries e: not applied to background due to expected noise 4/10/06 Nik Melchior - LOCUS

EM formulation Posterior similar to flexible sprites 𝑃,  𝑜 ,  𝑜 ,  𝑏 ,  𝑏 ∣𝐼≈ 𝑖 𝑄   𝑖 𝑄  𝑖 𝑜 ,  𝑖 𝑜 𝑄  𝑖 𝑏 ,  𝑖 𝑏  Posterior similar to flexible sprites Iteratively optimize over 9 latent variables - mixture model parameters - binary mask - position and size transformation - deformation field - real-valued mask - object edge model - background edge model ,𝑚,𝑇,𝐷,,  𝑜 ,  𝑜 ,  𝑏 ,  𝑏 4/10/06 Nik Melchior - LOCUS

Results 4/10/06 Nik Melchior - LOCUS

Results Comparison to Borenstein fragments Percentages indicate correctly labeled pixels across all images Last row removes class shape and edge information 4/10/06 Nik Melchior - LOCUS

Results 4/10/06 Nik Melchior - LOCUS