Presentation is loading. Please wait.

Presentation is loading. Please wait.

Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N96994134 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.

Similar presentations


Presentation on theme: "Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N96994134 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard."— Presentation transcript:

1 Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N96994134 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard Socher and Li Fei-Fei Computer Vision and Pattern Recognition (CVPR) 2009

2 Outline 2 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

3 Outline 3 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

4 1.Introduction 4  This paper proposed a novel generative model for simultaneously recognizing and segmenting object and scene classes.  Robust Representation of the Noisy Data  Flexible and Automatic Learning  Total Scene Understanding

5 1.Introduction 5 ClassificationAnnotationSegmentation Mutually beneficial!

6 1.Introduction 6 Athlete Horse Grass Trees Sky Saddle ClassificationAnnotationSegmentation Horse class: Polo

7 1.Introduction 7 Horse Sky Tree Grass Athlete Horse Grass Trees Sky Saddle ClassificationAnnotationSegmentation Horse Athlete class: Polo

8 1.Introduction 8 class: Polo Horse Athlete Horse Grass Trees Sky Saddle ClassificationAnnotationSegmentation

9 9 Related Work: Tu et al 03 Annotation Segmentation Horse Sky Tree Grass Horse Athlete Li & Fei-Fei 07 Annotation Classification Sky Grass Horse Athlete Horse Class: Polo Classification Segmentation Tree Heitz et al 08 Class: Polo

10 Outline 10 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

11 2.Hierarchical Generative Model-- Generative Model 11  Generative model: model p(x, y) or p(x|y)p(y)  Discriminative model: model p(y|x) 010203040506070 0 0.5 1 x = data 010203040506070 0 0.05 0.1 From Prof. Antonio Torralba course slide

12 2.Hierarchical Generative Model-- Generative Model 12  Naïve Bayesian model  (c: class, w: visual words)  Once we have learnt the distribution, for a query image w1w1 … wnwn c Bayesian Networks

13 2.Hierarchical Generative Model-- Generative model: Another example 13  Mixture Gaussian Model ? How to infer from unlabeled data even if we know the underlining probability distribution structure?

14 2.Hierarchical Generative Model– A graphical model 14 Directed graph Nodes represent variables Links show dependencies Conditional distributions at each node Inverse Variance Observed data Object class c γ μ x Mean P(μ|c) P(c) P(γ|c) P(x|μ,γ) Hidden

15 2.Hierarchical Generative Model– Spatial Latent Topic Model (Unsupervised) 15  Maximize Log-likelihood  an optimization problem: close-formed solution is intractable Dirichlet prior Multinomial

16 2.Hierarchical Generative Model– Spatial Latent Topic Model (Supervised) 16  For a query image, I d, find its most probable category c : Now it becomes C x K matrix, i.e. θ depends on observed c

17 2.Hierarchical Generative Model 17 C Nr O R NFNF X ArAr Nt Z S T D Athlete Horse Grass Trees Sky Saddle

18 2.Hierarchical Generative Model 18 C Visual Text class: Polo Athlete Horse Grass Trees Sky Saddle Joint distribution of random variable Visual Component Text Component. D

19 2.Hierarchical Generative Model 19 O Text Component. D Visual Text C class: Polo

20 2.Hierarchical Generative Model 20 R NFNF Color Location Texture Shape Text Component. O D Visual Text C class: Polo

21 R NFNF O D Visual Text C class: Polo X ArAr. Text Component 2.Hierarchical Generative Model 21

22 R NFNF O D Visual Text C class: Polo X ArAr Z NrNt “Connector variable” Athlete Horse Grass Trees Sky Saddle Text Component. 2.Hierarchical Generative Model 22

23 R NFNF O D Visual Text C class: Polo X ArAr Z NrNt “Connector variable”. S Athlete Horse Grass Trees Sky Saddle Athlete Horse Grass Trees Sky Saddle Visible Not visible “Switch variable” Horse Athlete Horse 2.Hierarchical Generative Model 23

24 R NFNF O D Visual Text C class: Polo X ArAr Z NrNt “Connector variable” S Athlete Horse Grass Trees Sky Saddle Visible Not visible “Switch variable” T Horse. 2.Hierarchical Generative Model 24

25 2.Hierarchical Generative Model 25  The model represent image features, object regions, visually relevant and irrelevant tags.

26 Outline 26 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

27 3.Automatic Learning 27  A framework for automatic learning from Internet images and tags (i.e. flickr.com), hence offering a scalable approach with no additional human labor.

28 3.Automatic Learning Exact Inference is Intractable ! Relationship of the random variables Visual Text C Nr O R NF X Ar Nt Z S T 28

29  Collapsed Gibbs Sampling 3.Automatic Learning Relationship of the random variables Visual Text C Nr O R NF X Ar Nt Z S T Top-down force Bottom-up force from visual information Bottom-up force from text information (R. Neal, 2000) 29

30 3.Automatic Learning 30 Step 1: Obtain Candidate Tags Reduce the number of tags by keeping words that belong to the ‘physical entity’ group. Step 2: Initialize Object Obtain initial object models Annotate scene images. Select initialization images. Step 3: Automatic Learning Add more Flickr images and their tags to jointly train the model.

31 Outline 31 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

32 4.Classification, Annotation and Segmentation 32  Classification  Use the visual component of the model to compute the probability of each scene class, by integrating out the latent object.  Annotation  Given an unknown image, annotation tags are extracted from the segmentation results.  Segmentation  Segmentation infers the exact pixel locations of each of the objects in the scene.

33 4.Classification, Annotation and Segmentation 33  The comparison between the results in the first two columns underscores the effectiveness of the contextual facilitation by the top-down classification on the annotation and segmentation tasks

34 Outline 34 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

35 5.Experimental Results 35  Comparison of classification results

36 5.Experimental Results 36  Comparison of precision and recall value of annotation

37 5.Experimental Results 37  Results of segmentation on seven object categories and mean values

38 Outline 38 1. Introduction 2. Hierarchical Generative Model 3. Automatic Learning 4. Inference: Classification, Annotation and Segmentation 5. Experimental Results 6. Conclusions

39 6.Conclusion 39  This paper proposed a hierarchical model is developed to unify the patch-level, object-level, and scene-level information.  The model is related to several research area:  Image understanding using contextual information.  Machine translation between words and images.  Simultaneous object recognition and segmentation.  Learning semantic visual models from Internet data.

40 Thank You! 40 Q & A


Download ppt "Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N96994134 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard."

Similar presentations


Ads by Google