Presentation is loading. Please wait.

Presentation is loading. Please wait.

Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010.

Similar presentations


Presentation on theme: "Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010."— Presentation transcript:

1 Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010

2  Introduction  Building the hierarchy  Graphical modal  Learning  Semantivisual image hierarchy  Implementation  Visualizing the semantivisual hierarchy  Quantitative evaluation  Application  Annotation  Labeling  Classification OUTLINE

3  For images, a meaningful image hierarchy can make image organization, browsing and searching more convenient and effective  Good image hierarchies can serve as knowledge ontology for end tasks such as image retrieval, annotation or classification.  Language-based  Low-level visual feature based INTRODUCTION

4  Use a multi-modal model to represent images and textual tags on the semantivisual hierarchy  Each image is associated with a path of the hierarchy, where the image regions can be assigned to different nodes of the path B ULIDING THE H IERARCHY Each image is decomposed into a set of over-segmented regions R = [R1…Rr…RN] each of the N regions is characterized by four appearance features

5  Graphical model  Each image-text pair (R,W) is assigned to a path C c = [C c1,…,C cl,…,C cL ] B ULIDING THE H IERARCHY

6  Learning the semantivisual image hierarchy  Given a set of unorganized images and user tags associated with them  Gibbs sampling : samples concept index Z, coupling variable S and path C  Sampling Z  Depend on 1) the likelihood of the region appearance 2) the likelihood of tags associated with this region 3) the concept indices of the other regions in the same image-text pair .. B ULIDING THE H IERARCHY

7  Sampling S  Its conditional distribution solely depends on the likelihood of the tag  Sampling C  Influenced by the previous arrangement of the hierarchy and the likelihood of the image-text pair B ULIDING THE H IERARCHY Prior probability induced by nCRP likelihood

8  4000 user upload images and 538 unique user tags  Each image is divided into small patches of 10×10 pixels.  Each patch is assigned to a codeword in a codebook of 500 visual word obtained by K-means  Obtain 4 region codebook for color(HSV histogram), location, texture, normalized SIFT histogram  To speed up learning, we initialize the levels in a path according to tf-idf score.  We obtain a hierarchy of 121 nodes, 4 levels and 53 paths. A S EMANTIVISUAL I MAGE H IERARCHY -- Implementation

9 A S EMANTIVISUAL I MAGE H IERARCHY -- Visualizing the Semantivisual Hierarchy General-to-specific relationship Purely visual information cannot provide meaningful image hierarchy Purely language-based hierarchy would miss close connection

10  Good clustering of images that share similar concepts,i.e., image along the same path, should be more or less annotated with similar tags.  Good hierarchical structure given path, i.e., images and their associated tags at different levels of the path, should demonstrate good general-to-specific relationships. A S EMANTIVISUAL I MAGE H IERARCHY -- A Quantitative Evaluation Of Image Hierarchies A path of L levels is selected from the hierarchy.

11  Given our learned image ontology, we can propose a hierarchical annotation of an unlabeled query image.  nCRP cannot perform well on sparse tag words. Its proposed hierarchy has many words assigned to the root node, resulting in very few paths.  A simple clustering algorithm such as KNN cannot find a good association between the test images and the training images in our challenging dataset with large visual diversity.  In contrast, our model learns an accurate association of visual and text data simultaneously A PPLICATION -- Hierarchical Annotation of Image

12  Serving as an image and text knowledge ontology, our semantivisual hierarchy and model can be used for image labeling without a hierarchical relation. A PPLICATION -- Image Labeling Collect the top 5 predicted words of each image Our model captures the hierarchical structure of image and tags !!

13  Another 4000 image are held out as test images. A PPLICATION -- Image Classification By encoding semantic meaning to the hierarchy, our semantivisual hierarchy delivers a more descriptive structure, which could be helpful for classification.

14  Use image and their tags to construct a meaningful hierarchy that organizes images in a general-to-specific structure.  Our quantitative evaluation by human subjects shows that our hierarchy is more meaningful and accurate than others. C ONCLUSION


Download ppt "Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010."

Similar presentations


Ads by Google