Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010.

Slides:



Advertisements
Similar presentations
Clustering Art & Learning the Semantics of Words and Pictures Manigantan Sethuraman.
Advertisements

LEARNING SEMANTICS OF WORDS AND PICTURES TEJASWI DEVARAPALLI.
Image Retrieval With Relevant Feedback Hayati Cam & Ozge Cavus IMAGE RETRIEVAL WITH RELEVANCE FEEDBACK Hayati CAM Ozge CAVUS.
Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Simultaneous Image Classification and Annotation Chong Wang, David Blei, Li Fei-Fei Computer Science Department Princeton University Published in CVPR.
Bring Order to Your Photos: Event-Driven Classification of Flickr Images Based on Social Knowledge Date: 2011/11/21 Source: Claudiu S. Firan (CIKM’10)
Patch to the Future: Unsupervised Visual Prediction
Parsing Clothing in Fashion Photographs
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Addressing the Medical Image Annotation Task using visual words representation Uri Avni, Tel Aviv University, Israel Hayit GreenspanTel Aviv University,
PHP-based Image Recognition and Retrieval of Late 18th Century Artwork Ben Goodwin Handouts are available for students writing summaries for class assignments.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Landmark Classification in Large- scale Image Collections Yunpeng Li David J. Crandall Daniel P. Huttenlocher ICCV 2009.
Using Structure Indices for Efficient Approximation of Network Properties Matthew J. Rattigan, Marc Maier, and David Jensen University of Massachusetts.
1 Statistical correlation analysis in image retrieval Reporter : Erica Li 2004/9/30.
Commentary-based Video Categorization and Concept Discovery By Janice Leung.
Unsupervised discovery of visual object class hierarchies Josef Sivic (INRIA / ENS), Bryan Russell (MIT), Andrew Zisserman (Oxford), Alyosha Efros (CMU)
CVPR SLAM 2007 Using Group Prior to Identify People In Consumer Images Andrew C. Gallagher Tsuhan Chen Carnegie Mellon University Eastman Kodak Company.
Presented by Zeehasham Rasheed
Ranking by Odds Ratio A Probability Model Approach let be a Boolean random variable: document d is relevant to query q otherwise Consider document d as.
Scalable Text Mining with Sparse Generative Models
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
POTENTIAL RELATIONSHIP DISCOVERY IN TAG-AWARE MUSIC STYLE CLUSTERING AND ARTIST SOCIAL NETWORKS Music style analysis such as music classification and clustering.
Information Retrieval in Practice
DOG I : an Annotation System for Images of Dog Breeds Antonis Dimas Pyrros Koletsis Euripides Petrakis Intelligent Systems Laboratory Technical University.
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Joint Image Clustering and Labeling by Matrix Factorization
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Project 2 SIFT Matching by Hierarchical K-means Quantization
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
Watch, Listen and Learn Sonal Gupta, Joohyun Kim, Kristen Grauman and Raymond Mooney -Pratiksha Shah.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
INTRODUCTION Heesoo Myeong and Kyoung Mu Lee Department of ECE, ASRI, Seoul National University, Seoul, Korea Tensor-based High-order.
For: CS590 Intelligent Systems Related Subject Areas: Artificial Intelligence, Graphs, Epistemology, Knowledge Management and Information Filtering Application.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li and Li Fei-Fei Dept. of Computer Science, Princeton University, USA CVPR ImageNet1.
Algorithmic Detection of Semantic Similarity WWW 2005.
Gang WangDerek HoiemDavid Forsyth. INTRODUCTION APROACH (implement detail) EXPERIMENTS CONCLUSION.
Topic Models Presented by Iulian Pruteanu Friday, July 28 th, 2006.
A New Method for Automatic Clothing Tagging Utilizing Image-Click-Ads Introduction Conclusion Can We Do Better to Reduce Workload?
Image Classification for Automatic Annotation
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Exploiting Ontologies for Automatic Image Annotation Munirathnam Srikanth, Joshua Varner, Mitchell Bowden, Dan Moldovan Language Computer Corporation SIGIR.
Image Classification over Visual Tree Jianping Fan Dept of Computer Science UNC-Charlotte, NC
A Multiresolution Symbolic Representation of Time Series Vasileios Megalooikonomou Qiang Wang Guo Li Christos Faloutsos Presented by Rui Li.
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Cell Segmentation in Microscopy Imagery Using a Bag of Local Bayesian Classifiers Zhaozheng Yin RI/CMU, Fall 2009.
Semantic search-based image annotation Petra Budíková, FI MU CEMI meeting, Plzeň,
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Exploring Social Tagging Graph for Web Object Classification
The topic discovery models
Multimodal Learning with Deep Boltzmann Machines
The topic discovery models
Matching Words with Pictures
a chicken and egg problem…
The topic discovery models
Objects as Attributes for Scene Classification
Y2Seq2Seq: Cross-Modal Representation Learning for 3D Shape and Text by Joint Reconstruction and Prediction of View and Word Sequences 1, Zhizhong.
Ying Dai Faculty of software and information science,
Nonparametric Bayesian Texture Learning and Synthesis
Hierarchical Relational Models for Document Networks
Presentation transcript:

Li-Jia Li Yongwhan Lim Li Fei-Fei Chong Wang David M. Blei B UILDING AND U SING A S EMANTIVISUAL I MAGE H IERARCHY CVPR, 2010

 Introduction  Building the hierarchy  Graphical modal  Learning  Semantivisual image hierarchy  Implementation  Visualizing the semantivisual hierarchy  Quantitative evaluation  Application  Annotation  Labeling  Classification OUTLINE

 For images, a meaningful image hierarchy can make image organization, browsing and searching more convenient and effective  Good image hierarchies can serve as knowledge ontology for end tasks such as image retrieval, annotation or classification.  Language-based  Low-level visual feature based INTRODUCTION

 Use a multi-modal model to represent images and textual tags on the semantivisual hierarchy  Each image is associated with a path of the hierarchy, where the image regions can be assigned to different nodes of the path B ULIDING THE H IERARCHY Each image is decomposed into a set of over-segmented regions R = [R1…Rr…RN] each of the N regions is characterized by four appearance features

 Graphical model  Each image-text pair (R,W) is assigned to a path C c = [C c1,…,C cl,…,C cL ] B ULIDING THE H IERARCHY

 Learning the semantivisual image hierarchy  Given a set of unorganized images and user tags associated with them  Gibbs sampling : samples concept index Z, coupling variable S and path C  Sampling Z  Depend on 1) the likelihood of the region appearance 2) the likelihood of tags associated with this region 3) the concept indices of the other regions in the same image-text pair .. B ULIDING THE H IERARCHY

 Sampling S  Its conditional distribution solely depends on the likelihood of the tag  Sampling C  Influenced by the previous arrangement of the hierarchy and the likelihood of the image-text pair B ULIDING THE H IERARCHY Prior probability induced by nCRP likelihood

 4000 user upload images and 538 unique user tags  Each image is divided into small patches of 10×10 pixels.  Each patch is assigned to a codeword in a codebook of 500 visual word obtained by K-means  Obtain 4 region codebook for color(HSV histogram), location, texture, normalized SIFT histogram  To speed up learning, we initialize the levels in a path according to tf-idf score.  We obtain a hierarchy of 121 nodes, 4 levels and 53 paths. A S EMANTIVISUAL I MAGE H IERARCHY -- Implementation

A S EMANTIVISUAL I MAGE H IERARCHY -- Visualizing the Semantivisual Hierarchy General-to-specific relationship Purely visual information cannot provide meaningful image hierarchy Purely language-based hierarchy would miss close connection

 Good clustering of images that share similar concepts,i.e., image along the same path, should be more or less annotated with similar tags.  Good hierarchical structure given path, i.e., images and their associated tags at different levels of the path, should demonstrate good general-to-specific relationships. A S EMANTIVISUAL I MAGE H IERARCHY -- A Quantitative Evaluation Of Image Hierarchies A path of L levels is selected from the hierarchy.

 Given our learned image ontology, we can propose a hierarchical annotation of an unlabeled query image.  nCRP cannot perform well on sparse tag words. Its proposed hierarchy has many words assigned to the root node, resulting in very few paths.  A simple clustering algorithm such as KNN cannot find a good association between the test images and the training images in our challenging dataset with large visual diversity.  In contrast, our model learns an accurate association of visual and text data simultaneously A PPLICATION -- Hierarchical Annotation of Image

 Serving as an image and text knowledge ontology, our semantivisual hierarchy and model can be used for image labeling without a hierarchical relation. A PPLICATION -- Image Labeling Collect the top 5 predicted words of each image Our model captures the hierarchical structure of image and tags !!

 Another 4000 image are held out as test images. A PPLICATION -- Image Classification By encoding semantic meaning to the hierarchy, our semantivisual hierarchy delivers a more descriptive structure, which could be helpful for classification.

 Use image and their tags to construct a meaningful hierarchy that organizes images in a general-to-specific structure.  Our quantitative evaluation by human subjects shows that our hierarchy is more meaningful and accurate than others. C ONCLUSION