Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The Twenty-Third IEEE Conference on Computer Vision and.

Slides:



Advertisements
Similar presentations
Topic models Source: Topic models, David Blei, MLSS 09.
Advertisements

Weakly supervised learning of MRF models for image region labeling Jakob Verbeek LEAR team, INRIA Rhône-Alpes.
Zhimin CaoThe Chinese University of Hong Kong Qi YinITCS, Tsinghua University Xiaoou TangShenzhen Institutes of Advanced Technology Chinese Academy of.
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
Community Detection with Edge Content in Social Media Networks Paper presented by Konstantinos Giannakopoulos.
Patch to the Future: Unsupervised Visual Prediction
A Probabilistic Framework for Semi-Supervised Clustering
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Caimei Lu et al. (KDD 2010) Presented by Anson Liang.
Clustering II.
WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1.
Object retrieval with large vocabularies and fast spatial matching
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Lecture 28: Bag-of-words models
Unsupervised Learning: Clustering Rong Jin Outline  Unsupervised learning  K means for clustering  Expectation Maximization algorithm for clustering.
Expectation Maximization Method Effective Image Retrieval Based on Hidden Concept Discovery in Image Database By Sanket Korgaonkar Masters Computer Science.
Unsupervised discovery of visual object class hierarchies Josef Sivic (INRIA / ENS), Bryan Russell (MIT), Andrew Zisserman (Oxford), Alyosha Efros (CMU)
Object Class Recognition Using Discriminative Local Features Gyuri Dorko and Cordelia Schmid.
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
AN ANALYSIS OF SINGLE- LAYER NETWORKS IN UNSUPERVISED FEATURE LEARNING [1] Yani Chen 10/14/
Hierarchical Subquery Evaluation for Active Learning on a Graph Oisin Mac Aodha, Neill Campbell, Jan Kautz, Gabriel Brostow CVPR 2014 University College.
Multiple Object Class Detection with a Generative Model K. Mikolajczyk, B. Leibe and B. Schiele Carolina Galleguillos.
Radial Basis Function Networks
Information Retrieval in Practice
Exercise Session 10 – Image Categorization
Jinhui Tang †, Shuicheng Yan †, Richang Hong †, Guo-Jun Qi ‡, Tat-Seng Chua † † National University of Singapore ‡ University of Illinois at Urbana-Champaign.
Transfer Learning From Multiple Source Domains via Consensus Regularization Ping Luo, Fuzhen Zhuang, Hui Xiong, Yuhong Xiong, Qing He.
Computer Vision James Hays, Brown
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
CSE 185 Introduction to Computer Vision Pattern Recognition.
Mining Discriminative Components With Low-Rank and Sparsity Constraints for Face Recognition Qiang Zhang, Baoxin Li Computer Science and Engineering Arizona.
SVCL Automatic detection of object based Region-of-Interest for image compression Sunhyoung Han.
COMMON EVALUATION FINAL PROJECT Vira Oleksyuk ECE 8110: Introduction to machine Learning and Pattern Recognition.
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
Data Mining and Machine Learning Lab Unsupervised Feature Selection for Linked Social Media Data Jiliang Tang and Huan Liu Computer Science and Engineering.
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
MACHINE LEARNING 8. Clustering. Motivation Based on E ALPAYDIN 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2  Classification problem:
ECE 5984: Introduction to Machine Learning Dhruv Batra Virginia Tech Topics: –Unsupervised Learning: Kmeans, GMM, EM Readings: Barber
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Tell Me What You See and I will Show You Where It Is Jia Xu 1 Alexander G. Schwing 2 Raquel Urtasun 2,3 1 University of Wisconsin-Madison 2 University.
Region-Based Saliency Detection and Its Application in Object Recognition IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM FOR VIDEO TECHNOLOGY, VOL. 24 NO. 5,
CVPR2013 Poster Detecting and Naming Actors in Movies using Generative Appearance Models.
Machine Learning Overview Tamara Berg CS 560 Artificial Intelligence Many slides throughout the course adapted from Svetlana Lazebnik, Dan Klein, Stuart.
Data Mining, ICDM '08. Eighth IEEE International Conference on Duy-Dinh Le National Institute of Informatics Hitotsubashi, Chiyoda-ku Tokyo,
1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision.
Clustering Instructor: Max Welling ICS 178 Machine Learning & Data Mining.
Flat clustering approaches
 Present by 陳群元.  Introduction  Previous work  Predicting motion patterns  Spatio-temporal transition distribution  Discerning pedestrians  Experimental.
Discovering Objects and their Location in Images Josef Sivic 1, Bryan C. Russell 2, Alexei A. Efros 3, Andrew Zisserman 1 and William T. Freeman 2 Goal:
Towards Total Scene Understanding: Classification, Annotation and Segmentation in an Automatic Framework N 工科所 錢雅馨 2011/01/16 Li-Jia Li, Richard.
CS654: Digital Image Analysis
Discriminative Training and Machine Learning Approaches Machine Learning Lab, Dept. of CSIE, NCKU Chih-Pin Liao.
Cluster Analysis Dr. Bernard Chen Assistant Professor Department of Computer Science University of Central Arkansas.
Iterative K-Means Algorithm Based on Fisher Discriminant UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Mantao Xu to be presented.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Semi-Supervised Clustering
Deeply learned face representations are sparse, selective, and robust
The topic discovery models
Bag-of-Visual-Words Based Feature Extraction
Nonparametric Semantic Segmentation
The topic discovery models
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Image Segmentation Techniques
The topic discovery models
Shashi Shekhar Weili Wu Sanjay Chawla Ranga Raju Vatsavai
Presentation transcript:

Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The Twenty-Third IEEE Conference on Computer Vision and Pattern Recognition Session: Object Recognition II: using Language, Tuesday 15, June 2010 Rongrong Ji 1, Hongxun Yao 1, Xiaoshuai Sun 1, Bineng Zhong 1, and Wen Gao 1,2 1 Visual Intelligence Laboratory, Harbin Institute of Technology 2 School of Electronic Engineering and Computer Science, Peking University

Overview ✴ Problem Statement ✴ Building patch-labeling correspondence ✴ Generative semantic embedding ✴ Experimental comparisons ✴ Conclusions

Overview ✴ Problem Statement ✴ Building patch-labeling correspondence ✴ Generative semantic embedding ✴ Experimental comparisons ✴ Conclusions

Introduction Building visual codebooks –Quantization-based approaches K-Means, Vocabulary Tree, Approximate K-Means et al. –Feature-indexing-based approaches K-D Tree, R Tree, Ball Tree et al. Refining visual codebooks –Topic model decompositions pLSA, LDA, GMM et al. –Spatial refinement Visual pattern mining, Discriminative visual phrase et al.

Introduction With the prosperity of Web community User Tags Our Contribution

Introduction Related works –Semi-supervised clustering –Data compression –Learning to build codebooks –Learning to refined codebooks The compression task try to learn a codebook to minimize the quality lost of original and compressed data Dual Clustering MRF Co-Clustering Include category labels into quantization Adaptive Vocabulary ERC-Forest MIL minimization Supervised merge or split the original codewords using category labels

Introduction Problems to achieve this goal image level –Correlative semantic labels –Model generality A traditional San Francisco street view with car and buildings ? Flower Rose

Introduction Our contribution –Publish a ground truth path-labeling set –A generalized semantic embedding framework Easy to deploy into different codebook models –Modeling label correlations Model correlative tagging in semantic embedding

Introduction The proposed framework Build patch-level correspondence Supervised visual codebook construction

Overview ✴ Problem Statement ✴ Building patch-labeling correspondence ✴ Generative semantic embedding ✴ Experimental comparisons ✴ Conclusions

Building Patch-Labeling Correspondence Collecting over 60,000 Flickr photos –For each photo DoG detection + SIFT “Face” labels (for instance)

Building Patch-Labeling Correspondence Purify the path-labeling correspondences –Density-Diversity Estimation (DDE) –Formulation For a semantic label, its initial correspondence set is –Patches extracted from images with label –A correspondence from label to local patch Purify le links in

Building Patch-Labeling Correspondence Density –For a given, Density reveals its representability for : Diversity –For a given, Diversity reveals it unique score for : Average neighborhood distance in m neighbors n m : number of images in neighborhood n i : number of total images with label s i

High T: Concerns more on Precision, not Recall Building Patch-Labeling Correspondence

Case study: “Face” label (before DDE)

Building Patch-Labeling Correspondence Case study: “Face” label (after DDE)

Overview ✴ Problem Statement ✴ Building patch-labeling correspondence ✴ Generative semantic embedding ✴ Experimental comparisons ✴ Conclusions

✴ Generative Hidden Markov Random Field ✴ A Hidden Field for semantic modeling ✴ An Observed Field for local patch quantization Generative Semantic Embedding

Hidden Field –Each produces correspondence links (le) to a subset of patch in the Observed Field –Links denotes semantic correlations between and l ij Generative Semantic Embedding WordNet::Similarity Pedersen et al. AAAI 04

Observed Field –Any two nodes follow visual metric (e.g. L2) –Once there is a link between and, we constrain by from the hidden field sjsj didi le ij Generative Semantic Embedding

In the ideal case –Each is conditionally independent given : Neighbors in the Hidden Field Feature set D is regarded as (partial) generative from the Hidden Field Generative Semantic Embedding

–Formulize Clustering procedure Assign a unique cluster label to each Hence, D is quantized into a codebook with corresponding features Cost for codebook candidate C P(C|D)=P(C)P(D|C)/P(D) Generative Semantic Embedding P(D) is a constraint Semantic Constraint Visual Distortion

Semantic Constraint P(C) –Define a MRF on the Observed Field –For a quantizer assignments C, its probability can be expressed as a Gibbs distribution from the Hidden Field as Generative Semantic Embedding

That is –Two data points and in contribute to if and only if Observed Field Hidden Field c i =c i sxsxsxsx sysysysy djdjdjdj didididi l xy Generative Semantic Embedding

Visual Constraint P(D|C) –Whether the codebook C is visually consistent with current data D Visual distortion in quantization Generative Semantic Embedding

Overall Cost Finding MAP of P(C|D) can be converted into maximizing its posterior probability The visual constraint P(D|C) The semantic constraints P(C) Generative Semantic Embedding

EM solution E step M step Assign local patches to the closest clusters Update the visual word center Generative Semantic Embedding

Model generation –To supervised codebook with label independence assumption –To unsupervised codebooks Making all l=0

Overview ✴ Problem Statement ✴ Building patch-labeling correspondence ✴ Generative semantic embedding ✴ Experimental comparisons ✴ Conclusions

Experimental Comparisons Case study of ratios between inter-class distance and intra-class distance with and without semantic embedding in the Flickr dataset.

Experimental Comparisons ✴ Database Flickr database –60,000 images, over 3,600 unique labels PASCAL VOC 05 ✴ Evaluation criterion Mean Average Precision (MAP)

Experimental Comparisons ✴ Parameter tuning ✴ Different label noise T (DDE purification strength) ✴ Different embedding strength S GSE (Without Hidden Field correlations Cr.),Varied T, S GSE (Without Hidden Field correlations Cr.), Varied T, S + GNP GSE (With Cr.) + Varied T, S GSE (With Cr.) + Varied T, S + GNP

Experimental Comparisons comparisons between our GSE model to Vocabulary Tree [1] and GNP [34] in Flickr 60,000 database.

Experimental Comparisons Confusion table comparisons on PASCAL VOC dataset with method in [24].

Overview ✴ Problem Statement ✴ Building patch-labeling correspondence ✴ Generative semantic embedding ✴ Experimental comparisons ✴ Conclusions

Conclusion Our contribution Propagating Web Labeling from images to local patches to supervise codebook quantization Generalized semantic embedding framework for supervised codebook building Model correlative semantic labels in supervision Future works Adapt one supervised codebook for different tasks Move forward supervision into local feature detection and extraction

Thank you!