From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information.

Slides:



Advertisements
Similar presentations
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Advertisements

Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
SINAI-GIR A Multilingual Geographical IR System University of Jaén (Spain) José Manuel Perea Ortega CLEF 2008, 18 September, Aarhus (Denmark) Computer.
Chapter 5: Introduction to Information Retrieval
WWW 2014 Seoul, April 8 th SNOW 2014 Data Challenge Two-level message clustering for topic detection in Twitter Georgios Petkos, Symeon Papadopoulos, Yiannis.
Multimedia Answer Generation for Community Question Answering.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
1 The Web as a Parallel Corpus  Parallel corpora are useful  Training data for statistical MT  Lexical correspondences for cross-lingual IR  Early.
Knowledge Science & Engineering Institute, Beijing Normal University, Analyzing Transcripts of Online Asynchronous.
Improving web image search results using query-relative classifiers Josip Krapacy Moray Allanyy Jakob Verbeeky Fr´ed´eric Jurieyy.
Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.
Longbiao Kang, Baotian Hu, Xiangping Wu, Qingcai Chen, and Yan He Intelligent Computing Research Center, School of Computer Science and Technology, Harbin.
Image Annotation and Feature Extraction
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Probabilistic Model for Definitional Question Answering Kyoung-Soo Han, Young-In Song, and Hae-Chang Rim Korea University SIGIR 2006.
CLEF – Cross Language Evaluation Forum Question Answering at CLEF 2003 ( Bridging Languages for Question Answering: DIOGENE at CLEF-2003.
Exploiting Ontologies for Automatic Image Annotation M. Srikanth, J. Varner, M. Bowden, D. Moldovan Language Computer Corporation
1 Cross-Lingual Query Suggestion Using Query Logs of Different Languages SIGIR 07.
Learning Phonetic Similarity for Matching Named Entity Translation and Mining New Translations Wai Lam, Ruizhang Huang, Pik-Shan Cheung ACM SIGIR 2004.
Translingual Topic Tracking with PRISE Gina-Anne Levow and Douglas W. Oard University of Maryland February 28, 2000.
The CLEF 2003 cross language image retrieval task Paul Clough and Mark Sanderson University of Sheffield
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
Feng Zhang, Guang Qiu, Jiajun Bu*, Mingcheng Qu, Chun Chen College of Computer Science, Zhejiang University Hangzhou, China Reporter: 洪紹祥 Adviser: 鄭淑真.
MIRACLE Multilingual Information RetrievAl for the CLEF campaign DAEDALUS – Data, Decisions and Language, S.A. Universidad Carlos III de.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
Retrieval Models for Question and Answer Archives Xiaobing Xue, Jiwoon Jeon, W. Bruce Croft Computer Science Department University of Massachusetts, Google,
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
Multilingual Relevant Sentence Detection Using Reference Corpus Ming-Hung Hsu, Ming-Feng Tsai, Hsin-Hsi Chen Department of CSIE National Taiwan University.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
IIIT Hyderabad’s CLIR experiments for FIRE-2008 Sethuramalingam S & Vasudeva Varma IIIT Hyderabad, India 1.
1 Learning Sub-structures of Document Semantic Graphs for Document Summarization 1 Jure Leskovec, 1 Marko Grobelnik, 2 Natasa Milic-Frayling 1 Jozef Stefan.
You Are What You Tag Yi-Ching Huang and Chia-Chuan Hung and Jane Yung-jen Hsu Department of Computer Science and Information Engineering Graduate Institute.
Web Image Retrieval Re-Ranking with Relevance Model Wei-Hao Lin, Rong Jin, Alexander Hauptmann Language Technologies Institute School of Computer Science.
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
UA in ImageCLEF 2005 Maximiliano Saiz Noeda. Index System  Indexing  Retrieval Image category classification  Building  Use Experiments and results.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Mining Translations of OOV Terms from the Web through Crosslingual Query Expansion Ying Zhang Fei Huang Stephan Vogel SIGIR 2005.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Cluster-specific Named Entity Transliteration Fei Huang HLT/EMNLP 2005.
CIKM Opinion Retrieval from Blogs Wei Zhang 1 Clement Yu 1 Weiyi Meng 2 1 Department of.
Iterative Translation Disambiguation for Cross Language Information Retrieval Christof Monz and Bonnie J. Dorr Institute for Advanced Computer Studies.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
Alignment of Bilingual Named Entities in Parallel Corpora Using Statistical Model Chun-Jen Lee Jason S. Chang Thomas C. Chuang AMTA 2004.
Probabilistic Latent Query Analysis for Combining Multiple Retrieval Sources Rong Yan Alexander G. Hauptmann School of Computer Science Carnegie Mellon.
National Taiwan University, Taiwan
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Learning Phonetic Similarity for Matching Named Entity Translations and Mining New Translations Wai Lam Ruizhang Huang Pik-Shan Cheung Department of Systems.
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Information Retrieval using Word Senses: Root Sense Tagging Approach Sang-Bum Kim, Hee-Cheol Seo and Hae-Chang Rim Natural Language Processing Lab., Department.
AN EFFECTIVE STATISTICAL APPROACH TO BLOG POST OPINION RETRIEVAL Ben He Craig Macdonald Iadh Ounis University of Glasgow Jiyin He University of Amsterdam.
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Advisor : Dr. Hsu Presenter : Chien Shing Chen Author: Wei-Hao.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
A Multilingual Hierarchy Mapping Method Based on GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Introduction to Information Retrieval Introduction to Information Retrieval Lecture 15: Text Classification & Naive Bayes 1.
An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)
F. López-Ostenero, V. Peinado, V. Sama & F. Verdejo
Deep Compositional Cross-modal Learning to Rank via Local-Global Alignment Xinyang Jiang, Fei Wu, Xi Li, Zhou Zhao, Weiming Lu, Siliang Tang, Yueting.
Multimedia Information Retrieval
Compact Query Term Selection Using Topically Related Text
Multimedia Information Retrieval
Web Information retrieval (Web IR)
Presentation transcript:

From Text to Image: Generating Visual Query for Image Retrieval Wen-Cheng Lin, Yih-Chen Chang and Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN CLEF 2004

Outline Introduction Visual representation of textual query Query translation Experiment results and discussion Conclusion CLEF 2004

Introduction Multimedia data  Language dependent Text, speech  Language independent Image, music Translingual transmedia information retrieval  Language translation  Media transformation CLEF 2004

Language Dependent Medium 1 Source Language Target Language Figure 1. Media Transformation and Language Translation Media Transformation Language Translation Intermediate Representation AC BD E Language Dependent Medium 2 Language Independent Medium CLEF 2004 Image Caption Query Text- based approach Content- based approach

Introduction We adopted text-based approaches in ImageCLEF 2003  Dictionary-based query translation  Unknown named entities were translated by similarity-based backward transliteration model This year, we explore the help of visual features to cross-language image retrieval CLEF 2004

Introduction Transforms textual queries into visual representations  Model the relationships between text and images  Visual queries are constructed from textual queries using the relationships The retrieval results using textual and visual queries are combined to generate the final ranked list CLEF 2004

Learning the relationships between images and text Learning the relationships between images and text from a set of images with textual descriptions  The textual description and image parts of an image is treated as an aligned sentence  The correlations between the textual and visual representations can be learned from the aligned sentences CLEF 2004

Learning the relationships between images and text Use blobs to represent images  Images are segmented into regions using Blobworld  The regions of all images are clustered into 2,000 clusters by K-means clustering algorithm  Each cluster is assigned a unique label (blob token)  Each image is represented by the blobs that its regions belong to CLEF 2004

Learning the relationships between images and text Correlation measurement  p(x): the occurrence probability of word x in text descriptions  p(y): the occurrence probability of blob y in image blobs  p(x,y): the probability that x and y occur in the same image CLEF 2004

Generating visual query In a textual query, nouns, verbs, and adjectives are used to generate blobs For a word w i, the top 30 blobs whose MI values with w i exceed a threshold, i.e., 0.01, are selected The set of selected blobs is the generated visual query CLEF 2004

Query translation Chinese query  English query  Segmentation, POS tagging, named entity identification  For each Chinese query term, find its translations by looking up a Chinese-English bilingual dictionary  First-two-highest-frequency The first two translations with the highest frequency of occurrence in the English image captions are selected CLEF 2004

Query translation Unknown named entities (Lin, W.C., Yang, C., and Chen, H.H. (2003). “Foreign Name Backward Transliteration in Chinese-English Cross-Language Image Retrieval”)  Apply the transformation rules to identify the name part and keyword part of a name  Keyword part  first-two-highest-frequency  Name part  similarity-based backward transliteration CLEF 2004

Query translation Build English name candidate list  The personal names and the location names in the English image captions are extracted (3,599 names) For each Chinese name, 300 candidates are selected from the 3,599 English names using an IR-based candidate filter The similarities of the Chinese name and the 300 candidates are computed at the phoneme level The top 6 candidates with the highest similarities are considered as the translations of the Chinese name CLEF 2004

Combining textual and visual information Two indices  Textual index  English captions  Visual index  image blobs Treat blobs as a language in which each blob token is a word Indexed by text retrieval system For each image, the similarity scores of textual and visual retrieval are normalized and combined using linear combination CLEF 2004

Experiment IR system  Okapi system  BM25 Learning correlations  English captions were translated into Chinese by SYSTRAN system 4 Chinese-English runs + 1 English monolingual run CLEF 2004

Official results Run Merging Weight Average Precision Textual QueryExample ImageGenerated Visual Query NTU-adhoc-CE-T-W NTU-adhoc-CE-T-WI NTU-adhoc-CE-T-WE NTU-adhoc-CE-T-WEI NTU-adhoc-EE-T-W Error of index  Long captions were truncated, thus some words were not indexed CLEF 2004

Unofficial results Run Merging Weight Average Precision Textual QueryExample ImageGenerated Visual Query NTU-CE-T-W-new NTU-CE-T-WI-new NTU-CE-T-WE-new NTU-CE-T-WEI-new NTU-EE-T-W-new CLEF 2004

Discussion Textual query only  69.72% of monolingual retrieval (CLEF2004)  55.56% of monolingual retrieval (CLEF2003) several named entities are not translated into Chinese in Chinese query set of ImageCLEF 2004 Example image only  Average precision:  The top one entry is the example image itself and is relevant to the topic except Topic 17 (the example image of Topic 17 is not in the pisec-total relevant set of Topic 17) CLEF 2004

Discussion Generated visual query only  Average precision: (24 topics) The help of generated visual query is limit  The performance of image segmentation is not good enough the majority of images are in black and white  The performance of clustering affects the performance of blobs-based approach CLEF 2004

Discussion  The quality of training data English captions are translated into Chinese by MT system Monolingual experiment (English)  Use English captions for training  Generate visual query from English query  English textual query + generated visual query  Average precision: CLEF 2004

Discussion  Which word in a query should be used to generate visual query Not all words are relative to the content of images or discriminative Manually select query terms to generate visual query  Average precision: (18 topics)  textual query run + manually selecting run  Average precision: CLEF 2004

Discussion  In some topics, the retrieved images are not relevant to the topics, while they are relevant to the query terms that are used to generate visual query  Topic 13: 1939 年聖安德魯斯高爾夫球公開賽 (The Open Championship golf tournament, St. Andrews 1939)  “ 高爾夫球 ” (golf) and “ 公開賽 ” (Open Championship) Top 10: the Open Championship golf tournament, but are not the one held in 1939 CLEF 2004

Conclusion We propose an approach that transforms textual queries into visual representations The retrieval results using textual and visual queries are combined  the performance is increased in English monolingual experiment  generated visual query has little impact in cross-lingual experiments CLEF 2004

Conclusion Using generated visual query could retrieve images relevant to the query terms that the visual query is generated from How to select appropriate terms to generate visual query and how to integrate textual and visual information effectively will be further investigated CLEF 2004

Thank you!