Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision.

Slides:



Advertisements
Similar presentations
Image Retrieval with Geometry-Preserving Visual Phrases
Advertisements

Relevance Feedback User tells system whether returned/disseminated documents are relevant to query/information need or not Feedback: usually positive sometimes.
Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Aggregating local image descriptors into compact codes
Ziv Bar-YossefMaxim Gurevich Google and Technion Technion TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA A A AA.
Three things everyone should know to improve object retrieval
Presented by Xinyu Chang
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
Database-Based Hand Pose Estimation CSE 6367 – Computer Vision Vassilis Athitsos University of Texas at Arlington.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
Special Topic on Image Retrieval Local Feature Matching Verification.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic
Information Retrieval in Practice
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.
WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1.
Query Operations: Automatic Local Analysis. Introduction Difficulty of formulating user queries –Insufficient knowledge of the collection –Insufficient.
Object retrieval with large vocabularies and fast spatial matching
Clustering… in General In vector space, clusters are vectors found within  of a cluster vector, with different techniques for determining the cluster.
Automatic Image Annotation and Retrieval using Cross-Media Relevance Models J. Jeon, V. Lavrenko and R. Manmathat Computer Science Department University.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman ICCV 2003 Presented by: Indriyati Atmosukarto.
MANISHA VERMA, VASUDEVA VARMA PATENT SEARCH USING IPC CLASSIFICATION VECTORS.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Language-Independent Set Expansion of Named Entities using the Web Richard C. Wang & William W. Cohen Language Technologies Institute Carnegie Mellon University.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Multimedia Databases Text II. Outline Spatial Databases Temporal Databases Spatio-temporal Databases Multimedia Databases Text databases Image and video.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
HYPERGEO 1 st technical verification ARISTOTLE UNIVERSITY OF THESSALONIKI Baseline Document Retrieval Component N. Bassiou, C. Kotropoulos, I. Pitas 20/07/2000,
Query Operations: Automatic Global Analysis. Motivation Methods of local analysis extract information from local set of documents retrieved to expand.
FLANN Fast Library for Approximate Nearest Neighbors
Nonnegative Shared Subspace Learning and Its Application to Social Media Retrieval Presenter: Andy Lim.
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.
BACKGROUND LEARNING AND LETTER DETECTION USING TEXTURE WITH PRINCIPAL COMPONENT ANALYSIS (PCA) CIS 601 PROJECT SUMIT BASU FALL 2004.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
Mining the Web to Create Minority Language Corpora Rayid Ghani Accenture Technology Labs - Research Rosie Jones Carnegie Mellon University Dunja Mladenic.
1 Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval Ondrej Chum, James Philbin, Josef Sivic, Michael Isard and.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Large Scale Discovery of Spatially Related Images Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University Prague.
Chapter 6: Information Retrieval and Web Search
1 Automatic Classification of Bookmarked Web Pages Chris Staff Second Talk February 2007.
A Model for Learning the Semantics of Pictures V. Lavrenko, R. Manmatha, J. Jeon Center for Intelligent Information Retrieval Computer Science Department,
Towards Semantic Embedding in Visual Vocabulary Towards Semantic Embedding in Visual Vocabulary The Twenty-Third IEEE Conference on Computer Vision and.
Chapter 8 Evaluating Search Engine. Evaluation n Evaluation is key to building effective and efficient search engines  Measurement usually carried out.
Query Suggestion. n A variety of automatic or semi-automatic query suggestion techniques have been developed  Goal is to improve effectiveness by matching.
Motivation  Methods of local analysis extract information from local set of documents retrieved to expand the query  An alternative is to expand the.
KNN & Naïve Bayes Hongning Wang Today’s lecture Instance-based classifiers – k nearest neighbors – Non-parametric learning algorithm Model-based.
Language Modeling Putting a curve to the bag of words Courtesy of Chris Jordan.
Concept-based P2P Search How to find more relevant documents Ingmar Weber Max-Planck-Institute for Computer Science Joint work with Holger Bast Torino,
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Final Project Mei-Chen Yeh May 15, General In-class presentation – June 12 and June 19, 2012 – 15 minutes, in English 30% of the overall grade In-class.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
Computer Vision Group Department of Computer Science University of Illinois at Urbana-Champaign.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
KNN & Naïve Bayes Hongning Wang
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
BMVC 2010 Sung Ju Hwang and Kristen Grauman University of Texas at Austin.
Information Retrieval in Practice
Video Google: Text Retrieval Approach to Object Matching in Videos
Cheng-Ming Huang, Wen-Hung Liao Department of Computer Science
Video Google: Text Retrieval Approach to Object Matching in Videos
Presentation transcript:

Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision – Fall 2007

Objective Given a query image of an object, retrieve all instances of that object in a large (1M+) image database

General Approach Use bag-of-visual-words architecture Use query expansion to improve performance Improve query expansion by… …(1) using spatial constraints to suppress false positives, and …(2) constructing a latent feature model of the query object

Procedure (Probabilistic Interpretation) Extract a generative model of an object from the query region Form the response set from images in the corpus

Generative Model A spatial configuration of visual words extracted from the query region, plus… A “background” distribution of words that encodes the overall frequency statistics of the corpus

Response Set Images from the corpus that contain a large number of visual words in common with the query region My subsequently be ranked using spatial information to ensure that… …(1) response and query contain similar features, and …(2) features occur in compatible spatial configurations

Objective Given a query image of an object, retrieve all instances of that object in a large (1M+) image database Do this by improving retrieval performance…

Objective Given a query image of an object, retrieve all instances of that object in a large (1M+) image database Do this by improving retrieval performance… by deriving better (generative) object models given the query region

A Better Model Keep the form fixed (still a configuration of visual words) Enrich the object model with additional information from the corpus Refer to this as a latent model

Latent Model Generalization of query expansion

Query Expansion Well-known technique in text-based information retrieval (1)Generate a query (2)Obtain an original response set (3)Use a number of high ranked documents to generate a new query (4)Repeat from (2)…

Latent Model Generalization of query expansion Well suited to this problem domain: (1) Spatial structure can be used to avoid false positives (2) The baseline image search without query expansion suffers more from false negatives than most text retrieval systems

Real-time Object Retrieval A visual vocabulary of 1M words is generated using an approximated K-means clustering method

Approximate K-means Comparable performance to exact K-means (< 1% performance degradation) at a fraction of the computational cost New data points assigned to cluster based on approximate nearest neighbor We go from O(NK) to O(N log(K)) N = number of features being clustered from K = number of clusters

Spatial Verification It is vital for query-expansion that we do not… …expand using false positives, or …use features which occur in the result image, but not in the object of interest Use hypothesize and verify procedure to estimate homography between query and target > 20 inliers = spatially verified result

Methods for Computing Latent Object Models Query expansion baseline Transitive closure expansion Average query expansion Recursive average query expansion Multiple image resolution expansion

Query expansion baseline 1.Find top 5 (unverified) results from original query 2.Average the term-frequency vectors 3.Requery once 4.Append these results

Transitive closure expansion 1.Create a priority queue based on number of inliers 2.Get top image in the queue 3.Find region corresponding to original query 4.Use this region to issue a new query 5.Add new verified results to queue 6.Repeat until queue is empty

Average query expansion 1.Obtain top (m < 50) verified results of original query 2.Construct new query using average of these results 3.Requery once 4.Append these results

Recursive average query expansion 1.Improvement of average query expansion method 2.Recursively generate queries from all verified results returned so far 3.Stop once 30 verified images are found or once no new images can be positively verified

Multiple image resolution expansion 1.For each verified result of original query… 2.Calculate relative change in resolution required to project verified region onto query region

Multiple image resolution expansion 3.Place results in 3 bands: 4.Construct average query for each band 5.Execute independent queries 6.Merge results 1.Verified images from first query first, then… 2.Expanded queries in order of number of inliers (0, 4/5)(2/3, 3/2)(5/4, infinity)

Datasets Oxford dataset Flickr1 dataset Flickr2 dataset

Oxford Dataset ~5,000 high res. (1024x768) images Crawled from Flickr using 11 landmarks as queries Ground truth labels Good – a nice, clear picture of landmark OK – more than 25% of object visible Bad – object not present Junk – less than 25% of object visible

Flickr1 Dataset ~100k high resolution images Crawled from Flickr using 145 most popular tags

Flickr2 Dataset ~1M medium res. (500 x 333) images Crawled from Flickr using 450 most popular tags

Datasets

Evaluation Procedure Compute Average Precision (AP) score for each of the 5 queries for a landmark Average these to obtain a Mean Average Precision (MAP) for the landmark Use two databases D1: Oxford + Flickr1 datasets (~100k) D2: D1 + Flickr1 datasets (~1M)

Average Precision Area under the precision-recall curve Precision = RPI / TNIR Recall = RPI / TNPC RPI = retrieved positive images TNIR = total number of images retrieved TNPC = total number of positives in the corpus Precision Recall

Retrieval Performance D1 –0.1s for typical query –1GB for index D2 –15s – 35s for typical query –4.3GB for index (use offline version)

Retrieval Performance

Method and Dataset Comparison ori = original query qeb = query expansion baseline trc = transitive closure expansion avg = average query expansion rec = recursive average query expansion sca = multiple image resolution expansion

Questions