Bag of Features Approach: recent work, using geometric information.

Slides:



Advertisements
Similar presentations
Recognising Panoramas M. Brown and D. Lowe, University of British Columbia.
Advertisements

Image Retrieval with Geometry-Preserving Visual Phrases
Using Large-Scale Web Data to Facilitate Textual Query Based Retrieval of Consumer Photos.
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Three things everyone should know to improve object retrieval
Foreground Focus: Finding Meaningful Features in Unlabeled Images Yong Jae Lee and Kristen Grauman University of Texas at Austin.
TP14 - Indexing local features
Query Specific Fusion for Image Retrieval
1 Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University.
Activity Recognition Aneeq Zia. Agenda What is activity recognition Typical methods used for action recognition “Evaluation of local spatio-temporal features.
CS4670 / 5670: Computer Vision Bag-of-words models Noah Snavely Object
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Global spatial layout: spatial pyramid matching Spatial weighting the features Beyond bags of features: Adding spatial information.
1 Image Retrieval Hao Jiang Computer Science Department 2009.
Localization in indoor environments by querying omnidirectional visual maps using perspective images Miguel Lourenco, V. Pedro and João P. Barreto ICRA.
Enhancing Exemplar SVMs using Part Level Transfer Regularization 1.
Special Topic on Image Retrieval Local Feature Matching Verification.
Image alignment Image from
Bag-of-features models Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
CVPR 2008 James Philbin Ondˇrej Chum Michael Isard Josef Sivic
Packing bag-of-features ICCV 2009 Herv´e J´egou Matthijs Douze Cordelia Schmid INRIA.
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun CVPR 2009.
Effective Image Database Search via Dimensionality Reduction Anders Bjorholm Dahl and Henrik Aanæs IEEE Computer Society Conference on Computer Vision.
Robust and large-scale alignment Image from
WISE: Large Scale Content-Based Web Image Search Michael Isard Joint with: Qifa Ke, Jian Sun, Zhong Wu Microsoft Research Silicon Valley 1.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Object retrieval with large vocabularies and fast spatial matching
Lecture 28: Bag-of-words models
CS335 Principles of Multimedia Systems Content Based Media Retrieval Hao Jiang Computer Science Department Boston College Dec. 4, 2007.
Beyond bags of features: Adding spatial information Many slides adapted from Fei-Fei Li, Rob Fergus, and Antonio Torralba.
Automatic Panoramic Image Stitching using Local Features Matthew Brown and David Lowe, University of British Columbia.
Bag-of-features models
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
Keypoint-based Recognition and Object Search
Machine learning & category recognition Cordelia Schmid Jakob Verbeek.
Object Recognition and Augmented Reality
Review: Intro to recognition Recognition tasks Machine learning approach: training, testing, generalization Example classifiers Nearest neighbor Linear.
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
Indexing Techniques Mei-Chen Yeh.
Clustering with Application to Fast Object Search
Keypoint-based Recognition Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 03/04/10.
A Thousand Words in a Scene P. Quelhas, F. Monay, J. Odobez, D. Gatica-Perez and T. Tuytelaars PAMI, Sept
CSE 473/573 Computer Vision and Image Processing (CVIP)
Problem Statement A pair of images or videos in which one is close to the exact duplicate of the other, but different in conditions related to capture,
Near Duplicate Image Detection: min-Hash and tf-idf weighting
Bag-of-features models. Origin 1: Texture recognition Texture is characterized by the repetition of basic elements or textons For stochastic textures,
A Statistical Approach to Speed Up Ranking/Re-Ranking Hong-Ming Chen Advisor: Professor Shih-Fu Chang.
Svetlana Lazebnik, Cordelia Schmid, Jean Ponce
1 Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval Ondrej Chum, James Philbin, Josef Sivic, Michael Isard and.
Video Google: A Text Retrieval Approach to Object Matching in Videos Josef Sivic and Andrew Zisserman.
Large Scale Discovery of Spatially Related Images Ondřej Chum and Jiří Matas Center for Machine Perception Czech Technical University Prague.
10/31/13 Object Recognition and Augmented Reality Computational Photography Derek Hoiem, University of Illinois Dali, Swans Reflecting Elephants.
Beyond Sliding Windows: Object Localization by Efficient Subwindow Search The best paper prize at CVPR 2008.
Visual Categorization With Bags of Keypoints Original Authors: G. Csurka, C.R. Dance, L. Fan, J. Willamowski, C. Bray ECCV Workshop on Statistical Learning.
I. Problem  Improve large-scale retrieval / classification accuracy  Incorporate spatial relationship between the features in the image  Oxford 5K Dataset.
Lecture 08 27/12/2011 Shai Avidan הבהרה: החומר המחייב הוא החומר הנלמד בכיתה ולא זה המופיע / לא מופיע במצגת.
Bundling Features for Large Scale Partial-Duplicate Web Image Search Zhong Wu ∗, Qifa Ke, Michael Isard, and Jian Sun Microsoft Research.
CS654: Digital Image Analysis
776 Computer Vision Jan-Michael Frahm Spring 2012.
Video Google: Text Retrieval Approach to Object Matching in Videos Authors: Josef Sivic and Andrew Zisserman University of Oxford ICCV 2003.
SIFT Scale-Invariant Feature Transform David Lowe
ALADDIN A Locality Aligned Deep Model for Instance Search
Learning Mid-Level Features For Recognition
Video Google: Text Retrieval Approach to Object Matching in Videos
Paper Presentation: Shape and Matching
Mixtures of Gaussians and Advanced Feature Encoding
By Suren Manvelyan, Crocodile (nile crocodile?) By Suren Manvelyan,
Features Readings All is Vanity, by C. Allan Gilbert,
CS 1674: Intro to Computer Vision Scene Recognition
Video Google: Text Retrieval Approach to Object Matching in Videos
Presentation transcript:

Bag of Features Approach: recent work, using geometric information

Problem Search for object occurrences in very large image collection

2 sub problems Object Category Recognition and Specific Object Recognition

Motivation Look for product information Look for similar products

Related work on large scale image search Most systems build upon the BoF framework [Sivic & Zisserman 03] – Large (hierarchical) vocabularies [Nister Stewenius 06] – Improved descriptor representation [Jégou et al 08, Philbin et al 08] – Geometry used in index [Jégou et al 08, Perdoc’h et al 09] – Query expansion [Chum et al 07] – … Efficiency improved by: – Min-hash and Geometrical min-hash [Chum et al ] – Compressing the BoF representation [Jégou et al. 09]

Local Features - SIFT

Creating a visual vocabulary 12 34

Inverted Index Index construction Searching

Use geometry Possible directions: – Change/optimize spatial verification stage – Insert a new geometric information to the index Ordered BOF Bundled features Visual phrases – Change the searching algorithm

Survey for today Spatial Bag-of-features [Cao, CVPR2010] Image Retrieval with Geometry-Preserving Visual Phrases [Zhang Jia Chen, CVPR2011] Smooth Object Retrieval using a Bag of Boundaries [Arandjelovi Zisserman, ICCV2011]

Spatial BOF Basic idea:

Spatial BOF Constructing linear and circular ordered bag- of-features:

Spatial BOF Translation invariance:

Spatial BOF Pros: – Gets better performance than BOF+RANSAC for large scale dataset* – Same format as standard BOF Cons: – Is dataset dependent because of need of training Do not present the results for large scale dataset with transfer learning from another dataset Future work – Check it with cross training for large dataset. Otherwise, it is not worth working further.

Geometry-Preserving Visual Phrases Basic idea:

Geometry-Preserving Visual Phrases Representation – Quantize image to 10x10 grid – Histogram of GVPs of length k – GVP dictionary size is “choose k from N visual words”

Geometry-Preserving Visual Phrases Pros: – Outperforms BOV + RANSAC Cons: – Only translation invariant because of memory Future work

BOF for smooth objects Idea: The information used for retrieval Query object Segment Gradient

BOF for smooth objects Results:

BOF for smooth objects Segmentation phase Over segmentation with super-pixels Classification of super-pixels: 3208 feature vector (median(Mag(Grad)), 4 bits, color histogram, BOF) SVM Post-processing

BOF for smooth objects Boundary description phase: Sample points on the boundary Calculate HoG at each point in 3 scales 340 dimensional L2 normalized vector * The descriptor is not rotation invariant

BOF for smooth objects Retrieval procedure: Boundary descripors are quantized (k=10k) Standard BOF scheme* Spatial verification for top 200 with loose affine homography (errors up to 100pixs) * No spatial information is recorded in the histogram

BOF for smooth objects Pros: – Solves the smooth object retrieval problem – Fast Cons: – Is dataset dependent because of need of training – Limited to objects with “solid” materials – segmentation has to catch the object’s boundary Future work – Eliminate the training step

Summary There is an active research in the field of CBIR to exploit geometry information. Each method with its limitations Still no widely accepted solution – Like spatial verification with RANSAC