Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Image Retrieval with Geometry-Preserving Visual Phrases.

Similar presentations


Presentation on theme: "Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Image Retrieval with Geometry-Preserving Visual Phrases."— Presentation transcript:

1 Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Image Retrieval with Geometry-Preserving Visual Phrases

2 Similar Image Retrieval Ranked relevant images … Image Database

3 Bag-of-Visual-Word (BoW) Images are represented as the histogram of words Similarity of two images: cosine similarity of histograms … Length: dictionary size

4 Geometry-preserving Visual Phrases length-k Phrase:: k words in a certain spatial layout … … (length-2 phrases) Bag of Phrases:

5 Phrases vs. Words Word Length-2 Length-3 Word Length-2 Length-3 Irrelevant Relevant

6 Previous Works

7 Geometry Verification … Searching Step with BoW Post-processing (Geometry Verification) Only on top ranked images Encode Spatial Info

8 Modeling relationship between words Co-occurrences in Entire image [L. Torresani, et al, CVPR 2009] No spatial information Phrases in a local neighborhoods [J. Yuan et al, CVPR07][Z. Wu et al., CVPR10] [C.L.Zitnick, Tech.Report 07] No long range interactions, weak geometry Select a subset of phrases [J. Yuan et al, CVPR07] Discard a large portion of phrases … … (length-2 Phrase) Dimension: exponential to # of words in Phrase Previous works: reduce the number of phrases Our work: All phrases, Linear computation time

9 Approach

10 Overview BoW BoP 1.Similarity Measure 2. Large Scale Retrieval Inverted Files Inverted Files Min-hash Inverted Files Inverted Files Min-hash [Zhang and Chen, 09] This Paper

11 Co-occurring Phrases A B C A B C D F D F A A EF EF [Zhang and Chen, 09] Only consider the translation difference

12 F F Co-occurring Phrase Algorithm A B C A B C B C A DF A EF Offset space D F D F A A EF EF [Zhang and Chen, 09] # of co-occurring length -2 Phrases: 1+1=5 A F A

13 Relation with the feature vector … … … … Inner product of the feature vectors # of co-occurring length-k phrases M: # of corresponding pairs, in practice, linear to the number of local features same as BOW!!!

14 Inverted Index with BoW Avoid comparing with every image Score table Image ID I1I1 I2I2 …InIn Score +1 … … … … … Inverted Index

15 Inverted Index with Word Location … … … … … … … I1I1 Assume same word only occurs once in the same image, Same memory usage as BoW Assume same word only occurs once in the same image, Same memory usage as BoW

16 Score Table Compute # of Co-occurring Phrases: BoW Compute the Offset Space Image ID I1I1 I2I2 …InIn Score … I1I1 I2I2 InIn BoP

17 wiwi Inverted Files with Phrases … Offset Space +1 I 1 I 10 … I8I8 I8I8 … I5I5 I5I5 … … … … … Inverted Index 0,01,0 0,1 0,-11,-1 -1,-1 -1,0 …… …

18 Final Score … I1I1 I2I2 InIn Offset Space Image ID I1I1 I2I2 …InIn Score Final similarity scores

19 Overview BoW BoP Inverted Files Inverted Files Min-hash Inverted Files Inverted Files Min-hash Less storage and time complexity Less storage and time complexity

20 Min-hash with BoW Probability of min-hash collision (same word) = Image Similarity I I

21 Min-hash with Phrases Probability of k min-hash collision with consistent geometry (Details are in the paper) I I Offset space

22 Other Invariances Image I Add dimension to the offset space Increase the memory usage [Zhang and Chen, 10]

23 Variant Matching Local histogram matching

24 Evaluation 1. BoW + Inverted Index vs. BoP + inverted Index 2. BoW + Min-hash vs. BoP + Min-hash Post-processing methods: complimentary to our work

25 Experiments –Inverted Index 5K Oxford dataset (55 queries) 1M flicker distracters Philbin, J. et al. 07

26 Example Precision-recall curve Higher precision at lower recall BoW BoP Recall Precision BoP BoW Recall Precision BoW

27 Comparison Mean average precision: mean of the AP on 55 queries Outperform BoW (similar computation) Outperform BoW+RANSAC (10 times slower on 150 top images) Larger improvement on smaller vocabulary size BoP BoW BoW+RANSAC BoP+RANSAC

28 +Flicker 1M Dataset Computational Complexity MethodMemory Runtime (seconds) QuantizationSearch BoW 8.1G 0.89s 0.137s BoP 8.5G 0.215s BoW+RANSAC -0.89s 4.137s RANSAC: 4s on top 300 images

29 Experiment - min-hash University of Kentucky dataset Minhash with BoW: [O. Chum et al., BMVC08]

30 Conclusion Encode more spatial information into the BoW Can be applied to all images in the database at the searching step Same computational complexity as BoW Better Retrieval Precision than BoW+RANSAC


Download ppt "Yimeng Zhang, Zhaoyin Jia and Tsuhan Chen Cornell University Image Retrieval with Geometry-Preserving Visual Phrases."

Similar presentations


Ads by Google