Presentation is loading. Please wait.

Presentation is loading. Please wait.

Indexing Techniques Mei-Chen Yeh.

Similar presentations


Presentation on theme: "Indexing Techniques Mei-Chen Yeh."— Presentation transcript:

1 Indexing Techniques Mei-Chen Yeh

2 Last week Matching two sets of features Strategy 1 Strategy 2:
Convert to a fixed-length feature vector (Bag-of-words) Use a conventional proximity measure Strategy 2: Build point correspondences

3 Last week: bag-of-words
visual vocabulary ….. frequency codewords CS 376 Lecture 18

4 Matching local features: building patch correspondences
? Image 1 Image 2 To generate candidate matches, find patches that have the most similar appearance (e.g., lowest SSD) Slide credits: Prof. Kristen Grauman CS 376 Lecture 18

5 Matching local features: building patch correspondences
? Image 1 Image 2 Simplest approach: compare them all, take the closest (or closest k, or within a thresholded distance) Slide credits: Prof. Kristen Grauman CS 376 Lecture 18

6 Indexing local features
Each patch / region has a descriptor, which is a point in some high-dimensional feature space (e.g., SIFT) Descriptor’s feature space Database images CS 376 Lecture 18 6

7 Indexing local features
When we see close points in feature space, we have similar descriptors, which indicates similar local content. Query image Descriptor’s feature space Database images CS 376 Lecture 18

8 Problem statement With potentially thousands of features per image, and hundreds to millions of images to search, how to efficiently find those that are relevant to a new image? CS 376 Lecture 18 8

9 50 thousand images 4m Slide credit: Nistér and Stewénius

10 110 million images?

11 To continue the analogy, if we printed all these images on paper, and stacked them,

12 The pile would stack, as high as

13 Scalability matters! Mount Everest.
Another way to put perspective on this is, Google image search not too long ago claimed to index 2 billion images, although based on meta-data, while we do it based on image content. So, with about 20 desktop systems like the one I just showed, it seems that it may be possible to build a web-scale content-based image search engine, and we are sort of hoping that this paper will fuel the race for the first such search engine. So, that is some motivation. Let me now move to the contribution of the paper. As you can guess by now, it is about scalability of recognition and retrieval. Scalability matters!

14 The Nearest-Neighbor Search Problem
Given A set S of n points in d dimensions A query point q Which point in S is closest to q? Time complexity of linear scan: O( ? ) dn ?

15 The Nearest-Neighbor Search Problem

16 The Nearest-Neighbor Search Problem
r-nearest neighbor for any query q, returns a point p ∈ S s.t. c-approximate r-nearest neighbor for any query q, returns a point p’ ∈ S

17 Today Indexing local features Inverted file Vocabulary tree
Locality sensitivity hashing CS 376 Lecture 18

18 Indexing local features: inverted file

19 Indexing local features: inverted file
For text documents, an efficient way to find all pages on which a word occurs is to use an index. We want to find all images in which a feature occurs. page ~ image word ~ feature To use this idea, we’ll need to map our features to “visual words”.

20 Text retrieval vs. image search
What makes the problems similar, different? CS 376 Lecture 18

21 e.g., SIFT descriptor space: each point is 128-dimensional
Visual words e.g., SIFT descriptor space: each point is 128-dimensional Extract some local features from a number of images … Slide credit: D. Nister, CVPR 2006

22 Visual words

23 Visual words

24 Visual words

25 Each point is a local descriptor, e.g. SIFT vector.

26 Example: Quantize into 3 words

27 Descriptor’s feature space
Visual words Map high-dimensional descriptors to tokens/words by quantizing the feature space Quantize via clustering, let cluster centers be the prototype “words” Determine which word to assign to each new image region by finding the closest cluster center. Word #2 Descriptor’s feature space CS 376 Lecture 18

28 Visual words Each group of patches belongs to the same visual word!
Figure from Sivic & Zisserman, ICCV 2003 CS 376 Lecture 18

29 Visual vocabulary formation
Issues: Sampling strategy: where to extract features? Fixed locations or interest points? Clustering / quantization algorithm What corpus provides features (universal vocabulary?) Vocabulary size, number of words Weight of each word?

30 The index maps word-to-image ids
Inverted file index Why the index give us a significant gain in efficiency? The index maps word-to-image ids

31 A query image is matched to database images that share visual words.
Inverted file index A query image is matched to database images that share visual words. CS 376 Lecture 18

32 tf-idf weighting Term frequency – inverse document frequency
Describe the frequency of each word within an image, decrease the weights of the words that appear often in the database economic, trade, … the, most, we, … w↗ discriminative regions w↘ common regions Standard weighting for text retrieval, measuring a word’s importance in a particular document CS 376 Lecture 18

33 tf-idf weighting Term frequency – inverse document frequency
Describe the frequency of each word within an image, decrease the weights of the words that appear often in the database Total number of documents in database Number of occurrences of word i in document d Standard weighting for text retrieval Number of documents word i occurs in, in whole database Number of words in document d CS 376 Lecture 18

34 Bag-of-Words + Inverted file
Bag-of-words representation Inverted file /research/vgoogle/index.html

35 D. Nistér and H. Stewenius
D. Nistér and H. Stewenius. Scalable Recognition with a Vocabulary Tree, CVPR 2006.

36 We then run k-means on the descriptor space
We then run k-means on the descriptor space. In this setting, k defines what we call the branch-factor of the tree, which indicates how fast the tree branches. In this illustration, k is three. We then run k-means again, recursively on each of the resulting quantization cells. This defines the vocabulary tree, which is essentially a hierarchical set of cluster centers and their corresponding Voronoi regions. We typically use a branch-factor of 10 and six levels, resulting in a million leaf nodes. We lovingly call this the Mega-Voc.

37 Visualize as a tree

38 Vocabulary Tree Training: Filling the tree
[Nister & Stewenius, CVPR’06] Slide credit: David Nister CS 376 Lecture 18

39 Vocabulary Tree Training: Filling the tree
[Nister & Stewenius, CVPR’06] Slide credit: David Nister CS 376 Lecture 18

40 Vocabulary Tree Training: Filling the tree
[Nister & Stewenius, CVPR’06] Slide credit: David Nister CS 376 Lecture 18

41 Vocabulary Tree Training: Filling the tree
[Nister & Stewenius, CVPR’06] Slide credit: David Nister CS 376 Lecture 18

42 Vocabulary Tree Training: Filling the tree
[Nister & Stewenius, CVPR’06] 42 Slide credit: David Nister CS 376 Lecture 18

43 Or perform geometric verification
Vocabulary Tree Recognition Retrieved Or perform geometric verification [Nister & Stewenius, CVPR’06] Slide credit: David Nister CS 376 Lecture 18

44 Think about the computational advantage of the hierarchical tree vs
Think about the computational advantage of the hierarchical tree vs. a flat vocabulary! CS 376 Lecture 18

45 Hashing

46 Direct addressing Create a direct-address table with m slots U
U (universe of keys) key satellite data 2 3 5 8 4 6 9 1 7 K (actual keys) 2 3 5 8

47 Direct addressing Search operation: O(1)
Problem: The range of keys can be large! 64-bit numbers => 18,446,744,073,709,551,616 different keys SIFT: 128 * 8 bits U 2^64 K CS 376 Lecture 18

48 Hashing O(1) average-case time
Use a hash function h to compute the slot from the key k T: hash table U (universe of keys) h(k1) may not be k1 anymore! h(k4) collision K (actual keys) may share a bucket k3 = h(k3) h(k5) k1 k4 m-1 k5 CS 376 Lecture 18

49 Hashing A good hash function
Satisfies the assumption of simple uniform hashing: each key is equally likely to hash to any of the m slots. How to design a hash function for indexing high-dimensional data?

50 128-d T: hash table ?

51 Locality-sensitive hashing
Indyk and Motwani. Approximate nearest neighbors: towards removing the curse of dimensionality, STOC 1998.

52 Locality-sensitive hashing (LSH)
Hash functions are locality-sensitive, if, for any pair of points p, q we have: Pr[h(p)=h(q)] is “high” if p is close to q Pr[h(p)=h(q)] is “low” if p is far from q ko-she CS 376 Lecture 18

53 Locality Sensitive Hashing
A family H of functions h: Rd → U is called (r, cr, P1, P2)-sensitive, if for any p, q: if then Pr[h(p)=h(q)] > P1 if then Pr[h(p)=h(q)] < P2

54 LSH Function: Hamming Space
Consider binary vectors points from {0, 1}d Hamming distance D(p, q) = # positions on which p and q differ Example: (d = 3) D(100, 011) = 3 D(010, 111) = 2

55 LSH Function: Hamming Space
Define hash function h as hi(p) = pi where pi is the i-th bit of p Example: select the 1st dimension h(010) = 0 h(111) = 1 Pr[h(010)≠h(111)] = ? = D(p, q)/d vs. D(p, q)? d? Pr[h(p)=h(q)] = ? 1 - D(p, q)/d Clearly, h is locality sensitive.

56 LSH Function: Hamming Space
A k-bit locality-sensitive hash function is defined as g(p) = [h1(p), h2(p), …, hk(p)]T Each hi(p) is chosen randomly Each hi(p) results in a single bit Pr(similar points collide) ≥ Pr(dissimilar points collide) ≤ Indyk and Motwani [1998]

57 LSH Function: R2 space Consider 2-d vectors

58 LSH Function: R2 space The probability that a random hyperplane separates two unit vectors depends on the angle between them:

59 LSH Pre-processing Each image is entered into L hash tables indexed by independently constructed g1, g2, …, gL Preprocessing Space: O(LN)

60 LSH Querying For each hash table, return the bin indexed by gi(q), 1 ≤ i ≤ L. Perform a linear search on the union of the bins.

61 W. –T Lee and H. –T. Chen. Probing the local-feature space of interest points, ICIP 2010.

62 Hash family a : random vector sampled from a Gaussian distribution
The dot-product a‧v projects each vector v to “a line” a : random vector sampled from a Gaussian distribution b : real value chosen uniformly from the range [0 , r] r : segment width

63 Building the hash table

64 Building the hash table
: segment width (max-min)/t For each random projection, we get t buckets.

65 Building the hash table
Generate K projections Combing them to get an index in the hash table: How many buckets do we get? tK

66 Building the hash table
Example 5 projections (K = 5) 15 segments (t = 15) 155 = 759,375 buckets in total!

67 Sketching the Feature Space
Natural image patches (from Berkeley segmentation database ) Noise image patches (Randomly-generated noise patches) Collect three image patches of different size 16x16 , 32x32 , 64x64 Each set consist of 200,000 patches.

68 Patch distribution over buckets

69 Summary Indexing techniques are essential for organizing a database and for enabling fast matching. For indexing high-dimensional data Inverted file Vocabulary tree Locality sensitive hashing

70 Resources and extended readings
LSH Matlab Toolbox Yeh et al., “Adaptive Vocabulary Forests for Dynamic Indexing and Category Learning,” ICCV 2007.


Download ppt "Indexing Techniques Mei-Chen Yeh."

Similar presentations


Ads by Google