Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)
Spectral Hashing Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Motivation What does the world look like?
Object Recognition for large-scale search High level image statistics And also the relationships between objects and the scene in general.

Semantic Hashing Semantic Hash Function Binary code
[Salakhutdinov & Hinton, 2007] Query Image Semantic Hash Function Address Space Binary code Images in database Query address Semantically similar images Quite different to a (conventional) randomizing hash

1. Locality Sensitive Hashing
Gionis, A. & Indyk, P. & Motwani, R. (1999) Take random projections of data Quantize each projection with few bits 1 101 1 1 Gist descriptor No learning involved

Toy Example 2D uniform distribution

2. Boosting Learn threshold & dimension for each bit (weak classifier)
Modified form of BoostSSC [Shaknarovich, Viola & Darrell, 2003] Positive examples are pairs of similar images Negative examples are pairs of unrelated images 1 1 Learn threshold & dimension for each bit (weak classifier) 1

3. Restricted Boltzmann Machine (RBM)
Type of Deep Belief Network Hinton & Salakhutdinov, Science 2006 Hidden units Visible units Symmetric weights Units are binary & stochastic Single RBM layer W Attempts to reconstruct input at visible layer from activation of hidden layer

Multi-Layer RBM: non-linear dimensionality reduction
Output binary code (N dimensional) Layer 3 N w3 256 Layer 2 256 w2 512 Layer 1 512 w1 512 Linear units at first layer Input Gist vector (512 dimensions)

2-D Toy example: 3 bits 7 bits 15 bits Distance from query point
Red – 0 bits Green – 1 bit Black – >2 bits Blue – 2 bits Query Point

Toy Results Distance Red – 0 bits Green – 1 bit Blue – 2 bits

Semantic Hashing Semantic Hash Function Binary code
[Salakhutdinov & Hinton, 2007] Query Image Semantic Hash Function Address Space Binary code Images in database Query address Semantically similar images Quite different to a (conventional) randomizing hash

Spectral Hash Binary code Real-valued vectors
Query Image Spectral Hash Non-linear dimensionality reduction Address Space Binary code Images in database Real-valued vectors Query address Semantically similar images Quite different to a (conventional) randomizing hash

Spectral Hashing (NIPS ’08)
Assume points are embedded in Euclidean space How to binarize so Hamming distance approximates Euclidean distance? Ham_Dist( , )=3

Spectral Hashing theory
Want to min YT(D-W)Y subject to: Each bit on 50% of time Bits are independent Sadly, this is NP-complete Relax the problem, by letting Y be continuous. Now becomes eigenvector problem

Nystrom Approximation
Method for approximating eigenfunctions Interpolate between existing data points Requires evaluation of distance to existing data  cost grows linearly with #points Also overfits badly in practice

What about a novel data point?
Need a function to map new points into the space Take limit of Eigenvalues as n\inf Need to carefully normalize graph Laplacian Analytical form of Eigenfunctions exists for certain distributions (uniform, Gaussian) Constant time compute/evaluate new point For uniform: Only depends on extent of distribution (b-a)

Eigenfunctions for uniform distribution

The Algorithm Input: Data {xi} of dimensionality d; desired # bits, k
Fit a multidimensional rectangle to the data Run PCA to align axes, then bound uniform distribution For each dimension, calculate k smallest eigenfunctions. This gives dk eigenfunctions. Pick ones with smallest k eigenvalues. Threshold eigenfunctions at zero to give binary codes

1. Fit Multidimensional Rectangle
Run PCA to align axes Bound uniform distribution

2. Calculuate Eigenfunctions

3. Pick k smallest Eigenfunctions
Eigenvalues e.g. k=3

4. Threshold chosen Eigenfunctions

Back to the 2-D Toy example
3 bits 7 bits 15 bits Distance Red – 0 bits Green – 1 bit Blue – 2 bits

2-D Toy Example Comparison

10-D Toy Example

Experiments on Real Data

Input Image representation: Gist vectors
Pixels not a convenient representation Use Gist descriptor instead (Oliva & Torralba, 2001) 512 dimensions/image (real-valued  16,384 bits) L2 distance btw. Gist vectors not bad substitute for human perceptual distance NO COLOR INFORMATION Oliva & Torralba, IJCV 2001

LabelMe images 22,000 images (20,000 train | 2,000 test)
Ground truth segmentations for all Assume L2 Gist distance is true distance

LabelMe data

Extensions

How to handle non-uniform distributions

Bit allocation between dimensions
Compare value of cuts in original space, i.e. before the pointwise nonlinearity.

Summary Spectral Hashing Simple way of computing good binary codes
Forced to make big assumption about data distribution Use point-wise non-linearities to map distribution to uniform Need more experiments on real data

Overview Assume points are embedded in Euclidean space (e.g. output from RBM) How to binarize the space so that Hamming distance between points approximates L2 distance?

Semantic Hashing beyond 30 bits

Strategies for Binarization
Deliberately add noise during backprop - forces extreme values to overcome noise 1 1

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Similar presentations

Presentation on theme: "Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Similar presentations

Presentation on theme: "Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)"— Presentation transcript:

Similar presentations

About project

Feedback