Spectral Approaches to Nearest Neighbor Search arXiv:1408.0751 Robert Krauthgamer (Weizmann Institute) Joint with: Amirali Abdullah, Alexandr Andoni, Ravi.

Slides:



Advertisements
Similar presentations
Property Testing of Data Dimensionality Robert Krauthgamer ICSI and UC Berkeley Joint work with Ori Sasson (Hebrew U.)
Advertisements

A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.
1 Approximating Edit Distance in Near-Linear Time Alexandr Andoni (MIT) Joint work with Krzysztof Onak (MIT)
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
Principal Component Analysis Based on L1-Norm Maximization Nojun Kwak IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008.
Algorithmic High-Dimensional Geometry 1 Alex Andoni (Microsoft Research SVC)
Overcoming the L 1 Non- Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT)
MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.
Dimensionality reduction. Outline From distances to points : – MultiDimensional Scaling (MDS) Dimensionality Reductions or data projections Random projections.
Presented by Relja Arandjelović The Power of Comparative Reasoning University of Oxford 29 th November 2011 Jay Yagnik, Dennis Strelow, David Ross, Ruei-sung.
Similarity Search in High Dimensions via Hashing
A NOVEL LOCAL FEATURE DESCRIPTOR FOR IMAGE MATCHING Heng Yang, Qing Wang ICME 2008.
Presented by Arshad Jamal, Rajesh Dhania, Vinkal Vishnoi Active hashing and its application to image and text retrieval Yi Zhen, Dit-Yan Yeung, Published.
Data Structures and Functional Programming Algorithms for Big Data Ramin Zabih Cornell University Fall 2012.
Nearest Neighbor Search in high-dimensional spaces Alexandr Andoni (Microsoft Research)
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT)
Spectral Approaches to Nearest Neighbor Search Alex Andoni Joint work with:Amirali Abdullah Ravi Kannan Robi Krauthgamer.
Large-scale matching CSE P 576 Larry Zitnick
Affine-invariant Principal Components Charlie Brubaker and Santosh Vempala Georgia Tech School of Computer Science Algorithms and Randomness Center.
1 Jun Wang, 2 Sanjiv Kumar, and 1 Shih-Fu Chang 1 Columbia University, New York, USA 2 Google Research, New York, USA Sequential Projection Learning for.
Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)
1 Lecture 18 Syntactic Web Clustering CS
1 Edit Distance and Large Data Sets Ziv Bar-Yossef Robert Krauthgamer Ravi Kumar T.S. Jayram IBM Almaden Technion.
Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)
Summer School on Hashing’14 Locality Sensitive Hashing Alex Andoni (Microsoft Research)
Optimal Data-Dependent Hashing for Approximate Near Neighbors
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.
Dimensionality Reduction
FLANN Fast Library for Approximate Nearest Neighbors
Large Scale Recognition and Retrieval. What does the world look like? High level image statistics Object Recognition for large-scale search Focus on scaling.
Efficient Image Search and Retrieval using Compact Binary Codes
Approximation algorithms for large-scale kernel methods Taher Dameh School of Computing Science Simon Fraser University March 29 th, 2010.
Beyond Locality Sensitive Hashing Alex Andoni (Microsoft Research) Joint with: Piotr Indyk (MIT), Huy L. Nguyen (Princeton), Ilya Razenshteyn (MIT)
Minimal Loss Hashing for Compact Binary Codes
Sublinear Algorithms via Precision Sampling Alexandr Andoni (Microsoft Research) joint work with: Robert Krauthgamer (Weizmann Inst.) Krzysztof Onak (CMU)
Sketching and Nearest Neighbor Search (2) Alex Andoni (Columbia University) MADALGO Summer School on Streaming Algorithms 2015.
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.
Nearest Neighbor Classifier 1.K-NN Classifier 2.Multi-Class Classification.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Database Management Systems, R. Ramakrishnan 1 Algorithms for clustering large datasets in arbitrary metric spaces.
Optimal Data-Dependent Hashing for Nearest Neighbor Search Alex Andoni (Columbia University) Joint work with: Ilya Razenshteyn.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1.
Sampling in Graphs Alexandr Andoni (Microsoft Research)
Iterative K-Means Algorithm Based on Fisher Discriminant UNIVERSITY OF JOENSUU DEPARTMENT OF COMPUTER SCIENCE JOENSUU, FINLAND Mantao Xu to be presented.
Summer School on Hashing’14 Dimension Reduction Alex Andoni (Microsoft Research)
Sketching complexity of graph cuts Alexandr Andoni joint work with: Robi Krauthgamer, David Woodruff.
Spectral Methods for Dimensionality
Data-dependent Hashing for Similarity Search
Clustering Data Streams
Grassmannian Hashing for Subspace searching
Data Driven Resource Allocation for Distributed Learning
Approximate Near Neighbors for General Symmetric Norms
Sublinear Algorithmic Tools 3
Lecture 11: Nearest Neighbor Search
Sublinear Algorithmic Tools 2
Spectral Approaches to Nearest Neighbor Search [FOCS 2014]
Sketching and Embedding are Equivalent for Norms
Lecture 7: Dynamic sampling Dimension Reduction
Near(est) Neighbor in High Dimensions
Data-Dependent Hashing for Nearest Neighbor Search
Near-Optimal (Euclidean) Metric Compression
Design of Hierarchical Classifiers for Efficient and Accurate Pattern Classification M N S S K Pavan Kumar Advisor : Dr. C. V. Jawahar.
Locality Sensitive Hashing
Fast Nearest Neighbor Search in the Hamming Space
Overcoming the L1 Non-Embeddability Barrier
Sampling in Graphs: node sparsifiers
Streaming Symmetric Norms via Measure Concentration
Lecture 15: Least Square Regression Metric Embeddings
President’s Day Lecture: Advanced Nearest Neighbor Search
Presentation transcript:

Spectral Approaches to Nearest Neighbor Search arXiv: Robert Krauthgamer (Weizmann Institute) Joint with: Amirali Abdullah, Alexandr Andoni, Ravi Kannan Les Houches, January 2015

Nearest Neighbor Search (NNS)

Motivation  Generic setup:  Points model objects (e.g. images)  Distance models (dis)similarity measure  Application areas:  machine learning: k-NN rule  signal processing, vector quantization, bioinformatics, etc…  Distance can be:  Hamming, Euclidean, edit distance, earth-mover distance, …

Curse of Dimensionality AlgorithmQuery timeSpace Full indexing No indexing – linear scan

Approximate NNS

NNS algorithms 6 It’s all about space partitions ! Low-dimensional  [Arya-Mount’93], [Clarkson’94], [Arya-Mount- Netanyahu-Silverman-We’98], [Kleinberg’97], [HarPeled’02],[Arya-Fonseca-Mount’11],… High-dimensional  [Indyk-Motwani’98], [Kushilevitz-Ostrovsky- Rabani’98], [Indyk’98, ‘01], [Gionis-Indyk- Motwani’99], [Charikar’02], [Datar-Immorlica- Indyk-Mirrokni’04], [Chakrabarti-Regev’04], [Panigrahy’06], [Ailon-Chazelle’06], [Andoni- Indyk’06], [Andoni-Indyk-Nguyen- Razenshteyn’14], [Andoni-Razenshteyn’15]

Low-dimensional 7

High-dimensional 8

Practice  Data-aware partitions  optimize the partition to your dataset  PCA-tree [Sproull’91, McNames’01, Verma-Kpotufe-Dasgupta’09]  randomized kd-trees [SilpaAnan-Hartley’08, Muja-Lowe’09]  spectral/PCA/semantic/WTA hashing [Weiss-Torralba-Fergus’08, Wang-Kumar-Chang’09, Salakhutdinov-Hinton’09, Yagnik-Strelow-Ross- Lin’11] 9

Practice vs Theory 10  Data-aware projections often outperform (vanilla) random-projection methods  But no guarantees (correctness or performance)  JL generally optimal [Alon’03, Jayram-Woodruff’11]  Even for some NNS setups! [Andoni-Indyk-Patrascu’06] Why do data-aware projections outperform random projections ? Algorithmic framework to study phenomenon?

Plan for the rest 11  Model  Two spectral algorithms  Conclusion

Our model 12

Model properties 13

Algorithms via PCA 14

15

PCA under noise fails 16

1 st Algorithm: intuition 17  Extract “well-captured points”  points with signal mostly inside top PCA space  should work for large fraction of points  Iterate on the rest

Iterative PCA 18

Simpler model 19

Analysis of general model 20

Closeness of subspaces ? 21

Wedin’s sin-theta theorem 22

Additional issue: Conditioning 23  After an iteration, the noise is not random anymore!  non-captured points might be “biased” by capturing criterion  Fix: estimate top PCA subspace from a small sample of the data  Might be purely due to analysis  But does not sound like a bad idea in practice either

Performance of Iterative PCA 24

2 nd Algorithm: PCA-tree 25  Closer to algorithms used in practice

2 algorithmic modifications 26  Centering:  Need to use centered PCA (subtract average)  Otherwise errors from perturbations accumulate  Sparsification:  Need to sparsify the set of points in each node of the tree  Otherwise can get a “dense” cluster:  not enough variance in signal  lots of noise

Analysis 27

Wrap-up 28  Here:  Model: “low-dimensional signal + large noise”  like NNS in low dimensional space  via “right” adaptation of PCA  Immediate questions:  Other, less-structured signal/noise models?  Algorithms with runtime dependent on spectrum?  Broader Q: Analysis that explains empirical success? Why do data-aware projections outperform random projections ? Algorithmic framework to study phenomenon?