Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT)

Slides:



Advertisements
Similar presentations
1+eps-Approximate Sparse Recovery Eric Price MIT David Woodruff IBM Almaden.
Advertisements

The Data Stream Space Complexity of Cascaded Norms T.S. Jayram David Woodruff IBM Almaden.
Optimal Space Lower Bounds for All Frequency Moments David Woodruff MIT
Numerical Linear Algebra in the Streaming Model Ken Clarkson - IBM David Woodruff - IBM.
Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.
Numerical Linear Algebra in the Streaming Model
Efficient Private Approximation Protocols Piotr Indyk David Woodruff Work in progress.
Tight Lower Bounds for the Distinct Elements Problem David Woodruff MIT Joint work with Piotr Indyk.
Property Testing of Data Dimensionality Robert Krauthgamer ICSI and UC Berkeley Joint work with Ori Sasson (Hebrew U.)
A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.
1 Approximating Edit Distance in Near-Linear Time Alexandr Andoni (MIT) Joint work with Krzysztof Onak (MIT)
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
Algorithmic High-Dimensional Geometry 1 Alex Andoni (Microsoft Research SVC)
Spectral Approaches to Nearest Neighbor Search arXiv: Robert Krauthgamer (Weizmann Institute) Joint with: Amirali Abdullah, Alexandr Andoni, Ravi.
Overcoming the L 1 Non- Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT)
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]
Turnstile Streaming Algorithms Might as Well Be Linear Sketches Yi Li Huy L. Nguyen David Woodruff.
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)
Uncertainty Principles, Extractors, and Explicit Embeddings of L 2 into L 1 Piotr Indyk MIT.
1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 11 June 1, 2005
1 Lecture 18 Syntactic Web Clustering CS
Avraham Ben-Aroya (Tel Aviv University) Oded Regev (Tel Aviv University) Ronald de Wolf (CWI, Amsterdam) A Hypercontractive Inequality for Matrix-Valued.
1 Edit Distance and Large Data Sets Ziv Bar-Yossef Robert Krauthgamer Ravi Kumar T.S. Jayram IBM Almaden Technion.
Optimal Data-Dependent Hashing for Approximate Near Neighbors
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.
Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.
Embedding and Sketching Non-normed spaces Alexandr Andoni (MSR)
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
Geometric Problems in High Dimensions: Sketching Piotr Indyk.
On Embedding Edit Distance into L_11 On Embedding Edit Distance into L 1 Robert Krauthgamer (Weizmann Institute and IBM Almaden)‏ Based on joint work (i)
Streaming Algorithms Piotr Indyk MIT. Data Streams A data stream is a sequence of data that is too large to be stored in available memory Examples: –Network.
Information Theory for Data Streams David P. Woodruff IBM Almaden.
13 th Nov Geometry of Graphs and It’s Applications Suijt P Gujar. Topics in Approximation Algorithms Instructor : T Kavitha.
Sublinear Algorithms via Precision Sampling Alexandr Andoni (Microsoft Research) joint work with: Robert Krauthgamer (Weizmann Inst.) Krzysztof Onak (CMU)
1 Embedding and Similarity Search for Point Sets under Translation Minkyoung Cho and David M. Mount University of Maryland SoCG 2008.
Geometric Problems in High Dimensions: Sketching Piotr Indyk.
Embedding and Sketching Sketching for streaming Alexandr Andoni (MSR)
Massive Data Sets and Information Theory Ziv Bar-Yossef Department of Electrical Engineering Technion.
Data Stream Algorithms Lower Bounds Graham Cormode
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1.
Summer School on Hashing’14 Dimension Reduction Alex Andoni (Microsoft Research)
Sketching complexity of graph cuts Alexandr Andoni joint work with: Robi Krauthgamer, David Woodruff.
Approximate Near Neighbors for General Symmetric Norms
Information Complexity Lower Bounds
Stochastic Streams: Sample Complexity vs. Space Complexity
New Characterizations in Turnstile Streams with Applications
Dimension reduction for finite trees in L1
Fast Dimension Reduction MMDS 2008
Sublinear Algorithmic Tools 3
Sublinear Algorithmic Tools 2
Spectral Approaches to Nearest Neighbor Search [FOCS 2014]
Lecture 10: Sketching S3: Nearest Neighbor Search
Sketching and Embedding are Equivalent for Norms
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Near(est) Neighbor in High Dimensions
Data-Dependent Hashing for Nearest Neighbor Search
Lecture 16: Earth-Mover Distance
Linear sketching with parities
Near-Optimal (Euclidean) Metric Compression
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University
Overcoming the L1 Non-Embeddability Barrier
Streaming Symmetric Norms via Measure Concentration
Embedding and Sketching
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Lecture 15: Least Square Regression Metric Embeddings
President’s Day Lecture: Advanced Nearest Neighbor Search
Approximating Edit Distance in Near-Linear Time
Presentation transcript:

Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT) 1

Sketching Compress a massive object to a small sketch Rich theories: high-dimensional vectors, matrices, graphs Similarity search, compressed sensing, numerical linear algebra Dimension reduction (Johnson, Lindenstrauss 1984): random projection on a low-dimensional subspace preserves distances n d When is sketching possible? 2

Similarity search Motivation: similarity search Model similarity as a metric Sketching may speed-up computation and allow indexing Interesting metrics: Euclidean ℓ 2 : d(x, y) = (∑ i |x i – y i | 2 ) 1/2 Manhattan, Hamming ℓ 1 : d(x, y) = ∑ i |x i – y i | ℓ p distances d(x, y) = (∑ i |x i – y i | p ) 1/p for p ≥ 1 Edit distance, Earth Mover’s Distance etc. 3

Sketching metrics Alice and Bob each hold a point from a metric space (say x and y) Both send s-bit sketches to Charlie For r > 0 and D > 1 distinguish d(x, y) ≤ r d(x, y) ≥ Dr Shared randomness, allow 1% probability of error Trade-off between s and D sketch(x)sketch(y) d(x, y) ≤ r or d(x, y) ≥ Dr? 0110…1 4 AliceBob Charlie x y

Near Neighbor Search via sketches Near Neighbor Search (NNS): Given n-point dataset P A query q within r from some data point Return any data point within Dr from q Sketches of size s imply NNS with space n O(s) and a 1-probe query Proof idea: amplify probability of error to 1/n by increasing the size to O(s log n); sketch of q determines the answer 5

Sketching real line Distinguish |x – y| ≤ 1 vs. |x – y| ≥ 1 + ε Randomly shifted pieces of size w = 1 + ε/2 Repeat O(1 / ε 2 ) times Overall: D = 1 + ε s = O(1 / ε 2 ) xy

Sketching ℓ p for 0 < p ≤ 2 (Indyk 2000): can reduce sketching of ℓ p with 0 < p ≤ 2 to sketching reals via random projections If (G 1, G 2, …, G d ) are i.i.d. N(0, 1)’s, then ∑ i x i G i – ∑ i y i G i is distributed as ‖x - y‖ 2 N(0, 1) For 0 < p < 2 use p-stable distributions instead Again, get D = 1 + ε with s = O(1 / ε 2 ) For p > 2 sketching ℓ p is hard: to achieve D = O(1) one needs sketch size to be s = Θ~(d 1-2/p ) (Bar-Yossef, Jayram, Kumar, Sivakumar 2002), (Indyk, Woodruff 2005) 7

Anything else? A map f: X → Y is an embedding with distortion D’, if for a, b from X: d X (a, b) / D’ ≤ d Y (f(a), f(b)) ≤ d X (a, b) Reductions for geometric problems If Y has s-bit sketches for approximation D, then for X one gets s bits and approximation DD’ 8

Metrics with good sketches A metric X admits sketches with s, D = O(1), if: X = ℓ p for p ≤ 2 X embeds into ℓ p for p ≤ 2 with distortion O(1) Are there any other metrics with efficient sketches (D and s are O(1))? We don’t know! Some new techniques waiting to be discovered? No new techniques?! 9

The main result If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ 1 – ε with distortion O(sD / ε) Embedding into ℓ p, p ≤ 2 Efficient sketches (Kushilevitz, Ostrovsky, Rabani 1998) (Indyk 2000) For norms A vector space X with ‖.‖: X → R ≥0 is a normed space, if ‖x‖ = 0 iff x = 0 ‖αx‖ = |α|‖x‖ ‖x + y‖ ≤ ‖x‖ + ‖y‖ Every norm gives rise to a metric: define d(x, y) = ‖x - y‖ 10

Sanity check ℓ p spaces: p > 2 is hard, 1 ≤ p ≤ 2 is easy, p < 1 is not a norm Can classify mixed norms ℓ p (ℓ q ): in particular, ℓ 1 (ℓ 2 ) is easy, while ℓ 2 (ℓ 1 ) is hard! (Jayram, Woodruff 2009), (Kalton 1985) A non-example: edit distance is not a norm, sketchability is largely open (Ostrovsky, Rabani 2005), (Andoni, Jayram, P ă traşcu 2010) ℓqℓq ℓpℓp 11

No embeddings → no sketches In the contrapositive: if a normed space is non-embeddable into ℓ 1 – ε, then it does not have good sketches Can convert sophisticated non-embeddability results into lower bounds for sketches 12

Example 1: the Earth Mover’s Distance For x: R [Δ]×[Δ] → R with ∑ i,j x i,j = 0, define the Earth Mover’s Distance ‖x‖ EMD as the cost of the best transportation of the positive part of x to the negative part (Monge-Kantorovich norm) Best upper bounds: D = O(1 / ε) and s = Δ ε (Andoni, Do Ba, Indyk, Woodruff 2009) D = O(log Δ) and s = O(1) (Charikar 2002), (Indyk, Thaper 2003), (Naor, Schechtman 2005) No embedding into ℓ 1 – ε with distortion O(1) (Naor, Schechtman 2005) No sketches with D = O(1) and s = O(1) 13

Example 2: the Trace Norm For an n × n matrix A define the Trace Norm (the Nuclear Norm) ‖A‖ to be the sum of the singular values Previously: lower bounds only for certain restricted classes of sketches (Li, Nguyen, Woodruff 2014) Any embedding into ℓ 1 requires distortion Ω(n 1/2 ) (Pisier 1978) Any sketch must satisfy sD = Ω(n 1/2 / log n) 14

The plan of the proof If a normed space X admits sketches of size s and approximation D, then for every ε > 0 the space X embeds (linearly) into ℓ 1 – ε with distortion O(sD / ε) Sketches Weak embedding into ℓ 2 Linear embedding into ℓ 1 – ε Information theoryNonlinear functional analysis A map f: X → Y is (s 1, s 2, τ 1, τ 2 )-threshold, if d X (x 1, x 2 ) ≤ s 1 implies d Y (f(x 1 ), f(x 2 )) ≤ τ 1 d X (x 1, x 2 ) ≥ s 2 implies d Y (f(x 1 ), f(x 2 )) ≥ τ 2 (1, O(sD), 1, 10)-threshold map from X to ℓ 2 15

Sketch → Threshold map X has a sketch of size s and approximation D There is a (1, O(sD), 1, 10)- threshold map from X to ℓ 2 ? No (1, O(sD), 1, 10)- threshold map from X to ℓ 2 Poincaré-type inequalities on X Convex duality ℓ k ∞ (X) has no sketches of size Ω(k) and approximation Θ(sD) (Andoni, Jayram, P ă traşcu 2010) (direct sum theorem for information complexity) X has no sketches of size s and approximation D ‖(x 1, …, x k )‖ = max i ‖x i ‖ 16

Sketching direct sums X has sketches of size s and approximation D ℓ k ∞ (X) has sketches of size O(s) and approximation Dk Alice Bob (a 1, a 2, …, a k )(b 1, b 2, …, b k ) ∑ i σ i a i ∑ i σ i b i sketch(∑ i σ i a i )sketch(∑ i σ i b i ) max i ‖a i - b i ‖ ≤ ‖∑ i σ i (a i – b i )‖ ≤ ∑ i ‖a i - b i ‖ ≤ k max i ‖a i - b i ‖ Crucially use the linear structure of X (not enough to be just a metric!) 17 (σ 1, σ 2, …, σ k ) — random ±1’s with probability 1/2

Threshold map → linear embedding (1, O(sD), 1, 10)-threshold map from X to ℓ 2 Linear embedding into ℓ 1 – ε with distortion O(sD / ε) Uniform embedding into ℓ 2 g: X → ℓ 2 s.t. L(‖x 1 – x 2 ‖) ≤ ‖g(x 1 ) – g(x 2 )‖ ≤ U(‖x 1 – x 2 ‖) where L and R are non-decreasing, L(t) > 0 for t > 0 U(t) → 0 as t → 0 (Aharoni, Maurey, Mityagin 1985) (Nikishin 1973) ? 18

Threshold map → uniform embedding A map f: X → ℓ 2 such that ‖x 1 - x 2 ‖ ≤ 1 implies ‖f(x 1 ) - f(x 2 )‖ ≤ 1 ‖x 1 - x 2 ‖ ≥ Θ(sD) implies ‖f(x 1 ) - f(x 2 )‖ ≥ 10 Building on (Johnson, Randrianarivony 2006) 1-net N of X; f Lipschitz on N Extend f from N to a Lipschitz function on the whole X 19

Open problems Extend to as general class of metrics as possible Connection to linear sketches? Sketches of the form x → Ax Conjecture: sketches of size s and approximation D can be converted to linear sketches with f(s) measurements and approximation g(D) Spaces that admit no non-trivial sketches (s = Ω(d) for D = O(1)): is there anything besides ℓ ∞ ? Can one strengthen our theorem to “sketchability implies embeddability into ℓ 1 ”? Equivalent to an old open problem from Functional Analysis. Sketches imply NNS, is there a reverse implication? 20