Download presentation

Presentation is loading. Please wait.

Published byRiya Gracey Modified over 2 years ago

1
Fast Johnson-Lindenstrauss Transform(s) Nir Ailon Edo Liberty, Bernard Chazelle Bertinoro Workshop on Sublinear Algorithms May 2011

2
JL – Distribution Version Find random mapping from R n to R k (n big, k small) such that for every x R n, ǁxǁ 2 =1 with probability exp{-k || x|| 2 = 1 ± O( (0< K is Tight for this probabilistic guarantee [Jayram, Woodruff 2011]

3
JL – Metric Embedding Version If you have N vectors x 1..x N R n : set k=O( log N) by union bound: for all i,j ǁ x i - x j ǁ ǁx i - x j ǁ low-distortion metric embedding Target dimension k almost tight [Alon 2003]

4
Solution: Johnson-Lindenstrauss (JL) dense random matrix k n =

5
So whats the problem? Running time (kn) Number of random bits (kn) Can we do better?

6
Fast JL A, Chazelle 2006 = S parse. H adamard. D iagonal Time = O(k 3 + nlog n), Randomness=O(k 3 log n + n) beats JL (kn) bound for: log n < k < n 1/2 k n Fourier

7
Improvement Ailon, Liberty 2008 O(n log n) for k < n 1/2 O(n) random bits

8
Algorithm (works for k=O(d 1/2 )) A, Liberty 2007 = B. D 1. H. D 2. H. D 3 … B = n Error Correcting Code kD i =

9
Assume D 1 =diag( 1 … d ) BD 1 x = x i B (i) i Rademacher r.v. in k dim Tail of Z=||BD 1 x|| 2 bounded using Talagrand Pr[|Z- | > ] exp{- 2 / 2 } ||B diag(x)|| 2 ||B t || 2 4 ||x|| 4 (Cauchy Schwartz) Algorithm (works for k=O(d 1/2 )) A, Liberty 2007 = B. D 1. H. D 2. H. D 3 … B = n Error Correcting Code k Each element is 1/ k Row set is subset of rows from Hadamard O(n log n) runtime Columns are 4-wise independent ||B t || 2 4 = O(1) Best we could hope for: ||x|| 4 =d -1/4 =k -1/2 O(1)k -1/2 k HD i x – Rademacher r.v. in d dim Z = ||HD i x|| 4 bounded using Talagrand… Pr[|Z- | > ] exp{- 2 / 2 } ||H|| 4/3 4 ||x|| 4 (Cauchy Schwartz) Use Haussdorff-Young and assumption on k to make progress at each i k=d 1/2-

10
In the meantime… Compressed sensing for sparse signal recovery Find a k n mapping s.t. the equation: y =. x, could be efficiently solved exactly for s-sparse x R.I.P. property sufficient (Candes + Tao): ǁ. xǁ ǁxǁ for s-sparse x You also want to be efficiently computable for the recovery algorithm

11
Why J.L. R.I.P. Number of s-sparse xs is exp{ s log n} [Baraniuk] Therefore k s log n/ 2 measurements enough using distributional JL to get (1+ )-R.I.P. for s-sparse vectors But fast R.I.P. was known without restriction on k – Rudelson, Vershynin: Take k log 3 n randomly chosen rows from Fourier transform – No restriction of the form k < n 1/2 Does R.I.P. J.L. ? – That would be a good way to kill restriction on k!

12
Rudelson + Vershynins R.I.P. If is random choice of k=s t log 4 n rows from Fourier (Hadamard) matrix, then with constant probability matrix is (1/ t)-R.I.P for s-sparse vectors

13
Rudelson + Vershynins R.I.P. (almost) metic J.L. Analysis not black box Had to extend nitty-gritty details

14
The Transformation Hadamard (Unnormalized) 1 n k k = O(log 4 n log N / 4 ) with no restriction on k x The Analysis D n n s = log N / 2 heaviest coordinates n – s lightest coordinates: bounded by 1/ s = 1/( log N) k -1/2

15
Hadamard (Unnormalized) 1 n k k = O(log 4 n log N / 4 ) with no restriction on k x The Analysis D n n s = log N / 2 heaviest coordinates n – s lightest coordinates: bounded by 1/ s = 1/( log N) k -1/2 || Dx H || 2 = ||x H || 2 ( 1 O( )) directly from r.v. xHxH xLxL D x L Rademacher Z = || k -1/2 D x L || 2 concentrated using Talagrand with = || k -1/2 diag(x L )|| 2 2

16
k = O(log 4 n log N / 4 ) with no restriction on k The Analysis s = log N / 2 heaviest coordinates n – s lightest coordinates: bounded by 1/ s = 1/( log N) || Dx H || 2 = ||x H || 2 ( 1 O( )) directly from r.v. D x L Rademacher Z = || k -1/2 D x L || 2 concentrated using Talagrand with = || k -1/2 diag(x L )|| 2 2 R.V. Proved that with constant probability over, uniformly for all vectors x=s ones and the rest zero: ||k -1/2 diag(x)|| 2 2 is bounded (this is R.I.P.). The two parameters that govern the bound are: 1.||x|| = 1, and 2.For any vector y s.t. ||y|| 2 =1: ||diag(x) y|| 1 T What we have is 1.||x L || 1/( log N) 2.For any vector y s.t. ||y|| 2 =1: ||diag(x L ) y|| 1 1 = O(log N / 2 )

17
More.. Krahmer and Ward (2010) prove RIP JL black-box! This fixed the -4 problem and replaces it with the correct -2 !! Proof technique: Kane, Nelson: sparse JL Lots of work on derandomization Can we get rid of polylog n? If we go via R.I.P. then we need at least one log n factor, which JL doesnt seem to need.

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google