Download presentation

Presentation is loading. Please wait.

Published byRiya Gracey Modified over 3 years ago

1
Fast Johnson-Lindenstrauss Transform(s) Nir Ailon Edo Liberty, Bernard Chazelle Bertinoro Workshop on Sublinear Algorithms May 2011

2
JL – Distribution Version Find random mapping from R n to R k (n big, k small) such that for every x R n, ǁxǁ 2 =1 with probability exp{-k || x|| 2 = 1 ± O( (0< K is Tight for this probabilistic guarantee [Jayram, Woodruff 2011]

3
JL – Metric Embedding Version If you have N vectors x 1..x N R n : set k=O( log N) by union bound: for all i,j ǁ x i - x j ǁ ǁx i - x j ǁ low-distortion metric embedding Target dimension k almost tight [Alon 2003]

4
Solution: Johnson-Lindenstrauss (JL) dense random matrix k n =

5
So whats the problem? Running time (kn) Number of random bits (kn) Can we do better?

6
Fast JL A, Chazelle 2006 = S parse. H adamard. D iagonal Time = O(k 3 + nlog n), Randomness=O(k 3 log n + n) beats JL (kn) bound for: log n < k < n 1/2 k n Fourier

7
Improvement Ailon, Liberty 2008 O(n log n) for k < n 1/2 O(n) random bits

8
Algorithm (works for k=O(d 1/2 )) A, Liberty 2007 = B. D 1. H. D 2. H. D 3 … B = n Error Correcting Code kD i =

9
Assume D 1 =diag( 1 … d ) BD 1 x = x i B (i) i Rademacher r.v. in k dim Tail of Z=||BD 1 x|| 2 bounded using Talagrand Pr[|Z- | > ] exp{- 2 / 2 } ||B diag(x)|| 2 ||B t || 2 4 ||x|| 4 (Cauchy Schwartz) Algorithm (works for k=O(d 1/2 )) A, Liberty 2007 = B. D 1. H. D 2. H. D 3 … B = n Error Correcting Code k Each element is 1/ k Row set is subset of rows from Hadamard O(n log n) runtime Columns are 4-wise independent ||B t || 2 4 = O(1) Best we could hope for: ||x|| 4 =d -1/4 =k -1/2 O(1)k -1/2 k HD i x – Rademacher r.v. in d dim Z = ||HD i x|| 4 bounded using Talagrand… Pr[|Z- | > ] exp{- 2 / 2 } ||H|| 4/3 4 ||x|| 4 (Cauchy Schwartz) Use Haussdorff-Young and assumption on k to make progress at each i k=d 1/2-

10
In the meantime… Compressed sensing for sparse signal recovery Find a k n mapping s.t. the equation: y =. x, could be efficiently solved exactly for s-sparse x R.I.P. property sufficient (Candes + Tao): ǁ. xǁ ǁxǁ for s-sparse x You also want to be efficiently computable for the recovery algorithm

11
Why J.L. R.I.P. Number of s-sparse xs is exp{ s log n} [Baraniuk] Therefore k s log n/ 2 measurements enough using distributional JL to get (1+ )-R.I.P. for s-sparse vectors But fast R.I.P. was known without restriction on k – Rudelson, Vershynin: Take k log 3 n randomly chosen rows from Fourier transform – No restriction of the form k < n 1/2 Does R.I.P. J.L. ? – That would be a good way to kill restriction on k!

12
Rudelson + Vershynins R.I.P. If is random choice of k=s t log 4 n rows from Fourier (Hadamard) matrix, then with constant probability matrix is (1/ t)-R.I.P for s-sparse vectors

13
Rudelson + Vershynins R.I.P. (almost) metic J.L. Analysis not black box Had to extend nitty-gritty details

14
The Transformation Hadamard (Unnormalized) 1 n k k = O(log 4 n log N / 4 ) with no restriction on k x The Analysis D n n s = log N / 2 heaviest coordinates n – s lightest coordinates: bounded by 1/ s = 1/( log N) k -1/2

15
Hadamard (Unnormalized) 1 n k k = O(log 4 n log N / 4 ) with no restriction on k x The Analysis D n n s = log N / 2 heaviest coordinates n – s lightest coordinates: bounded by 1/ s = 1/( log N) k -1/2 || Dx H || 2 = ||x H || 2 ( 1 O( )) directly from r.v. xHxH xLxL D x L Rademacher Z = || k -1/2 D x L || 2 concentrated using Talagrand with = || k -1/2 diag(x L )|| 2 2

16
k = O(log 4 n log N / 4 ) with no restriction on k The Analysis s = log N / 2 heaviest coordinates n – s lightest coordinates: bounded by 1/ s = 1/( log N) || Dx H || 2 = ||x H || 2 ( 1 O( )) directly from r.v. D x L Rademacher Z = || k -1/2 D x L || 2 concentrated using Talagrand with = || k -1/2 diag(x L )|| 2 2 R.V. Proved that with constant probability over, uniformly for all vectors x=s ones and the rest zero: ||k -1/2 diag(x)|| 2 2 is bounded (this is R.I.P.). The two parameters that govern the bound are: 1.||x|| = 1, and 2.For any vector y s.t. ||y|| 2 =1: ||diag(x) y|| 1 T What we have is 1.||x L || 1/( log N) 2.For any vector y s.t. ||y|| 2 =1: ||diag(x L ) y|| 1 1 = O(log N / 2 )

17
More.. Krahmer and Ward (2010) prove RIP JL black-box! This fixed the -4 problem and replaces it with the correct -2 !! Proof technique: Kane, Nelson: sparse JL Lots of work on derandomization Can we get rid of polylog n? If we go via R.I.P. then we need at least one log n factor, which JL doesnt seem to need.

Similar presentations

OK

Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.

Experts and Boosting Algorithms. Experts: Motivation Given a set of experts –No prior information –No consistent behavior –Goal: Predict as the best expert.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Free ppt on brain machine interface technology Ppt on game theory mario Ppt on natural numbers definition Ppt on summary writing graphic organizer Professional backgrounds for ppt on social media Ppt on job skills Ppt on human chromosomes vs primates Ppt on peak load pricing of electricity Ppt on product advertising images Ppt on noise in frequency modulation