Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dimension Reduction using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty.

Similar presentations


Presentation on theme: "Dimension Reduction using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty."— Presentation transcript:

1 Dimension Reduction using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty

2 Dimension Reduction Algorithmic technique: –often as “black box” Removes redundancy from data A sampling method Measure concentration phenomenon (Existential) metric embedding Recently: computational aspects –time –randomness

3 Challenge Find random projection  from R d to R k (d big, k small) such that for every x  R d, ||x|| 2 =1, 0<  with probability exp{-k    ||  x|| 2 = 1 ± O(  RdRd RkRk  x xx

4 Usage If you have n vectors x 1..x n  R d : set k=O(   log n) by union bound: for all i,j ||  x i -  x j || ≈   ||x i - x j || low-distortion metric embedding “tight” RdRd RkRk  x1x1 x1x1 x2x2 x3x3 x2x2 x3x3

5 Solution: Johnson-Lindenstrauss (JL) “dense random matrix” k d  =

6 So what’s the problem? running time  (kd) number of random bits  (kd) can we do better?

7 Fast JL A, Chazelle 2006  = S parse. H adamard. D iagonal time = O(k 3 + dlog d), randomness=O(k 3 + d) beats JL  (kd) bound for: log d < k < d 1/3 k d Fourier

8 This Talk O(d logk) for k < d 1/2 beats JL up to k < d 1/2 O(d) random bits

9 Algorithm (k=d 1/2 ) A, Liberty 2007  = B CH. D iagonal..... k d Error Correcting Code

10 Error Correcting Codes k = d 1/2 d B = columns closed under addition row set subset of d x d Hadamard 0  k -1/2 1  -k -1/2 Bx computable in time d logd given x  binary “dual-BCH code of designed distance 4” 4-wise independence

11 Error Correcting Codes Fact (easily from properties of dual BCH): ||B t || 2  4 = O(1) for y t  R k with ||y|| 2 = 1: ||yB|| 4 = O(1) RdRd R k=  d B x xx D...

12 Rademacher Series on Error Correcting Codes look at r.v. BDx  (R k, l 2 ) BDx =  D ii x i B  i D ii  R {  1}i=1...d =  D ii M  i RdRd R k=  d B x xx D...

13 Talagrand’s Concentration Bound for Rademacher Series Z = ||  D ii M  i || p (in our case p=2) Pr[ |Z- E Z| >  ] = O(exp{-  2 /4||M|| 2  p 2 })

14 Rademacher Series on Error Correcting Codes look at r.v. BDx  (R k, l 2 ) Z = ||BDx|| 2 = ||  D ii M  i || 2 ||M|| 2  2  ||x|| 4 ||B t || 2  4 (Cauchy-Schwartz) by ECC properties: ||M|| 2  2  ||x|| 4 O(1) trivial: E Z = ||x|| 2 = 1 RdRd R k=  d B x xx D...

15 Rademacher Series on Error Correcting Codes look at r.v. BDx  (R k, l 2 ) Z = ||BDx|| 2 = ||  D ii M  i || 2 ||M|| 2  2 =O(||x|| 4 ) E Z = 1 Pr[ |Z- E Z| >  ] = O(exp{-  2 /4||M|| 2  2 2 })  Pr[|Z-1| >  ] = O(exp{-  2 /||x|| 4 2 }) RdRd R k=  d B x xx D... how to get ||x|| 4 2 = O(k -1 = d -1/2 ) ? Challenge: w. prob. exp{-k    deviation of more than 

16 Controlling ||x|| 4 2 RdRd R k=  d B x xx D... how to get ||x|| 4 2 = O(k -1 = d -1/2 ) ? if you think about it for a second... “random” x has ||x|| 4 2 =O(d -1/2 ) but “random” x easy to reduce: just output first k dimensions are we asking for too much? no: truly random x has strong bound on ||x|| p for all p>2

17 Controlling ||x|| 4 2 RdRd R k=  d B x xx D... how to get ||x|| 4 2 = O(k -1 = d -1/2 ) ? can multiply x by orthogonal matrix try matrix HD Z = ||HDx|| 4 = ||  D ii x i H  i || 4 = ||  D ii M  i || 4 by Talagrand: Pr[ |Z- E Z| > t ] = O(exp{-t 2 /4||M|| 2  4 2 }) E Z = O(d -1/4 ) (trivial) ||M|| 2  4  ||H|| 4/3  4 ||x|| 4 (Cauchy Schwartz) (HD used in [AC06] to control ||HDx||  )

18 Controlling ||x|| 4 2 how to get ||x|| 4 2 = O(k -1 = d -1/2 ) ? RdRd R k=  d B x xx D... Z = ||  D ii M  i || 4 M  i = x i H  i Pr[ |Z- E Z| > t ] = O(exp{-t 2 /4||M|| 2  4 2 }) (Talagrand) E Z = O(d -1/4 ) (trivial) ||M|| 2  4  ||H|| 4/3  4 ||x|| 4 (Cauchy Schwartz) ||H|| 4/3  4  d -1/4 (Hausdorff-Young)  ||M|| 2  4  d -1/4 ||x|| 4  Pr[ ||HDx|| 4 > d -1/4 + t ] = exp{-t 2 /d -1/2 ||x|| 4 2 }

19 Controlling ||x|| 4 2 how to get ||x|| 4 2 = O(d -1/2 ) ? RdRd R k=  d B x xx D... Pr[ ||HDx|| 4 > d -1/4 + t ] = exp{-t 2 /d -1/2 ||x|| 4 2 } need some slack k=d -1/2-  max error probability for challenge: exp{-k} k = t 2 /d -1/2 ||x|| 4 2  t = k 1/2 d 1/4 ||x|| 4 = ||x|| 4 d -  /2

20 Controlling ||x|| 4 2 how to get ||x|| 4 2 = O(d -1/2 ) ? RdRd R k=  d B x xx D... first round: ||HDx|| 4 < d -1/4 + ||x|| 4 d -  /2 second round: ||HD ’’ HD ’ x|| 4 < d -1/4 + d -1/4-  /2 + ||x|| 4 d -  r=O(1/  )’th round: ||HD (r)... HD ’ x|| 4 < O(r)d -1/4... with probability 1-O(exp{-k})

21 Algorithm for k=d -1/2-  RdRd RkRk B x xx D... HD (1) HD (r) running time O(d logd) randomness O(d)

22 Open Problems Go beyond k=d 1/2 Conjecture: can do O(d log d) for k=d 1-  Approximate linear l 2 - regression minimize ||Ax-b|| 2 given A,b (overdetermined) –State of the art for general inputs: Õ(linear time) for #{variables} < #{equations} 1/3 –Conjecture: can do Õ(linear time) always


Download ppt "Dimension Reduction using Rademacher Series on Dual BCH Codes Nir Ailon Edo Liberty."

Similar presentations


Ads by Google