Download presentation

Presentation is loading. Please wait.

Published byJennifer Egan Modified over 4 years ago

1
Reductions to the Noisy Parity Problem TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAA A A A A Vitaly Feldman Parikshit Gopalan Subhash Khot Ashok K. Ponnuswami Harvard UW Georgia Tech aka New Results on Learning Parities, Halfspaces, Monomials, Mahjongg etc.

2
Uniform Distribution Learning x, f(x)x {0,1} n f: {0,1} n ! {+1,-1} Goal: Learn the function f in poly(n) time.

3
Uniform Distribution Learning x, f(x) Goal: Learn the function f in poly(n) time. Information theoretically impossible. Will assume f has nice structure, such as 1.Parityf(x) = (-1) · x 2.Halfspacef(x) = sgn(w · x) 3.k-juntaf(x) = f(x i 1,…,x i k ) 4.Decision Tree 5.DNF

4
Uniform Distribution Learning x, f(x) Goal: Learn the function f in poly(n) time. 1.Parityn O(1) Gaussian elim. 2.Halfspacen O(1) LP 3.k-juntan 0.7k [MOS] 4.Decision Treen log n Fourier 5.DNFn log n Fourier

5
Uniform Distribution Learning with Random Noise x, (-1) e ·f(x) Goal: Learn the function f in poly(n) time. x {0,1} n f: {0,1} n ! {+1,-1} e = 1 w.p = 0 w.p 1 -

6
x, (-1) e ·f(x) Goal: Learn the function f in poly(n) time. 1.ParityNoisy Parity 2.Halfspacen O(1) [BFKV] 3.k-juntan k Fourier 4.Decision Treen log n Fourier 5. DNFn log n Fourier Uniform Distribution Learning with Random Noise

7
Coding Theory: Decoding a random linear code from random noise. Best Known Algorithm: 2 n/log n Blum-Kalai-Wasserman [BKW] Believed to be hard. Variant: Noisy parity of size k. Brute force runs in time O(n k ). The Noisy Parity Problem x, (-1) e ·f(x)

8
Agnostic Learning under the Uniform Distribution x, g(x) Goal: Get an approx. to g that is as good as f. g(x) is a {-1,+1} random variable. Pr x [g(x) f(x)]

9
x, g(x) Goal: Get an approx. to g that is as good as f. If the function f is a 1.Parity2 n/log n [FGKP] 2.Halfspacen O(1) [KKMS] 3.k-juntan k [KKMS] 4.Decision Treen log n [KKMS] 5.DNFn log n [KKMS] Agnostic Learning under the Uniform Distribution

10
x, g(x) Given g which has a large Fourier coefficient, find it. Coding Theory: Decoding a random linear code with adversarial noise. If queries were allowed: Hadamard list decoding [GL, KM]. Basis of algorithms for Decision trees [KM], DNF [Jackson]. Agnostic Learning of Parities

11
Reductions between problems and models x, f(x)x, g(x) Noise-free Random Agnostic x, (-1) e ·f(x)

12
Reductions to Noisy Parity Theorem [FGKP]: Learning Juntas, Decision Trees and DNFs reduce to learning noisy parities of size k. ClassSize of ParityError-rate k-juntak½ - 2 -k Decision tree, DNF log n½ - n -2

13
Uniform Distribution Learning x, f(x) Goal: Learn the function f in poly(n) time. 1.Parityn O(1) Gaussian elim. 2.Halfspacen O(1) LP 3.k-juntan 0.7k [MOS] 4.Decision Treen log n Fourier 5.DNFn log n Fourier

14
Reductions to Noisy Parity Theorem [FGKP]: Learning Juntas, Decision Trees and DNFs reduce to learning noisy parities of size k. ClassSize of ParityError-rate k-juntak½ - 2 -k Decision tree, DNF log n½ - n -2 Evidence in favor of noisy parity being hard? Reduction holds even with random classification noise.

15
x, (-1) e ·f(x) Goal: Learn the function f in poly(n) time. 1.ParityNoisy Parity 2.Halfspacen O(1) [BFKV] 3.k-juntan k Fourier 4.Decision Treen log n Fourier 5. DNFn log n Fourier Uniform Distribution Learning with Random Noise

16
Reductions to Noisy Parity Theorem [FGKP]: Agnostically learning parity with error-rate reduces to learning noisy parity with error-rate. With BKW, gives 2 n/log n agnostic learning algorithm. Main Idea: A noisy parity algorithm can help find large Fourier coefficients from random examples.

17
Reductions between problems and models x, f(x)x, g(x) Noise-free Random Agnostic x, (-1) e ·f(x) Probabilistic Oracle

18
Probabilistic Oracles Given h: {0,1} n ! [-1,1] h x, b x {0,1} n, b 2 {-1,+1}. E[b | x] = h(x).

19
Simulating Noisefree Oracles x, f(x) f x, b E[b | x] = f(x) 2 {-1,1}, hence b = f(x) Let f: {0,1} n ! {-1,1}.

20
Simulating Random Noise x, f(x) 0.8f x, b E[b | x] = 0.8 f(x) Hence b = f(x) w.p 0.9 b = -f(x) w.p 0.1 Given f: {0,1} n ! {-1,1} and = 0.1 Let h(x) = 0.8 f(x).

21
Simulating Adversarial Noise x, g(x) h x, b Given g(x) is a {-1,1} r.v. and Pr x [g(x) f(x)] =. Let h(x) = E[g(x)]. Bound on error rate implies E x [|h(x) – f(x)|] <

22
Reductions between problems and models x, f(x)x, g(x) Noise-free Random Agnostic x, (-1) e ·f(x) Probabilistic Oracle

23
… for the slideshow.

Similar presentations

OK

Learning and smoothed analysis Adam Kalai Microsoft Research Cambridge, MA Shang-Hua Teng* University of Southern California Alex Samorodnitsky* Hebrew.

Learning and smoothed analysis Adam Kalai Microsoft Research Cambridge, MA Shang-Hua Teng* University of Southern California Alex Samorodnitsky* Hebrew.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google