Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Presentation on theme: "Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)"— Presentation transcript:

Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Error-Correcting Codes, Randomness and Complexity  “Classical” complexity (pre-Randomness): P vs NP, P vs NL, …  “Modern” complexity (Randomness): cryptography, NP = PCP(log n, 1), BPP vs P, expanders/extractors/… Use of classical error-correcting codes (Hadamard, Reed-Solomon, …) Invention of new kinds of codes (locally testable, locally decodable, locally list-decodable, …)

Example: Derandomization Idea: Replace truly random string with computationally random (pseudorandom) string Goal: Save on the number of random bits used

Example: Derandomization Computational Randomness = Computational Hardness  Hard-to-compute functions ) Pseudorandom Generators (PRG) ) derandomization (BPP = P)  PRG ) Hard-to-compute functions

Hardness of Boolean functions  worst-case hardness of f : every (resource-bounded) algorithm A computes f(x) incorrectly for at least one input x  average-case hardness of f : every (resource-bounded) algorithm A computes f(x) incorrectly for at least  fraction of inputs x PRG requires average-case hard functions

Worst-Case to Average-Case f(x 1 ) f(x 2 ) f(x 3 ) … f(x N ) Error-Correcting Encoding g(x 1 ) g(x 2 ) g(x 3 ) … g(x M ) N = 2 n M = 2 O(n)

Correcting Errors f(x 1 ) f(x 2 ) f(x 3 ) … f(x N ) Error-Correcting Decoding g(x 1 ) g(x 2 ) g(x 3 ) … g(x M ) If can compute g on “many” inputs, then can compute f on all inputs.

A Closer Look Implicit Error-Correcting Decoding If  h(x) = g(x) for “many” x, and  h is computable by a “small” circuit, then f is computable by a “small” circuit. h f h ¼ g Use Locally Decodable error-correcting codes !

List-Decodable Codes Implicit Error-Correcting List-Decoding If  h(x) = g(x) for ½ +  of inputs, and  h is computable by a “small” circuit, then f is computable by a “small” circuit. h h ¼ g Use Locally List-Decodable error-correcting codes ! [Sudan, Trevisan, Vadhan ’01] (algebraic polynomial-based codes) f

Hardness Amplification Yao’s XOR Lemma: If f:{0,1} n ! {0,1} is  -hard for size s (i.e., any size s circuit errs on ¸  fraction of inputs), then f © k (x 1,…,x k ) = f(x 1 ) © … © f(x k ) is (1/2-  ) – hard for size s’=s* poly( ,  ), for  ¼ 2 -  (  k) Proof: By contradiction. Suppose have a small circuit computing f © k on more than ½+  fraction, show how to build a new circuit computing f on > 1-  fraction.

XOR-based Code Think of a binary message msg on N=2 n bits as a truth-table of a Boolean function f. The code of msg is of length N k where code(x 1,…,x k ) = f(x 1 ) © … © f(x k ). This is very similar to a version of Hadamard code …

Hadamard Code Given a binary msg on N bits, the Hadamard code of msg is a string of 2 N bits, where for an N-bit string r, the code at r is Had(msg) r = mod 2 (the inner product of msg and r) Our XOR-code is essentially truncated Hadamard code where we only consider N-bit strings r of Hamming weight k : f(x 1 ) © … © f(x k ) = where r i =1 for i=x 1, …, x k and r i =0 elsewhere

List-Decoding Hadamard Code Given a 2 N -bit string w, how many N-bit strings m 1, …, m t are there such that Had(m i ) agrees with w in ¸ ½ +  fraction of positions ? Answer: O(1/  2 ) (easy to show using discrete Fourier analysis, or elementary probability theory) The famous Goldreich-Levin algorithm provides an efficient way of list-decoding Hadamard code with optimal list size O(1/  2 )

List-Decoding k-XOR-Code Given a string w, how many strings m 1, …, m t are there such that each k-XOR codeword code(m i ) agrees with w in ¸ ½ +  fraction of positions ? Answer: Too many ! (any two messages that differ in < 1/k fraction of bits have almost identical codewords)

List-Decoding k-XOR-Code Correct question: Given a string m, how many k-XOR codewords code(msg 1 ), …, code(msg t ) are there such that (1) each code(msg i ) agrees with m in ¸ ½ +  fraction of positions, and (2) every pair msg i and msg j differ in at least  fraction of positions ? Answer: 1/(4  2 – e -2  k ), which is O(1/  2 ) for  > log (1/  )/k (as is the case for Yao’s XOR Lemma ! )

The List Size The proof of Yao’s XOR Lemma yields an approximate list-decoding algorithm for the XOR-code defined above. But the list size is 2 poly(1/  ) rather than the optimal poly(1/  )

Our Result for k-XOR Code There is a randomized algorithm such that, for  ¸ poly(1/k): Given a circuit C that computes code(msg) in ½+  fraction of positions, the algorithm outputs with high probability a list of poly(1/  ) circuits that contains a circuit agreeing with msg in ¸ 1- k -0.1 fraction positions. The running time is poly(|C|,1/  ).

Direct Product Lemma If f:{0,1} n ! {0,1} is  -hard for size s (i.e., any size s circuit errs on ¸  fraction of inputs), then f k (x 1,…,x k ) = f(x 1 )…f(x k ) is  ¼ 2 -  (  k) -hard for size s’=s* poly( ,  ). XOR Lemma and Direct Product Lemma are essentially equivalent, thanks to the Goldreich- Levin list-decoding algorithm for Hadamard codes. Hence, enough to list-decode the Direct Product Lemma.

The proof of the DP Lemma [Impagliazzo & Wigderson]: Give efficient randomized algorithm LEARN that, given as input a circuit C  -computing f k (where f:{0,1} n ! {0,1}) and poly(n,1/  ) pairs (x,f(x)) for independent uniform x’s, with high probability outputs a circuit C’ (1-  )-computing f. Need to know f(x) for poly(n,1/  ) random x’s. Let’s choose x’s at random, and then try all possibilities for the values of f on these x’s. This gives a list of 2 poly(n,1/  ) circuits.

Reducing the List Size Magic: We will use the circuit C  -computing f k to generate poly(n,1/  ) pairs (x,f(x)) for independent uniform x’s, and then run LEARN on C and the generated pairs (x,f(x)). Well… Cannot do exactly that, but …

Imperfect samples We will use the circuit C  -computing f k to generate poly(n,1/  ) pairs (x,b x ) for a distribution on x’s that is statistically close to uniform and such that for most x’s we have b x = f(x). Then run a generalization of LEARN on C and the generated pairs (x,b x ), where the generalized LEARN is tolerant of imperfect samples (x,b x ).

How to generate imperfect samples

Warm-up Given a circuit C  -computing f k, want to generate (x,f(x)) where x is almost uniformly distributed. First attempt: Pick a k-tuple (x 1,…, x k ) uniformly at random from the  -fraction of k-tuples where C is correct. Evaluate C(x 1,…, x k ) = b 1 … b k. Pick a random i, 1 · i · k, and output (x i,b i ).

A Sampling Lemma Let S µ {0,1} nk be any set of density . Define a distribution D as follows: Pick an k-tuple of n-bit strings (x 1,…,x k ) uniformly at random from S, pick uniformly an index 1 · i · k, and output x i. Then the statistical distance between D and the uniform distribution is at most (log(k/  )/k) 1/2 ¼ 1/k

Using the Sampling Lemma If we could sample k-tuples on which C is correct, then we would have a pair (x,f(x)) for x ¼ 1/k- close to uniform. But we can’t ! Instead, run the previous sampling procedure with a random k-tuple (x 1,…, x k ) some poly(1/  ) number of times. With high probability, at least one pair will be of the form (x,f(x)) for x close to uniform.

Getting more pairs Given a circuit C  -computing f k, we can get k 1/2 pairs (x,f(x)), for x’s statistically close to uniform, by viewing the input k-tuple as a k 1/2 -tuple of k 1/2 -tuples, and applying the Sampling Lemma to that “meta-tuple”.

What does it give us ? Given a circuit C  -computing f k, we can generate about k 1/2 samples (x,f(x)). (Roughly speaking.) Need about n/  2 samples (to run LEARN). If n/  2 k 1/2 ???

Direct Product Amplification Idea: Given a circuit C  -computing f k, construct a new circuit C’ that  ’-computes f k’ for k’ = k 3/2, and  ’ >  2. Iterate a constant number of times, and get a circuit poly(  )-computing f poly(k) for any poly(k). If  = poly(1/k), we are done. [ since n/  2 · poly(k) ]

Direct Product Amplification Cannot achieve perfect DP amplification ! Instead, can create a circuit C’ such that, for at least  ’ fraction of tuples (x 1,…, x k’ ), C’(x 1,…, x k’ ) agrees with f(x 1 ),…, f(x k’ ) in “most” positions. Because of this imperfection, we can only get pairs of the form (x,b x ) where x’s are almost uniform and “most” b x =f(x).

Putting Everything Together

C for f k C’ for f k c DP amplification Sampling LEARN pairs (x,b x ) circuit (1-  )-computing f with probability > poly(  ) Repeat poly(1/  ) times to get a list containing a good circuit for f, w.h.p.

An application to uniform hardness amplification

Hardness amplification in PH Theorem: Suppose there is a language L in P NP k that is 1/n c -hard for BPP. Then there is a language L’ in P NP k that is (1/2-n -d )-hard for BPP, for any constant d. Trevisan gives a weaker reduction (from 1/n c to (1/2 – log -  n) hardness) but within NP. Since we use the nonmonotone function XOR as an amplifier, we get outside NP.

Open Questions Achieving optimal list-size decoding for arbitrary . What monotone functions f yield efficiently list-decodable f-based error- correcting codes ? Getting an analogue of the Goldreich-Levin algorithm for monotone f-based codes would yield better uniform hardness amplification in NP.

Download ppt "Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)"

Similar presentations