Complexity Theory Lecture 11 Lecturer: Moni Naor.

Slides:

Advertisements

Similar presentations

An Introduction to Randomness Extractors Ronen Shaltiel University of Haifa Daddy, how do computers get random bits?

Advertisements

Linear-Degree Extractors and the Inapproximability of Max Clique and Chromatic Number David Zuckerman University of Texas at Austin.

Complexity Theory Lecture 6

Approximate List- Decoding and Hardness Amplification Valentine Kabanets (SFU) joint work with Russell Impagliazzo and Ragesh Jaiswal (UCSD)

Foundations of Cryptography Lecture 2: One-way functions are essential for identification. Amplification: from weak to strong one-way function Lecturer:

Many-to-one Trapdoor Functions and their Relations to Public-key Cryptosystems M. Bellare S. Halevi A. Saha S. Vadhan.

Simple extractors for all min- entropies and a new pseudo- random generator Ronen Shaltiel Chris Umans.

Foundations of Cryptography Lecture 10 Lecturer: Moni Naor.

Foundations of Cryptography Lecture 11 Lecturer: Moni Naor.

CS151 Complexity Theory Lecture 8 April 22, 2004.

Foundations of Cryptography Lecture 5 Lecturer: Moni Naor.

Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.

Foundations of Cryptography Lecture 13 Lecturer: Moni Naor.

Foundations of Cryptography Lecture 4 Lecturer: Moni Naor.

Randomized Algorithms Kyomin Jung KAIST Applied Algorithm Lab Jan 12, WSAC

Time vs Randomness a GITCS presentation February 13, 2012.

Foundations of Cryptography Lecture 12 Lecturer: Moni Naor.

Foundations of Cryptography Lecture 8: Application of GL, Next-bit unpredictability, Pseudo-Random Functions. Lecturer: Moni Naor Announce home )deadline.

CS151 Complexity Theory Lecture 7 April 20, 2004.

Simple Extractors for All Min-Entropies and a New Pseudo-Random Generator Ronen Shaltiel (Hebrew U) & Chris Umans (MSR) 2001.

Foundations of Cryptography Lecture 5: Signatures and pseudo-random generators Lecturer: Moni Naor.

1 Slides by Iddo Tzameret and Gil Shklarski. Adapted from Oded Goldreich’s course lecture notes by Erez Waisbard and Gera Weiss.

ACT1 Slides by Vera Asodi & Tomer Naveh. Updated by : Avi Ben-Aroya & Alon Brook Adapted from Oded Goldreich’s course lecture notes by Sergey Benditkis,

CS151 Complexity Theory Lecture 7 April 20, 2015.

The Goldreich-Levin Theorem: List-decoding the Hadamard code

CS151 Complexity Theory Lecture 11 May 4, CS151 Lecture 112 Outline Extractors Trevisan’s extractor RL and undirected STCONN.

Derandomizing LOGSPACE Based on a paper by Russell Impagliazo, Noam Nissan and Avi Wigderson Presented by Amir Rosenfeld.

CS151 Complexity Theory Lecture 8 April 22, 2015.

Locally Decodable Codes Uri Nadav. Contents What is Locally Decodable Code (LDC) ? Constructions Lower Bounds Reduction from Private Information Retrieval.

CS151 Complexity Theory Lecture 10 April 29, 2004.

Lecturer: Moni Naor Weizmann Institute of Science

Lecturer: Moni Naor Foundations of Cryptography Lecture 6: pseudo-random generators, hardcore predicate, Goldreich-Levin Theorem, Next-bit unpredictability.

The Power of Randomness in Computation 呂及人中研院資訊所.

CS151 Complexity Theory Lecture 9 April 27, 2004.

1 Algorithms for Large Data Sets Ziv Bar-Yossef Lecture 13 June 22, 2005

1 On the Power of the Randomized Iterate Iftach Haitner, Danny Harnik, Omer Reingold.

Foundations of Cryptography Lecture 9 Lecturer: Moni Naor.

Foundations of Cryptography Lecture 8 Lecturer: Moni Naor.

Foundations of Cryptography Lecture 2 Lecturer: Moni Naor.

Complexity Theory Lecture 12 Lecturer: Moni Naor.

GOING DOWN HILL : EFFICIENCY IMPROVEMENTS IN CONSTRUCTING PSEUDORANDOM GENERATORS FROM ONE-WAY FUNCTIONS Iftach Haitner Omer Reingold Salil Vadhan.

Simulating independence: new constructions of Condensers, Ramsey Graphs, Dispersers and Extractors Boaz Barak Guy Kindler Ronen Shaltiel Benny Sudakov.

CS151 Complexity Theory Lecture 9 April 27, 2015.

Ragesh Jaiswal Indian Institute of Technology Delhi Threshold Direct Product Theorems: a survey.

Why Extractors? … Extractors, and the closely related “Dispersers”, exhibit some of the most “random-like” properties of explicitly constructed combinatorial.

Foundations of Cryptography Lecture 6 Lecturer: Moni Naor.

Secure Computation (Lecture 5) Arpita Patra. Recap >> Scope of MPC > models of computation > network models > modelling distrust (centralized/decentralized.

XOR lemmas & Direct Product thms - Many proofs Avi Wigderson IAS, Princeton ’82 Yao ’87 Levin ‘89 Goldreich-Levin ’95 Impagliazzo ‘95 Goldreich-Nisan-Wigderson.

CS151 Complexity Theory Lecture 10 April 29, 2015.

Fall 2013 CMU CS Computational Complexity Lectures 8-9 Randomness, communication, complexity of unique solutions These slides are mostly a resequencing.

Cryptography and Privacy Preserving Operations Lecture 2: Pseudo-randomness Lecturer: Moni Naor Weizmann Institute of Science.

1 Explicit Two-Source Extractors and Resilient Functions Eshan Chattopadhyay David Zuckerman UT Austin.

Umans Complexity Theory Lectures Lecture 17: Natural Proofs.

List Decoding Using the XOR Lemma Luca Trevisan U.C. Berkeley.

Pseudo-random generators Talk for Amnon ’ s seminar.

Error-Correcting Codes and Pseudorandom Projections Luca Trevisan U.C. Berkeley.

Umans Complexity Theory Lecturess Lecture 11: Randomness Extractors.

Umans Complexity Theory Lectures Lecture 9b: Pseudo-Random Generators (PRGs) for BPP: - Hardness vs. randomness - Nisan-Wigderson (NW) Pseudo- Random Generator.

Pseudo-randomness. Randomized complexity classes model: probabilistic Turing Machine –deterministic TM with additional read-only tape containing “coin.

Probabilistic Algorithms

Umans Complexity Theory Lectures

Pseudorandomness when the odds are against you

Cryptography Lecture 5.

RS – Reed Solomon List Decoding.

The Curve Merger (Dvir & Widgerson, 2008)

Cryptography Lecture 12 Arpita Patra © Arpita Patra.

Umans Complexity Theory Lectures

Cryptography Lecture 5.

CS151 Complexity Theory Lecture 10 May 2, 2019.

CS151 Complexity Theory Lecture 7 April 23, 2019.

Presentation transcript:

Complexity Theory Lecture 11 Lecturer: Moni Naor

Recap Last week: Statistical zero-knowledge AM protocol for VC dimension Hardness and Randomness This Week: Hardness and Randomness Semi-Random Sources Extractors

Derandomization A major research question: How to make the construction of –Small Sample space `resembling’ large one –Hitting sets Efficient. Successful approach: randomness from hardness –(Cryptographic) pseudo-random generators –Complexity oriented pseudo-random generators

Recall: Derandomization I Theorem: any f 2 BPP has a polynomial size circuit Simulating large sample spaces Want to find a small collection of strings on which the PTM behaves similarly to the large collection –If the PTM errs with probability at most , then should err on at most  +  of the small collection Choose m random strings For input x event A x is more than (  +  ) of the m strings fail the PTM Pr[A x ] · e -2  2 m < 2 -2n Pr[ [ x A x ] ·  x Pr[A x ] < 2 n 2 -2n =1 Good 1-  Bad  Collection that should resemble probability of success on ALL inputs Chernoff

Pseudo-random generators Would like to stretch a short secret (seed) into a long one The resulting long string should be usable in any case where a long string is needed –In particular: cryptographic application as a one-time pad Important notion: Indistinguishability Two probability distributions that cannot be distinguished –Statistical indistinguishability: distances between probability distributions –New notion: computational indistinguishability

Computational Indistinguishability Definition : two sequences of distributions {D n } and {D’ n } on {0,1} n are computationally indistinguishable if for every polynomial p(n) for every probabilistic polynomial time adversary A for sufficiently large n If A receives input y  {0,1} n and tries to decide whether y was generated by D n or D’ n then |Prob[A=‘0’ | D n ] - Prob[A=‘0’ | D’ n ] | < 1/p(n) Without restriction on probabilistic polynomial tests: equivalent to variation distance being negligible ∑ β  {0,1} n |Prob[ D n = β] - Prob[ D’ n = β]| < 1/p(n) advantage

Pseudo-random generators Definition : a function g:{0,1} * → {0,1}* is said to be a (cryptographic) pseudo-random generator if It is polynomial time computable It stretches the input |g(x)|>|x| – denote by ℓ(n) the length of the output on inputs of length n If the input (seed) is random, then the output is indistinguishable from random For any probabilistic polynomial time adversary A that receives input y of length ℓ(n) and tries to decide whether y= g(x) or is a random string from {0,1} ℓ(n) for any polynomial p(n) and sufficiently large n |Prob[A=`rand’| y=g(x)] - Prob[A=`rand’| y  R {0,1} ℓ(n) ] | < 1/p(n) Want to use the output a pseudo-random generator whenever long random strings are used Anyone who considers arithmetical methods of producing random numbers is, of course, in a state of sin. J. von Neumann

Pseudo-random generators Definition : a function g:{0,1} * → {0,1}* is said to be a (cryptographic) pseudo-random generator if It is polynomial time computable It stretches the input g(x)|>|x| – denote by ℓ(n) the length of the output on inputs of length n If the input is random the output is indistinguishable from random For any probabilistic polynomial time adversary A that receives input y of length ℓ(n) and tries to decide whether y= g(x) or is a random string from {0,1} ℓ(n) for any polynomial p(n) and sufficiently large n |Prob[A=`rand’| y=g(x)] - Prob[A=`rand’| y  R {0,1} ℓ(n) ] | < 1/p(n) Important issues: Why is the adversary bounded by polynomial time? Why is the indistinguishability not perfect?

Pseudo-Random Generators and Derandomization All possible strings of length k A pseudo-random generator mapping k bits to n bits strings Any input should see roughly the same fraction of accept and rejects The result is a derandomization of a BPP algorithm by taking majority

Complexity of Derandomization Need to go over all 2 k possible input string Need to compute the pseudo-random generator on those points The generator has to be secure against non-uniform distinguishers: –The actual distinguisher is the combination of the algorithm and the input If we want it to work for all inputs we get the non-uniformity

Construction of pseudo-random generators randomness from hardness Idea: for any given a one-way function there must be a hard decision problem hidden there If balanced enough: looks random Such a problem is a hardcore predicate Possibilities: –Last bit –First bit –Inner product

Hardcore Predicate Definition : let f:{0,1} * → {0,1}* be a function. We say that h:{0,1} * → {0,1} is a hardcore predicate for f if It is polynomial time computable For any probabilistic polynomial time adversary A that receives input y=f(x) and tries to compute h(x) for any polynomial p(n) and sufficiently large n |Prob[A(y)=h(x)] - 1/2| < 1/p(n) where the probability is over the choice y and the random coins of A Sources of hardcoreness: –not enough information about x not of interest for generating pseudo-randomness –enough information about x but hard to compute it

Single bit expansion Let f:{0,1} n → {0,1} n be a one-way permutation Let h:{0,1} n → {0,1} be a hardcore predicate for f Consider g:{0,1} n → {0,1} n+1 where g(x)=(f(x), h(x)) Claim : g is a pseudo-random generator Proof: can use a distinguisher for g to guess h(x) f(x), h(x))f(x), 1-h(x))

From single bit expansion to many bit expansion Can make r and f (m) (x) public –But not any other internal state Can make m as large as needed xf(x)h(x,r) Output Internal Configuration r f (2) (x) f (3) (x) Input h(f(x),r) h(f (2) (x),r) h(f (m-1) (x),r)f (m) (x)

Two important techniques for showing pseudo-randomness Hybrid argument Next-bit prediction and pseudo-randomness

Hybrid argument To prove that two distributions D and D’ are indistinguishable: suggest a collection of distributions D= D 0, D 1,… D k =D’ If D and D’ can be distinguished, then there is a pair D i and D i+1 that can be distinguished. Advantage ε in distinguishing between D and D’ means advantage ε/k between some D i and D i+1 Use a distinguisher for the pair D i and D i+1 to derive a contradiction

Next-bit Test Definition : a function g:{0,1} * → {0,1}* is said to pass the next bit test if: It is polynomial time computable It stretches the input |g(x)|>|x| – denote by ℓ(n) the length of the output on inputs of length n If the input (seed) is random, then the output passes the next-bit test For any prefix 0≤ i< ℓ(n), for any probabilistic polynomial time adversary A that is a predictor : receives the first i bits of y= g(x) and tries to guess the next bit, for any polynomial p(n) and sufficiently large n |Prob[A(y i,y 2,…, y i ) = y i+1 ] – 1/2 | < 1/p(n) Theorem : a function g:{0,1} * → {0,1}* passes the next bit test if and only if it is a pseudo-random generator

Landmark results in the theory of cryptographic pseudo-randomness Theorem : if pseudo-random generators stretching by a single bit exist, then pseudo-random generators stretching by any polynomial factor exist Theorem : if one-way permutations exist, then pseudo-random generators exist A more difficult theorem to prove: Theorem [HILL] : One-way functions exist iff pseudo-random generators exist

Complexity Oriented Pseudo-Random Generators Cryptography Only crude upper bound on the time of the `user’ –Distinguisher The generator has less computational power than the distinguisher Derandomization when derandomizing an algorithm you have a much better idea about the resources –In particular know the run time The generator has more computational power –May be from a higher complexity class Quantifier order switch

Ideas for getting better pseudo-random generators for derandomization The generator need not be so efficient For derandomizing –a parallel algorithm generator may be more sequential Example: to derandomize AC 0 circuits can compute parities –A low memory algorithm may use more space In particular we can depart from the one-way function assumption Easy one-way hard the other The (in)distinguishing probability need not be so small –We are are going to take a majority at the end

Parameters of a complexity oriented pseudo-random generator All functions of n Seed length t Output length m Running time n c fooling circuits of size s error ε Any circuit family {C n } of size s(n) that tries to distinguish outputs of the generator from random strings in {0,1} m(n) has at most ε(n) advantage

Hardness Assumption: Unapproximable Functions Definition: E =  [ k DTIME(2 kn ) Definition : A family of functions f = {f n } f n :{0,1} n  {0,1} is said to be s(n)- unapproximable if for every family of circuits {C n } of size s(n): Pr x [C n (x) = f n (x)] ≤ ½ + 1/s(n). s(n) is both the circuit size and the bound on the advantage Example: if g is a one-way permutation and h is a hardcore function strong against s(n)- adversaries, then f(y)=h(g -1 (y)) is s(n)- unapproximable Average hardness notion

One bit expansion Assumption: f = {f n } is –s(n)- unapproximable, for s(n) = 2 Ω(n) –in E Claim: G = {G n }: G n (y) = y◦f log n (y) is a single bit expansion generator family Proof: suppose not, then There exists a predictor that computes f log n with probability better than ½ + 1/s(log n) on a random inpu Parameters seed length t = log n output length m=log n + 1 fooling circuits of size s  n δ running time n c error ε =1/n δ < 1/m

Getting Many Bits Simultaneously Try outputting many evaluations of f on various parts of the seed: Let b i (y) be a projection of y and consider G(y) = f(b 1 (y))◦f(b 2 (y))◦…◦f(b m (y)) Seems that a predictor must evaluate f(b i (y)) to predict i th bit But: predictor might use correlations without having to compute f Turns out: sufficient to decorrelate the b i (y)’s in a pairwise manner If |y|=t and S µ {1...t} we denote by y |S the sequence of bits of y whose index is in S

Nearly-Disjoint Subsets Definition : a collection of subsets S 1,S 2,…,S m  {1…t} is an (h, a) -design if: – Fixed size : for all i, |S i | = h – Small intersection : for all i ≠ j, |S i Å S j | ≤ a {1...t} S1S1 S2S2 S3S3 Each of size h Parameters: (m,t,h,a) Each intersection of size · a

Nearly-Disjoint Subsets Lemma : for every ε > 0 and m < n can construct in poly(n) time a collection of subsets S 1,S 2,…,S m  {1…t} which is a (h, a)- design with the following parameters: h = log n, a = εlog n t is O(log n). Both a proof of existence and a sequential construction Method of conditional probabilities The constant in the big O depends on ε

Nearly-Disjoint Subsets Proof: construction in a greedy manner repeat m times: –pick a random (h=log n)- subset of {1…t} –set t = O(log n) so that: expected overlap with a fixed S i is ½εlog n probability overlap with S i is larger than εlog n is at most 1/m –Can get by picking a single element independently from t’ buckets of size ½ ε For S i event A i is: intersection is larger than (1/2  +  )t’ Pr[A i ] · e -2  2 t’ < 2 -log m –Union bound: some h- subset has the desired small overlap with all the S i picked so far –find the good h- subset by exhaustive search

Other construction of designs Based on error correcting codes Simple construction: based on polynomials

The NW generator Nisan-Wigderson Need: f  E that is s(n)- unapproximable, for s(n) = 2 δn A collection S 1,…,S m  {1…t} which is an (h,a)- design with h=log n, a = δlog n/3 and t = O(log n) G n (y)=f log n (y |S 1 )◦f log n (y |S 2 )◦…◦f log n (y |S m ) f log n : seed y A subset S i

The NW generator Theorem : G={G n } is a complexity oriented pseudo-random generator with: –seed length t = O(log n) –output length m = n δ/3 –running time n c for some constant c –fooling circuits of size s = m –error ε = 1/m

The NW generator Proof: –assume G={G n } does not  -pass a statistical test C = {C m } of size s : |Pr x [C(x) = 1] – Pr y [C( G n (y) ) = 1]| > ε –can transform this distinguisher into a predictor A of size s’ = s + O(m): Pr y [A(G n (y) 1 … i-1 ) = G n (y) i ] > ½ + ε/m just as in the next-bit test using a hybrid argument

Proof of the NW generator Pr y [A(G n (y) 1 … i-1 ) = G n (y) i ] > ½ + ε/m –fix the bits outside of S i to preserve advantage: Pr y’ [A(G n (  y’  ) 1 … i-1 ) = G n (  y’  ) i ] > ½ + ε/m  and  is the assignment to {1…t}\S i maximizing the advantage of A  G n (y)=f log n (y |S 1 )◦f log n (y |S 2 )◦…◦f log n (y |S m ) f log n : Y’ SiSi

 Proof of the NW generator –G n (  y’  ) i is exactly f log n (y’) –for j ≠ i, as y’ varies,  y’  | S j varies over only 2 a values! From the small intersection property To compute G n (  y’  ) j need only a lookup table of size 2 a hard-wire (up to) m-1 tables of 2 a values to provide all G n (  y’  ) 1 … i-1 G n (y)=f log n (y |S 1 )◦f log n (y |S 2 )◦…◦f log n (y |S m ) f log n : y ’ SiSi

The Circuit for computing f There is a small circuit for approximating f from A A output f log n (y’) y’ Properties of the circuit size: s + O(m) + (m-1)2 a < s(log n) = n δ advantage ε/m=1/m 2 > 1/s(log n) = n -δ hardwired tables 123i-1

Extending the result Theorem : if E contains 2 Ω(n) - unapproximable functions then BPP = P. The assumption is an average case one Based on non-uniformity Improvement: Theorem: If E contains functions that require size 2 Ω(n) circuits (for the worst case), then E contains 2 Ω(n) - unapproximable functions. Corollary: If E requires exponential size circuits, then BPP = P.

Extracting Randomness from defective sources Suppose that we have an imperfect source of randomness –physical source biased, correlated –Collection of events in a computer /dev/rand Information Leak Can we: –Extract good random bits from the source –Use the source for various tasks requiring randomness Probabilistic algorithms

Imperfect Sources Biased coins: X 1, X 2 …, X n Each bit is independently chosen so that Pr[X i =1] = p How to get unbiased coins? Von Neumann’s procedure: Flip the coin twice. If it comes up ‘0’ followed by ‘1’ call the outcome ‘0’. If it comes up ‘1’ followed by ‘0’ call the outcome ‘1’. Otherwise (two ‘0’ ‘s or two ‘1’ ‘s occurred) repeat the process. Claim: procedure generates an unbiased result, no matter how the coin was biased. Works for all p simultaneously Two questions: Can we get a better rate of generating bits What about more involved or insidious models?

Shannon Entropy Let X be random variable over alphabet  with distribution P The Shannon entropy of X is H(X) = - ∑ x  P(x) log P (x) Where we take 0 log 0 to be 0. Interpretation: represents how much we can compress X under the best encoding

Examples If X=0 (constant) then H(x) = 0 –Only case where H(X)=0 when X is constant –All other cases H(X) >0 If X  {0,1} and Prob[X=0] = p and Prob[X=1]=1-p, then H(X) = -p log p + (1-p) log (1-p) ≡ H(p) If X  {0,1} n and is uniformly distributed, then H(X) = - ∑ x  {0,1} n 1/2 n log 1/2 n = 2 n /2 n n = n

Properties of Entropy Entropy is bounded: H(X) ≤ log |  | –equality occurs only if X is uniform over  For any function f:{0,1} *  {0,1} * H(f(X)) ≤ log H(X) H(X) Is an upper bound on the number of bits we can deterministically extract from X

Does High Entropy Suffice for extraction? If we have a source on X  {0,1} n where X has high entropy (say H(X) ≥ n/2 ), how many bits can we guarantee to extract? Consider: –Pr[X=0 n ] = 1/2 –For any x  1{0,1} n-1 Pr[ X=x ] = 1/2 n Then H(X) = n/2+1/2 But cannot guarantee more than a single bit in the extraction

Another Notion: Min Entropy Let X be random variable over alphabet  with distribution P x The min entropy of X is H min (X) = - log max x   P (x) The min entropy represents the most likely value of X min-entropy k implies: – no string has weight more than 2 -k Property: H min (X) ≤ H(X) Why? Would like to extract k bits from a min entropy k source. Possible ~ If: we know the source have unlimited computation power

The Semi-Random model Santa Vazirani Definition : A source emitting a sequence of bits X 1, X 2, …, X n is an SV source with bias  if for all 1 · i · n and b 1, b 2, …, b i 2 {0,1} i we have: ½ -  · Pr[X i = b i | X 1 = b 1, X 2 = b 2 …, X i-1 = b i-1 ] · ½ +  So the next bit has bias at most  Clear generalization of a biased coin Motivation: physical measurements where there are correlations with the history Distributed imperfect coin generation An SV source has high min entropy: for any string b 1, b 2, …, b n Pr[X 1 = b 1, X 2 = b 2 …, X n = b n ] · (½ +  ) n

Impossibility of extracting a single bit from SV sources Would like a procedure similar to von Neumann’s for SV sources. –A function f  :{0,1} n  {0,1} such that for any SV source with bias  we have that f  (X 1, X 2, …, X n ) is more or less balanced. Theorem: For all  2 (0,1], all n and all functions f:{0,1} n  {0,1}: there is an SV source of bias  such that f(X 1, X 2, …, X n ) has bias at least 

Proof of impossibility of extraction strong SV sources Definition : A source emitting a sequence of bits X 1, X 2, …, X n is a strong SV source with bias  if: for all 1 · i · n and b 1, b 2, …, b i-1, b i+1, …, b n 2 {0,1} n-1 we have that the bias of X i given that X 1 = b 1, X 2 = b 2 …, X i-1 = b i-1, X i+1 = b i+1, …, X n = b n is at most  This is a restriction: every strong SV source is also an SV source Even the future does not help you to bias X i too much

Proof of impossibility of extraction  –imbalanced sources Definition : A source emitting a sequence of n bits X=X 1, X 2, …, X n is  – imbalanced if for all x,y 2 {0,1} n we have Pr[X=x]/Pr[X=y] · (1+  )/  (1 -  ) Lemma : Every  –imbalanced source is a strong SV source with bias  Proof : for all 1 · i · n and b 1, b 2, …, b i-1, b i+1, …, b n 2 {0,1} n-1 the  – imbalanced property implies that (1 -  )/  (1 +  · Pr[X 1 = b 1, X 2 = b 2 …, X i-1 = b i-1,X i =0, X i+1 = b i+1 …, X n = b n ]/ Pr[X 1 = b 1, X 2 = b 2 …, X i-1 = b i-1,X i =1, X i+1 = b i+1 …, X n = b n ] · (1+  )/  (1 -  which implies strong bias at most 

Proof of impossibility of extraction Lemma : for every function f:{0,1} n  {0,1} there is a  –imbalanced source such that f  (X 1, X 2, …, X n ) has bias at least  Proof : there exists a set S µ {0,1} n of size 2 n-1 such that f is constant on S. Consider source X: –With probability ½ +  pick a random element in S –With probability ½ -  pick a random element in {0,1} n /S Recall: A source is  –imbalanced if for all x,y 2 {0,1} n we have Pr[X=x]/Pr[X=y] · (1+  (1 -  S s.t. |S|= 2 n-1 f -1 (b) b is the majority f -1 (1-b)

Extractors So if extraction from SV is impossible should we simply give up? No: use randomness! Make sure you are using much less randomness than you are getting out

Extractor Extractor: a universal procedure for purifying imperfect source: –The function Ext(x,y) should be efficiently computable –truly random seed as “catalyst” –Parameters: (n, k, m, t,  ) seed source string near-uniform {0,1} n 2 k strings Ext t bits m bits Truly random

Extractor: Definition (k, ε)- extractor: for all random variables X with min-entropy k : –output fools all tests T : |Pr z [T(z) = 1] – Pr y 2 R { 0,1 } t, x  X [T(Ext(x, y)) = 1]| ≤ ε –distributions Ext(X, U t ) and U m are ε -close ( L 1 dist ≤ 2 ε ) U m uniform distribution on :{0,1} m Comparison to Pseudo-Random Generators –output of PRG should fool all efficient tests –output of extractor should fool all tests

Extractors: Applications Using extractors –use output in place of randomness in any application –alters probability of any outcome by at most ε Main motivation: –use output in place of randomness in algorithm –how to get truly random seed? –enumerate all seeds, take majority

Extractor as a Graph {0,1} n {0,1} m 2t2t Size : 2 k Want every subset of size 2 k to see almost all of the rhs with equal probability For each subset of size at least 2 k the degree on the rhs should be roughly the same

Extractors: desired parameters Goals:good: optimal: short seed O(log n) log n+O(1) long output m = k Ω(1) m = k+t–O(1) many k’s k = n Ω(1) any k = k(n) seed source string near-uniform {0,1} n 2 k strings Ext t bits m bits Allows going over all seeds

Extractors A random construction for Ext achieves optimal! –but we need explicit constructions Otherwise we cannot derandomize BPP –optimal construction of extractors still open Trevisan Extractor: –idea: any string defines a function String C over  of length ℓ define a function f C :{1… ℓ }   by f C (i)=C[i] –Use NW generator with source string in place of hard function From complexity to combinatorics!

Error-correcting codes Error Correcting Code (ECC): C:Σ n  Σ n message m  Σ n codeword c(m)  Σ ℓ received corrupted word R –C(m) with some positions corrupted If not too many errors, want to decode: D(R) = m parameters of interest: –rate: n/ ℓ –distance: d = min m  m’ Δ(C(m), C(m’))

Distance and error correction Error Correcting Code C with minimum distance d: –can uniquely decode from up to  d/2  errors ΣℓΣℓ d

Distance, error correction and list decoding Alternative to unique decoding: find a short list of messages (inlcuding the correct one) –Hope to get closer to d errors! Instead of  d/2  errors in unique decoding Johnson Bound: Theorem: a binary code with minimum distance (½ - δ 2 ) ℓ has at most O(1/δ 2 ) codewords in any ball of radius (½ - δ)ℓ.

Trevisan Extractor Tools: –An error-correcting code C:{0,1} n  {0,1} ℓ Distance between codewords: (½ - ¼m -4 )ℓ – Important: in any ball of radius ½-  there are at most 1/  2 codewords.  = ½ m -2 Blocklength ℓ = poly(n) Polynomial time encoding –Decoding time does not matter –An (a,h)- design S 1,S 2,…,S m  {1…t } where h=log ℓ a = δlog n/3 t=O(log ℓ) Construction: Ext(x, y)=C(x)[y |S 1 ]◦C(x)[y |S 2 ]◦…◦C(x)[y |S m ]

Trevisan Extractor Ext(x, y)=C(x)[y |S 1 ]◦C(x)[y |S 2 ]◦…◦C(x)[y |S m ] Theorem : Ext is an extractor for min-entropy k = n δ, with –output length m = k 1/3 –seed length t = O(log ℓ ) = O(log n) –error ε ≤ 1/m C(x): seed y

Proof of Trevisan Extractor Assume X µ {0,1} n is a min-entropy k random variable failing to ε -pass a statistical test T: |Pr z [T(z) = 1] - Pr x  X, y  {0,1} t [T(Ext(x, y)) = 1]| > ε By applying usual hybrid argument: there is a predictor A and 1 · i · m: Pr x  X, y  {0,1} t [A(Ext(x, y) 1 … i-1 ) = Ext(x, y) i ] > ½+ε/m

The set for which A predict well Consider the set B of x’ s such that Pr y  {0,1} t [A(Ext(x, y) 1 … i-1 ) = Ext(x, y) i ] > ½+ε/2m By averaging Pr x [x 2 B] ¸ ε/2m Since X has min-entropy k: there should be at least ε/2m 2 k different x in B The contradiction will be by showing a succinct encoding for each x 2 B

…Proof of Trevisan Extractor i, A and B are fixed If you fix the bits outside of S i to  and  and let y’ vary over all possible assignments to bits in S i. Then Ext(x, y) i = Ext(x,  y’  ) i = C(x)[  y’  |S i ] = C(x)[ y’ ] goes over all the bits of C(x) For every x 2 B short description of a string z close to C(x) –fix bits outside of S i to  and  preserving the advantage Pr y’ [P(Ext(x,  y’  ) 1 … i-1 )=C(x)[y’] ] > ½ + ε/(2m)  and  is the assignment to {1…t}\S i maximizing the advantage of A –for j ≠ i, as y’ varies,  y’  | S j varies over only 2 a values! –Can provide (i-1) tables of 2 a values to supply Ext(x,  y’  ) 1 … i-1

Trevisan Extractor short description of a string z agreeing with C(x) A Output is C(x)[y’ ] w.p. ½ + ε/(2m) over Y’ y’ Y’  {0,1} log ℓ

…Proof of Trevisan Extractor Up to ( m-1 ) tables of size 2 a describe a string z that has a ½ + ε/(2m) agreement with C(x) Number of codewords of C agreeing with z: on ½ + ε/(2m) places is O(1/δ 2 )= O(m 4 ) Given z: there are at most O(m 4 ) corresponding x ’s Number of strings z with such a description: 2 (m-1)2 a = 2 n δ 2/3 = 2 k 2/3 total number of x 2 B O(m 4 ) 2 k 2/3 << 2 k (ε/2m) Johnson Bound: A binary code with distance (½ - δ 2 )n has at most O(1/δ 2 ) codewords in any ball of radius (½ - δ)n. C has minimum distance (½ - ¼m -4 )ℓ

Conclusion Given a source of n random bits with min entropy k which is n  (1) it is possible to run any BPP algorithm using the source and obtain the correct answer with high probability Even though extracting even a single bit may be impossible

Application: strong error reduction L  BPP if there is a p.p.t. TM M: x  L  Pr y [M(x,y) accepts ] ≥ 2/3 x  L  Pr y [M(x,y) rejects ] ≥ 2/3 Want: x  L  Pr y [M(x,y) accepts ] ≥ k x  L  Pr y [M(x,y) rejects ] ≥ k Already know: if we repeat O(k) times and take majority –Use n = O(k)·|y| random bits; Of them 2 n-k can be bad strings

Strong error reduction Better: Ext extractor for k = |y| 3 = n δ, ε < 1/6 –pick random w  R {0,1} n –run M(x, Ext(w, z)) for all z  {0,1} t take majority of answers –call w “ bad ” if maj z M(x, Ext(w, z)) is incorrect |Pr z [M(x,Ext(w,z))=b] - Pr y [M(x,y)=b]| ≥ 1/6 –extractor property: at most 2 k bad w –n random bits; 2 n δ bad strings

References Theory of cryptographic Pseudo-randomness developed by: – Blum and Micali, next bit test, 1982 –Computational indistinguishability, Yao, 1982, The NW generator: – Nisan and Wigderson, Hardness vs. Randomness, JCSS, –Some of the slides on the topic: Chris Umans’ course Lecture 9