Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at

Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at http://eecs.harvard.edu/~salil

I. Motivation

Original Motivation [SV84,Vaz85,VV85,CG85,Vaz87,CW89,Zuc90,Zuc91] Randomization is pervasive in CS –Algorithm design, cryptography, distributed computing, … Typically assume perfect random source. –Unbiased, independent random bits –Unrealistic? Can we use a “weak” random source? –Source of biased & correlated bits. –More realistic model of physical sources. (Randomness) Extractors: convert a weak random source into an almost-perfect random source.

Applications of Extractors Derandomization of BPP [Sip88,GZ97,MV99,STV99] Derandomization of logspace algorithms [NZ93,INW94,RR99,GW02] Distributed & Network Algorithms [WZ95,Zuc97,RZ98,Ind02]. Hardness of Approximation [Zuc93,Uma99,MU01] Cryptography [CDHKS00,MW00,Lu02] Data Structures [Ta02]

The Unifying Role of Extractors Extractors can be viewed as types of: Hash Functions Expander Graphs Samplers Pseudorandom Generators Error-Correcting Codes  Unify the theory of pseudorandomness.

This Tutorial Is framed around connections between extractors & other objects. We’ll use these to: –Gain intuition for the definition. –Describe a few applications. –Hint at the constructions. Many omissions. For further reading: –N. Nisan and A. Ta-Shma. Extracting randomness: a survey and new constructions. Journal of Computer & System Sciences, 58 (1):148-173, 1999. –R. Shaltiel. Recent developments in explicit constructions of extractors. Bulletin of EATCS, 77:67-95, June 2002. –S. Vadhan. Course Notes for CS225: Pseudorandomness. http://eecs.harvard.edu/~salil

Outline I.Motivation II. Extractors as extractors III. Extractors as hash functions IV. Extractors as expander graphs V. Extractors as pseudorandom generators VI. Extractors as error-correcting codes VII. Concluding remarks & open problems

II. Extractors as Extractors

Weak Random Sources What is a source of biased & correlated bits? –Probability distribution X on {0,1} n. –Must contain some “randomness”. –Want: no independence assumptions ) one sample Measure of “randomness” –Shannon entropy: No good: –Better [Zuckerman `90] : min-entropy

Min-entropy Def: X is a k -source if H 1 ( X ) ¸ k. i.e. Pr [ X = x ] · 2 -k for all x Examples: –Unpredictable Source [SV84]: 8 i 2 [ n ], b 1,..., b i-1 2 {0,1}, –Bit-fixing [CGH+85,BL85,LLS87,CW89]: Some k coordinates of X uniform, rest fixed (or even depend arbitrarily on others). –Flat k -source: Uniform over S µ {0,1} n, |S|=2 k Fact [CG85]: every k -source is convex combination of flat ones.

Extractors: 1 st attempt A function Ext : {0,1} n ! {0,1} m s.t. 8 k -source X, Ext ( X ) is “close” to uniform. Impossible! 9 set of 2 n-1 inputs x on which first bit of Ext(x) is constant ) flat (n- 1) - source X, bad for Ext. E XT k - source of length n m almost-uniform bits

Extractors [Nisan & Zuckerman `93] Def: A (k,  ) -extractor is Ext : {0,1} n £ {0,1} d ! {0,1} m s.t. 8 k -source X, Ext ( X,U d ) is  -close to U m. d random bits “seed” Key point: seed can be much shorter than output. Goals: minimize seed length, maximize output length. E XT k - source of length n m almost-uniform bits

Definitional Details U t = uniform distribution on {0,1} t Measure of closeness: statistical difference (a.k.a. variation distance) –T = “statistical test” or “distinguisher” –metric, 2 [0,1], very well-behaved Def: X, Y  -close if  (X,Y) · .

The Parameters The min-entropy k : –High min-entropy: k = n-a, a =o(n) –Constant entropy rate: k =  (n) –Middle (hardest) range: k = n , 0<  <1 –Low min-entropy: k = n o(1) The error  : –In this talk:  =.01 (for simplicity) –Very small  sometimes important. The output length m : –Certainly m · k + d. –Can this be achieved?

The Optimal Extractor Thm [Sip88,RT97]: For every k · n, 9 a ( k,  )-extractor w/ –Seed length d= log(n-k)+O(1) –Output length m = k+d - O(1) –“extract almost all the min-entropy w/logarithmic seed” Pf sketch: Probabilistic Method. Show that for random Ext, Pr[Ext not (k,  )- extractor ] < 1. –Use capital letters: N=2 n, M=2 m,... –For fixed flat k -source X and T µ {0,1} m, –# choices of X and T = (Chernoff) ( ¼ log n except high min-ent.)

The Optimal Extractor Thm: For every k · n, there exists a ( k,  )-extractor w/ –Seed length d= log(n-k)+O(1) –Output length m = k+d-O(1) Thm [NZ93,RT97]: Above tight up to additive constants. For applications, need explicit extractors: –Ext(x,y) computable in time poly(n). –Random extractor requires space ¸ 2 n to even store! Long line of research has sought to approach above bounds with explicit constructions.

Application: BPP w/a weak source [Zuckerman `90,`91] accept/reject Randomized Algorithm input x errs w.p. ·  2(  ) Run algorithm using all 2 d seeds & output majority. Only polynomial slowdown, provided d=O(log n) and Ext explicit. k - source m uniform bits d -bit seed ++ almost E XT

III. Extractors as Hash Functions

Strong extractors Output looks random even after seeing the seed. Def: Ext is a (k,  ) strong extractor if Ext 0 (x,y) = y ± Ext(x,y) is a (k,  ) extractor i.e. 8 k -sources X, for a 1-  0 frac. of y 2 {0,1} d Ext ( X,y) is  0 -close to U m Optimal: d= log(n-k)+O(1), m= k-O(1) Can obtain strongness explicitly at little cost [RSW00].

Extractors as Hash Functions {0,1} n {0,1} m flat k -source, i.e. set of size 2 k À 2 m For most y, h y maps sets of size K almost uniformly onto range.

Extractors from Hash Functions Leftover Hash Lemma [ILL89]: universal (ie pairwise independent) hash functions yield strong extractors –output length: m= k-O(1) –seed length: d= O(n) –example: Ext(x,(a,b))= first m bits of a ¢ x+b in GF( 2 n ) Almost pairwise independence [SZ94,GW94]: –seed length: d= O(log n+k)

IIb. Extractors as Extractors

Composing Extractors We have some nontrivial basic extractors. Idea: compose them to get better extractors Original approach of [NZ93] & still in use.

Increasing the Output [WZ93] m 1 - bit output Intuition: if m 1 ¿ k, source still has a lot of randomness d 1 bits E XT 1 k - source d 2 bits m 2 - bit output E XT 2

Increasing the Output Length [WZ93] Prop: If Ext 1 is a (k,  ) -extractor & Ext 2 a (k-m 1 - O(1),  ) -extractor, then Ext is a (k,3  )- extractor. Key lemma: (X,Z) (correlated) random vars. X a k -source and |Z|=s w.p. ¸ 1-  over z Ã Z, X | Z=z is a ( k-s- log(1/  ) ) -source. Compare w/Shannon entropy:

Proof of Key Lemma Key lemma: (X,Z) (correlated) random vars, Proof: Let BAD = { z : Pr[Z=z] ·  ¢ 2 -s }. Then X a k -source and |Z|=s w.p. ¸ 1-  over z Ã Z, X | Z=z is a ( k-s- log(1/  ) ) -source.

Pf of Prop : Z 1  - close to uniform (because Ext 1 an extractor) w.p. ¸ 1-  over z Ã Z 1 X| Z1 =z a (k 1 -m 1 -O(1))- source (by Key Lemma) Z 2 | Z1 =z  - close to uniform (because Ext 2 an extractor) ) ( Z 1, Z 2 ) 3  -close to uniform. Increasing the Output [WZ93] m 1 - bit Z 1 d 1 bits E XT 1 k - source X d 2 bits m 2 - bit Z 2 E XT 2

An Application [NZ93]: Pseudorandom bits vs. Small Space ) Output looks uniform to observer. Small space s 00000000 00111011101000100000000100001100001 01100001 0100000100010101100000010 seed E XT length n Applications: – derandomizing logspace algorithms [NZ93] – cryptography in bounded-storage model [Lu02] Start w/source of truly random bits. Conditioned on observer’s state, have (k-s-O(1)) -source w.h.p. (by Key Lemma) 0100000100010101100000010

Shortening the Seed Ext 2 may have shorter seed (due to shorter output). Problem: Ext 1 only guaranteed to work when seed independent of source. m 1 - bit output d 1 bits E XT 1 k - source d 2 bits zzzz E XT 2 Idea: use output of one extractor as seed to another.

Block Sources [CG85] Def: (X 1,X 2 ) is a (k 1,k 2 ) block source if –X 1 is a k 1 - source – is a k 2 - source m 1 - bit output d 1 bits E XT 1 X1X1 d 2 bits X2X2 E XT 2 Q: When does this work?

The [NZ93] Paradigm An approach to constructing extractors: 1.Given a general source X 2.Convert it to a block source (X 1,X 2 ) –can use part of the seed for this –may want many blocks (X 1,X 2, X 3,...) 3.Apply block extraction (using known extractors, e.g. almost pairwise independence) Still useful today – it “improves” extractors, e.g. [RSW00] How to do Step 2?? –get a block by randomly sampling bits from source... –harder as min-entropy gets lower.

Outline I.Motivation  II. Extractors as extractors  III. Extractors as hash functions  IV. Extractors as expander graphs V. Extractors as pseudorandom generators VI. Extractors as error-correcting codes VII. Concluding remarks & open problems

IV. Extractors as Expander Graphs

Expander Graphs Measures of Expansion: –Vertex Expansion: Every subset S of size   n has at least   |S| neighbors for constants  > 0,  > 1. –Eigenvalues: 2 nd largest eigenvalue of random walk on G is  for constant < 1. (equivalent for constant-degree graphs [Tan84,AM85,Alo86]) Informally: Sparse graphs w/ very strong connectivity. Goals: – Minimize the degree. – Maximize the expansion. Random graphs of degree 3 are expanders [Pin73], but explicit constructions of constant-degree expanders much harder [Mar73,...,LPS86,Mar88]. S Neighbors(S)

K Extractors & Expansion [NZ93] Connect x  {0,1} n and y  {0,1} m if Ext(x,r)=y for some r  {0,1} d Short seed  low degree Extraction  expansion [N] ´ {0,1} n [M] ´ {0,1} m D ¸ (1-  )  M n - bit k - source m almost-uniform bits d -bit seed E XT x y

Extractors vs. Expander Graphs Main Differences: 1.Extractors are unbalanced, bipartite graphs. 2.Different expansion measures (extraction vs. e-value). –Extractors  graphs which “beat the e-value bound” [NZ93,WZ93] 3.Extractors polylog degree, expanders constant degree. 4.Extractors: expansion for sets of a fixed size Expanders: expansion for all sets up to some size

Extractors vs. Expander Graphs Main Differences: Extractors are unbalanced, bipartite graphs. 2.Different expansion measures (extraction vs. e-value). –Extractors  graphs which “beat the e-value bound” [NZ93,WZ95] 3.Extractors polylog degree, expanders constant degree. 4.Extractors  expansion for sets of a fixed size Expanders  expansion for all sets up to some size

K Expansion Measures — Extraction Extractors: Start w/min-entropy k, end  -close to min-entropy m ) measures how much min-entropy increases (or is not lost) Eigenvalue: similar, but for “2-entropy” (w/o  -close) [N] ´ {0,1} n [M] ´ {0,1} m D ¸ (1-  )  M n - bit k - source m almost-uniform bits d -bit seed E XT

Let G = D -regular, N -vertex graph A = transition matrix of random walk on G = (adj. mx) /D Fact: A has 2 nd largest e-value iff  prob. distribution X || A X - U N || 2   || X - U N || 2 Fact :  e-value measures how random step increases 2- entropy Expansion Measures — The Eigenvalue

Extractors vs. Expander Graphs Main Differences: Extractors are unbalanced, bipartite graphs. Different expansion measures (extraction vs. e-value). –Extractors  graphs which “beat the e-value bound” [NZ93,WZ95] 3.Extractors polylog degree, expanders constant degree. 4.Extractors: expansion for sets of a fixed size Expanders: expansion for all sets up to some size

The Degree Constant-degree expanders viewed as “difficult”. Extractors typically nonconstant degree, “elementary” –Optimal: d  log (n-k) truly random bits. –Typically: k =  n or k = n   d=  (log n) –Lower min-entropies viewed as hardest. Contradictory views? –Easiest extractors  highest min-entropy k = n–O(1)  d=O(1)  constant degree –Resolved in [RVW01]: high min-entropy extractors & constant- degree expanders from same, simple “zig-zag product” construction.

High Min-Entropy Extractors [GW94] length n, ( n-a)- source n1n1 n2n2 Observe: If break source into two blocks. ) (close to) a ( n 1 -a, n 2 -a-O(1))- block source! (by Key Lemma) d1d1 E XT 1 m1m1 E XT 2 d2d2 Do block-source extraction!

Zig-Zag Product [RVW00] n1n1 n2n2 d1d1 E XT 1 m1m1 E XT 2 d2d2 length n, min-entropy n-a Problem: Lose a bits of min-entropy. aa Solution: Collect “buffers” which retain unextracted min-entropy [RR99] E XT 3 d3d3 a Extract from buffers at end.  zig-zag product

Extractors vs. Expander Graphs Main Differences: Extractors are unbalanced, bipartite graphs. Different expansion measures (extraction vs. e-value). –Extractors  graphs which “beat the e-value bound” [NZ93,WZ95] Extractors polylog degree, expanders constant degree. 4.Extractors: expansion for sets of a fixed size Expanders: expansion for all sets up to some size

Randomness Conductors [CRVW02] Six parameters: n, m, d, k, ,  For every k  k and every input k -source, output is  -close to a ( k+  ) -source. m = k+  : extractor with guarantee for smaller sets.  = d : “Lossless expander” [TUZ01] –Equivalent to graphs with vertex expansion (1-  )  degree! –Explicitly: very unbalanced case w/polylog degree [TUZ01], nearly balanced case w/const. deg [CRVW02] n -bit input m -bit output d -bit seed C ON

V. Extractors as Pseudorandom Generators

PRG Pseudorandom Generators [BM82,Y82,NW88] Generate many bits that “look random” from short random seed. m bits indistinguishable fr. uniform Distributions X, Y computationally indistinguishable if for all efficient T (circuit of size m ), d -bit seed

Hardness vs. Randomness Any function of high circuit complexity  PRGs [BM82,Y82,NW88,BFNW93,IW97,...] Current state-of-the-art [SU01,Uma02] : Thm [IW97]: If E=DTIME(2 O( ) ) requires circuits of size 2  ( ), then P=BPP. f : {0,1}  {0,1} circuit complexity k PRG f : {0,1} O( )  {0,1} m m  k  (1)

Extractors & PRGs Thm [Trevisan `99]: Any “sufficiently general” construction of PRGs from hard functions is also an extractor.

Extractors & PRGs PRG f d -bit seed E XT d -bit seed statistically close to U m n bits w/min-entropy k comp. indisting. from U m

Extractors & PRGs PRG f d -bit seed E XT d -bit seed statistically close to U m n bits w/min-entropy k comp. indisting. from U m Step I: View extractors as “randomness multipliers” [NZ93,SZ94]

Extractors & PRGs PRG f d -bit seed E XT d -bit seed statistically close to U m n bits w/min-entropy k comp. indisting. from U m Step II: View hard function as an input to PRG. f : {0,1} log n  {0,1} circuit complexity k

PRG d -bit seed comp. indisting. from U m f : {0,1} log n  {0,1} circuit complexity k 1. f from dist. of min-entropy k  whp f has circuit complexity  k – O(1) (even “description size” k – O(1)) 2.Statistical closeness  computational indistinguishability “relative to any distinguisher” 3.(1) holds “relative to any distinguisher”. Analysis (intuition) min-entropy statistically close E XT

PRG d -bit seed comp. indisting. from U m f : {0,1} log n  {0,1} circuit complexity k 1.Fix a statistical test T µ {0,1} m. 2.Suppose T distinguishes PRG f ( U d ) from U m. PRG correctness proof ) circuit C of size k 0 s.t. C T ´ f. 3.But w.h.p., f| C is a (k-k 0 -O(1)) -source (by Key Lemma), so undetermined if k 0 ¿ k. )( Analysis (“formal”) min-entropy statistically close E XT “reconstruction paradigm”

When does this work? When PRG has a “black-box” proof: for any function f and any statistical test T, i.e. if PRG construction “relativizes” [Mil99,KvM99] Almost all PRG constructions are of this form. Partial converse: If E XT is an explicit extractor and f has high description (Kolmogorov) complexity relative to T, then E XT ( f,  ) is pseudorandom for T.

Consequences Simple (and very good) extractor based on NW PRG [Tre99] (w/ subsequent improvements [RRV99,ISW00,TUZ01] ) More importantly: new ways of thinking about both objects. Benefits for extractors: –Reconstruction paradigm: Given T distinguishing Ext ( x, U d ) from U m, “reconstruct” x w/short advice (used in [TZS01,SU01] ) –New techniques from PRG literature. Benefits for PRGs: –Best known PRG construction [Uma02], finer notion of optimality. –Distinguishes information-theoretic vs. efficiency issues. –To go beyond extractor limitations, must use special properties of hard function or distinguisher (as in [IW98,Kab00,IKW01,TV02]).

VI. Extractors as Error-Correcting Codes

Error-Correcting Codes Classically: Large pairwise Hamming distance. List Decoding: Every Hamming ball of rel. radius ½-  in {0,1} D has at most K codewords. Many PRG [GL89,BFNW93,STV99,MV99] and extractor [Tre99,...,RSW00] constructions use codes. [Ta-Shma & Zuckerman `01]: Extractors are a generalization of list-decodable codes. ECC n -bit message x D -bit codeword ECC(x)

d -bit y Strong 1-bit Ext’s  List-Decodable Codes E XT n -bit x ECC n -bit x D=2 d bits  -close to U d+1    The Correspondence: ECC(x) y = E XT(x,y)

Claim: E XT (k,  ) extractor  ECC has < 2 k codewords in any ( ½-  )-ball Pf: Suppose 2 k codewords within distance ( ½-  ) of z 2 {0,1} D. Feed extractor uniform dist. on corresponding msgs. Consider statistical test T={ y ±  : y 2 {0,1} d,  =z y } –Pr[ extractor output  T] > ½+  –Pr[ uniform distribution  T] = ½.  Strong 1-bit Ext’s  List-Decodable Codes d -bit y E XT n -bit x ECC n -bit x D=2 d bits  -close to U d+1    ECC(x) y = E XT(x,y)

Claim: ECC has <  2 k codewords in any ( ½-  )-ball  E XT (k, 2  ) extractor Pf: Suppose on k -source X, output 2  -far from U d+1.  P : {0,1} d  {0,1} s.t. Pr X,Y [ P(Y)= E XT(X,Y) ] > ½+2 .  E X [ dist(P,ECC(X)) ] < ½-2 . But only  2 k codewords in ( ½-  )-ball around P.  Strong 1-bit Ext’s  List-Decodable Codes d -bit y E XT n -bit x ECC n -bit x D=2 d bits  -close to U d+1    ECC(x) y = E XT(x,y)

Extractors & Codes Many-bit extractors  list-decodable codes over large alphabets (size 2 m ) [TZ01] “Reconstruction proof” in PRG view $ Decoding algorithm in code view Trevisan’s extractor has efficient decoding alg. [TZ01]. –several applications (data structures for set storage [Ta02]...) Idea [Ta-Shma, Zuckerman, & Safra `01]: Exploit codes more directly in extractor construction.

Extractors from Codes Existing codes give extractors with short output. Q: How to get many bits? Use seed to select m positions in encoding of source. Positions independent: works but seed too long. Dependent positions? –[Tre99] gives one way. ECC n -bit x ECC(x) seed y m -bit output

Dependent projections Naive: consecutive positions Analysis attempt (reconstruction): Goal: given T µ {0,1} m which distinguishes Ext (x,U d ) from U m, reconstruct x with few (i.e. k ) bits of advice. ECC n -bit x seed y m bits P E CC (x) i From T, get next-bit predictor P : {0,1} i ! {0,1} s.t. [Yao82]

The Reconstruction Easy case: P always correct. –Advice: first consecutive i · m positions. P E CC (x) P correct for ½+  fraction of positions. i E CC (x) i – Repeatedly applying P, reconstruct all of E CC (x) & hence x. PPPPPPPPPPP

Dealing with Errors Q: How to deal with errors? Use error-correcting properties of code –Suffices to reconstruct “most” positions. –½-  errors requires list-decoding ) additional advice –But one incorrect prediction can ruin everything! Idea: error-correct with each prediction step –Need “consecutive” to be compatible w/decoding, reuse advice. Reed-Muller code: E CC (x) = low-degree poly. F m ! F –[Ta-Shma,Zuckerman, & Safra `01]: consecutive = along line. –[Shaltiel-Umans `01]: consecutive = according to linear map which “generates” F m n {0} –[Umans `02]: PRG from this.

VII. Concluding Remarks

Towards Optimality Recall: For every k · n, 9 a ( k,  )-extractor w/ Seed d= log(n-k)+O(1) &Output m = k+d-O(1) Thm [...,NZ93,WZ93,GW94,SZ94,SSZ95,Zuc96,Ta96,Ta98,Tre99, RRV99, ISW00,RSW00,RVW00,TUZ01,TZS01,SU01,LRVW02] For every k · n, 9 an EXPLICIT ( k,  )-extractor w/ Seed d= O(log(n-k)) &Output m =.  k Seed d= O(log 2 (n-k)) &Output m = k+d-O(1) Not there yet! Optimize up to additive constants. –In many apps, efficiency » D=2 d –Often entropy loss k+d-m significant, rather than m itself. –Dependence on error 

Conclusions The many guises of randomness extractors –extractors, hash fns, expanders, samplers, pseudorandom generators, error-correcting codes –translating ideas between views very powerful! –increases impact of work on each object The study of extractors –many applications –many constructions: “information theoretic” vs. “reconstruction paradigm” –optimality important

Some Research Directions Exploit connections further. Optimality up to additive constants. Single, self-contained construction for all ranges of parameters. ( [SU01] comes closest.) Study randomness conductors. When can we have extractors with no seed? –important for e.g. cryptography w/imperfect random sources. –sources with “independence” conditions [vN51,Eli72,Blu84,SV84, Vaz85, CG85,CGH+85,BBR85,BL85,LLS87,CDH+00] –for “efficient” sources [TV02]

Further Reading N. Nisan and A. Ta-Shma. Extracting randomness: a survey and new constructions. Journal of Computer & System Sciences, 58 (1):148-173, 1999. R. Shaltiel. Recent developments in explicit constructions of extractors. Bulletin of EATCS, 77:67- 95, June 2002. S. Vadhan. Course Notes for CS225: Pseudorandomness. http://eecs.harvard.edu/~salil many papers...

Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at

Similar presentations

Presentation on theme: "Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at

Similar presentations

Presentation on theme: "Randomness Extractors & their Many Guises Salil Vadhan Harvard University to be posted at"— Presentation transcript:

Similar presentations

About project

Feedback