Download presentation

Presentation is loading. Please wait.

Published byMike Penington Modified over 2 years ago

1
On necessary and sufficient cryptographic assumptions: the case of memory checking Lecture 2 : Authentication and Communication Complexity Lecturer: Moni Naor Weizmann Institute of Science Web site of lectures :

2
Recap of Lecture 1 Key idea of cryptography: use computational intractability for your advantage One-way functions are necessary and sufficient to solve the two guard identification problem –Notion of Reduction between cryptographic primitives Equivalence of the existence of one-way functions: –Amplification of weak one-way functions –Distributionally one-way functions

3
Existence of one-way functions is equivalent: The existence of one-way functions is equivalent to Pseudo-random generators [HILL] Pseudo-random functions and permutations –Block ciphers Bit commitment –Implies zero-knowledge Signature Schemes (Non trivial) shared-key encryption Goal of these talk: add two other items to the list: Sub-linear authentication Memory Checking

4
Authentication Verifying that a string has not been modified –A central problem in cryptography –Many variants Relevant both in communication and in storage

5
The authentication problem one-time version Alice would want to send a message m {0,1} n to Bob They want to prevent Eve from interfering –Bob should be sure that the message m he receives is equal to the message m Alice sent Alice Bob Eve m

6
Specification of the Problem Alice and Bob communicate through a channel N Bob has an external register R N (no message) {0,1} n Eve completely controls the channel Requirements: R Completeness : If Alice wants to send m {0,1} n and Eve does not interfere – Bob has value m in R Soundness : If Alice wants to send m and Eve does interfere –RN –R is either N or m (but not m m ) RN –If Alice does not want to send a message R is N Since this is a generalization of the identification problem – must use shared secrets and probability or complexity Probabilistic version: Nfor any behavior from Eve, for any message m {0,1} n, the probability that Bob is in state m m or N is at most ε

7
Authentication using hash functions Suppose that – H= {h| h: {0,1} n {0,1} k } is a family of functions – Alice and Bob share a random function h H –To authenticate message m {0,1} n Alice sends (m,h(m)) –When receiving (m,z) Bob computes h(m) and compares to z RIf equal, moves register R to m R NIf not equal, register R stays in N What properties do we require from H –hard to guess h(m) - at most ε But clearly not sufficient: one-time pad. –hard to guess h(m) even after seeing h(m) - at most ε Should be true for any m –Short representation for h - must have small log|H| –Easy to compute h(m) given h and m

8
Universal hash functions Given that for h H we have h: {0,1} n {0,1} k we know that ε2 -k A family where this is an equality is called universal 2 Definition : a family of functions H= {h| h: {0,1} n {0,1} k } is called Strongly Universal 2 or pair-wise independent if: – for all m 1, m 2 {0,1} n and y 1, y 2 {0,1} k we have Prob[h(m 1 ) = y 1 and h(m 2 ) = y 2 ] = 2 -2k Where the probability is over a randomly chosen h H In particular Prob[h(m 2 ) = y 2 | h(m 1 ) = y 1 ] = 2 -k when a strongly universal 2 family is used in the protocol, Eves probability of cheating is at most 2 -k

9
Constructing universal hash functions The linear polynomial construction: fix a finite field F of size at least the message space 2 n –Could be either GF[2 n ] or GF[P] for some prime P 2 n The family H of functions h: F F i s defined as H= {h a,b (m) = am + b | a, b F} Claim : the family above is strongly universal 2 Proof: for every m 1, m 2, y 1, y 2 F there are unique a, b F such that am 1 +b = y 1 am 2 +b = y 2 Size: each h H represented by 2n bits

10
Constructing universal hash functions The inner product construction: fix a finite field F of size at least the target space 2 k –Could be either GF[2 k ] or GF[P] for some prime P 2 k Let n= l k Treat each message m {0,1} n as a (l+1) -vector over F where the first entry is 1. Denote by (m 0, m 1, …,m l ) The family H of functions h: F l F i s defined by all vectors (l+1) -vector H= {h a (m)= i=0 l a i m i | a 0, a 1, …,a l F} Claim : the family above is strongly universal 2 Proof: for every (m 0, m 1, …,m l ), (m 0, m 1, …,m l ) y 1, y 2 F there are there same number (and non-zero) of solutions to i=0 l a i m i = y 1 i=0 l a i m i = y 2 Size: each h H represented by n+k bits

11
Lower bound on size of strongly universal hash functions Theorem : let H= {h| h: {0,1} n {0,1} } be a family of pair-wise independent functions. Then |H| is Ω(2 n ) More precisely, to obtain a d -wise independence family |H| should be Ω(2 n d/2 ) Theorem : see N. Alon and J. Spencer, The Probabilistic Method Chapter 15 on derandomization, proposition 2.3

12
An almost perfect solution By allowing ε to be slightly larger than 2 -k we can get much smaller families Definition : a family of functions H= {h| h: {0,1} n {0,1} k } is called δ- Universal 2 if for all m 1, m 2 {0,1} n where m 1 m 2 we have Prob[h(m 1 ) = h(m 2 ) ] δ

13
An almost perfect solution Idea : combine a family of δ- Universal 2 functions H 1 = {h| {0,1} n {0,1} k } with a Strongly Universal 2 family H 2 = {h| {0,1} k {0,1} k } Consider the family H where each h H is {0,1} n {0,1} k is defined by h 1 H 1 and h 2 H 2 h(x) = h 2 (h 1 (x)). As before Alice sends m, h(m) Claim : probability of cheating is at most δ + 2 -k Proof: when Eve sends m, y we must have m m but either –y =h(m), which means that Eve succeeds with probability at most δ + 2 -k Collision in h 1 Or in h 2 Or –y h(m) which means that Eve succeeds with probability at most 2 -k Collision in h 2 Size: each h H represented by log |H 1 |+ log |H 2 |

14
Constructing almost universal hash functions The polynomial evaluation construction {0,1} n {0,1} k : fix a finite field F of size at least the target space 2 k –Could be either GF[2 k ] or GF[P] for some prime P 2 k Let n = k Treat each (non-zero) message m {0,1} n as a degree ( -1) - polynomial over F. Denote by P m The family H of functions h: F F is defined by all elements in F : H= {h x (m)= P m (x)| x F} Claim : the family above is δ- Universal 2 for δ= (-1)/2 k Proof: the maximum number of points where two different degree ( -1) polynomials agree is -1 Size: each h H represented by k bits m

15
Parameters for authentication To authenticate an n bits message with probability of error ε Need: Secret key length: (log n + log 1 / ε ) Added tag length (log 1 / ε) log n Lower bound does not hold for interactive protocols

16
Authentication for Storage Large file residing on a remote server Verifier stores a small secret `representative of file –Fingerprint –When retrieving the file should identify corruption The size of the fingerprint –A well understood problem

17
Sub-linear Authentication What about sub-linear authentication: –Do you have to read the whole file to figure out whether it has been corrupted? –Encode the information you store (Authenticators). –How large a fingerprint do you need? How much of the encoding do you need to read?

18
public encoding p x Authenticators How to authenticate a large and unreliable memory with a small and secret memory Encoding Algorithm E : –Receives a vector x 2 {0,1} n, encodes it into: a public encoding p x a small secret encoding s x. Space complexity: s(n) Decoding Algorithm D : –Receives a public encoding p and decodes it into a vector x 2 {0,1} n Consistency Verifier V : –Receives public p x and secret s x encodings, verifies whether decoder output = encoder input –Makes few queries to public encoding: query complexity: t(n) An adversary sees (only) the public encoding and can change it E secret encoding s x V D public encoding p y accept reject x {0,1} n x y

19
Power of the Adversary We have seen the access the Adversary has to the system Distinguish between computationally – all powerful and – Bounded Adversaries Dr. Evil

20
Pretty Good Authenticator Idea: encode X using a good error correcting code C –Actually erasures are more relevant –As long as a certain fraction of the symbols of C(X) is available, can decode X –Good example: Reed Solomon code Add to each symbol a tag F k (a,i), a function of secret information k 2 {0,1} s, symbol a 2 location i Verifiers picks random location i reads symbol a and tag t –Check whether t=F k (a,i) and rejects if not Decoding process removes all inappropriate tags and uses the decoding procedure of C

21
How good is the authenticator Suppose it is impossible to forge tags If adversary changes fraction of symbols –Probability of being caught per test is If the code C can recover X from 1- of the symbols –then the probability of false `accept is –Can make it smaller by repetition How to make the tags unforgeable? –Easy: Need the range of F k (a,i) to be large enough to make guessing unlikely –Need that for random k 2 {0,1} s any adversary given many tags {(a j,i j ) F k (a j,i j )} j Hard to guess F k (a,i) for any (a,i) not in the list

22
Computational Indistinguishability Definition : two sequences of distributions {D n } and {D n } on {0,1} n are computationally indistinguishable if for every polynomial p(n) and sufficiently large n, for every probabilistic polynomial time adversary A that receives input y {0,1} n and tries to decide whether y was generated by D n or D n |Prob[A=0 | D n ] - Prob[A=0 | D n ] | < 1/p(n) Without restriction on probabilistic polynomial tests: equivalent to variation distance being negligible β {0,1} n |Prob[ D n = β] - Prob[ D n = β]| < 1/p(n)

23
Pseudo-random Functions Let {s(n), m(n), (n)} be a sequence of parameters: F: {0,1} s {0,1} m {0,1} key Domain Range Denote Y= F k (X) A family of functions Φ n ={F k | k 0,1} s } is pseudo-random if it is Efficiently computable - random access and...

24
Pseudo-random Any polynomial time tester A that can choose adaptively –X 1 and get Y 1 = F S (X 1 ) –X 2 and get Y 2 = F S (X 2 ) … –X q and get Y q = F S (X q ) Then A has to decide whether – F k R Φ n or – F k R R m = { F | F :{0,1} m {0,1} } Not important for us

25
Pseudo-random For a function F chosen at random from (1) Φ n = {F k | k 0,1 s (2) R m = { F | F :{0,1} m {0,1} } For all polynomial time machines A that choose q locations and try to distinguish (1) from (2) for all polynomial 1/p(n) Prob A 1 F R F k - Prob A 1 F R R n m 1/p(n)

26
Equivalent/Non-Equivalent Definitions Instead of next bit test: for X X 1,X 2,, X q chosen by A, decide whether given Y is –Y= F S (X) or –Y R 0,1 m Adaptive vs. Non-adaptive Unpredictability vs. pseudo-randomness Really what we need

27
Existence of Pseudo-Random functions and Authenticators If one-way functions exist so pseudo-random generators If pseudo-random generators exist, so do pseudo- random functions Authenticators Conclusion: If one-way functions exist, –so do sublinear authenticators with Secret memory: sufficient to store a key Query complexity log n or log n log 1/ Probability of error

28
So are we done Two problems: Need –computational bounded adversary and –one-way functions Efficiency: the evaluation of a pseudo-random function might be a burden to add to every memory fetch operation

29
Composing universal hash functions Concatenation Let H where each h H is {0,1} n {0,1} k be a family of δ- Universal 2 functions Consider the family H where each h H is {0,1} 2n {0,1} 2k and where h(x 1,x 2 ) = h(x 1 ), h(x 2 ) for some h H Claim : the family above is δ- Universal 2 Proof: let x 1, x 2 and x 1, x 2 be a pair of inputs. If x 1 x 1 collision must occur in first part h(x 1 )=h( x 1 ) Else, x 2 x 2 and collision must occur in second part h(x 2 )=h( x 2 ) In either case the probability is at most δ

30
Composing universal hash functions Composition Let H 1 = {h| h:{0,1} n 1 {0,1} n 2 } with H 2 = {h| h: {0,1} n 2 {0,1} n 3 } be families of δ- Universal 2 functions Consider the family H where each h H is {0,1} n 1 {0,1} n 3 is defined by h 1 H 1 and h 2 H 2 h(x) = h 2 (h 1 (x)) Claim : the family above is 2 δ- Universal 2 Proof: the collision must occur either at the first hash function or the second hash function. Each event happens with probability at most δ and we apply the union bound n2n2 n1n1 n3n3

31
The Tree Construction h1h1 h2h2 h3h3 Let n= l k and let each h i : {0,1} 2k {0,1} k be chosen independently from H a δ -Universal family, then result is a family of functions {0,1} n {0,1} k which is tδ - Universal where t is the number of levels in the tree Size: t log |H| m

32
Communication Complexity Alice x2Xx2X Bob y2Yy2Y Let f:X x Y Z Input is split between two participants Want to compute outcome: z=f (x,y) while exchanging as few bits as possible

33
A protocol is defined by the communication tree z 0 z 1 z 2 z 3 z 4 z 5 z 6 z 7... Alice: 0 Bob: 1 Alice: 0 Bob: 1 z5z5

34
A Protocol A protocol P over domain X x Y with range Z is a binary tree where –Each internal node v is labeled with either a v :X {0,1} or b v :Y {0,1} –Each leaf is labeled with an element z 2 Z The value of protocol P on input (x,y) is the label of the leaf reached by starting from the root and walking down the tree. At each internal node labeled a v walk – left if a v (x)=0 – right if a v (x)=1 At each internal node labeled b v walk – left if b v (y)=0 – right if b v (y)=1 –The cost of protocol P on input (x,y) is the length of the path taken on input (x,y) –The cost of protocol P is the maximum path length

35
Motivation for studying communication complexity Originally for studying VLSI questions Connection with Turing Machines Data structures and the cell probe model Boolean circuit depth … New application: lower bound for the authentication problem

36
Communication Complexity of a function For a function f:X x Y Z the (deterministic) communication complexity of f ( D(f) ) is the minimum cost of protocol P over all protocols that compute f Observation : For any function f:X x Y Z D(f) log |X| + log |Z| Example : let x,y µ {1,…,n} and let f(x,y)=max{x [ y} Then D(f) · 2 log n

37
Median let x,y µ {1,…,n} and let MED(x,y) be the median of the multiset x [ y If the size is even then element ranked |x [ y|/2 Claim : D(MED) is O(log 2 n) protocol idea : do a binary search on the value, each party reporting how many are above the current guess Exercise : D(MED) is O(log n) protocol idea : each party proposes a candidate See which one is larger - no need to repeat bits

38
Combinatorial Rectangles A combinatorial rectangle in X x Y is a subset R µ X x Y such that R= A x B for some A µ X and B µ Y Proposition: R µ X x Y is a combinatorial rectangle iff (x 1,y 1 ) 2 R and (x 2,y 2 ) 2 R implies that (x 1,y 2 ) 2 R For Protocol P and node v let R v be the set of inputs (x,y) reaching v Claim: For any protocol P and node v the set R v is a combinatorial rectangle Claim: For any given the transcript of an exchange between Alice an Bob possible (but not x and y ) possible to determine z=f(x,y)

39
Fooling Sets For f:X x Y Z a subset R µ X x Y is f - monochromatic if f is fixed on R Observation: any protocol P induces a partition of X x Y into f - monochromatic rectangles. The number of rectangles is the number of leaves in P A set S µ X x Y is a fooling set for f if there exists a z 2 Z where –For every (x,y) 2 S, f(x,y)=z –For every distinct (x 1,y 1 ), (x 2,y 2 ) 2 S either f(x 1,y 2 )z or f(x 2,y 1 )z Property: no two elements of a fooling set S can be in the same monochromatic rectangle Lemma: if f has a fooling set of size t, then D(f) log 2 t z z x1x1 x2x2 y1y1 y2y2

40
Applications Equality : Alice and Bob each hold x,y 2 {0,1} n –want to decide whether x=y or not. Fooling set for Equality S={(w,w)|w 2 {0,1} n } Conclusion : D(Equality) ¸ n Disjointness : let x,y µ {1,…,n} and let –DISJ(x,y)=1 if |x y| ¸ 1 and –DISJ(x,y)=0 otherwise Fooling set for Disjointness S={(A,comp(A))|A µ {1,…,n} } Conclusion : D(DISJ) ¸ n

41
Rank lower bound For f:X x Y {0,1} let M f tbe the |X| x |Y| matrix where entry f(x,y) has value f(x,y). The rank of f is the rank of M f over the reals. Theorem : For any f:X x Y {0,1} D(f) log 2 rank(f) Proof: For any protocol P M f = leaf M M is the matrix corresponding to the rectangle of leaf Examples: – Equality M Equality is the identity matrix – rank is 2 n

42
Inner Product Let x,y 2 {0,1} n IP(x,y)= i=1 n x i y i mod 2 What is rank(M IP )? Let N=M IP M IP. Then entry ( x,y) in N is z 2 {0,1} n which is # zs where = =1 –2 n-2 if xy –2 n-1 if x=y –0 if x or y is 0 Hence rank(N) ¸ 2 n -1. Since rank(M IP M IP ) · min{rank(M IP },rank(M IP }} rank(M IP } ¸ 2 n -1 and D(IP) ¸ n

43
Non-determinism and covers Can a protocol tree be very unbalanced? –Technique for balancing: given a protocol with t leaves there is a protocol with communication complexity O(log t) Is the monotone rectangles lower bound tight? Consider instead of partition into monochromatic rectangles a cover by monochromatic rectangles The rectangles are not necessary disjoint For z 2 Z a z-cover handles only the inputs (x,y) where f(z,y)=z This corresponds to non-deterministic complexity A non-deterministic protocol for verifying that f(x,y)=z: Alice: Guess a rectangle R intersecting row x, send name to Bob Bob: verify that R intersects column y tell Alice Accept only if Bob approve Complexity: log # of z-rectangles

44
Probabilistic Protocols ALICE BOB y x Random coins

45
Probabilistic Communication Complexity Alice an Bob have each, in addition to their inputs, access to random strings of arbitrary length r A and r B (respectively) A probabilistic protocol P over domain X x Y with range Z is a binary tree where –Each internal node v is labeled with either a v (x, r A ) or b v (y, r B ) –Each leaf is labeled with an element z 2 Z Take all probabilities over the choice of r A and r B P computes f with zero error if for all (x,y) Pr[P(x,y)=f(x,y)]=1 P computes f with error if for all (x,y) Pr[P(x,y)=f(x,y)] ¸ 1- For Boolean f, P computes f with one-sided error if for all (x,y) s.t. f(x,y)=0 Pr[P(x,y)=0]=1 and for all (x,y) s.t. f(x,y)=1 Pr[P(x,y)=1] ¸ 1-

46
Measuring Probabilistic Communication Complexity For input (x,y) can consider as the cost of protocol P on input (x,y) either worst-case depth average depth over r A and r B Cost of a protocol : maximum cost over all inputs (x,y) The appropriate measure of probabilistic communication complexity: R 0 (f): minimum (over all protocols) of the average cost of a randomized protocol that computes f with zero error. R (f): minimum (over all protocols) of the worst-case cost of a randomized protocol that computes f with error. – Makes sense: if 0< <½ R(f) = R 1/3 (f): R 1 (f): minimum (over all protocols) of the worst-case cost of a randomized protocol that computes f with one-sided error. – Makes sense: if 0< <1. R 1 (f) = R ½ 1 (f):

47
Equality Idea: pick a family of hash functions H={h|h:{0,1} n {1…m}} such that for all xy, for random h 2 R H Pr[(h(x)=h(y)] · Protocol: Alice: pick random h 2 R H and send Bob: compare h(x) to h(y) and announce the result This is a one-sided error protocol with cost log|H|+ log m Constructing H: Fact: over any two polynomials of degree d agree on at most d points Fix prime q such that n 2 · q · 2n 2 map x to a polynomial W x of degree d=n/log q over GF[q] H={h z |z 2 GF[q]} and h z (x)=W x (z) = d/q= n/q log q · 1/n log n

48
Relationship between the measures Error reduction (Boolean): for one-sided errors: k repetitions reduces to k hence R k 1 (f) · kR 1 (f) for two-sided errors: k repetitions and taking majority reduces the error using Chernoff bounds Derandomization: R(f) = R 1/3 (f) 2 (log D(f)) General Idea: find a small collection of assignments to where the protocol behaves similarly. Problem: to work for all pairs of inputs need to repeat ~n times Instead: jointly evaluate for each leaf the probability of reaching it, on the given input: P [x,y] = P A [x|Bob follows the path] ¢ P B [y|Alice follows the path] Chernoff: if Pr[x i ] = 1] =1/2 - then Pr[ i=1 k x i > k/2] · e- 2 2 k Alice computes and sends. Accuracy: log R(f) bits Bob computes

49
Public coins model What if Alice and Bob have access to a joint source of bits. Possible view: distribution over deterministic protocols Let R pub (f): be the minimum cost of a public coins protocol computing f correctly with probability at least 1- for any input (x,y) Example: R pub (Equality) = (-log ) Theorem : for any Boolean f: R (f) is R pub (f)+O(log n + log 1/ Proof : choose t = 8n/ 2 assignments to the public string…

50
Simulating large sample spaces Want to find among all possible public random strings a small collection of strings on which the protocol behave similarly on all inputs Choose m random strings For input (x,y) event A x,y is more than ( + ) of the m strings fail the protocol Pr[A x,y ] · e -2 2 t < 2 -2n Pr[ [ x,y A x,y ] · x,y Pr[A x,y ] <2 2n 2 -2n =1 Good 1- Bad Collection that should resemble probability of success on ALL inputs

51
Number of rounds So far have not been concerned with the number of rounds It is known that there functions with a large gap in the communication complexity between protocols with few and many rounds What is the smallest number of rounds?

52
What happens if we flatten the tree? mAmA mBmB f(x,y) ALICE BOB CAROL y x

53
The simultaneous messages model: Alice receives x and Bob who receives inputs y They simultaneously send a message to a referee Carol who initially get no input Carol should compute f(x,y) Several possible models: Deterministic: all lower bounds for deterministic protocols for f(x,y) are applicable here Shared (Public) random coins: –Equality has a good protocol Consider public string as hash functions h, Alice sends h(x) Bob sends h(y) and Charlie compares the outcome The complexity can be as little as O(1) is after constant probability of error Provided the random bits are chosen independently than the inputs

54
Simultaneous Messages Protocols Suggested by Yao 1979 mAmA mBmB f(x,y) x {0,1} n y {0,1} n ALICE BOB CAROL

55
Simultaneous Messages Protocols For the equality function: There exists a protocol where –|m A | x |m B | = O(n) –Let C be a good error correcting code –Alice and bob arrange C(x) and C(y) in an |m A | x |m B | rectangle Alice sends a random row Bob send a random columns Carol compares the intersection

56
Simultaneous Messages Protocols Lower bounds for the equality function: –|m A | + |m B | = (n) [Newman Szegedy 1996] –|m A | x |m B | = (n) [Babai Kimmel 1997] Idea: for each x 2 {0,1} n find a `typical multiset of messages T x = {w 1, w 2,…,w t } where t 2 O(|m B |) Each w i is a message in the original protocol, |m A | bits Property: for each message m B the behavior on T x approximates the real behavior Average behavior of Carol on w 1, w 2,…,w t is close to its average response during protocol Over random i, and randomness of Carol Over randomness of Alice and Carol

57
Simultaneous Messages Protocols How to find for each x 2 {0,1} n such a `typical T x of size t Claim: a random choice of w i s is good Proof by Chernoff – Need to `take care of every m B (2 |m B | possibilities ) Claim: for x x we have T x T x – Otherwise behaves the same when y = x for x and x Let S x be the m B s for which protocol mostly says 1 Let W x be the m B s for which protocol mostly says 0 Then for y=x the distribution should be mostly on S x Conclusion: t ¢ |m A | ¸ n and we get |m A | x |m B | = (n) [Babai Kimmel 1997]

58
General issue What do combinatorial lower bounds men when complexity issues are involved? What happens to the pigeon-hole principal when one-way functions (one-way hashing) are involved? Does the simultaneous message lower bound hold when one-way functions exist –Issue is complicated by the model –Can define Consecutive Message Protocol model with iff results

59
And now for something completely different

60
Faculty members in Cryptography and Complexity Prof. Uri Feige Prof. Oded Goldreich Prof. Shafi Goldwasser Prof. Moni Naor Dr. Omer Reingold Prof. Ran Raz Prof. Adi Shamir אורי פייגה עודד גולדרייך שפי גולדווסר מוני נאור עומר ריינגולד רן רז עדי שמיר One of the most active groups in the world!

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google