Session 2: Secret key cryptography – stream ciphers – part 2.

Presentation on theme: "Session 2: Secret key cryptography – stream ciphers – part 2."— Presentation transcript:

Session 2: Secret key cryptography – stream ciphers – part 2

The Berlekamp-Massey algorithm Computational complexity of the Berlekamp-Massey algorithm is quadratic in the length of the minimum LFSR capable of generating the intercepted sequence. Thus, if the linear complexity is very high, then the task of predicting the next bits of the sequence is too complex.

The Berlekamp-Massey algorithm Then, in order to prevent the cryptanalysis of a pseudorandom sequence generator, we must design it in such a way that its linear complexity is too high for the application of the Berlekamp-Massey algorithm.

Pseudorandom sequence generators Based on LFSRs The goals: Preserve good characteristics of the PN-sequences Increase the linear complexity The key is the initial state Different families of generators

Combinational generators  Non linear filter  Non linear combiner LFSR

Non linear filters In general, it is difficult to calculate the value of the linear complexity of the resulting sequence. However, under some special conditions, it is possible to estimate the linear complexity of the resulting sequence.

Algebraic normal form It is the form of a Boolean function that uses only the operations  and  In the ANF, the product that includes the largest number of variables is denominated non linear order of the function. Example: The non linear order of the function f(x 1,x 2,x 3 )=x 1  x 2 x 3  x 1 x 3 is 2.

Algebraic normal form The ANF of a function can be determined from its truth table. The Möbius transform

Algebraic normal form Example: n=3, u=001 000 001 010 011 100 101 110 111 x

Algebraic normal form Example: n=3, u=010 000 001 010 011 100 101 110 111 x

Algebraic normal form Example: n=3 x0x0 x1x1 x2x2 f 0000 0011 0100 0111 1000 1011 1101 1110

Algebraic normal form u=000u=001u=010 000 001 010 011 100 101 110 111 000 001 010 011 100 101 110 111 000 001 010 011 100 101 110 111 a 000 =f(0,0,0)=0a 001 =f(0,0,0)+ +f(0,0,1)=0+1=1 a 010 =f(0,0,0)+ +f(0,1,0)=0+0=0

Algebraic normal form u=011u=100u=101 000 001 010 011 100 101 110 111 000 001 010 011 100 101 110 111 000 001 010 011 100 101 110 111 a 011 =f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)= 0+1+0+1=0 a 100 =f(0,0,0)+ +f(1,0,0)=0+0=0 a 101 =f(0,0,0)+ f(0,0,1)+f(1,0,0)+f(1,0,1)= 0+1+0+1=0

Algebraic normal form u=110u=111 000 001 010 011 100 101 110 111 a 110 =f(0,0,0)+ f(0,1,0)+f(1,0,0)+f(1,1,0)= 0+0+0+1=1 a 111 =f(0,0,0)+ f(0,0,1)+f(0,1,0)+f(0,1,1)+ f(1,0,0)+f(1,0,1)+f(1,1,0)+ f(1,1,1)=0

Algebraic normal form f(x 0,x 1,x 2 )=a 001 x 2 +a 110 x 0 x 1 =x 2 +x 0 x 1

Non linear filters Theorem (Rueppel, 1984): With the LFSR of length n and with the filter function with the property that its unique term in the ANF of maximum order k is a product of equidistant phases, the lower limit of the linear complexity of the resultant sequence is

Non linear filters Design principles: The feedback polynomial: primitive The filter function must have various terms of each order. k  n/2 Include a linear term in order to obtain good statistical properties of the resulting sequence (balanced filter function).

Non linear combiners In these generators, the keystream sequence is obtained by combining the output sequences of various LFSRs in a non linear manner. Example – it is possible to use a Boolean function (without memory).

Non linear combiners Two cryptographic principles by Shannon: Confusion – we must use complicated transformations – as many bits of the key as possible should be involved in obtaining a single bit of the keystream sequence (and the ciphertext). Diffusion – Every bit of the key must affect many bits of the keystream sequence (and the ciphertext).

Non linear combiners Possible flaws of non linear combiners (to be considered during the design): Bad statistical properties – e.g. too many zeros/ones in the output sequence. Correlation – The output sequence coincides too much with one or more internal sequences – this enables correlation attacks.

Non linear combiners Correlation attacks: It is possible to divide the task of the cryptanalyst into several less difficult tasks – “Divide and conquer”. In order to prevent the correlation attacks, the non linear function of the combiner must have, at the same time: as high non linear order as possible as high correlation immunity as possible. These two requirements are opposite – we must find a trade off between these two values.

Non linear combiners Correlation immunity: A Boolean function is correlation immune of order m if its output sequence is not correlated with any set of m and less input sequences. But, the higher the correlation immunity, the lower the non linear order k. The trade off (N is the number of variables) m+k  N; 1  k  N, 0  m  N-1

Non linear combiners A Boolean function is balanced if it has an equal number of 0s and 1s in its truth table. The balanced correlation immune functions of order m are denominated m-resilient functions.

Non linear combiners Example: The sum modulo 2 of N variables has the maximum possible value of correlation immunity, N-1, but its non linear order is 1. If the combination function contains memory, then the trade off between the correlation immunity and non linearity is not needed – it is possible to maximize both values – a single bit of memory is enough (Rueppel, 1984).

Non linear combiners If F is a Boolean function of N periodic input sequences a 1 (t), a 2 (t),..., a N (t), then the output sequence b(t) = F(a 1 (t), a 2 (t),..., a N (t)) is a linear combination of various products of sequences. These products are determined by determining the ANF of the function F.

Non linear combiners If in the ANF of the function F instead of the sum and product modulo 2 we use the sum and product of integers, the resulting function is denominated F* and for the linear complexity and the period of the output sequence of F the following holds:

Non linear combiners Example: If the characteristic polynomials of the input sequences are: All these polynomials are primitive!

Non linear combiners Example (cont.): Then

Non linear combiners The sum of N sequences in GF(q): The equality holds if the characteristic polynomials of the input sequences are mutually prime.

Non linear combiners The sum of N sequences in GF(q): Obviously, if the periods of the input sequences are mutually prime then

Non linear combiners Example: Primitive! The periods are Mersenne primes

Non linear combiners Product of N sequences in GF(q): Theorem (Golić, 1989) If Per(a i ) are mutually prime, then Theorem (Lidl, Niedereiter) Per(a i ) are mutually prime

Non linear combiners Example: Primitive! The periods are Mersenne primes

Non linear combiners The general case: Let be the Boolean function obtained by removing all the products from the function F except those of the maximum order. Let be the corresponding integer function.

Non linear combiners Theorem (Golić, 1989) F depends on all the N input variables. Per(a i ) are mutually prime. Then

Non linear combiners Example: If the characteristic polynomials of the input sequences are: Primitive, periods Mersenne primes

Non linear combiners Example (cont.)

Geffe’s generator F balanced – good statistical properties

Geffe’s generator The equivalent scheme

Geffe’s generator Example: polynomials – primitive, with periods that are Mersenne primes.

Geffe’s generator Problem: Correlation!

Correlation immune functions Is there a way to find a Boolean memoryless combiner that guarantees a high level of correlation immunity? This is a difficult problem and there is no final answer. However, some Boolean combiners are known to have a high level of correlation immunity.

Correlation immune functions One of the classes of such “good” functions – Latin squares. A Latin square is an n×n scheme of integers in which each element appears exactly once in each row and in each column.

Correlation immune functions Basic property of Latin squares: If we exchange two rows/columns of a Latin square, the obtained scheme is also a Latin square. This gives rise to a construction (one of the possible algorithms): We start from the table of addition of the additive group with n elements. We exchange some rows and columns of the table several times.

Correlation immune functions Example – a Latin square of order 4: 3201 1023 0312 2130

Correlation immune functions A Latin square of dimension n as a family of log 2 n Boolean functions (a vectorial Boolean function with log 2 n outputs): There are 2 address branches, log 2 n bits each The output has log 2 n bits. Example (see previous slide): The address is 0110 (the two most significant bits address the row). The output is 10.

Correlation immune functions Basic correlation-related property of Latin squares: Each bit of output is correlated with a linear combination of inputs that are located in both address branches. Consequence: there is no way of analyzing the address branches individually – no divide and conquer.

Correlation immune functions

Decimation of sequences The principal characteristic: The output sequence of a subgenerator controls the clock sequence of one or more subgenerators.

Decimation of sequences Example 1: X=1,1,0,1,0,1,0,1 Y=0,1,0,0,1 Z=1,0,1,0,0 Example 2: X and Y are generated by LFSRs and the BRM is applied

Decimation of sequences Theorem (Chambers, Jennings, 1984) R 1, R 2 – primitive polynomials, degrees m and n, respectively Periods M=2 m -1 and N=2 n -1 All the prime factors of M divide N Then:

Decimation of sequences The requirements of the Theorem are satisfied if the lengths of both LFSRs are equal and the feedback polynomials are primitive.

Decimation of sequences Example: n=m=107, primitive polynomials LC=nM=107(2 107 -1) Per = NM =(2 107 -1)(2 107 -1)

The shrinking generator (1993) A very simple binary sequence generator (Crypto’93) It consists of two LFSRs: LFSR1 and LFSR2 Based on P, LFSR1 (the control register) decimates the sequence generated by LFSR2 LFSR 1 LFSR 2 P clock

The shrinking generator If a i =0, b i is discarded, otherwise b i is sent to the output. Thus the number of discarded bits from the sequence b depends on the lengths of runs of 0s in the sequence a.

The shrinking generator (an example) LFSRs:  LFSR1: L 1 =3, f 1 (x)=1+x 2 +x 3, IS 1 =(1,0,0)  LFSR2: L 2 =4, f 2 (x)=1+x+x 4, IS 2 =(1,0,0,0) Decimation rule P: {a i }= 0 1 1 1 0 0 1 0 1 1 1 0 0 1 … {b i }= 1 1 1 0 1 0 1 1 0 0 1 0 0 0 … {c j }= 1 1 0 1 0 0 1 0 … The underlined bits (1 and 0) are eliminated.

Characteristics of the output sequence Period: Linear complexity: Number of 1’s: balanced sequence

Example – BRM vs. Shrinking BRM: X=000100110101111… Y=001110100111010… Z=0010100111… Shrinking: X=000100110101111… Y=001110100111010… Z=01011011

Statistical testing of PN generators The output sequence of a generator of pseudorandom sequences looks random, but it is not. Pseudorandom generators expand a truly random sequence (the key) to a much longer sequence, such that an adversary cannot distinguish between the pseudorandom sequence and a truly random sequence.

Statistical testing of PN generators In order to obtain a guarantee of the security of this type of generators various statistical tests are applied, especially designed for this purpose. The fact that a generator passes a set of statistical tests should be considered a necessary condition, although not a sufficient one, for the security of the generator.

Statistical testing of PN generators If the result X of an experiment can take any real value, then X is a continuous random variable. The probability density function f(x) of a continuous random variable X can be integrated and the following holds: f(x)  0, for all x  R For all a, b  R the following holds

Statistical testing of PN generators A continuous random variable has a normal distribution with the mean  and the variance  2 if its probability density function is: We say that X is If X is, then we say that X has a standard normal distribution.

Statistical testing of PN generators If the random variable X is, then the variable is. The Euler’s gamma function:

Statistical testing of PN generators A continuous random variable X has a  2 distribution with degrees of freedom if its probability density function is

Statistical testing of PN generators A statistical hypothesis H is an affirmation about the distribution of one or more random variables. A hypothesis test is a procedure based on the observed values of the random variable that leads to the acceptance or rejection of the hypothesis H.

Statistical testing of PN generators The test only provides a measure of the strength of evidence given by the data against the hypothesis. The conclusion is probabilistic. The level of significance  of the test of the hypothesis H is the probability of rejecting the hypothesis H when it is true.

Statistical testing of PN generators The hypothesis to be tested is denominated the null hypothesis, H 0. The alternative hypothesis is denoted by H 1 or H a. In cryptography: H 0 – the given generator is a random sequence generator.

Statistical testing of PN generators If  is too small, the test could accept non random sequences. If  is too high, the test could reject random sequences. In cryptography:  is between 0,001 and 0,05.

Statistical testing of PN generators A test: Determines a statistic for the sample of the output sequence. This statistic is compared with the expected value of a random sequence.

Statistical testing of PN generators How is the comparison carried out? The computed statistic – X 0 – follows a  2 distribution with degrees of freedom. It is assumed that this statistic takes large values for non random sequences. In order to achieve , a threshold X  is chosen (by means of the corresponding table), such that P(X 0 >X  )= .

Statistical testing of PN generators How is the comparison carried out? (cont.) If the value of the statistic for the sample of the output sequence, X s, satisfies X s >X , then the sequence fails on the test. Basic tests for cryptographic use: Frequency test, serial test, poker test, runs test, autocorrelation test, etc.

Statistical testing of PN generators Frequency test Purpose: determine if the number of zeros and ones in a sequence s is approximately the same. n 0 – number of zeros, n 1 – number of ones. The statistic:

Statistical testing of PN generators Frequency test (cont.) The statistic follows a  2 distribution with 1 degree of freedom. The approximation is good enough if n  10.

Statistical testing of PN generators Serial test Tries to determine if the number of occurrences of 00, 01, 10 and 11, as subsequences of s is approximately the same. The statistic: The statistic follows a  2 distribution with 2 degrees of freedom. The approximation is good enough if n  21.

Statistical testing of PN generators Poker test A positive integer m is considered such that The sequence s is divided into k parts of size m. n i is the number of occurrences of the type i of the sequence of length m, 1  i  2 m (that is, i is the value of the integer whose binary representation is the sequence of length m. The test determines if every sequence of length m appears approximately the same number of times.

Statistical testing of PN generators Poker test (cont.) The statistic: The statistic follows approximately a  2 distribution with 2 m -1 degrees of freedom.

Statistical testing of PN generators Runs test A run of length i – a subsequence of s formed by i consecutive zeros or i consecutive ones that are neither preceded nor followed by the same symbol. A run of zeros – gap A run of ones – block

Statistical testing of PN generators Runs test (cont.) Purpose: determine if the number of runs of different lengths in the sequence s is that expected in a random sequence. The number of gaps (or blocks) of length i in a random sequence of length n is It is considered that k is equal to the largest integer i for which e i  5. We denote by B i and H i the number of blocks and gaps of length i in s, for each i, 1  i  k.

Statistical testing of PN generators Runs test (cont.) The statistic The statistic follows approximately a  2 distribution with 2k-2 degrees of freedom.

Statistical testing of PN generators Autocorrelation test Checks the correlation between s and shifted versions of s. An integer d, 1  d  n/2  is considered. The number of bits in s that are not equal to the d-shifts is

Statistical testing of PN generators Autocorrelation test (cont.) The statistic The statistic follows approximately a N (0,1) distribution. The approximation is good enough if n-d  10.

Download ppt "Session 2: Secret key cryptography – stream ciphers – part 2."

Similar presentations