Presentation is loading. Please wait.

Presentation is loading. Please wait.

Martin Novotný, Andy Rupp Ruhr University Bochum

Similar presentations


Presentation on theme: "Martin Novotný, Andy Rupp Ruhr University Bochum"— Presentation transcript:

1 Martin Novotný, Andy Rupp Ruhr University Bochum
Application of FPGA Design: Design Challenges for Implementing Realtime A5/1 Attack with Precomputation Tables Martin Novotný, Andy Rupp Ruhr University Bochum

2 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

3 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

4 A5/1 Cipher A5/1 A5/1 Encrypts GSM communication
GSM communication organized in frames 1 frame = 114 bits in each direction Stream cipher produces the keystream KS being xored with the plaintext P to form ciphertext C C = P  KS A5/1 P C A5/1 KS

5 Architecture of A5/1 Cipher
3 linear feedback shift registers (LFSRs) LFSRs irregularly clocked the register is clocked iff its clocking bit (yellow) is equal to the majority of all 3 clocking bits  at least 2 registers are clocked in each cycle

6 Algorithm of A5/1 Reset all 3 registers
(Initialization) Load 64 bits of key K + 22 bits of frame number FN into 3 registers K and FN xored bit-by-bit to the least significant bits registers clocked regularly (Warm-up) Clock for 100 cycles and discard the output registers clocked irregularly (Execution) Clock for 228 cycles, generate bits (for each direction) Repeat for the next frame

7 Cryptanalysis of stream ciphers with known plaintext
From the ciphertext C and known plaintext P compute keystream KS: KS = P  C Keystream KS is a function of: key K: KS = f(K) internal state L: KS = g(L) (internal state = content of all registers) Reset all 3 registers (Initialization) Load 64 bits of key K + 22 bits of frame number FN into 3 registers (Warm-up) Clock for 100 cycles and discard the output (Execution) Clock for 228 cycles, generate bits (for each direction) We can skip Initialization and Warm-up!!!

8 Cryptanalysis of A5/1 For (known) keystream KS find the internal state L When L found, track the A5/1 cipher back through Warm-up phase and Initialization to get the key K. Reset all 3 registers (Initialization) Load 64 bits of key K + 22 bits of frame number FN into 3 registers (Warm-up) Clock for 100 cycles and discard the output (Execution) Clock for 228 cycles, generate bits (for each direction) L

9 Cryptanalysis of A5/1 Internal state L has 64 bits
 we need (at least) 64 bits of keystream KS One A5/1 frame has 114 bits  we can make samples KSi It is sufficient to find any Li KS3 L3 KS2 L2 KS1 L1 KS0 L0

10 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

11 Two extreme approaches
Brute force attack Check all combinations of a key K online time T = N = 2k memory M = 1 Table lookup (For a given plaintext P) All pairs key-ciphertext {Ki, Ci} precomputed and stored (sorted by C) Online phase: Look-up C in the table (and find K) time T = 1 memory M = N = 2k

12 Time-Memory Trade-Off (Hellman, 1981)
Compromises the above two extreme approaches Precomputation phase: For a given plaintext P: precompute (ideally all) pairs key-ciphertext {Ki, Ci}; store only some of them in the table. Online phase: Perform some computations; lookup the table and find the key K. time T = N2/3 memory M = N2/3

13 Precomputation (offline) phase
Idea: Encryption function E is a pseudo-random function C = EK(P) Pairs {Ki, Ci} organized in chains Ci is used to create a key Ki+1 for the next step E is pseudo-random  we perform a pseudo-random walk in the keyspace E K C P plaintext P is the same P E K1 C1 SP = 1234 Start Point P E R EP Kt Ct B05B 8EC0 End Point f C2 P E R K3 28DF 7A3D R K2 R – reduction function (DES: C has 64 bits, K has 56 bits) f – step function f(x) = R(Ex(P))

14 Precomputation (offline) phase
Idea: Encryption function E is a pseudo-random function C = EK(P) Pairs {Ki, Ci} organized in chains Ci is used to create a key Ki+1 for the next step E is pseudo-random  we perform a pseudo-random walk in the keyspace E K C P plaintext P is the same Start Point End Point P E R f K1 EP K2 K3 Kt C1 C2 Ct SP = 1234 7A3D 28DF B05B 8EC0 R – reduction function (DES: C has 64 bits, K has 56 bits) f – step function f(x) = R(Ex(P))

15 Precomputation (offline) phase
1234 SP1 = k10 f k11 f k12 f … f k1t-1 f k1t = EP1 8EC0 1235 SP2 = k20 f k21 f k22 f … f k2t-1 f k2t = EP2 2A1B 1236 SP3 = k30 f k31 f k32 f … f k3t-1 f k3t = EP3 4D3C … … … 9999 SPm = km0 f km1 f km2 f … f kmt-1 f kmt = EPm 02E3 m chains with a fixed length t generated Only pairs {SPi, EPi} stored (sorted by EP)  reducing memory requirements P E R f SPj = kj0 kjt = EPj kj1 kj2 kjt-1

16  Online phase E Given C. (and P) … try to find K, such that C = EK(P)
= EPi ? Lookup: SPi R C y1 E K f

17  Online phase Given C. (and P) … try to find K, such that C = EK(P)
= EPi ? Lookup: R C y1

18 Online phase Given C. (and P) … try to find K, such that C = EK(P) y1

19  Online phase Given C. (and P) … try to find K, such that C = EK(P)
= EPi ? Lookup: R C y1 f y2

20 Online phase Given C. (and P) … try to find K, such that C = EK(P) C

21  Online phase Given C. (and P) … try to find K, such that C = EK(P)
= EPi ? Lookup: R C y1 f y2 y3

22 Online phase Given C. (and P) … try to find K, such that C = EK(P) C

23  Online phase Given C. (and P) … try to find K, such that C = EK(P)
= EPi ? Lookup: R C y1 f y2 y3 y4

24  Online phase E Given C. (and P) … try to find K, such that C = EK(P)
= EPi ? Lookup: SPi R C y1 f y2 y3 y4 f K

25 Birthday paradox problem
m chains of fixed length t generated R is not bijective ⇒ some kij collide. Collisions yield in chain merges or in cycles in chains Matrix stopping rule: Hellman proved that it is not worth to increase number of chains m or length of chain t beyond the point at which m × t2 = N (the coverage of keyspace does not increase too much then)

26 Birthday paradox problem
Matrix stopping rule: m × t2 = N Recommendation: To use r tables, each with different reduction (re-randomization) function R Since also N = m  t  r, then r = t Hellman recommends m = t = r = N1/3 SP1 … … … … … EP1 SP2 … … … … … EP2 SP3 … … … … … EP3 … … … SP200 … … … ... EP200

27 Hellman TMTO – Complexity
Precomputation phase Precomputation time PT = m  t  r = N (e.g. 260) Memory M = m  r = N2/3 (e.g. 240 ) Online phase Memory M = N2/3 Online time T = t  r = t2 = N2/3 (e.g. 240 ) Table accesses TA = T = N2/3 (e.g. 240 )

28 Hellman TMTO – Complexity
Precomputation phase Precomputation time PT = m  t  r = N (e.g. 260) Memory M = m  r = N2/3 (e.g. 240 ) Online phase Memory M = N2/3 Online time T = t  r = t2 = N2/3 (e.g. 240 ) Table accesses TA = T = N2/3 (e.g. 240 ) 34 years (1 disk access ~ 1 ms)

29 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

30 Distinguished points (DP) (Rivest, ????)
Slight modification of original Hellman method Goal: To reduce the number of table accesses TA (in Hellman TA = N2/3) Distinguished point is a point of a certain property (e.g. 20 most significant bits are equal to 0).

31 Distinguished Points (DP) Precomputation phase
Chains are generated until the distinguished point (DP) is reached if the chain exceeds maximum length tmax, then it is discarded and the next chain is generated the chain is also discarded if the DP has been reached, but the chain is too short tmin (to have better coverage) Triples {SPj, EPj, lj} stored, sorted by EP (lj is a length of the chain) 1234 SP1 = k10 f k11 f … … … … … … … f k1u = EP1 0EC0 1235 SP2 = k20 f k21 f … … f k2v = EP2 0A1B 1236 SP3 = k30 f k31 f … … … … … f k3w = EP3 043C … … … 9999 SPm = km0 f km1 f … … … f kmz = EPm 02E3 chains have different lengths End Points are DP

32 Distinguished Points (DP) Online phase
There is 1 distinguished point per chain – the End Point End Point  Distinguished Point Algorithm: compute yi+1 = f(yi) iteratively until the DP is reached (or the maximum length tmax is exceeded) then lookup (just once per table) (if tmax is exceeded, do not lookup at all) Advantages Table accesses TA = r = N1/3 (c.f. TA = t  r = N2/3 in original Hellman) Chain loops are not possible E f = EPi ? Lookup: SPi R C y1 f y2 y3 y4 043C f K

33 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

34 TMTO with multiple data (Biryukov & Shamir, 2000)
Important for stream ciphers: To reveal an internal state Li having k bits we need only k bits of a keystream KSi Having D data samples of the ciphertext C (or the keystream KS) we have D times more chances to find the key K (or the internal state L)  We calculate r/D tables only  we reduce the precomputation time PT and the memory M × online time T and #table access TA remain unchanged KS3 L3 KS2 L2 KS1 L1 KS0 L0

35 TMTO with multiple data A5/1
1 frame: 114 bits Internal state: 64 bits  114 – = 51 data samples from 1 frame (each sample has 64 bits) D = 51 We calculate D times less tables ( save memory, save time) KS0 L0 KS1 L1 KS2 L2 KS3 L3

36 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

37 Rainbow tables (Oechslin, 2003)
Idea: to use different reduction/re-randomization function Ri in each step of chain generation, hence the step functions are: f1 f2 f3 … ft-1 ft P E R1 f1 R2 f2 Rt ft SPj = kj0 kjt = EPj kj1 kj2 xjt-1 Online phase: Compute y1 = Rt(C), compare with EPs, if no match, then Compute y2 = ft(Rt-1(C)), compare with EPs, if no match, then Compute y3 = ft(ft-1(Rt-2(C))), compare with EPs, if no match, then

38 Rainbow tables Just one table (or only several tables) generated,
m = N2/3 (t reduction functions used ⇒ the table can be t times longer), t = N1/3 Advantages chain loops impossible point collisions lead to chain merges only if the equal points appear in the same position of the chain online time T about ½ of the online time of original Hellman (for single data) number of table accesses the same like for the Hellman+DP method (for single data) Disadvantages Inferior to the Hellman+DP method in the case of multiple data (D > 1) (online time T and the number of table accesses TA are D-times greater)

39 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

40 f1 f2 f3 … fS-1 fS f1 f2 f3 … fS-1 fS … … … f1 f2 f3 … fS-1 fS
Thin-rainbow tables The way to cope with the rainbow tables when having multiple data The sequence of S different reduction functions fi is applied k-times periodically in order to create a chain: f1 f2 f3 … fS-1 fS f1 f2 f3 … fS-1 fS … … … f1 f2 f3 … fS-1 fS Chain length t = S × k 1st 2nd kth

41 Thin-rainbow tables + DP (to reduce # table accesses TA)
DP criterion is checked after each fS f1 f2 f3 … fS-1 fS f1 f2 f3 … fS-1 fS … … … f1 f2 f3 … fS-1 fS We store only chains for which kmin < k < kmax 1st 2nd kth DP ? DP ? DP ? DP ?

42 Candidates for implementation (in case of multiple data, D>1)
Hellman + DP DP-criterion checked after each step-function f Thin-rainbow + DP DP-criterion checked after fS only simpler HW, better time/area product Both have the same precomputation complexity Both have comparable online time T and # table accesses TA

43 Thin-rainbow tables + DP (to reduce # table accesses TA)
DP criterion is checked after each fS f1 f2 f3 … fS-1 fS f1 f2 f3 … fS-1 fS … … … f1 f2 f3 … fS-1 fS We store only chains for which kmin < k < kmax 1st 2nd kth DP ? DP ? DP ? DP ?

44 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

45 Implementation choices
Pipeline? Array of small computing elements?

46 Slice – FPGAs Basic Building Block
Look-up Table Flip- flop Look-up Table Flip- flop Implements combinational logic (any logic function of 4 variables) It is RAM 16x1 (holds the truth-table)

47 LUT – configuration choices
Look-up Table Flip- flop Look-up Table Flip- flop Can be configured as: LUT (function generator) RAM 16x1 SRL16 (upto 16-bit shift register)

48 Implementation choices
Pipeline? All A5/1 bits should have been accessible in parallel max. 240 A5/1 units (64x FF/unit) (and no control unit, …) Array of small computing elements? LFSRs can be implemented using SRL16 (1 LUT config. as up to 16-bit shift register) max. 480 A5/1 units (8x SRL16 + 5x FF/unit) (enough FFs for control unit, …)

49 A5/1 TMTO basic element Calculates one chain Two-stroke mode:
core #1 generates keystream, core #2 is loaded core #2 generates keystream, core #1 is loaded

50 First, the start point SPj is loaded to core #1
A5/1 TMTO basic element First, the start point SPj is loaded to core #1

51 A5/1 TMTO basic element … then one rainbow period f1f2f3 … fS-1fS is performed … In odd steps: Core #1 generates keystream … ... that is re-randomized … … and loaded to core # 2 as a new internal state

52 A5/1 TMTO basic element … then one rainbow period f1f2f3 … fS-1fS is performed … … and loaded to core # 1 as a new internal state ... that is re-randomized … In even steps: Core #2 generates keystream …

53 A5/1 TMTO basic element After application of fS:
the result is shifted out to check the DP-criterion

54 A5/1 TMTO engine – table generation (in 1 FPGA)
234 TMTO elements  234 chains computed in parallel in Spartan FPGA TMTO elements share the DP-checker

55 A5/1 TMTO engine – table generation (in 1 FPGA)
Loading Startpoints 1234

56 A5/1 TMTO engine – table generation (in 1 FPGA)
Loading Startpoints 1234 1235

57 A5/1 TMTO engine – table generation (in 1 FPGA)
Loading Startpoints 1234 1235 1236

58 A5/1 TMTO engine – table generation (in 1 FPGA)
Loading Startpoints 1234 1235 1236

59 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 1234 1235 1236

60 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 7A3D 802B 41C3

61 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 27B3 4C81 05A1

62 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 5AB7 44DC 820F

63 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 654C 05B5 82A1

64 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 57A2 91D6 120B

65 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 1283 AB45 5A1B

66 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 987B 651E 420B

67 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 1A56 02BA 8ACD

68 A5/1 TMTO engine – table generation (in 1 FPGA)
1st Rainbow Sequence 1A56 02BA 8ACD

69 A5/1 TMTO engine – table generation (in 1 FPGA)
Evaluation (DP checking) 1A56 02BA 8ACD

70 A5/1 TMTO engine – table generation (in 1 FPGA)
Evaluation (DP checking) 1A56 02BA 8ACD

71 A5/1 TMTO engine – table generation (in 1 FPGA)
Evaluation (DP checking) 02BA 8ACD 1A56 1237 1235

72 A5/1 TMTO engine – table generation (in 1 FPGA)
Evaluation (DP checking) 8ACD 1A56 1237 02BA 1235

73 A5/1 TMTO engine – table generation (in 1 FPGA)
Evaluation (DP checking) 1A56 1237 8ACD 02BA 1235

74 A5/1 TMTO engine – table generation (in 1 FPGA)
2nd Rainbow Sequence … 1A56 1237 8ACD 02BA 1235

75 Data error detection Data MUST be correct
Errors may appear during the data transfer via COPA bus (120 FPGAs sharing the same bus) Hamming code (72, 64) TED (triple error detection) detects 99.19% quadruple errors (detects also all errors of 5, 6, 7, 9, 10, 11, … bits) If an error appears, the data are discarded Hamming encoding COPA bus

76 Outline A5/1 cipher Time-Memory Trade-off Tables
Original Hellman Approach Distinguished points TMTO with multiple data Rainbow tables Thin-rainbow tables Architecture of the A5/1 TMTO engine – table generation Implementation results

77 Implementation results
COPACOBANA is able to perform up to 236 (~69 billion) step-functions fi per second 234 TMTO elements/FPGA 120 FPGAs maximum frequency fmax = 156 MHz one step-function takes 64 clock cycles 234 × 120 × 156106 / 64  236

78 Parameter choices chains computed m rainbow sequence S DP criterion d [bits] #seq. in chain k precomp. time PT [days] disk usage DU [TB] # data samples: D = 64 online time OT [s] table accesses TA success ratio SR 241 215 5 [23 , 26] 337.5 7.49 27.8 221 0.86 239 [23 , 27] 95.4 3.25 36.3 0.67 240 214 [24 , 27] 4.85 10.9 220 0.63 84.4 7.04 7.0 0.60 3.48 [24 , 26] 5.06 8.5 0.55 237 6 [24 , 28] 47.7 0.79 73.5 0.42


Download ppt "Martin Novotný, Andy Rupp Ruhr University Bochum"

Similar presentations


Ads by Google