Presentation is loading. Please wait.

Presentation is loading. Please wait.

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient.

Similar presentations


Presentation on theme: "M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient."— Presentation transcript:

1 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum & Minerals

2 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Modular Multiplication: C = A * B mod M where A, B < M Secure System very large operand size too expensive. Straightforward Method: Multiplication then modulus division. M. Modular Multiplication Operation In many public-key encryption schemes (e.g., RSA, ElGamal & ECC), Modular Multiplication is a basic arithmetic operations heavily used. M. ABMABM C

3 Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Interleaving Interleaving Multipl. and reduction In 1983, Blakley: P i = 2 P i-1 + b i A + q M In the literature, proposals to solve the magnitude comparison problem. Koc’s implementation based on carry-save adders. Partial products are represented as sum-carry pairs. The 5 MSBs of the pair is tested for sign estimation. P = 0 for i = n-1 to 0 { P = 2 * P if ( P  M ) P = P – M if ( bi = 1 ) { P = P + A if ( P  M ) P = P – M }

4 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Montgomery Montgomery ’ s Method In 1985, Montgomery: P i = P i-1 + b i A + q M / 2 No full magnitude comparison is required. The correction step can be easily removed. However, pre and post calculations are needed in order to have the required result. As in the interleaving method, implementations based on carry-save adders are the most effective solutions. P ’ = 0 for i= 0 to n-1 { P ’ = P ’ + a ’ i * B ’ if ( p ’ 0 = 1 ) P ’ = P ’ + M P ’ = P ’ / 2 } if ( P ’  M ) P ’ = P ’ - M

5 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion High-Radix High-Radix Method Speedups the modular multiplier by requiring less number of cycles. Area and time will increase. The reduction step will be the crucial operation. As the radix increases, it becomes more complex. Walter shows that there is a direct trade-off between the required space and the overall computation time. The AT factor is independent of the choice of the radix. The factor is expected to improve for radices that are not much larger than radix-2.

6 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison Comparison Between [6] and [18] Montgomery [18]Koc [6] Description (S,C) = S + C + a i B (S,C) = (S + C + s 0 M) / 2 (S,C) = 2S + 2C + a i B + qM q Є {1, 0,-1} Equation Hardware

7 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison Between [6] and [18] [18]MontgomeryKoc [6] Algorithmic Analysis Transformation of operands into Montgomery’s domain The two’s complement of the modulus needs to be computed calculationsPre- n + 2 iterationsn + 3 iterationscalculationsInter- Summation of the sum- carry pair needs to be transformed back to the ordinary domain There is a correction step in addition to the final summation of the sum-carry pair calculationsPost- GCD (M, 2) = 1 If M is represented using n bits, then |M|  2 n-1 Restrictions Comparison

8 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison Between [6] and [18] [18]MontgomeryKoc [6] Hardware Analysis Two n-bit carry save adders Two (n+4)-bit carry save adders plus 5-bit carry lookahead logic Logic 56Registers [18]MontgomeryKoc [6] Synthesis Analysis 6.342 ns6.468 nsClock period Comparison

9 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improvement Improvements on [6] Pipelining : Due to data dependency, the pipelining will not improve the throughput. However, the pipeline can be used to compute two separate operations simultaneously.

10 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improvement Improvements on [6] Parallelism : The correction step at the end of the algorithm increases the algorithm complexity. At the hardware level, the correction step can be implemented using two options. By computing the two possible results in parallel, time will be saved.

11 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Adders The last stage in both algorithms does full-length addition on the carry-sum pair which can be performed in hardware through binary adders. Statistics showed that 72% of the instructions perform additions in the data path of a prototypical RISC machine. The carry-lookahead adder and the carry-skip adder were compared in terms of time, area and power. Binary Adders

12 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion CLA Carry-Lookahead Adder The total delay of the carry-lookahead adder is  (log n). There is a penalty paid for this gain: the area increases. The carry-lookahead adders require  (n log n) area.

13 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion The carry-skip adder has a simple and regular structure that requires an area in the order of  (n) which is hardly larger then the area required by the ripple-carry adder. The time complexity of the carry-skip adder is bounded between  (n 1\2 ) and  (log_n). An equal-block-size one-level carry-skip adder will have a time complexity of  (n 1\2 ). However, a more optimized multi-level carry-skip adder will have a time complexity of O (log n). CSK Carry-Skip Adder

14 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison CLA versus CSK Using 32-bit operands, a multi-level carry-skip adder was 14 % faster and its power dissipation was 58 % of that of the carry- lookahead adder. Using 64-bit operands, a one-level carry-skip adder was 38% slower and its power consumption is 68 % of the the carry- lookahead adder.

15 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion This work studied the modular multiplication problem over large operand sizes. Based on a survey, two implementations for modular multiplication algorithms were modeled using VHDL and synthesized. A time-area analysis of both implementations showed that Koc’s implementation has the potential to be an effective solution in terms of time and hardware requirements. This implementation was improved further. Carry-save adders give the maximum speedup in computing the partial products since. However, full-length addition on the sum- carry pair needs to be carried out at the last iteration through dedicated binary adder. Two binary adders were studied: the CLA and the CSK. Although the two adders can be of a comparable speed, the CSK requires smaller area and consumes much less power than the CLA. Conclusion

16 M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum and Minerals Thank you The End


Download ppt "M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient."

Similar presentations


Ads by Google