M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient.

Slides:



Advertisements
Similar presentations
Using Carry-Save Adders For Radix- 4, Can Be Used to Generate 3a – No Booth’s Slight Delay Penalty from CSA – 3 Gates.
Advertisements

UNIVERSITY OF MASSACHUSETTS Dept
Multiplication Schemes Continued
1 EFFICIENT ADDERS TO SPEEDUP MODULAR MULTIPLICATION FOR CRYPTOGRAPHY Adnan Gutub Hassan Tahhan Computer Engineering Department KFUPM, Dhahran, SAUDI ARABIA.
UNIVERSITY OF MASSACHUSETTS Dept
EE 382 Processor DesignWinter 98/99Michael Flynn 1 AT Arithmetic Most concern has gone into creating fast implementation of (especially) FP Arith. Under.
1 CS 140 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris.
Space vs. Speed: Binary Adders 11.3 Space vs. Speed.
EECS Components and Design Techniques for Digital Systems Lec 18 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
VLSI Arithmetic Adders Prof. Vojin G. Oklobdzija University of California
SCOTT MILLER, AMBROSE CHU, MIHAI SIMA, MICHAEL MCGUIRE ReCoEng Lab DEPARTMENT OF ELECTRICAL AND COMPUTER ENGINEERING UNIVERSITY OF.
VLSI Design Spring03 UCSC By Prof Scott Wakefield Final Project By Shaoming Ding Jun Hu
ECE C03 Lecture 61 Lecture 6 Arithmetic Logic Circuits Hai Zhou ECE 303 Advanced Digital Design Spring 2002.
UNIVERSITY OF MASSACHUSETTS Dept
Chapter # 5: Arithmetic Circuits Contemporary Logic Design Randy H
An Expandable Montgomery Modular Multiplication Processor Adnan Abdul-Aziz GutubAlaaeldin A. M. Amin Computer Engineering Department King Fahd University.
CHES20021 Scalable and Unified Hardware to Compute Montgomery Inverse in GF(p) and GF(2 n ) A. Gutub, A. Tenca, E. Savas, and C. Koc Information Security.
Copyright 2008 Koren ECE666/Koren Part.6a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Copyright 2008 Koren ECE666/Koren Part.5a.1 Israel Koren Spring 2008 UNIVERSITY OF MASSACHUSETTS Dept. of Electrical & Computer Engineering Digital Computer.
Chapter 5 Arithmetic Logic Functions. Page 2 This Chapter..  We will be looking at multi-valued arithmetic and logic functions  Bitwise AND, OR, EXOR,
Prince Sultan College For Woman
ECE 645 – Computer Arithmetic Lecture 10: Fast Dividers ECE 645—Computer Arithmetic 4/15/08.
3-1 Chapter 3 - Arithmetic Computer Architecture and Organization by M. Murdocca and V. Heuring © 2007 M. Murdocca and V. Heuring Computer Architecture.
Long Modular Multiplication for Cryptographic Applications Laszlo Hars Seagate Research Workshop on Cryptographic Hardware and Embedded Systems, CHES 2004.
Chapter 6-2 Multiplier Multiplier Next Lecture Divider
VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California
Asynchronous Datapath Design Adders Comparators Multipliers Registers Completion Detection Bus Pipeline …..
Chapter 8 Problems Prof. Sin-Min Lee Department of Mathematics and Computer Science.
IKI a-Combinatorial Components Bobby Nazief Semester-I The materials on these slides are adopted from those in CS231’s Lecture Notes.
Chapter 4 – Arithmetic Functions and HDLs Logic and Computer Design Fundamentals.
Chapter # 5: Arithmetic Circuits
Chapter 6-1 ALU, Adder and Subtractor
5-1 Programmable and Steering Logic Chapter # 5: Arithmetic Circuits.
Reconfigurable Computing - Multipliers: Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on.
Han Liu Supervisor: Seok-Bum Ko Electrical & Computer Engineering Department 2010-Feb-23.
Sequential Multipliers Lecture 9. Required Reading Chapter 9, Basic Multiplication Scheme Chapter 10, High-Radix Multipliers Chapter 12.3, Bit-Serial.
King Fahd University of Petroleum and Minerals King Fahd University of Petroleum and Minerals Computer Engineering Department Computer Engineering Department.
Description and Analysis of MULTIPLIERS using LAVA.
Digital Kommunikationselektronik TNE027 Lecture 2 1 FA x n –1 c n c n1- y n1– s n1– FA x 1 c 2 y 1 s 1 c 1 x 0 y 0 s 0 c 0 MSB positionLSB position Ripple-Carry.
Multi-operand Addition
Advanced VLSI Design Unit 05: Datapath Units. Slide 2 Outline  Adders  Comparators  Shifters  Multi-input Adders  Multipliers.
EECS Components and Design Techniques for Digital Systems Lec 16 – Arithmetic II (Multiplication) David Culler Electrical Engineering and Computer.
Lecture 4 Multiplier using FPGA 2007/09/28 Prof. C.M. Kyung.
Nov 10, 2008ECE 561 Lecture 151 Adders. Nov 10, 2008ECE 561 Lecture 152 Adders Basic Ripple Adders Faster Adders Sequential Adders.
FPGA-Based System Design: Chapter 4 Copyright  2003 Prentice Hall PTR Topics n Number representation. n Shifters. n Adders and ALUs.
ECE 645 – Computer Arithmetic Lecture 6: Multi-Operand Addition ECE 645—Computer Arithmetic 3/5/08.
Unrolling Carry Recurrence
Wallace Tree Previous Example is 7 Input Wallace Tree
Orange Coast College Business Division Computer Science Department CS 116- Computer Architecture Arithmetic: Part II.
Full Tree Multipliers All k PPs Produced Simultaneously Input to k-input Multioperand Tree Multiples of a (Binary, High-Radix or Recoded) Formed at Top.
Recursive Architectures for 2DLNS Multiplication RESEARCH CENTRE FOR INTEGRATED MICROSYSTEMS - UNIVERSITY OF WINDSOR 11 Recursive Architectures for 2DLNS.
Two’s and one’s complement arithmetic CLOCK ARITHMETIC.
Lecture #23: Arithmetic Circuits-1 Arithmetic Circuits (Part I) Randy H. Katz University of California, Berkeley Fall 2005.
Application of Addition Algorithms Joe Cavallaro.
ECEN 248 Lab 7: Carry Look Ahead and Carry Save Adders Dept. of Electrical and Computer Engineering.
An Optimized Hardware Architecture for the Montgomery Multiplication Algorithm Miaoqing Huang 1, Kris Gaj 2, Soonhak Kwon 3, Tarek El-Ghazawi 1 1 The George.
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock Slide 1 Spezielle.
Chapter 8 Computer Arithmetic. 8.1 Unsigned Notation Non-negative notation  It treats every number as either zero or a positive value  Range: 0 to 2.
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock Slide 1 Spezielle.
UNIT 2. ADDITION & SUBTRACTION OF SIGNED NUMBERS.
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
Reconfigurable Computing - Options in Circuit Design John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound,
Efficient Montgomery Modular Multiplication Algorithm Using Complement and Partition Techniques Speaker: Te-Jen Chang.
Full Adder Truth Table Conjugate Symmetry A B C CARRY SUM
UNIVERSITY OF MASSACHUSETTS Dept
Unsigned Multiplication
EFFICIENT ADDERS TO SPEEDUP MODULAR MULTIPLICATION FOR CRYPTOGRAPHY
UNIVERSITY OF MASSACHUSETTS Dept
Description and Analysis of MULTIPLIERS using LAVA
UNIVERSITY OF MASSACHUSETTS Dept
Presentation transcript:

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum & Minerals

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Modular Multiplication: C = A * B mod M where A, B < M Secure System very large operand size too expensive. Straightforward Method: Multiplication then modulus division. M. Modular Multiplication Operation In many public-key encryption schemes (e.g., RSA, ElGamal & ECC), Modular Multiplication is a basic arithmetic operations heavily used. M. ABMABM C

Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Interleaving Interleaving Multipl. and reduction In 1983, Blakley: P i = 2 P i-1 + b i A + q M In the literature, proposals to solve the magnitude comparison problem. Koc’s implementation based on carry-save adders. Partial products are represented as sum-carry pairs. The 5 MSBs of the pair is tested for sign estimation. P = 0 for i = n-1 to 0 { P = 2 * P if ( P  M ) P = P – M if ( bi = 1 ) { P = P + A if ( P  M ) P = P – M }

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Montgomery Montgomery ’ s Method In 1985, Montgomery: P i = P i-1 + b i A + q M / 2 No full magnitude comparison is required. The correction step can be easily removed. However, pre and post calculations are needed in order to have the required result. As in the interleaving method, implementations based on carry-save adders are the most effective solutions. P ’ = 0 for i= 0 to n-1 { P ’ = P ’ + a ’ i * B ’ if ( p ’ 0 = 1 ) P ’ = P ’ + M P ’ = P ’ / 2 } if ( P ’  M ) P ’ = P ’ - M

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion High-Radix High-Radix Method Speedups the modular multiplier by requiring less number of cycles. Area and time will increase. The reduction step will be the crucial operation. As the radix increases, it becomes more complex. Walter shows that there is a direct trade-off between the required space and the overall computation time. The AT factor is independent of the choice of the radix. The factor is expected to improve for radices that are not much larger than radix-2.

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison Comparison Between [6] and [18] Montgomery [18]Koc [6] Description (S,C) = S + C + a i B (S,C) = (S + C + s 0 M) / 2 (S,C) = 2S + 2C + a i B + qM q Є {1, 0,-1} Equation Hardware

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison Between [6] and [18] [18]MontgomeryKoc [6] Algorithmic Analysis Transformation of operands into Montgomery’s domain The two’s complement of the modulus needs to be computed calculationsPre- n + 2 iterationsn + 3 iterationscalculationsInter- Summation of the sum- carry pair needs to be transformed back to the ordinary domain There is a correction step in addition to the final summation of the sum-carry pair calculationsPost- GCD (M, 2) = 1 If M is represented using n bits, then |M|  2 n-1 Restrictions Comparison

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison Between [6] and [18] [18]MontgomeryKoc [6] Hardware Analysis Two n-bit carry save adders Two (n+4)-bit carry save adders plus 5-bit carry lookahead logic Logic 56Registers [18]MontgomeryKoc [6] Synthesis Analysis ns6.468 nsClock period Comparison

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improvement Improvements on [6] Pipelining : Due to data dependency, the pipelining will not improve the throughput. However, the pipeline can be used to compute two separate operations simultaneously.

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improvement Improvements on [6] Parallelism : The correction step at the end of the algorithm increases the algorithm complexity. At the hardware level, the correction step can be implemented using two options. By computing the two possible results in parallel, time will be saved.

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Adders The last stage in both algorithms does full-length addition on the carry-sum pair which can be performed in hardware through binary adders. Statistics showed that 72% of the instructions perform additions in the data path of a prototypical RISC machine. The carry-lookahead adder and the carry-skip adder were compared in terms of time, area and power. Binary Adders

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion CLA Carry-Lookahead Adder The total delay of the carry-lookahead adder is  (log n). There is a penalty paid for this gain: the area increases. The carry-lookahead adders require  (n log n) area.

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion The carry-skip adder has a simple and regular structure that requires an area in the order of  (n) which is hardly larger then the area required by the ripple-carry adder. The time complexity of the carry-skip adder is bounded between  (n 1\2 ) and  (log_n). An equal-block-size one-level carry-skip adder will have a time complexity of  (n 1\2 ). However, a more optimized multi-level carry-skip adder will have a time complexity of O (log n). CSK Carry-Skip Adder

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Comparison CLA versus CSK Using 32-bit operands, a multi-level carry-skip adder was 14 % faster and its power dissipation was 58 % of that of the carry- lookahead adder. Using 64-bit operands, a one-level carry-skip adder was 38% slower and its power consumption is 68 % of the the carry- lookahead adder.

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion This work studied the modular multiplication problem over large operand sizes. Based on a survey, two implementations for modular multiplication algorithms were modeled using VHDL and synthesized. A time-area analysis of both implementations showed that Koc’s implementation has the potential to be an effective solution in terms of time and hardware requirements. This implementation was improved further. Carry-save adders give the maximum speedup in computing the partial products since. However, full-length addition on the sum- carry pair needs to be carried out at the last iteration through dedicated binary adder. Two binary adders were studied: the CLA and the CSK. Although the two adders can be of a comparable speed, the CSK requires smaller area and consumes much less power than the CLA. Conclusion

M. Interleaving Montgomery High-Radix Comparison Improvement Adders CLA CSK Comparison Conclusion Improving Cryptographic Architectures by Adopting Efficient Adders in their Modular Multiplication Hardware Adnan Gutub, Hassan Tahhan Computer Engineering Department, King Fahd University of Petroleum and Minerals Thank you The End