IEEE ARITH 17 Cape Cod, 27th – 29th June 2005 Data Dependent Power Use in Multipliers Colin D. Walter David Samyde Work partly done at DICE, UCL, Louvain-la-Neuve, Belgium
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Overview Background & Aims History Cryptographic Context Multiplier Models Gate Switching Activity Hamming & Booth Weight Multipliers Lab Results Conclusions
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Background Power used by a multiplier is data dependent. Similarly, EMR from a multiplier depends on current state & new inputs. Inexpensive equipment can measure the variations. So secret data may leak during cryptographic use. The main leakage in smart cards is from buses. First order leakage depends on Hamming weight, which can be made constant. The multiplier is the next most leaky HW component of a crypto co-processor.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Aims There are HW counter-measures, such as Faraday cages, and SW blinding counter-measures. It is unclear if these are totally effective. So investigate which multiplier designs & arithmetic representations might reduce power/EMR variations. 1.Build model to simulate power consumption. 2.Apply to standard designs and compare them. 3.Develop “better” multipliers...
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 History Occasional (public) refs in old patents: To ensure that the data carrier consumes the same amount of current whether the requested operation is authorized or unauthorized, a bit is stored in the memory in either event. [Abstract, US Patent , filed Aug 1978] Kocher et al (C RYPTO 1996, 1999): Timing and Power Attacks – the concepts made public. Walter (CHES 2001): How to extract private RSA key from power variation of single decryption in presence of standard SW counter-measures. Flynn & Oberman (Wiley, 2001) “Advanced Computer Arithmetic Design”
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Cryptographic Context Smartcard : 8- or 16-bit multipliers for RSA. Long integers A, B in modular products have ~2 7 digits. Each digit x digit mult n a i x b j has ~2 7 cases with same a i (or b j ). Take average power trace as b j (resp. a i ) varies. (Generally, some average must be taken to eliminate noise) Does result characterise a i or mask its value? Any revealed characteristics can be used to distinguish multipliers in the exp n alg m, and hence determine the secret exponent.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Multiplier Model Standard Add-and-Shift Multiplier: 3-to-2 full adders (counters) & 2 bit half adders. Wallace tree arrangement for adders/ HAs. Build model with input word length k as parameter. For convenience, assume all gate switching (A ND, X OR, etc) consumes same power. (Easy to drop this assumption.) Count gates switched for all initial states and all inputs. Draw graphs and look for distinguishing characteristics.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Gate Switching Activity Clearly, Hamming weight is leaked by knowledge of switch counts. (Hamming Weight = #1 bits in binary string.) No. of Gate Switchings averaged over initial states for 3-bit multiplier 2nd Argument Digit st Argument Digit Digit wt 3 Digits wt 2 Digits wt 1 Digit wt 0
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Hamming Weight Multiplier Similar results hold for exhaustive simulations as word size increases. Complexity too great for 16-bit words or larger: O(2 4k k 2 ) for k-bit words. Need to build a Hamming weight multiplier where inputs are Ham g Wt s and output is average gate switching activity – and with polynomial complexity, if possible. Solution: For k-bit multiplier & input a with HW(a) = h, send probability h/k of a bit 1 along the wire, and compute probabilities of gates switching.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Results Gate Switching in 8-bit Multiplier as function of input Ham Wts. Comparison of gate counts gives excellent match between HWt multiplier and binary multiplier, all k. So model can be used to predict gate activity in larger cases. HW(a) HW(b) Gates
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Evaluation The model also accurately predicts the Ham Wt of the output. The 3-D graphs (actual vs model results) have the same features. Hamming Wt of Output (k = 16): HW(a) HW(b) HW(a×b)
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Booth 2 Multiplier A 2-bit Booth Multiplier was built: One input is given a base 4 re-coding of one argument using digits –2, –1, – 0, +0, +1, +2. These multiples of the other input (the multiplicand) feed into a tree of compressors. Graphs show that gate switching (& leakage) depends on: i)The Hamming Wt of the multiplicand ii)The “Booth” Weight of the multiplier: Booth Wt is defined by summing: 0 for recoded digit +0( is added) 2 for recoded digit –0( is added, with correction) 1 for all other digits d(dM is added for multiplicand M)
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Booth Weight Multiplier Can a HWt / BWt multiplier be built for the Booth multiplier like the Ham Wt add-and-shift multiplier? This would predict gate switching from HWt and BWt inputs without combinatorial explosion. The Add-and-Shift case assumed compressor input bits were independent. This was reasonably accurate. Addends and make this unreasonable for a Booth weight multiplier. Alignment of bits in 2M & shifted 1M also reduces independence. Solution not yet worked out.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Multiplier Comparison Overall gate switching was less in the Booth multiplier than the Add-and-Shift multiplier. Area is larger for Booth multiplier with expected digit sizes. So leakage is less, but there is a silicon cost. More complex multipliers are unlikely in most smartcards.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Lab Results The DICE lab at UCL was used to measure power variation and EMR in several multipliers. Only add-and-shift designs were available. EMR at a variety of frequencies yields much more discriminating leakage than a simple gate count, which approximated the power leakage data. So the models agreed with lab results, but the lab results might be used to extract further information.
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 Conclusions Power use in standard multipliers is closely related to input Hamming (or re-coded) weights; Simplified poly time models can enable good accuracy for power use, so designs can be tested easily in the search for less leaky hardware; Some multiplier designs (such as one with 2-bit Booth re-coding) leak less information about Hamming wts than others (such as the standard Add-and-Shift multiplier).
IEEE ARITH 17 Cape Cod, 27th – 29th June /17 IACR CHES Aug – 1 Sept Edinburgh Scotland