# Jan M. Rabaey Low Power Design Essentials ©2008 Chapter 11 Ultra-Low Power/Voltage Design.

## Presentation on theme: "Jan M. Rabaey Low Power Design Essentials ©2008 Chapter 11 Ultra-Low Power/Voltage Design."— Presentation transcript:

Jan M. Rabaey Low Power Design Essentials ©2008 Chapter 11 Ultra-Low Power/Voltage Design

Low Power Design Essentials ©2008 11.2 Chapter Outline  Rationale  Lower Bounds on Computational Energy  Subthreshold Logic  Moderate Inversion as a Trade-off  Revisiting Logic Gate Topologies  Summary

Low Power Design Essentials ©2008 11.3 Rationale  Continued increase of computational density must be combined with decrease in energy/operation (EOP).  Further scaling of supply voltage essential to accomplish that –The only other option is to keep on reducing activity  Some key questions: –How far can the supply voltage be scaled? –What is the minimum energy per operation that can be obtained theoretically and practically? –What to do about the threshold voltage and leakage? –How to practically design circuits that approach the minimum energy bounds?

Low Power Design Essentials ©2008 11.4 Opportunities for Ultra-Low Voltage  Number of applications emerging that do not need high performance, only extremely low power dissipation  Examples: –Standby operation for mobile components –Implanted electronics and artificial senses –Smart objects, fabrics and e-textiles  Need power levels below 1 mW (even  W in certain cases)

Low Power Design Essentials ©2008 11.5 Minimum Operational Voltage of Inverter  Swanson, Meindl (April 1972)  Further extended in Meindl (Oct 2000) Limitation: gain at midpoint > -1 C ox : gate capacitance C d : diffusion capacitance n: slope factor For ideal MOSFET (60 mV/decade slope): at 300° K or [Ref: R. Swanson, JSSC’72; J. Meindl, JSSC’00] © IEEE 1972

Low Power Design Essentials ©2008 11.6 Subthreshold Modeling of CMOS Inverter  From Chapter 2: (DIBL can be ignored at low voltages) with

Low Power Design Essentials ©2008 11.7 Subthreshold DC model of CMOS Inverter Assume NMOS and PMOS are fully symmetrical and all voltages normalized to the thermal voltage  T = kT/q (x i = V i /  T ; x o = V o /  T ; x D = V DD /  T ) The VTC of the inverter for NMOS and PMOS in subthreshold can be derived: [Ref: E. Vittoz, CRC’05] with so that and For |A Vmax | = 1: x D = 2ln(n+1)

Low Power Design Essentials ©2008 11.8 Results from Analytical Model 11.11.21.31.41.51.61.71.81.92 1 2 3 4 5 6 7 n x d A max =1 A max =2 A max =4 A max =10 Normalized VTC for n=1.5 as a function of V DD (x d ) Subthreshold Inverter Minimum supply voltage for a given maximum gain as a function of the slope factor n [Ref: E. Vittoz, CRC’05] x dmin = 2ln(2.5) = 1.83 for n=1.5 x d =4 sufficient for reliable operation x D =8 x D =6 x D =4 x D =1 x D =2 n=1.5 012345678 0 1 2 3 4 5 6 7 8 x i x o

Low Power Design Essentials ©2008 11.9 Confirmed by simulation (at 90 nm) Observe: non-symmetry of VTC increases VDD min For n =1.5, VDD min = 1.83  T = 48 mV Minimum operational supply voltage pn-ratio VDD min (mV)

Low Power Design Essentials ©2008 11.10 Also Holds for More Complex Gates Degradation due to asymmetry Minimum operational supply voltage (2-input NOR) pn-ratio

Low Power Design Essentials ©2008 11.11 Minimum Energy per Operation  Moving one electron over VDD min : –Emin = QV DD /2 = q 2(ln2)kT/2q = kTln(2) –Also called the Von Neumann-Landauer-Shannon bound –At room temperature (300K): Emin = 0.29 10 -20 J  Minimum sized CMOS inverter at 90 nm operating at 1V –E = CV DD 2 = 0.8 10 -15 J, or 5 orders of magnitude larger! J. von Neumann,. [Theory of Self-Reproducing Automata, 1966]. Predicted by von Neumann: kTln(2) How close can one get? [Ref: J. Von Neumann, Ill’66]

Low Power Design Essentials ©2008 11.12 Propagation Delay of Subthreshold Inverter Normalizing t p to  0 = C  T /I 0: (for V DD >>    Comparison between curve-fitted model and simulations (FO4, 90 nm) 345678910 0 20 40 60 80 100 120 x d t p  0 = 338 n = 1.36 (nsec)

Low Power Design Essentials ©2008 11.13 Dynamic Behavior Also: Short circuit current ignorable if input rise time smaller than  0, or balanced slopes at in- and outputs 00.511.522.5 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Time (normalized to  0 ) Voltage (norm. to 4  T ) Transient response t r = 2  0 00 0.5  0 0 tptp t p as a function of t rise 00.511.522.53 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 t rise t p (normalized to  0 ) x D = 4

Low Power Design Essentials ©2008 11.14 Power Dissipation of Subthreshold Inverter  P dyn = CV DD 2 f (nothing new)  Short-circuit power can be ignored ( = 4 12345678910 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 1.3 x D I Stat I 0 n=1.5 circuit fails logic levels degenerate  Leakage current equal to I 0 for x D >= 4 (ignores DIBL)  Increases for smaller values of x D due to degeneration of logic levels

Low Power Design Essentials ©2008 11.15 Power-Delay Product and Energy-Delay 345678910 0 1 2 3 4 5 6 7 8 9 x d pdp For low activity (  << 1), large x D advantageous! 345678910 0 0.5 1 1.5 ed x d  =1  =.5  =.25  =.1  =.01  =.05  =1  =.5  =.25  =.1  =.01

Low Power Design Essentials ©2008 11.16 Energy for a Given Throughput Most important question – assuming 1/T =  /2t p what minimizes the energy for a given task? 3456789101112 10 1 2 3 4 x d energy  Energy minimized by keeping  as high as possible and have computation occupy most of the time – use minimum voltage that meets T  If  must be low because of topology (< 0.05), there exists an optimum voltage that minimizes the energy  =1  =.1  =.05  =.01  =.005  =0.001 dynamic power dominates

Low Power Design Essentials ©2008 11.17 Example: Energy-Aware FFT [Ref: A. Wang, ISSCC’04] Architecture scales gracefully from 128 to 1024 point lengths, and supports 8b and 16b precision. © IEEE 2004

Low Power Design Essentials ©2008 11.18 FFT Energy-Performance Curves  The optimal V DD for the 1024-point, 16b FFT is estimated from switching and leakage models for a 0.18  m process. Optimal (V DD, V TH ) Threshold Voltage (V TH ) Supply Voltage (V DD ) [Ref: A. Wang, ISSCC’04] © IEEE 2004

Low Power Design Essentials ©2008 11.19 SubThreshold FFT  0.18  m CMOS process  V DD =180mV-900mV  fclock = 164Hz-6MHz.  At 0.35V, Energy = 155nJ/FFT; fclock = 10kHz; W = 0.6  W Data Memory Twiddle ROMs Butterfly Datapath Control logic 2.1 mm 2.6 mm V DD (mV) Clock frequency V DD (mV) 1024-point, 16 bit measured estimated Energy (nJ) [Ref: A. Wang, ISSCC’04] © IEEE 2004

Low Power Design Essentials ©2008 11.20 Challenges in Sub-Threshold Design  Obviously only for very low speed design  Analysis so far only for symmetrical gates – minimum operation voltage increases for non-symmetrical structures  Careful selection of and sizing logic structures is necessary –Data dependencies may cause gates to fail  Process variations further confound the problem  Registers and memory a major concern

Low Power Design Essentials ©2008 11.21 Logic Sizing Considerations W p (max) Inverter with a minimum sized W n 01 W p (min) drive current leakage current  CMOS in subthreshold is “ratioed logic”  Careful sizing of transistors necessary to ensure adequate logic levels Max Size Min Size Operational Region [Ref: A. Wang, ISSCC’04] 180 nm CMOS © IEEE 2004

Low Power Design Essentials ©2008 11.22 Logic Sizing Considerations W p (max) SF corner W p (min) FS corner W p (max)  Inverter sizing analysis and minimum supply voltage analysis must be performed at the process corners.  Variations raise the minimum voltage the circuit can be run at. Impact of Process Variations Operational Region [Ref: A. Wang, ISSCC’04] © IEEE 2004

Low Power Design Essentials ©2008 11.23 The Impact of Data Dependencies B Z B AA XOR1 Z B B A B A XOR2 100 50 0 1m2m3m4m0 A=1 B=0 A=0 B=1 A=0 B=0 A=1 B=1 Voltage level at Z (mV) 50 0 Voltage level at Z (mV) 100 1m2m3m4m0 A=1 B=0 A=0 B=1 A=0 B=0 A=1 B=1 [Ref: A. Wang, ISSCC’04] © IEEE 2004

Low Power Design Essentials ©2008 11.24 The Impact of Data Dependencies idle current drive current A=1, B=0, Z=1 Z  Leakage through the parallel devices causes XOR1 to fail at 100mV. XOR1 idle current drive current A=1, B=0, Z=1 weak drive current Z  Balanced number of devices reduces the effects of leakage and process variations. XOR2 Solid sub-threshold design requires symmetry for all input vectors [Ref: A. Wang, ISSCC’04] © IEEE 2004

Low Power Design Essentials ©2008 11.25 The Sub-Threshold (Low Voltage) Memory Challenge  Obstacles that limit functionality at low voltage –SNM –Write margin –Read current / bit-line leakage –Soft errors –Erratic behavior Read SNM worst challenge SNM read SNM hold SNM for sub-V T, 6T cell at 300mV Variation aggravates situation

Low Power Design Essentials ©2008 11.26 Solutions to Enable Sub-V TH Memory  Standard 6T way of doing business won’t work  Voltage scaling versus transistor sizing –Current depends exponentially on voltages in sub- threshold –Use voltages (not sizing) to combat problems  New bitcells –Buffer output to remove Read SNM –Lower BL leakage  Complemented with architectural strategies –ECC, interleaving, SRAM refresh, redundancy

Low Power Design Essentials ©2008 11.27 Sub-threshold SRAM Cell [Ref: B. Calhoun, ISSCC’06]  Buffered read allows separate Read, Write ports  Removing Read SNM allows operation at lower V DD with same stability at corners; WL_WR BLB BL Q QB VV DD RBLRWL floating VVDD floats during write access, but feedback restores ‘1’ to V DD QB=1 RBL=1 0 QBB held near 1 by leakage QB=0 RBL=1 0 QBB =1 leakage reduced by stack Buffer reduces BL leakage: Allows 256 cells/BL instead of 16 cells/BL Higher integration reduces area of peripheral circuits © IEEE 2006

Low Power Design Essentials ©2008 11.28 Sub-threshold SRAM Chip functions without error to below 400mV, holds without error to <250mV:  At 400mV, 3.28mW and 475kHz at 27 o C  Reads to 320mV (27 o C) and 360mV (85 o C)  Write to 380mV (27 o C) and 350mV (85 o C) 256kb SRAM Array 32kb Block [Ref: B. Calhoun, ISSCC’06] Sub-V TH operation demonstrated in 65nm memory chip

Low Power Design Essentials ©2008 11.29 Example: Sub-Threshold Microprocessor  Processor for sensor network applications –Simple 8-bit architecture to optimize energy efficiency –3.5 pJ per instruction at 350mV and 354 kHz operation –10X less energy than previously reported –11 nW at 160 mV (300 mV RBB) –41 year operation on 1g Li-ion battery [Ref: S. Hanson, JSSC’07] © IEEE 2007

Low Power Design Essentials ©2008 11.30 Prototype Implementation 6 subliminal processors large solar cell solar cell for adders level converter array discrete adders processor memories test memories solar cell for processor discrete cells / xtors solar cell for discretes test module Level converter array Chip Layout with 7 processors [Courtesy: D. Blaauw, Univ. Michigan]

Low Power Design Essentials ©2008 11.31 Is Sub-threshold the Way to Go?  Achieves lowest possible energy dissipation  But … at a dramatic cost in performance 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 00.20.40.60.81 V DD (V) t p (  s)

Low Power Design Essentials ©2008 11.32 In Addition: Huge Timing Variance 0 10 20 30 40 50 60 70 80 00.20.40.60.81 V DD (V)  /  (%)  Normalized timing variance increases dramatically with V DD reduction  Design for yield means huge overhead at low voltages: –Worst-case design at 300mV: > 200% overkill

Low Power Design Essentials ©2008 11.33 Increased Sensitivity to Variations  Subthreshold circuits operate at low I on /I off ratios, from about a 1000 to less than 10 (at x D = 4)  Small variations in device parameters can have a large impact, and threaten the circuit operation 12345678910 0 1 2 3 I on over I off x DD

Low Power Design Essentials ©2008 11.34 ONE SOLUTION: Back Off A Bit …  The performance cost of minimum energy is exponentially high.  Operating slightly above the threshold voltage improves performance dramatically while having small impact on energy The Challenge: Modeling in the Moderate Inversion region Delay Energy Optimal E-D Trade-off Curve

Low Power Design Essentials ©2008 11.35  The EKV Model covers strong, moderate and weak inversion regions Modeling Over All Regions of Interest  Inversion Coefficient IC measures the degree of saturation with k a fit factor and I S the specific current and is related directly to V DD [Ref: C. Enz, Analog’95]

Low Power Design Essentials ©2008 11.36 Relationship between V DD and IC 10 -3 10 -2 10 10 0 1 2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 IC V DD Threshold changes move curves up or down IC = 1 equals V DD ~ V TH 90 nm CMOS weak moderatestrong

Low Power Design Essentials ©2008 11.37 10 -2 10 0 2 -2 10 10 0 1 2 3 IC Normalized t p Model Simulation Provides Good Match over Most of the Range Largest deviations in strong inversion – Velocity saturation not well handled by simple model strong inversion weak inversion

Low Power Design Essentials ©2008 11.38 Modeling Energy  =1  =0.2  =0.02  =0.002

Low Power Design Essentials ©2008 11.39 High Activity Scenario 0.1 1 1 1 1 2 2 2 2 4 4 4 4 6 6 6 8 8 10 12 14 V TH V DD 0.01 0.1 1 1 11 2 2 22 3 3 33 0.150.20.250.30.350.40.450.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Equal energy Equal performance IC = 1 Minimum energy (90 nm,  = 0.02)

Low Power Design Essentials ©2008 11.40 Low Activity Scenario 0.1 1 1 1 1 2 2 2 2 4 4 4 4 6 6 6 8 8 10 12 14 V TH V DD 0.01 0.1 1 1 1 1 1 1 2 2 2 2 2 3 0.150.20.250.30.350.40.450.5 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Equal energy Equal performance IC = 1 Minimum energy (90 nm,  = 0.002)

Low Power Design Essentials ©2008 11.41 Example: Adder  Simple full-adder using NAND & INV only

Low Power Design Essentials ©2008 11.42 Optimizing over size, V DD, V TH (full range) delay (min delay, max energy) energy 10 10 0 1 2 IC  Delay and energy normalized to minimum delay and corresponding maximum energy  Significant energy savings within strong inversion  Relatively little energy savings going from moderate to weak  Higher potential for energy savings when activity is lower  =0.1  =0.01  =0.001 V TH ↑ V DD ↓ [Ref: C. Marcu, UCB’06]

Low Power Design Essentials ©2008 11.43 Sensitivity to Parameter Variations [Ref: C. Marcu, UCB’06]

Low Power Design Essentials ©2008 11.44 Moving the Minimum Energy Point  Having the minimum energy point in the sub- threshold region is unfortunate –Sub-threshold energy savings are small and expensive –Further technology scaling not offering much relief Remember the stack effect …  Can it be moved upwards?  Or equivalently… Can we lower the threshold?

Low Power Design Essentials ©2008 11.45 Complex versus Simple Gates  Example (from Chapter 4) Fan-in(2) Fan-in(4) versus Complex gates improve the I on /I off ratio!

Low Power Design Essentials ©2008 11.46 Moving the Minimum Energy Point stack2 stack4stack6 V TH V DD

Low Power Design Essentials ©2008 11.47 Complex versus Simple Gates V DD = 1V V TH = 0.1V V DD = 0.14V V TH = 0.25V V DD = 0.1V V TH = 0.22V V DD = 0.34V V TH = 0.43V V DD = 0.29V V TH = 0.38V  = 0.1  = 0.001

Low Power Design Essentials ©2008 11.48 Controlling Leakage in PTL Pass Transistor Network driversreceivers No leakage through the logic path No V DD and GND connections in the logic path Leverage complexity No leakage through the logic path No V DD and GND connections in the logic path Leverage complexity Confine leakage to well-defined and controllable paths [Ref: L. Alarcon, Jolpe’07]

Low Power Design Essentials ©2008 11.49 Sense-Amplifier Based Pass-Transistor Logic (SAPTL) Pass Transistor network Leakage path confined to root node driver and sense amplifier Leakage path confined to root node driver and sense amplifier Sense amplifier to recover delay and voltage swing Sense amplifier to recover delay and voltage swing [Ref: L. Alarcon, Jolpe’07] S S sense amplifier sense amplifier stack root node driver data inputs timing control outputs

Low Power Design Essentials ©2008 11.50 Sense-Amplifier Based Pass-Transistor Logic (SAPTL) Root Input A B S S P0P0 to sense amp A B B B S S Out CK Outputs pre-charged to V DD during low CK cycle (pre-conditioning subsequent logic module) Latch retains value even after inputs are pulled low Low voltage operation (300 mV) Outputs pre-charged to V DD during low CK cycle (pre-conditioning subsequent logic module) Latch retains value even after inputs are pulled low Low voltage operation (300 mV) Current steering Works with very low I on /I off Regular and balanced (Programmable) Current steering Works with very low I on /I off Regular and balanced (Programmable) [Ref: L. Alarcon, Jolpe’07]

Low Power Design Essentials ©2008 11.51 Static CMOS SAPTL TG-CMOS 90nm CMOS V DD : 300mV – 1V V TH  300mV Energy-Delay Trade-off Energy (fJ) Delay (FO4 @ 1V) 110100 1K 1 10 100 1K 10K 100K V DD = 450mV SAPTL V DD = 300mV TG-CMOS V DD = 450mV SAPTL V DD = 300mV TG-CMOS V DD = 900mV SAPTL V DD = 400mV Static CMOS V DD = 900mV SAPTL V DD = 400mV Static CMOS V DD =1V TG-CMOS V DD = 550mV Static CMOS V DD =1V TG-CMOS V DD = 550mV Static CMOS V DD scaling still works! V DD scaling still works! 20 2.5K Sweet-spot: < 10 fJ > 2.5k FO4 10 [Ref: L. Alarcon, Jolpe07]

Low Power Design Essentials ©2008 11.52 Summary  To continue scaling, a reduction in energy per operation is necessary  This is complicated by the perceived lower limit on the supply voltage  Design techniques such as circuits operating in weak or moderate inversion, combined with innovative logic styles are essential if voltage scaling is to continue  Ultimately the deterministic Boolean model of computation may have to be abandoned.

Low Power Design Essentials ©2008 11.53 References Books and Book Chapters  E. Vittoz, “Weak Inversion for Ultimate Low-Power Logic,” in C. Piguet, Ed., Low-Power Electronics Design, Ch. 16, CRC Press, 2005.  A. Wang, A. Chandrakasan, Sub-Threshold Design for Ultra Low-Power Systems, Springer, 2006. Articles  L. Alarcon, T.T. Liu, M. Pierson, J. Rabaey, “Exploring Very Low-Energy Logic: A Case Study,” Journal of Low Power Electronics, Vol. 3, No. 3., December 2007.  B. Calhoun and A. Chandrakasan, “A 256kb Sub-threshold SRAM in 65nm CMOS,”, Digest of Technical Papers, ISSCC 2006, pp. 2592-2601, San Francisco, Febr. 2006.  J. Chen et al, “An Ultra-Low_Power Memory with a Subthreshold Power Supply Voltage,” IEEE Journal of Solid State Circuits, Vol. 41 No 10, pp. 2344-2353, Oct 2006.  C. Enz, F. Krummenacher, and E. Vittoz, “An Analytical MOS Transistor Model Valid in All Regions of Operation and Dedicated to Low Voltage and Low-Current Applications,” Analog Integrated Circuits and Signal Proc., vol. 8, pp. 83- 114, July 1995.  S. Hanson et al., “Exploring Variability and Performance in a Sub-200-mV Processor,” in Journal of Solid State Circuits, Vol. 43, No. 4, pp. 881-891, April 2008.  R. Landauer, “Irreversibility and heat generation in the computing process,” IBM Journal Res. Develop, 5:183-191, 1961.  C. Marcu, M. Mark, and J. Richmond, “Energy-Performance Optimization Considerations in All Regions of MOSFET Operation with Emphasis on IC=1”, Project Report EE241, UC Berkeley, Spring 2006.  J.D. Meindl, J. Davis,“The fundamental limit on binary switching energy for tera scale integration (TSI)”, IEEE Journal of Solid-State Circuits, Volume 35, Issue 10, pp. 1515 – 1516, Oct 2000.  M. Seok et al, “The Phoenix Processor: A 30 pW Platform for Sensor Applications,” Proceedings VLSI Symposium, Honolulu, June 2008.

Low Power Design Essentials ©2008 11.54 References (cntd)  R. Swanson and J. Meindl, “Ion-Implanted Complementary MOS Transistors in Low-Voltage Circuits,” IEEE J. Solid State Circuits, vol. SC-7, pp. 146-153, April 1972.  E. Vittoz and J. Fellrath, “CMOS Analog Integrated Circuits based on Weak-Inversion Operation,” IEEE J. Solid State Circuits, vol. SC-12, pp. 224-231, June 1977.  J. von Neumann, “Theory of Self-Reproducing Automata,” in A.W. Burks, Ed., Univ. Illinois Press, Urbana, 1966.  A. Wang, A. Chandrakasan, "A 180mV FFT Processor Using Subthreshold Circuit Techniques", Digest of Technical Papers, ISSCC 2004, pp. 292-293, San Francisco, Febr. 2004.  K. Yano et al., “A 3.8 ns CMOS 16 × 16 Multiplier using Complimentary Pass-Transistor Logic,” IEEE Journal of Solid State Circuits, vol. SC-25, No 2, pp. 388-395, April 1990.

Download ppt "Jan M. Rabaey Low Power Design Essentials ©2008 Chapter 11 Ultra-Low Power/Voltage Design."

Similar presentations