Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE477 L12&13 Low Power.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 12&13: Designing for Low Power Mary Jane Irwin ( www.cse.psu.edu/~mji.

Similar presentations


Presentation on theme: "CSE477 L12&13 Low Power.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 12&13: Designing for Low Power Mary Jane Irwin ( www.cse.psu.edu/~mji."— Presentation transcript:

1 CSE477 L12&13 Low Power.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 12&13: Designing for Low Power Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg477www.cse.psu.edu/~mji www.cse.psu.edu/~cg477 [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]

2 CSE477 L12&13 Low Power.2Irwin&Vijay, PSU, 2002 Review: Designing Fast CMOS Gates  Transistor sizing  Progressive transistor sizing l fet closest to the output is smallest of series fets  Transistor ordering l put latest arriving signal closest to the output  Logic structure reordering l replace large fan-in gates with smaller fan-in gate network  Logical effort  Buffer (inverter) insertion l separate large fan-in from large C L with buffers l uses buffers so there are no more than four TGs in series

3 CSE477 L12&13 Low Power.3Irwin&Vijay, PSU, 2002 Why Power Matters  Packaging costs  Power supply rail design  Chip and system cooling costs  Noise immunity and system reliability  Battery life (in portable systems)  Environmental concerns l Office equipment accounted for 5% of total US commercial energy usage in 1993 l Energy Star compliant systems

4 CSE477 L12&13 Low Power.4Irwin&Vijay, PSU, 2002 Why worry about power? -- Power Dissipation P6 Pentium ® 486 386 286 8086 8085 8080 8008 4004 0.1 1 10 100 197119741978198519922000 Year Power (Watts) Lead microprocessors power continues to increase Power delivery and dissipation will be prohibitive Source: Borkar, De Intel 

5 CSE477 L12&13 Low Power.5Irwin&Vijay, PSU, 2002 Why worry about power? -- Chip Power Density 4004 8008 8080 8085 8086 286 386 486 Pentium® P6 1 10 100 1000 10000 19701980199020002010 Year Power Density (W/cm2) Hot Plate Nuclear Reactor Rocket Nozzle Sun’s Surface …chips might become hot… Source: Borkar, De Intel 

6 CSE477 L12&13 Low Power.6Irwin&Vijay, PSU, 2002 Chip Power Density Distribution  Power density is not uniformly distributed across the chip  Silicon is not a good heat conductor  Max junction temperature is determined by hot-spots l Impact on packaging, w.r.t. cooling Power Map On-Die Temperature

7 CSE477 L12&13 Low Power.7Irwin&Vijay, PSU, 2002 Problem Illustration

8 CSE477 L12&13 Low Power.8Irwin&Vijay, PSU, 2002 Why worry about power ? -- Battery Size/Weight Expected battery lifetime increase over the next 5 years: 30 to 40% From Rabaey, 1995 65707580859095 0 10 20 30 40 50 Rechargable Lithium Year Nickel-Cadmium Ni-Metal Hydride Nominal Capacity (W-hr/lb) Battery (40+ lbs)

9 CSE477 L12&13 Low Power.9Irwin&Vijay, PSU, 2002 Why worry about power? -- Standby Power  Drain leakage will increase as V T decreases to maintain noise margins and meet frequency demands, leading to excessive battery draining standby power consumption. 8KW 1.7KW 400W 88W 12W 0% 10% 20% 30% 40% 50% 20002002200420062008 Standby Power Source: Borkar, De Intel  Year20022005200820112014 Power supply V dd (V)1.51.20.90.70.6 Threshold V T (V)0.4 0.350.30.25 …and phones leaky!

10 CSE477 L12&13 Low Power.10Irwin&Vijay, PSU, 2002 Power and Energy Figures of Merit  Power consumption in Watts l determines battery life in hours  Peak power l determines power ground wiring designs l sets packaging limits l impacts signal noise margin and reliability analysis  Energy efficiency in Joules l rate at which power is consumed over time  Energy = power * delay l Joules = Watts * seconds l lower energy number means less power to perform a computation at the same frequency

11 CSE477 L12&13 Low Power.11Irwin&Vijay, PSU, 2002 Power versus Energy Watts time Power is height of curve Watts time Approach 1 Approach 2 Approach 1 Energy is area under curve Lower power design could simply be slower Two approaches require the same energy

12 CSE477 L12&13 Low Power.12Irwin&Vijay, PSU, 2002 PDP and EDP  Power-delay product (PDP) = P av * t p = (C L V DD 2 )/2 l PDP is the average energy consumed per switching event (Watts * sec = Joule) l lower power design could simply be a slower design l allows one to understand tradeoffs better energy-delay energy delay  Energy-delay product (EDP) = PDP * t p = P av * t p 2 l EDP is the average energy consumed multiplied by the computation time required l takes into account that one can trade increased delay for lower energy/operation (e.g., via supply voltage scaling that increases delay, but decreases energy consumption)

13 CSE477 L12&13 Low Power.13Irwin&Vijay, PSU, 2002 Understanding Tradeoffs Energy 1/Delay a b c d  Which design is the “best” (fastest, coolest, both) ? better

14 CSE477 L12&13 Low Power.14Irwin&Vijay, PSU, 2002 Understanding Tradeoffs Energy 1/Delay a b c d Lower EDP  Which design is the “best” (fastest, coolest, both) ? better

15 CSE477 L12&13 Low Power.15Irwin&Vijay, PSU, 2002 CMOS Energy & Power Equations E = C L V DD 2 P 0  1 + t sc V DD I peak P 0  1 + V DD I leakage P = C L V DD 2 f 0  1 + t sc V DD I peak f 0  1 + V DD I leakage Dynamic power Short-circuit power Leakage power f 0  1 = P 0  1 * f clock

16 CSE477 L12&13 Low Power.16Irwin&Vijay, PSU, 2002 Dynamic Power Consumption Energy/transition = C L * V DD 2 * P 0  1 P dyn = Energy/transition * f = C L * V DD 2 * P 0  1 * f P dyn = C EFF * V DD 2 * f where C EFF = P 0  1 C L Not a function of transistor sizes! Data dependent - a function of switching activity! VinVout CLCL Vdd f01f01

17 CSE477 L12&13 Low Power.17Irwin&Vijay, PSU, 2002 Pop Quiz  Consider a 0.25 micron chip, 500 MHz clock, average load cap of 15fF/gate (fanout of 4), 2.5V supply. l Dynamic Power consumption per gate is ??  With 1 million gates (assuming each transitions every clock) l Dynamic Power of entire chip = ??.

18 CSE477 L12&13 Low Power.18Irwin&Vijay, PSU, 2002 Lowering Dynamic Power P dyn = C L V DD 2 P 0  1 f Capacitance: Function of fan-out, wire length, transistor sizes Supply Voltage: Has been dropping with successive generations Clock frequency: Increasing… Activity factor: How often, on average, do wires switch?

19 CSE477 L12&13 Low Power.19Irwin&Vijay, PSU, 2002 Short Circuit Power Consumption Finite slope of the input signal causes a direct current path between V DD and GND for a short period of time during switching when both the NMOS and PMOS transistors are conducting. VinVout CLCL I sc

20 CSE477 L12&13 Low Power.20Irwin&Vijay, PSU, 2002 Short Circuit Currents Determinates  Duration and slope of the input signal, t sc  I peak determined by l the saturation current of the P and N transistors which depend on their sizes, process technology, temperature, etc. l strong function of the ratio between input and output slopes -a function of C L E sc = t sc V DD I peak P 0  1 P sc = t sc V DD I peak f 0  1

21 CSE477 L12&13 Low Power.21Irwin&Vijay, PSU, 2002 Impact of C L on P sc VinVout CLCL I sc  0 VinVout CLCL I sc  I max Large capacitive load Output fall time significantly larger than input rise time. Small capacitive load Output fall time substantially smaller than the input rise time.

22 CSE477 L12&13 Low Power.22Irwin&Vijay, PSU, 2002 I peak as a Function of C L I peak (A) time (sec) x 10 -10 x 10 -4 C L = 20 fF C L = 100 fF C L = 500 fF 500 psec input slope Short circuit dissipation is minimized by matching the rise/fall times of the input and output signals - slope engineering. When load capacitance is small, I peak is large.

23 CSE477 L12&13 Low Power.23Irwin&Vijay, PSU, 2002 P sc as a Function of Rise/Fall Times P normalized t sin /t sou t V DD = 3.3 V V DD = 2.5 V V DD = 1.5V normalized wrt zero input rise-time dissipation When load capacitance is small (t sin /t sout > 2 for V DD > 2V) the power is dominated by P sc If V DD < V Tn + |V Tp | then P sc is eliminated since both devices are never on at the same time. W/L p = 1.125  m/0.25  m W/L n = 0.375  m/0.25  m C L = 30 fF

24 CSE477 L12&13 Low Power.24Irwin&Vijay, PSU, 2002 Leakage (Static) Power Consumption Sub-threshold current is the dominant factor. All increase exponentially with temperature! V DD I leakage Vout Drain junction leakage Sub-threshold current Gate leakage

25 CSE477 L12&13 Low Power.25Irwin&Vijay, PSU, 2002 Leakage as a Function of V T 10 -2 10 -12 10 -7  Continued scaling of supply voltage and the subsequent scaling of threshold voltage will make subthreshold conduction a dominate component of power dissipation.  An 90mV/decade V T roll-off - so each 255mV increase in V T gives 3 orders of magnitude reduction in leakage (but adversely affects performance)

26 CSE477 L12&13 Low Power.26Irwin&Vijay, PSU, 2002 TSMC Processes Leakage and V T 80 0.25 V 13,000 920/400 0.08  m 24 Å 1.2 V CL013 HS 52 0.29 V 1,800 860/370 0.11  m 29 Å 1.5 V CL015 HS 42 Å T ox (effective) 43142230FET Perf. (GHz) 0.40 V0.73 V0.63 V0.42 VV Tn 3000.151.6020I off (leakage) (  A/  m) 780/360320/130500/180600/260I DSat (n/p) (  A/  m) 0.13  m0.18  m0.16  m L gate 2 V1.8 V V dd CL018 HS CL018 ULP CL018 LP CL018 G From MPR, 2000

27 CSE477 L12&13 Low Power.27Irwin&Vijay, PSU, 2002 Exponential Increase in Leakage Currents Temp(C) I leakage (nA/  m) From De,1999

28 CSE477 L12&13 Low Power.28Irwin&Vijay, PSU, 2002 Review: Energy & Power Equations E = C L V DD 2 P 0  1 + t sc V DD I peak P 0  1 + V DD I leakage P = C L V DD 2 f 0  1 + t sc V DD I peak f 0  1 + V DD I leakage Dynamic power (~90% today and decreasing relatively) Short-circuit power (~8% today and decreasing absolutely) Leakage power (~2% today and increasing) f 0  1 = P 0  1 * f clock

29 CSE477 L12&13 Low Power.29Irwin&Vijay, PSU, 2002 Power and Energy Design Space Constant Throughput/Latency Variable Throughput/Latency EnergyDesign TimeNon-active ModulesRun Time Active Logic Design Reduced V dd Sizing Multi-V dd Clock Gating DFS, DVS (Dynamic Freq, Voltage Scaling) Leakage+ Multi-V T Sleep Transistors Multi-V dd Variable V T + Variable V T

30 CSE477 L12&13 Low Power.30Irwin&Vijay, PSU, 2002 Dynamic Power as a Function of Device Size  Device sizing affects dynamic energy consumption l gain is largest for networks with large overall effective fan-outs (F = C L /C g,1 )  The optimal gate sizing factor (f) for dynamic energy is smaller than the one for performance, especially for large F’s l e.g., for F=20, f opt (energy) = 3.53 while f opt (performance) = 4.47  If energy is a concern avoid oversizing beyond the optimal 1234567 0 0.5 1 1.5 f normalized energy F=1 F=2 F=5 F=10 F=20 From Nikolic, UCB

31 CSE477 L12&13 Low Power.31Irwin&Vijay, PSU, 2002 Dynamic Power Consumption is Data Dependent ABOut 001 010 100 110 2-input NOR Gate With input signal probabilities P A=1 = 1/2 P B=1 = 1/2 Static transition probability P 0  1 = P out=0 x P out=1 = P 0 x (1-P 0 )  Switching activity, P 0  1, has two components l A static component – function of the logic topology l A dynamic component – function of the timing behavior (glitching) NOR static transition probability = 3/4 x 1/4 = 3/16

32 CSE477 L12&13 Low Power.32Irwin&Vijay, PSU, 2002 NOR Gate Transition Probabilities CLCL A B BA P 0  1 = P 0 x P 1 = (1-(1-P A )(1-P B )) (1-P A )(1-P B ) PAPA PBPB 0 101  Switching activity is a strong function of the input signal statistics l P A and P B are the probabilities that inputs A and B are one

33 CSE477 L12&13 Low Power.33Irwin&Vijay, PSU, 2002 Transition Probabilities for Some Basic Gates P 0  1 = P out=0 x P out=1 NOR(1 - (1 - P A )(1 - P B )) x (1 - P A )(1 - P B ) OR(1 - P A )(1 - P B ) x (1 - (1 - P A )(1 - P B )) NANDP A P B x (1 - P A P B ) AND(1 - P A P B ) x P A P B XOR(1 - (P A + P B - 2P A P B )) x (P A + P B - 2P A P B ) B A Z X 0.5 For Z: P 0  1 = For X: P 0  1 =

34 CSE477 L12&13 Low Power.34Irwin&Vijay, PSU, 2002 Transition Probabilities for Some Basic Gates P 0  1 = P out=0 x P out=1 NOR(1 - (1 - P A )(1 - P B )) x (1 - P A )(1 - P B ) OR(1 - P A )(1 - P B ) x (1 - (1 - P A )(1 - P B )) NANDP A P B x (1 - P A P B ) AND(1 - P A P B ) x P A P B XOR(1 - (P A + P B - 2P A P B )) x (P A + P B - 2P A P B ) B A Z X 0.5 For Z: P 0  1 = P 0 x P 1 = (1-P X P B ) P X P B For X: P 0  1 = P 0 x P 1 = (1-P A ) P A = 0.5 x 0.5 = 0.25 = (1 – (0.5 x 0.5)) x (0.5 x 0.5) = 3/16

35 CSE477 L12&13 Low Power.35Irwin&Vijay, PSU, 2002 Inter-signal Correlations  Determining switching activity is complicated by the fact that signals exhibit correlation in space and time l reconvergent fan-out B A Z X P(Z=1) = P(B=1) & P(A=1 | B=1) Reconvergent fan-out 0.5  Have to use conditional probabilities

36 CSE477 L12&13 Low Power.36Irwin&Vijay, PSU, 2002 Inter-signal Correlations B A Z X P(Z=1) = P(B=1) & P(A=1 | B=1) 0.5 (1-0.5)(1-0.5)x(1-(1-0.5)(1-0.5)) = 3/16 (1- 3/16 x 0.5) x (3/16 x 0.5) = 0.085 Reconvergent  Determining switching activity is complicated by the fact that signals exhibit correlation in space and time l reconvergent fan-out  Have to use conditional probabilities

37 CSE477 L12&13 Low Power.37Irwin&Vijay, PSU, 2002 Logic Restructuring Chain implementation has a lower overall switching activity than the tree implementation for random inputs Ignores glitching effects  Logic restructuring: changing the topology of a logic network to reduce transitions A B C D F A B C DZ F W X Y 0.5 (1-0.25)*0.25 = 3/16 0.5 7/64 15/256 3/16 15/256 AND: P 0  1 = P 0 x P 1 = (1 - P A P B ) x P A P B

38 CSE477 L12&13 Low Power.38Irwin&Vijay, PSU, 2002 Input Ordering A B C X F 0.5 0.2 0.1 B C A X F 0.2 0.1 0.5 Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5)

39 CSE477 L12&13 Low Power.39Irwin&Vijay, PSU, 2002 Input Ordering Beneficial to postpone the introduction of signals with a high transition rate (signals with signal probability close to 0.5) A B C X F 0.5 0.2 0.1 B C A X F 0.2 0.1 0.5 (1-0.5x0.2)x(0.5x0.2)=0.09(1-0.2x0.1)x(0.2x0.1)=0.0196

40 CSE477 L12&13 Low Power.40Irwin&Vijay, PSU, 2002 Glitching in Static CMOS Networks ABC X Z 101000 Unit Delay A B X Z C  Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) l glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value

41 CSE477 L12&13 Low Power.41Irwin&Vijay, PSU, 2002 Glitching in Static CMOS Networks ABC X Z 101000 Unit Delay A B X Z C  Gates have a nonzero propagation delay resulting in spurious transitions or glitches (dynamic hazards) l glitch: node exhibits multiple transitions in a single cycle before settling to the correct logic value

42 CSE477 L12&13 Low Power.42Irwin&Vijay, PSU, 2002 Glitching in an RCA S0 S1 S2S14 S15 Cin S0 S1 S2 S3 S4 S5 S10 S15

43 CSE477 L12&13 Low Power.43Irwin&Vijay, PSU, 2002 Balanced Delay Paths to Reduce Glitching So equalize the lengths of timing paths through logic F1F1 F2F2 F3F3 0 0 0 0 1 2 F1F1 F2F2 F3F3 0 0 0 0 1 1  Glitching is due to a mismatch in the path lengths in the logic network; if all input signals of a gate change simultaneously, no glitching occurs

44 CSE477 L12&13 Low Power.44Irwin&Vijay, PSU, 2002 Power and Energy Design Space Constant Throughput/Latency Variable Throughput/Latency EnergyDesign TimeNon-active ModulesRun Time Active Logic Design Reduced V dd Sizing Multi-V dd Clock Gating DFS, DVS (Dynamic Freq, Voltage Scaling) Leakage+ Multi-V T Sleep Transistors Multi-V dd Variable V T + Variable V T

45 CSE477 L12&13 Low Power.45Irwin&Vijay, PSU, 2002 Dynamic Power as a Function of V DD  Decreasing the V DD decreases dynamic energy consumption (quadratically)  But, increases gate delay (decreases performance) V DD (V) t p(normalized)  Determine the critical path(s) at design time and use high V DD for the transistors on those paths for speed. Use a lower V DD on the other gates, especially those that drive large capacitances (as this yields the largest energy benefits).

46 CSE477 L12&13 Low Power.46Irwin&Vijay, PSU, 2002 Multiple V DD Considerations  How many V DD ? – Two is becoming common l Many chips already have two supplies (one for core and one for I/O)  When combining multiple supplies, level converters are required whenever a module at the lower supply drives a gate at the higher supply (step-up) l If a gate supplied with V DDL drives a gate at V DDH, the PMOS never turns off -The cross-coupled PMOS transistors do the level conversion -The NMOS transistor operate on a reduced supply l Level converters are not needed for a step-down change in voltage l Overhead of level converters can be mitigated by doing conversions at register boundaries and embedding the level conversion inside the flipflop (see Figure 11.47) V DDH V in V out V DDL

47 CSE477 L12&13 Low Power.47Irwin&Vijay, PSU, 2002 Dual-Supply Inside a Logic Block  Minimum energy consumption is achieved if all logic paths are critical (have the same delay)  Clustered voltage-scaling l Each path starts with V DDH and switches to V DDL (gray logic gates) when delay slack is available l Level conversion is done in the flipflops at the end of the paths

48 CSE477 L12&13 Low Power.48Irwin&Vijay, PSU, 2002 Power and Energy Design Space Constant Throughput/Latency Variable Throughput/Latency EnergyDesign TimeNon-active ModulesRun Time Active Logic Design Reduced V dd Sizing Multi-V dd Clock Gating DFS, DVS (Dynamic Freq, Voltage Scaling) Leakage+ Multi-V T Sleep Transistors Multi-V dd Variable V T + Variable V T

49 CSE477 L12&13 Low Power.49Irwin&Vijay, PSU, 2002 Stack Effect  Leakage is a function of the circuit topology and the value of the inputs V T = V T0 +  (  |-2  F + V SB | -  |-2  F |) where V T0 is the threshold voltage at V SB = 0; V SB is the source- bulk (substrate) voltage;  is the body-effect coefficient AB B A Out VXVX ABVXVX I SUB 00V T ln(1+n)V GS =V BS = -V X 010V GS =V BS =0 10V DD -V T V GS =V BS =0 110V SG =V SB =0  Leakage is least when A = B = 0  Leakage reduction due to stacked transistors is called the stack effect

50 CSE477 L12&13 Low Power.50Irwin&Vijay, PSU, 2002 Short Channel Factors and Stack Effect  In short-channel devices, the subthreshold leakage current depends on V GS,V BS and V DS. The V T of a short-channel device decreases with increasing V DS due to DIBL (drain-induced barrier loading). l Typical values for DIBL are 20 to 150mV change in V T per voltage change in V DS so the stack effect is even more significant for short-channel devices. l V X reduces the drain-source voltage of the top nfet, increasing its V T and lowering its leakage  For our 0.25 micron technology, V X settles to ~100mV in steady state so V BS = -100mV and V DS = V DD -100mV which is 20 times smaller than the leakage of a device with V BS = 0mV and V DS = V DD

51 CSE477 L12&13 Low Power.51Irwin&Vijay, PSU, 2002 Leakage as a Function of Design Time V T  Reducing the V T increases the sub- threshold leakage current (exponentially) l 90mV reduction in V T increases leakage by an order of magnitude  But, reducing V T decreases gate delay (increases performance)  Determine the critical path(s) at design time and use low V T devices on the transistors on those paths for speed. Use a high V T on the other logic for leakage control. l A careful assignment of V T ’s can reduce the leakage by as much as 80%

52 CSE477 L12&13 Low Power.52Irwin&Vijay, PSU, 2002 Dual-Thresholds Inside a Logic Block  Minimum energy consumption is achieved if all logic paths are critical (have the same delay)  Use lower threshold on timing-critical paths l Assignment can be done on a per gate or transistor basis; no clustering of the logic is needed l No level converters are needed

53 CSE477 L12&13 Low Power.53Irwin&Vijay, PSU, 2002 Variable V T (ABB) at Run Time  V T = V T0 +  (  |-2  F + V SB | -  |-2  F |) V SB (V) V T (V)  A negative bias on V SB causes V T to increase  Adjusting the substrate bias at run time is called adaptive body-biasing (ABB) l Requires a dual well fab process  For an n-channel device, the substrate is normally tied to ground (V SB = 0)

54 CSE477 L12&13 Low Power.54Irwin&Vijay, PSU, 2002 Next Lecture and Reminders  Next lecture (after midterm) l Dynamic logic -Reading assignment – Rabaey, et al, 6.3  Reminders l HW3 due Oct 10 th (hand in to TA) l Class cancelled on Oct 10 th as make up for evening midterm l Class cancelled on Oct 15 th due to fall break l I will be out of town Oct 10 th through Oct 15 th and Oct 18 th through Oct 23 rd, so office hours during those periods are cancelled l There will be a guest lecturer on Oct 22 nd l Evening midterm exam scheduled -Wednesday, October 16 th from 8:15 to 10:15pm in 260 Willard -Only one midterm conflict filed for so far


Download ppt "CSE477 L12&13 Low Power.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 12&13: Designing for Low Power Mary Jane Irwin ( www.cse.psu.edu/~mji."

Similar presentations


Ads by Google