3Metrics Delay (sec): Energy (Joule) Power (Watt) Power*Delay (Joule) Performance metricEnergy (Joule)Efficiency metric: effort to perform a taskPower (Watt)Energy consumed per unit timePower*Delay (Joule)Mostly a technology parameter – measures the efficiency of performing an operation in a given technologyEnergy*Delay = Power*Delay2 (Joule-sec)Combined performance and energy metric – figure of merit of design styleOther Metrics: Energy-Delayn (Joule-secn)Increased weight on performance over energy
4Where is Power Dissipated in CMOS? Active (Dynamic) power(Dis)charging capacitorsShort-circuit powerBoth pull-up and pull-down on during transitionStatic (leakage) powerTransistors are imperfect switchesStatic currentsBiasing currents
5Active (or Dynamic) Power Key property of active power:with f the switching frequencySources:Charging and discharging capacitorsTemporary glitches (dynamic hazards)Short-circuit currents
6Charging Capacitors Applying a voltage step R V C Value of R does not impact energy!
7Applied to Complementary CMOS Gate VddPMOSiLANETWORK1VoutANCNMOSLNETWORKOne half of the power from the supply is consumed in the pull-up network and one half is stored on CLCharge from CL is dumped during the 10 transitionIndependent of resistance of charging/discharging network
8Circuits with Reduced Swing Energy consumed is proportional to output swing
9Charging Capacitors - Revisited Driving from a constant current sourceRICEnergy dissipated in resistor can be reduced by increasing charging time T (that is, decreasing I)
10Charging Capacitors Using constant voltage or current driver? Econstant_current < Econstant_voltageifT > 2RCEnergy dissipated using constant current charging can be made arbitrarily small at the expense of delay: Adiabatic chargingNote: tp(RC) = 0.69 RCt0→90%(RC) = 2.3 RC
11Charging CapacitorsDriving using a sine wave (e.g. from resonant circuit)Rv(t)CEnergy dissipated in resistor can be made arbitrarily small if frequency w << 1/RC(output signal in phase with input sinusoid)
12Dynamic Power Consumption Power = Energy/transition • Transition rate = CLVDD2 • f01 = CLVDD2 • f • P01 = CswitchedVDD2 • fPower dissipation is data dependent – depends on the switching probabilitySwitched capacitance Cswitched = P01CL= a CL (a is called the switching activity)
13Impact of Logic Function Example: Static 2-input NOR gateAssume signal probabilitiespA=1 = 1/2pB=1 = 1/2ABOut1Then transition probabilityp01 = pOut=0 x pOut=1= 3/4 x 1/4 = 3/16If inputs switch every cycleaNOR = 3/16NAND gate yields similar result
14Impact of Logic Function Example: Static 2-input XOR GateAssume signal probabilitiespA=1 = 1/2pB=1 = 1/2ABOut1Then transition probabilityp01 = pOut=0 x pOut=1= 1/2 x 1/2 = 1/4Assumes inputs of 0 and 1 are equally likely.For dynamic gates, the activity depends only on the signal probability - while for the static case the transition probability depends on the previous state. Remember for static NOR gate P0->1 = 3/16If inputs switch in every cycleP01 = 1/4
15Transition Probabilities for Basic Gates As a function of the input probabilitiesp01AND(1 - pApB)pApBOR(1 - pA)(1 - pB)(1 - (1 - pA)(1 - pB))XOR(1 - (pA +pB – 2pApB))(pA + pB – 2pApB)Activity for static CMOS gatesa = p0p1
16Activity as a Function of Topology XOR versus NAND/NORXORNAND/NORaNOR,NAND = (2N-1)/22N aXOR = 1/4
17How about Dynamic Logic? VDDEvalPrechargeEnergy dissipated when effective output is zero!or P0→1 = P0Always larger than P0P1!E.g. P0→1(NAND) = 1/2N ; P0→1(NOR) = (2N-1)/2NActivity in dynamic circuits hence always higher than static.But … capacitance most often smaller.
18Transition probability is 1! Differential Logic?VDDStatic:Activity is doubledDynamic:Transition probability is 1!OutOutGateHence: power always increases.
19Evaluating Power Dissipation of Complex Logic Simple idea: start from inputs and propagate signal probabilities to outputsP1But:Reconvergent fanoutFeedback and temporal/spatial correlations
20Reconvergent Fanout (Spatial Correlation) Inputs to gate can be interdependent (correlated)reconvergenceno reconvergencereconvergentPZ = 1-(1-PA)PBPZ = 1-(1-PA)PA ? NO! PZ = 1PZ: probability that Z=1Must use conditional probabilitiesPZ = 1- PA . P(X|A) = 1probability that X=1 given that A=1Becomes complex and intractable real fast
21Temporal Correlations FeedbackTemporal correlation ininput streamsRLogicX……Both streams have same P = 1 but different switching statisticsX is a function of itself→ correlated in timeActivity estimation the hardest part of power analysisTypically done through simulation with actual input vectors (see later)
22Glitching in Static CMOS Analysis so far did not include timing effectsABC101000XGlitchZGate DelayAlso known as dynamic hazards:“A single input change causing multiple changes in the output”The result is correct, but extra power is dissipated
23Example: Chain of NAND Gates 1Out23452004006000.01.02.03.0Time (ps)867Voltage (V)
24What Causes Glitches?A,BA,BC,DXYZC,DXYZUneven arrival times of input signals of gate due to unbalanced delay pathsSolution: balancing delay paths!
25Short-Circuit Currents (also called crowbar currents)PMOS and NMOS simultaneously on during transitionPsc ~ f
26Short-Circuit Currents VinoutCLDDIsc=MAXVinoutCLDDIsc~time(s)20-0.511.522.54060Isc (A)x104CL= 20 fF= 100 fF= 500 fFLarge loadSmall loadEqualizing rise/fall times of input and output signals limits Psc to 10-15% of the dynamic dissipation[Ref: H. Veendrick, JSSC’84]
27Modeling Short-Circuit Power Can be modeled as capacitora, b: technology parametersk: function of supply and threshold voltages, and transistor sizesEasily included in timing and power models
33Gate Tunneling Exponential function of supply voltage DD0VISUBGDGSLeakExponential function of supply voltageIGD~ e-ToxeVGD, IGS~ e-ToxeVGSIndependent of the sub-threshold leakage0.10.20.30.184.108.40.206.80.9220.127.116.11.8x 10-10VDD (V)Igate (A)90 nm CMOSModeled in BSIM4Also in BSIM3v3 (but not always included in foundry models)NMOS gate leakage usually worse than PMOS
34Other sources of static power dissipation Diode (drain-substrate) reverse bias currentsp+n+n+p+p+n+n wellp substrateElectron-hole pair generation in depletion region of reverse-biased diodesDiffusion of minority carriers through junctionFor sub-50nm technologies with highly-doped pn junctions, tunneling through narrow depletion region becomes an issueStrong function of temperatureMuch smaller than other leakage components in general
35Other sources of static power dissipation Circuit with dc bias currents:sense amplifiers, voltage converters and regulators, sensors, mixed-signal components, etcShould be turned off if not used, or standby current should be minimized
36Summary of Power Dissipation Sources a – switching activityCL – load capacitanceCCS – short-circuit capacitanceVswing – voltage swingf – frequencyIDC – static currentIleak – leakage current
37The Traditional Design Philosophy Maximum performance is primary goalMinimum delay at circuit levelArchitecture implements the required function with target throughput, latencyPerformance achieved through optimum sizing, logic mapping, architectural transformations.Supplies, thresholds set to achieve maximum performance, subject to reliability constraints
38CMOS Performance Optimization Sizing: Optimal performance with equal fanout per stageExtendable to general logic cone through ‘logical effort’Equal effective fanouts (giCi+1/Ci) per stageExample: memory decoder[Ref: I. Sutherland, Morgan-Kaufman‘98]
39Model not Appropriate Any Longer Traditional scaling modelMaintaining the frequency scaling modelWhile slowing down voltage scaling
40The New Design Philosophy Maximum performance (in terms of propagation delay) is too power-hungry, and/or not even practically achievableMany (if not most) applications either can tolerate larger latency, or can live with lower than maximum clock-speedsExcess performance (as offered by technology) to be used for energy/power reductionTrading off speed for power
41Relationship Between Power and Delay 1234-0.0.40.80.20.6x 10-4VTH(V)VDDPower (W)AB1234-0.40.40.85x 10-10Delay (s)VTH(V)VDDABFor a given activity level, power is reduced while delay is unchanged if both VDD and VTH are lowered such as from A to B.[Ref: T. Sakurai and T. Kuroda, numerous references]
42The Energy-Delay Space Equal performance curvesVDDEqual energy curvesVTHEnergy minimum
43Energy-Delay Product as a Metric 3.53delay90 nm technologyVTH approx 0.35V2.521.5energy-delay1energy0.50.60.70.80.911.11.2VDDEnergy-delay exhibits minimum at approximately 2 VTH(typical unless leakage dominates)
44Exploring the Energy-Delay Space UnoptimizeddesignEmaxPareto-optimal designsEminDminDmaxDelayIn energy-constrained world, design is trade-off processMinimize energy for a given performance requirementMaximize performance for given energy budget[Ref: D. Markovic, JSSC’04]
45Summary Power and energy are now primary design constraints Active power still dominating for most applicationsSupply voltage, activity and capacitance the key parametersLeakage becomes major factor in sub-100nm technology nodesMostly impacted by supply and threshold voltagesDesign has become energy-delay trade-off exercise!
46ReferencesD. Markovic, V. Stojanovic, B. Nikolic, M.A. Horowitz, R.W. Brodersen, “Methods for True Energy-Performance Optimization,” IEEE Journal of Solid-State Circuits, vol. 39, no. 8, pp , AugJ. Rabaey, A. Chandrakasan, B. Nikolic, “Digital Integrated Circuits: A Design Perspective,” 2nd ed, Prentice Hall 2003.Takayasu Sakurai, ”Perspectives on power-aware electronics,” Digest of Technical Papers ISSCC, pp. 26-29, Febr. 03.I. Sutherland, B. Sproull, and D. Harris, “Logical Effort”, Morgan Kaufmann, 1999.H. Veendrick, “Short-Circuit Dissipation of Static CMOS Circuitry and its Impact on the Design of Buffer Circuits,” IEEE Journal of Solid-State Circuits, Vol. SC-19, no. 4, pp.468–473, 1984.