Presentation is loading. Please wait.

Presentation is loading. Please wait.

VLSI Design Power Frank Sill Torres Department of Electronic Engineering, Federal University of Minas Gerais, Av. Antônio Carlos 6627, CEP: 31270-010,

Similar presentations


Presentation on theme: "VLSI Design Power Frank Sill Torres Department of Electronic Engineering, Federal University of Minas Gerais, Av. Antônio Carlos 6627, CEP: 31270-010,"— Presentation transcript:

1 VLSI Design Power Frank Sill Torres Department of Electronic Engineering, Federal University of Minas Gerais, Av. Antônio Carlos 6627, CEP: , Belo Horizonte (MG), Brazil

2 Copyright Sill Torres, 2012 TRENDS 2

3 3 Trend: Performance Source: Moore, ISSCC 2003

4 Copyright Sill Torres, 2012 Trends – Power Dissipation SoC Consumer Portable Power Trend [Source: ITRS, 2010 Update]

5 Copyright Sill Torres, 2012 Trends - Power Density ←Hot Plate Nuclear Reactor → Source:

6 Copyright Sill Torres, Problems of High Power Dissipation Continuously increasing performance demands  Increasing power dissipation of technical devices  Today: power dissipation is a main problem High Power dissipation leads to:  High efforts for cooling  Increasing operational costs  Reduced reliability  High efforts for cooling  Increasing operational costs  Reduced reliability  Reduced time of operation  Higher weight (batteries)  Reduced mobility  Reduced time of operation  Higher weight (batteries)  Reduced mobility

7 Copyright Sill Torres, Chip Power Density Distribution Power density is not uniformly distributed across the chip Silicon is not a good heat conductor Max junction temperature is determined by hot-spots  Impact on packaging, cooling Power Map On-Die Temperature

8 Copyright Sill Torres, „The Internet is an Electricity Hog“ Energy for the internet in 2001 in Germany: 6.8 Bill. kWh = 1.4 % of total energy consumption  2.35 Bn. kWh for 17.3 Mill. Internet-PCs  1.91 Bn. for servers  1.67 Bn. for the network  0.87 Bn. for USV Rate of growth (at the moment): 36 % per year Prognosis: Bn. kWh  > 6 % total energy consumption  > 3 medium nuclear power plants World: 400 Mill. PCs  0.16 PW (P = Peta=10 15 ) Badische Zeitung, 2003

9 Copyright Sill Torres, Dissipation in a Notebook Peripherals DiskDisplay WLAN Communication Ethernet Battery Power supply ASICs Memory programmable µPs or DSPs Processing DC-DC converter

10 Copyright Sill Torres, Energy dissipation in a notebookEnergy dissipation a PDA Examples for Energy Dissipation

11 Copyright Sill Torres, Battery Capacity Generalized Moore‘s Law Capacity of batteries 2% - 6% Increase per year (up to year 2000) Intel beats Varta Intel beats Varta Source: Timmernann, 2007

12 Copyright Sill Torres, Current Progresses Batter. 20 kg Factor 4 in the last 10 years  still much too less

13 Copyright Sill Torres, 2012 POWER CONSUMPTION IN CMOS 13

14 Copyright Sill Torres, Metrics: Energy and Power Energy  Measured in Joules or kWh  “Measure of the ability of a system to do work or produce a change”  “No activity is possible without energy.” Power  Measured in Watts or kW  “Amount of energy required for a given unit of time.”  Average power Average amount of energy consumed per unit time Simplified to "power" in clear contexts  Instantaneous power Energy consumed if time unit goes to zero

15 Copyright Sill Torres, Metrics: Energy and Power cont’d Instantaneous Electrical Power P(t)  P(t) = v(t) * i(t)  v(t): Potential difference (or voltage drop) across component  i(t): Current through component Electrical Energy  E = P(t) * t = v(t) * i(t) * t Electrical Energy in CMOS circuits  Energy = Power * Delay  Why?

16 Copyright Sill Torres, CLCL Consumption in CMOS Voltage (Volt, V) Water pressure (bar) Current (Ampere, A) Water quantity per second (liter/s) EnergyAmount of Water Energy consumption is proportional to capacitive load! 0 1

17 Copyright Sill Torres, CLCL Voltage (Volt, V) Water pressure (bar) Current (Ampere, A) Water quantity per second (liter/s) EnergyAmount of Water Consumption in CMOS cont’d Energy for calculation only consumed at 0→1 at output 0 1

18 Copyright Sill Torres, Energy and Instantaneous Power CLCL CLCL INV1: High instantaneous Power (bigger width) INV2: Low instantaneous power t d1 t d2  Same Energy (C in ingnored)  INV1 is faster

19 Copyright Sill Torres, Watts time Power is height of curve Watts time Energy is area under curve Approach 1 Approach 2 Approach 1 Metrics: Energy and Power cont’d Energy = Power * time for calculation = Power * Delay

20 Copyright Sill Torres, Metrics: Energy and Power cont’d Energy dissipation  Determines battery life in hours  Sets packaging limits Peak power  Determines power ground wiring designs  Impacts signal noise margin and reliability analysis

21 Copyright Sill Torres, Metrics: PDP and EDP Power-Delay Product  Power P, delay t p  Quality criterion PDP = P * t p [J] P and t p have some weight Two designs can have same PDP, even if t p = 1 year Energy-Delay Product  EDP = PDP * t p = P * t p 2  Delay t p has higher weight

22 Copyright Sill Torres, Energy and Power Average Power direct proportional to Energy  In Following: Power means average power

23 Copyright Sill Torres, Where Does Power Go in CMOS? Dynamic Power Consumption  Charging and discharging capacitors Short Circuit Currents  Short circuit path between supply rails during switching Leakage  Leaking diodes and transistors

24 Copyright Sill Torres, Dynamic Power Consumption P dyn = C L * V DD 2 * P 0  1 * f P 0  1 : probability for 0-to-1 switch of output f : clock frequency α : activity Data dependent - a function of switching activity! V in V out CLCL V DD f 0  1 = α * f

25 Copyright Sill Torres, Dynamic Power Consumption CLCL V DD

26 Copyright Sill Torres, Example: Static 2 Input NOR Cell P A=1 = 1/2 P B=1 = 1/2 P Out=0 = 3/4 P Out=1 = 1/4 P 0→1 = P Out=0 * P Out=1 = 3/4 * 1/4 = 3/16 Then: Transition Probabilities for CMOS Cells ABOut Truth table of NOR2 cell If A and B with same input signal probability: C eff = P 0→1 * C L = 3/16 * C L

27 Copyright Sill Torres, P 0  1 = P out=0 * P out=1 NOR(1 - (1 - P A )(1 - P B )) * (1 - P A )(1 - P B ) OR(1 - P A )(1 - P B ) * (1 - (1 - P A )(1 - P B )) NANDP A P B * (1 - P A P B ) AND(1 - P A P B ) * P A P B XOR(1 - (P A + P B - 2P A P B )) * (P A + P B - 2P A P B ) Transition Probabilities cont’d A and B with different input signal probability: P A and P B : Probability that input is 1 P 1 : Probability that output is 1 Switching activity in CMOS circuits: P 0  1 = P 0 * P 1 For 2-Input NOR: P 1 = (1-P A )(1-P B ) Thus: P 0  1 = (1-P 1 )*P 1 = [1-(1-P A )(1-P B )]*[(1-P A )][1-P B ] (see next slide)

28 Copyright Sill Torres, Transition Probability of NOR2 Cell as a Function of Input Probabilities Transition Probabilities cont’d Probability of input signals → high influence on P 0  1 Source: Timmernann, 2007

29 Copyright Sill Torres, Short Circuit Power Consumption Finite slope of input signal  During switching: NMOS and PMOS transistors are conducting for short period of time (t sc )  Direct current path between VDD and GND P sc = V DD * I sc * (P 0  1 + P 1  0 ) V in V out CLCL I sc V DD GND t sc

30 Copyright Sill Torres, Leakage Power Consumption Most important Leakage currents:  Subthreshold Leakage I sub  Gate Oxide Leakage I gate P leak = I leak * V DD ≈ (I sub + I gate )* V DD V DD GND CLCL I sub I gate SiO 2 Source Drain Gate I gate I sub L

31 Copyright Sill Torres, 2012 P = α f C L V DD 2 + V DD I peak (P 0  1 + P 1  0 ) + V DD I leak 31 Dynamic power (≈ % today and decreasing relatively) Short-circuit power (≈ 10 % today and decreasing absolutely) Leakage power (≈ 20 – 50 % today and increasing) Power Equations in CMOS

32 Copyright Sill Torres, 2012 LEAKAGE 32

33 Copyright Sill Torres, Si Substrate Metal Gate High-k Tri-Gate S G D III-V S Carbon Nanotube FET 50 nm 35 nm 30 nm SiGe S/D Strained Silicon SiGe S/D Strained Silicon 90 nm65 nm45 nm32 nm Technology Generation 20 nm 10 nm 5 nm Nanowire Manufacturing Development Research Trends

34 Copyright Sill Torres, Trends cont‘d Dynamic Power Dissipation Power Dissipation by Leakage currents Source: S. Borkar (Intel), ‘05

35 Copyright Sill Torres, Recap: Transistor Geometrics n+ p-type body polysilicon gate Gate length L Source: Rabaey,“Digital Integrated Circuits”,1995 Gate-width W SiO 2 gate oxide (good insulator, e ox = 3.9 t ox – thickness of oxide layer t ox

36 Copyright Sill Torres, Subthreshold Leakage Threshold Voltage  Transistor characteristic  If: „Gate-Source“-Voltage V gs higher than V th  Channel under Gate  Current between Drain and Source  If: V gs lower than V th  (ideal) No current Subthreshold leakage I sub  Leakage between Drain and Source when V gs < V th  Based on: Short Channels Diffusion Thermionic Emission Source Drain Gate I sub

37 Copyright Sill Torres, Subthreshold Leakage cont’d 0V th ’V th Log (Drain current) Gate voltage Short-channel device I sub Source: Agarwal, 2007 Transistor is conducting NMOS-Transistor

38 Copyright Sill Torres, Drain Induced Barrier Lowering (DIBL) Electrons have to overcome potential barrier to enter the channel Ideal: Potential barrier is only controlled by gate voltage Changed by gate voltage V gs < V th V gs > V th Height of curve = Potential barrier

39 Copyright Sill Torres, Drain Induced Barrier Lowering cont’d At short channel transistors potential barrier is also affected by drain voltage  If V ds = V DD Transistors can start to conduct even if V gs < V th Short-channel transistor (L < 180 nm) Long-channel transistor (L > 2 µm) Lowering of potential barrier

40 Copyright Sill Torres, Temperature dependence I OFF at C I sub at 25 0 C 130nm  6x  70nm  16x  Based on Thermionic Emission: subthreshold leakage I sub increases with temperature Source: Chatterjee, Intel-labs

41 Copyright Sill Torres, Gate Oxide Leakage I gate Tunneling effect  Electromagnetic wave strike at barrier:  Reflection + Intrusion into barrier  If thickness is small enough:  Wave interfuse barrier partially: (Electrons tunnel through Barrier) Gate oxide leakage I gate  In Nanometer-Transistors, where T ox < 2 nm  Electrons tunnel through gate oxide  Leakage current

42 Copyright Sill Torres, Gate Oxide Thickness at 45 nm

43 Copyright Sill Torres, Gate Oxide Leakage cont’d Components of Gate Oxide Leakage:  Tunneling currents through overlap regions (gate-drain I gso, gate- source I gdo )  Tunneling currents into channel (gate-drain I gis, gate-source I gcd )  Tunneling currents between gate and bulk (I gb )

44 Copyright Sill Torres, Further Leakage Components Reverse bias pn junction conduction I pn Gate induced drain leakage I GIDL Drain source punchthrough I PT Hot carrier injection I HCI I HCI I pt I GIDL I pn

45 Copyright Sill Torres, Leakage Dependencies Leakage depends on:  Gate Width (I sub, I gate )  Gate Length (I sub, I gate )  Gate Oxide Thickness (I gate )  Threshold Voltage (I sub )  Temperature (I sub )  Input state (I gate )

46 Copyright Sill Torres, 2012 LOW POWER TECHNIQUES 46

47 Copyright Sill Torres, Reducing V DD has a quadratic effect!  Has a negative effect on performance especially as V DD approaches 2V T Lowering C L  Improves performance as well  Keep transistors minimum size Reducing the switching activity, f 0  1 = P 0  1 * f  A function of signal statistics and clock rate  Impacted by logic and architecture design decisions Lowering Dynamic Power

48 Copyright Sill Torres, 2012 Micro transductors ‘08, Low Leakage 48 Power & Delay Dependence of V th w.o. gate leakage Source: Sakurai, ‘01

49 Copyright Sill Torres, Transistor Sizing for Power Minimization Larger sized devices: only useful only when interconnects dominate Minimum sized devices: usually optimal for low-power Small W’s Large W’s Higher Voltage Lower Voltage Lower Capacitance Higher Capacitance Source: Timmernann, 2007 To keep performance

50 Copyright Sill Torres, Logic Style and Power Consumption Voltage increases: Power-delay product improves Best logic style minimizes power-delay for a given delay constraint  New Logic style can reduced Power dissipation (if possible / available !) Source: Jan M. Rabaey

51 Copyright Sill Torres, Logic Restructuring  Chain implementation has a lower overall switching activity than tree implementation for random inputs  BUT: Ignores glitching effects  Logic restructuring: changing the topology of a logic network to reduce transitions A B C D F A B C DZ F W X Y 0.5 (1-0.25)*0.25 = 3/ /64 = /256 3/16 3/16 = /256 AND: P 0  1 = P 0 * P 1 = (1 - P A P B ) * P A P B Source: Jan M. Rabaey

52 Copyright Sill Torres, Input Ordering Beneficial: postponing introduction of signals with a high transition rate (signals with signal probability close to 0.5) A B C X F B C A X F (1-0.5x0.2)*(0.5x0.2)=0.09 (1-0.2x0.1)*(0.2x0.1)= AND: P 0  1 = (1 - P A P B ) * P A P B Source: Jan M. Rabaey

53 Copyright Sill Torres, ABC X Z Unit Delay A B X Z C Glitching Source: Jan M. Rabaey

54 Copyright Sill Torres, Example 1: Chain of NAND Cells V DD / 2 Source: Jan M. Rabaey

55 Copyright Sill Torres, Example 2: Adder Circuit V DD / 2 Source: Jan M. Rabaey

56 Copyright Sill Torres, How to Cope with Glitching? F 1 F 2 F F 1 F 3 F Equalize Lengths of Timing Paths Through Design Source: Jan M. Rabaey

57 Copyright Sill Torres, 2012  Power is reduced by two mechanisms –Clock net toggles less frequently, reducing f eff –Registers’ internal clock buffering switches less often Clock Gating Local Gating Global Gating clk qn qd dout din en clk qn qd doutdin en clk FSM Execution Unit Memory Control clk enM enE enF Source: Jan M. Rabaey

58 Copyright Sill Torres, 2012 Clock Gating Insertion Local clock gating: 3 methods  Logic synthesizer finds and implements local gating opportunities  RTL code explicitly specifies clock gating  Clock gating cell explicitly instantiated in RTL Global clock gating: 2 methods  RTL code explicitly specifies clock gating  Clock gating cell explicitly instantiated in RTL Source: Jan M. Rabaey

59 Copyright Sill Torres, 2012 Clock Gating VHDL Code Conventional RTL Code //always clock the register if rising_edge (clk) then // form the flip-flop if (enable = ‘1’)then q <= din; end if; end if; Low Power Clock Gated RTL Code //only clock the register when enable is true gclk <= enable and clk; // gate the clock if rising_edge (gclk) then // form the flip-flop q <= din; end if; Instantiated Clock Gating Cell //instantiate a clock gating cell from the target library I1: clkgx1 port map(en=>enable, cp=>clk, gclk_out=>gclk); if rising_edge (gclk) then // form the flip-flop q <= din; end if; Source: Jan M. Rabaey

60 Copyright Sill Torres, 2012 Clock Gating: Example DSP/ HIF DEU MIF VDE 896Kb SRAM Source: M. Ohashi, Matsushita, 2002  90% of FlipFlops clock-gated  70% power reduction by clock-gating MPEG4 decoder mW mW 2025 Without clock gating With clock gating Power [mW]

61 Copyright Sill Torres, 2012 Data Gating Objective  Reduce wasted operations => reduce f eff Example  Multiplier whose inputs change every cycle, whose output conditionally feeds an ALU Low Power Version  Inputs are prevented from rippling through multiplier if multiplier output is not selected X X Source: Jan M. Rabaey

62 Copyright Sill Torres, 2012 Data Gating Insertion Two insertion methods  Logic synthesizer finds and implements data gating opportunities  RTL code explicitly specifies data gating Some opportunities cannot be found by synthesizers Issues  Extra logic in data path slows timing  Additional area due to gating cells Source: Jan M. Rabaey

63 Copyright Sill Torres, 2012 Data Gating VHDL Code: Operand Isolation Conventional Code assign muxout = sel ? A : A*B ; // build mux Low Power Code assign multinA = sel & A ; // build and cell assign multinB = sel & B ; // build and cell assign muxout = sel ? A : multinA*multinB ; X sel B A muxout X sel B A muxout Source: Jan M. Rabaey

64 Copyright Sill Torres, Influence of Threshold Voltage V th Threshold Voltage V th :  Influence on sub-threshold leakage I sub  Influence on delay of logic cells I sub Delay

65 Copyright Sill Torres, Influence of Gate Oxide Thickness T ox Gate oxide Thickness T ox :  Influence on gate oxide leakage I gate  Influence on delay I gate Delay

66 Copyright Sill Torres, Recap: Data Paths Data propagate through different data paths between registers (flipflops - FF) Paths mostly differ in propagation delay times Frequency of clock signal (CLK) depends on path with longest delay  critical path Paths Path

67 Copyright Sill Torres, Recap: Slack B A Y C time all Inputs of G1 arrived G1 ready with evaluation delay of G1 all inputs of G2 arrived Slack for G1

68 Copyright Sill Torres, Dual-V th / Dual-T ox Two different cell types:  Cells consist of „low-V th “- or „low-T ox “-transistors  Low threshold voltage or thin gate oxide layer  For critical paths  High leakage / short delay  Cells consist of „low-V th “- or „low-T ox “-transistors  Low threshold voltage or thin gate oxide layer  For critical paths  High leakage / short delay “LVT / LTO”- Cells  Cells consist of „high-V th “- „high-T ox “-transistors  High threshold voltage or thick gate oxide layer  For uncritical paths  Low leakage / long delay  Cells consist of „high-V th “- „high-T ox “-transistors  High threshold voltage or thick gate oxide layer  For uncritical paths  Low leakage / long delay “HVT / HTO”- Cells  Leakage reduction at constant performance (no level converter necessary)

69 Copyright Sill Torres, Performance at different Dual-V th Measured at NAND2 BPTM 65nm Technology

70 Copyright Sill Torres, Leakage I sub at different Dual-V th Measured at NAND2 BPTM 65nm Technology

71 Copyright Sill Torres, Dual-V th / Dual-T ox Example Critical Path HVT- and/or HTO-Cells LVT- and/or LTO-Cells

72 Copyright Sill Torres, Stack Effect Transistor stack: at least two transistor from same type (NMOS or PMOS) in a row Based on behavior of internal nodes:  The more transistors are non-conducting (off) the lower the leakage Source: K. Roy

73 Copyright Sill Torres, Sleep Transistors  Idea: Insertion of additional transistors between logic block and supply lines  This transistors: connect with SLEEP- signal  If circuit has nothing to do:  SLEEP signal is active: Stack effect (additional off transistor in row to other)  If sleep transistors are High-V th : approach also called Multi-Threshold CMOS (MTCMOS)  Mostly insertion only of 1 Transistor Low-V th logic cells Vss Vdd sleep Virtual Vss Virtual Vdd sleep Source: Kaijian Shi, Synopsys

74 Copyright Sill Torres, Sleep Transistors: Realization VDD Global VDD V VDD1 domain Ring style sleep transistor implementation Sleep transistors are placed around each VVDD island V VDD2 domain Source: Kaijian Shi, Synopsys

75 Copyright Sill Torres, Sleep Transistors: Realization cont’d Grid style sleep transistor implementation Source: Kaijian Shi, Synopsys Global VDD V VDD2 VDD VVDD1 V VDD2 V DD network cross chip; V VDD networks in each gating domain Sleep transistors are placed in grid connecting V DD and V VDDs

76 Copyright Sill Torres, Sleep Transistors: Problems Sleep transistor can be modeled as resistor R In active mode (cell is working)  Current I through sleep transistor  Voltage V x drop over resistor  Output voltage reduced to V DD -V x Reduced Delay (of following blocks) Current I is not a leakage current! I is a discharging current of load capacitances

77 Copyright Sill Torres, Stackforcing Simple method of using stack effect  Increasing stack by splitting transistors  C in stays constant  Only one technology is needed  Area is (almost) the same  Drive strength (drain-source current) is reduced  delay goes down

78 Copyright Sill Torres, Stackforcing cont’d Source: Narendra, et al., ISLPED01 Normalized I sub Normalized delay No Stackforcing

79 Copyright Sill Torres, Input Vector Control (IVC) Leakage of cell depends on input vector

80 Copyright Sill Torres, Every circuits is input vector with minimum leakage Idea: If design is in passive mode  SLEEP signal gets active  Sleep vector is applied Input Vector Control cont’d

81 Copyright Sill Torres, Pin Reordering Gate leakage in stack depends on input vector Same logic input vector (amounts of ‘0’ and ‘1’ is equal) → can result in different leakage If input probability is known  reorder pins so that highest probable state has minimum gate leakage BPTM, 65 nm technology

82 Copyright Sill Torres, 2012 Delay and Power versus VDD Dynamic Power (and leakage) can be traded by delay tdtd P dyn

83 Copyright Sill Torres, 2012 Adaptive Dynamic Voltage/Frequency Scaling (DVS/DFS) Slow down processor to fill idle time More Delay  lower operational voltage Runtime Scheduler determines processor speed and selects appropriate voltage Transitions delay for frequencies <150  s Potential to realize 10x energy savings E.g.: Intel SpeedStep, AMD PowerNow, Transmeta Longrun Active Idle Active Idle 3.3 V Active 2.4 V

84 Copyright Sill Torres, 2012 DVS/DFS with Transmeta LongRun Source: Transmeta

85 Copyright Sill Torres, 2012 Multi-VDD  Objective  Reduce dynamic power by reducing the V DD 2 term  Higher supply voltage used for speed-critical logic  Lower supply voltage used for non speed-critical logic  Example  Memory V DD = 1.2 V  Logic V DD = 1.0 V  Logic dynamic power savings = 30% Source: Jan M. Rabaey

86 Copyright Sill Torres, 2012 Multi-VDD Issues  Partitioning  Which blocks and modules should use with voltages?  Physical and logical hierarchies should match as much as possible  Voltages  Voltages should be as low as possible to minimize CV DD 2 f  Voltages must be high enough to meet timing specs  Level shifters  Needed (generally) to buffer signals crossing islands  Added delays must be considered  Physical design  Multiple V DD rails must be considered during floorplanning  Timing verification  Timing verification must be performed for all corner cases across voltage islands. Source: Jan M. Rabaey

87 Copyright Sill Torres, 2012 Multi-VDD Flow Route Determine which blocks run at which Vdd Multi-voltage placement Multi-voltage synthesis Multi-voltage synthesis Determine floor plan Verify timing Clock tree synthesis Source: Jan M. Rabaey

88 Copyright Sill Torres, 2012 Power-orientated Programming  Algorithms can differ in power dissipation Source: Irwin, bubble.cheap.cquick.c Switched Capacitance (nF) Others Functional Unit Pipeline Registers Register File


Download ppt "VLSI Design Power Frank Sill Torres Department of Electronic Engineering, Federal University of Minas Gerais, Av. Antônio Carlos 6627, CEP: 31270-010,"

Similar presentations


Ads by Google