Presentation is loading. Please wait.

Presentation is loading. Please wait.

Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010.

Similar presentations


Presentation on theme: "Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010."— Presentation transcript:

1 Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010

2 Modern MTJ  Bias voltage/current controlled variable resistance device –Low: R P –High: R AP –TMR = (R AP - R P )/ R P  Spin-transfer-torque (STT) Switching –Switching is controlled by the direction of writing current. –Writing current density has to exceed thresholds 2

3 Motivations for Hybrid Logic  Significant application in MRAM design.  Why logic? –CMOS-compitible ● Switching current: 200uA – 2mA ● 90nm transistor: 1mA/um gate width –Non-volatility, high stability ● Introducing MTJ's non-volatility into CMOS, which may suppress leakage in active mode and reduce the leakage in idle mode to minimum. –3D – stack ● Replace CMOS with MTJ may increase density. 3

4 Questions?  What architecture can best utilize MTJ's non-volatility feature to improve energy efficiency?  Can MTJ/CMOS hybrid circuit has better energy delay trade-off than CMOS circuit?  How much leakage power can be saved by introducing MTJ to CMOS?  Any overhead? How much is the switching power of MTJ?  What will be the trend of MTJ/CMOS hybrid circuit with technology scaling? 4

5 Logic-in-Memory MTJ (LIM-MTJ) Logic Style  LIMT-MTJ –Use differential MTJ in Dynamic Current-mode Logic (DyCML) ● Outputs are evaluated based on the resistance difference of pull down networks through x-coupled PMOS. ● Claimed to have dynamic and static power than SCMOS. 5 Schematic of LIM-MTJ 1-bit full adder.

6 Energy-Performance Characterization  V.S. SCMOS & DyCML –LIM-MTJ has no energy performance advantage as compared to the equivelent CMOS implementation 6 Schematic of SCMOS 1-bit full adder. Schematic of DyCML 1-bit full adder.

7 MTJ Switching Energy Analysis  Switching Energy –I W = J C ∙A, ● J C is the critical current density ● A is the junction area. A = π∙W∙L= K∙L 2, L is junction size. –R = δ/A ● δ is the resistance-area product, intrinsic MTJ parameter. δ = 20 Ω ∙ um 2 –t is time. 7

8 MTJ Switching Energy Analysis  J C is a function of current pulse width. –Switching time is a function of current density. ● Δ is the thermal stability factor (Δ≥40) ● t 0 is the intrinsic switching time. t 0 = 1 ns ● J C0 is the intrinsic critical current density, J C0 = J C at t= t 0. –Modern MTJs have been shown to have J C0 = 2-7 MA/cm 2 8

9  Switching Energy –Function of switching time (t) given J C0, δ, L, Δ –Ref. MTJ ● J C0 = 5 MA/cm 2, δ= 20 Ω ∙ um 2, L=135nm, (W=65 nm,) ● R P =725 Ω, I C =1.4mA @ t=1ns  Switching Energy > 1 pJ –CMOS/MTJ hybrid logic circuits require frequent switching is hardly energy efficient. MTJ Switching Energy Analysis 9

10  Switching Energy with scaling –δ, L, J C0  fJ Switching –δ ≤ 5 Ω ∙ um 2 & J C0 ≤ 0.6 MA/cm 2 & L ≤ 33nm 10

11 LUT-based Logic  Store the true table in memory  Reads out the logic value based on input selection. –Reconfigurable –Can implement all type of logics. e.g. FPGA  Replace storage cell with MTJ –No MTJ switching during the logic operation. Only need to be configured once. –Non-volatile, minimum stanby power. –Instant boot-up. 11 Example of 3 input LUT

12 MTJ Reading Circuit  Conventional current-mirror sense amplifier based reading circuit. (SA) –Slow (2 stages) –Power hungry (DC current) 12 ∆V∆V ∆V∆V VIP VIN

13 MTJ Reading Circuit  X-coupled inverter based reading circuit. (XSA) –Fast ● ∆V are generated and amplified at the same time –Power efficient ● no DC current, only charging discharging capacitance 13 ∆V at evaluation phase 1MTJ and 1R ref accessed per read Amplified by X- coupled inverter

14 Energy Performance Comparison 14

15 Instant Power 15

16 1 Bit Full Adder (CMOS_LUT)  Transistor Count –16xEDFF –4xMUX4 –2xMUX2 –672 Transistors 16

17 1 Bit Full Adder (MTJ_LUT1)  Transistor Count –16xREAD1XMTJ –4xMUX4 –2xMUX2 –2xWRTCKT –448 Transistors –33% Reduction –16 MTJ 17

18 READ1XMTJ  15T+1MTJ  Need writing circuit 18

19 1 Bit Full Adder (MTJ_LUT2)  Transistor Count –2x READ8XMTJ –1x 9-WORD DECODER –2x MUX2 –1x INV –1x WRTCKT –174 Transistors –76% Reduction –16 MTJ 19

20 READ8XMTJ  MTJs share reading circuit  1MTJ + 1 R ref are accessed / read  1MTJ is accessed / write  23T + 8 MTJ 20

21 Simulation Setup  3 LUT architecture are compared –CMOS-LUT –MTJ-LUT1: MTJ reading circuit + MUX –MTJ-LUT2: Shared MTJ reading circuit + decoder  Configured to implement 1-bit full adder –2 3-input LUTs  ASU predictive technology model (PTM) –90nm, 65nm (bulk) –45nm, 32nm (SOI)  MTJ characteristic –Rp = 700, Rap = 1400, TMR = 100%, I cap2p = 223uA, I cp2ap = 500uA –Verilog-A MTJ model from Richard. 21

22 Configuration Power  CMOS-LUT –1GHz  MTJ-LUT –250MHz –750uA Writing Current –About 3 ns Writing time / MTJ  MTJ-based LUT are 10x bigger configuration power –16 MTJ’s switching energy 22

23 Delay  MTJ-based LUT2 has 2.5x bigger delay 23

24 Leakage Power  MTJ-LUT1 has a little bit bigger leakage power  MTJ-LUT2 has about 5x smaller total leakage power and –10x smaller storage leakage (due to MTJ) –2x smaller logic leakage (from MUX to decoder) 24

25 Energy (Operation Frequency:100MHz)  LUT2 –4x total energy saving @ 32nm ● 1/10 leakage_storage, ½ leakage_logic, bigger dynamic_logic ● Dynamic_storage overhead decreases with technology scaling down. 25

26 Energy (Operation Frequency:250MHz)  LUT2 –3x total energy saving @ 32nm ● 1/10 leakage_storage, ½ leakage_logic, ½ dynamic_logic ● Dynamic_storage overhead decreases with technology scaling down. 26

27 Energy (Operation Frequency:500MHz)  LUT2 –2x total energy saving @ 32nm ● 1/10 leakage_storage, ½ leakage_logic, ½ dynamic_logic ● Dynamic_storage overhead decreases with technology scaling down. 27

28 Standby Power 28  Dynamic sleep transistor –50mV voltage drop across sleep transistor  5-20X reduction Standby Power (uW)Technology Node Structure90nm65nm45nm32nm CMOS-LUT6.512.83.329.9 MTJ-LUT11.661.790.4691.04 MTJ-LUT20.8360.6250.2020.227

29 Conclusions  What architecture can best utilize MTJ's non-volatility feature to improve energy efficiency? –LUT-based logic which require no MTJ switching.  Can MTJ/CMOS hybrid circuit has better energy delay trade-off than CMOS circuit? –Yes.  How much leakage power can be saved by introducing MTJ to CMOS? –About 10x reduction  Any overhead? How much is the switching power of MTJ? –Yes. MTJ reading energy is overhead. MTJ writing energy of modern MTJ is around several pJ.  What will be the trend of MTJ/CMOS hybrid circuit with technology scaling? –Will play significant role in suppressing leakage below 45 nm. 29


Download ppt "Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010."

Similar presentations


Ads by Google