Presentation is loading. Please wait.

Presentation is loading. Please wait.

Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department.

Similar presentations


Presentation on theme: "Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department."— Presentation transcript:

1 Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department of Computer Science University of Pittsburgh 1 ICCAD 2009

2 Introduction Traditional SRAM Cache – Limited by density, leakage and scalability STT-RAM Cache? – High density (~4x than SRAM) – High speed (same read speed as SRAM) – Non-volatile – No write endurance problem 2

3 STT-RAM: Cell Magnetic Tunnel Junction (MTJ) Relative magnetization direction – Different resistances  Logic 0 or 1 Write: spin-polarized current – Much less write current than conventional MRAM 3 MgO High Resistance (Logic 1) Low Resistance (Logic 0) Reference Layer Free Layer

4 Similar array structure as SRAM Bidirectional write current STT-RAM: Cell Array 4 write 0write 1 MTJ BLSLBLSL WL

5 STT-RAM Cache: Challenge High dynamic energy – 6~14x more energy per write access [Dong et al. DAC 2008, Sun et al. HPCA 2009] – Write contributes >74% of total dynamic energy 5 74.2% Need to reduce write energy in STT-RAM cache!

6 Opportunity Many bits are unchanged in a write access – Redundant bit-writes [Zhou et al. ISCA 2009] Redundant bit-writes in 16MB STT-RAM cache 6 88% How to exploit this opportunity?

7 Exploiting Redundant Bit-Writes Need to know the old value… Read & compare before write [Zhou et al. ISCA 2009] Can we do better? 7

8 Observation MTJ resistance changes abruptly by the end of write cycle – Cell still holds old value at early stage of write cycle Read is much faster than write 8 Y. Chen et al. ISQED 2008 Possible to sense the old value at early stage of write cycle

9 Early Write Termination: Idea On a write access… – Start write cycle like normal – Sense the old value at early stage – Terminate the write cycle if old value is same as new value Does not require a preceding read & compare! 9

10 EWT Circuit 10 MTJ pass Vsense1Vsense0 write 0 write 1 conversion Vin1Vin0 Conversion circuit -Basic differential amplifier -Input lower  Output higher -Input higher  Output lower Rwire Vsense0 Vsense1 Vref0 Vref1 Sense-Amp New value Terminate? SLBL WL

11 How EWT Works? 11 MTJ pass Vsense1 Vsense0 low write 0 high conversion Vin1 Vin0 Rwire Old ValueNew ValueVsense0SA outputAction 0 0higher1Terminate Vin0 lower 10 0Continuehigher 0.536ns SLBL WL

12 Advantages of EWT No performance penalty! – Carried within a write cycle – No need to read & compare before a write – Write access may finish early  Slight speedup Low energy overhead (3.23%) Low complexity Easy to integrate with existing designs 12

13 MODELING STT-RAM AND EWT 13

14 Latency Modeling Cell – Derived from recent works [Dong et al. DAC 2008] Peripheral – Derived from CACTI [Thoziyoor et al. ISCA 2008, Dong et al. DAC 2008] 14

15 Dynamic Energy Modeling Baseline: Derived from recent works [Dong et al. DAC 2008] EWT – Read energy: same as baseline – Write energy: variable 15 Peripheral (derived from CACTI) Extra energy introduced by EWT circuits (HSPICE) N changed × E changed + N unchanged × E unchanged Cell changeTerminated cell change

16 Leakage Energy Modeling STT-RAM is non-volatile – Power gate the idle banks – Assume 1ns delay to “wake up” – Used in both baseline and EWT 16

17 Experimental Setup Simics-based simulator – 4-core CMP, 1GHz – 32KB private L1 cache – 16MB shared L2 cache using STT-RAM, 16 banks – 4GB main memory – Enhanced cache model: STT-RAM & EWT 17

18 Results: Performance 18 Normalized Cycle-Per-Instruction (CPI) 1% speedup Slight performance improvement

19 Results: Write Energy 19 Normalized write energy Up to 80% write energy reduction 70% saving

20 Results: Dynamic Energy 20 Normalized dynamic energy 52% reduction EWT Base

21 Results: Total Energy Normalized total energy 21 33% reduction

22 Results: Energy-Delay Product Normalized ED 2 22 34% reduction

23 Conclusion Address a key challenge to STT-RAM cache: dynamic energy EWT: Exploit redundant bit-writes without performance penalty – Low overhead and complexity Modeling and evaluation – Up to 80% write energy reduction – 34% ED 2 reduction 23

24 THANK YOU! 24


Download ppt "Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department."

Similar presentations


Ads by Google