Presentation is loading. Please wait.

Presentation is loading. Please wait.

Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee

Similar presentations


Presentation on theme: "Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee"— Presentation transcript:

1 Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee
Heterogeneous Die Stacking of SRAM Row Cache and 3-D DRAM: An Empirical Design Evaluation Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee Electrical and Computer Engineering Georgia Tech Intel Labs The 54th IEEE International Midwest Symposium on Circuits and Systems

2 Modern DRAM Design Challenges
Scaling challenge Less capacity Higher leakage Increasing manufacturing cost Energy efficiency pressure Smart phone / tablet Cloud / Exa-scale computing

3 Future Solutions Heterogeneous stacking [Kawano et al., IEDM 2006]
Dedicating a logic layer for I/O circuit Better performance, lower energy consumption Homogeneous stacking [Kang et al., JSSC 2010] Increasing density without scaling the device

4 New Opportunity for Processor Architects
SPACE AVAILABLE (404)

5 SRAM Row Cache?

6 Motivation An Optimized 3D-Stacked Memory Architecture by Exploiting Excessive, High-Density TSV Bandwidth by D. H. Woo, N. H. Seong, D. L. Lewis and H.-H. S. Lee in IEEE International Symposium on High-Performance Computer Architecture (HPCA-16), 2010.

7 Row Buffer Conflicts in a Multi-core
Address Stream 0x 0x Conventional 3D DRAM 0x 0x 0x 0x c0 Row Buffer Hit rate ~ 50% One entry / bank 3 row misses 3 row hits One cache line

8 Eliminating Redundant Array Lookup
Address Stream 0x 0x Heterogeneous SRAM row cache + 3D DRAM 0x 0x 0x 0x c0 Row cache Hits 2 row misses 4 cache hits One row cache line Row Cache

9 SRAM Row Cache Stacking
Large set-associative SRAM row cache Eliminating redundant DRAM look-ups caused by conflict misses in row buffers High bandwidth, low energy communication through TSVs

10 Conventional DRAM Bank Structure
2-D transfer is still energy hungry! Large area overhead of TSV TSVs One bank per die Not drawn to scale

11 Folded, Scalable DRAM Bank Structure
Short transfer of large data (a row) Long transfer of small data (a cache line) 64x64TSVs 64x16TSVs One half-row = 4Kb (64x64)

12 Final Design: SRAM Row Cache + 3-D DRAM
One SRAM cache bank per DRAM bank

13 Performance: Overall Speedup Performance: Row Hit Rate
Performance Results Performance: Overall Speedup Performance: Row Hit Rate

14 Energy: Relative DRAM Lookup Energy
Energy Results Energy: Relative DRAM Lookup Energy

15 Energy Breakdown

16 Conclusion 3-D stacking  new light for architects SRAM row cache for 3-D DRAM Folded DRAM bank design Optimize 2-D Traffic Significant energy savings

17 That’s All, Folks! Georgia Tech ECE MARS Lab

18 BACKUP FOILS

19 Simulation Results


Download ppt "Dong Hyuk Woo Nak Hee Seong Hsien-Hsin S. Lee"

Similar presentations


Ads by Google