Presentation is loading. Please wait.

Presentation is loading. Please wait.

Manjunath Shevgoor, Rajeev Balasubramonian, University of Utah

Similar presentations


Presentation on theme: "Manjunath Shevgoor, Rajeev Balasubramonian, University of Utah"— Presentation transcript:

1 Addressing Service Interruptions in Memory with Thread-to-Rank Assignment
Manjunath Shevgoor, Rajeev Balasubramonian, University of Utah Niladrish Chatterjee, NVIDIA Jung-Sik Kim, Samsung Electronics 4/18/2016 ISPASS 2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

2 DRAM Refresh: Quick Recap
DRAM cell leaks through access transistor Leakage increases with temperature DRAM cell must be Refreshed every 64ms 1/8K of the DRAM rank is refreshed every 7.8µs Bit Line Word DRAM Cell Leaks more with Temperature Leak 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

3 Refresh Timing Parameters
7.8 ms or 3.9 ms tREFI tRFC tRFC tRFC 640 ns (32 Gb) tRefresh tRecovery 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

4 tRFC Projections 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

5 Refresh determines memory peak power
Refresh Power in DRAM Command Current (mA) Act 67 Read 125 Write Refresh 245 Refresh determines memory peak power 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

6 Stagger refresh to reduce peak power
Rank 1 Rank 2 Rank 3 Rank 4 MC 8-core CMP MC Channel 1 Channel 2 Stagger refresh to reduce peak power 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

7 Effect of Staggered Refresh
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

8 Talk Outline DRAM refresh background
Goal: Low peak power of staggered refresh, performance of simultaneous refresh Analyzing stalls from refresh Solution: Thread-to-rank assignment Results 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

9 Each Staggered Refresh
Rank 1 Rank 2 Rank 3 Rank 4 Each Staggered Refresh stalls many cores MC 8-core CMP MC Channel 1 Channel 2 Stalled T1 R1 T2 R3 R2 T7 T3 T8 Stalled Thread Rank T1 R2 T2 R3 T1 R1 T2 R2 T2 R1 T3 R1 Rank 1 Refreshing => 3 Threads Stalled Rank 3 Refreshing => 3 Threads Stalled 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

10 Limit the Spread- Address Mapping
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

11 % Refreshes Affecting a Thread
Highest Performance Loss 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

12 37% increase in Execution Time
Highest Performance Loss 37% increase in Execution Time 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

13 Rank Assigned Page Mapping
Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 Rank 1 Rank 3 Rank 2 Rank 4 8-core CMP MC MC Channel 1 Channel 2 Strict mapping of threads to ranks. e.g., used for cache partitioning by Lin et al., HPCA 2008 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

14 Limit the Spread- Page Mapping
Thread 1 Thread 2 Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 Rank 1 Rank 3 Rank 2 Rank 4 MC 8-core CMP MC Channel 1 Channel 2 Relaxed mapping of threads to ranks. 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

15 Modified Clock Algorithm
P List of Pages in Memory P P P P P P P P P P P Baseline Hand 1 2 3 4 Modified List of Pages in Ranks Hands 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

16 Methodology Simics + USIMM DRAM Specifications
8 RISC cores, UltraSPARC III ISA 3.2 GHz, 4-wide OoO, 64-entry RoB 32 KB I&D L1 caches, 4 cycles 4/8 MB shared L2 cache, 10 cycles DRAM Specifications 2 Channels, 2 Ranks per Channel, 16 Banks per Rank 800MHz DDR4 DRAM SPEC 2006, NPB, and Cloudsuite, Parsec 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

17 18% better than Staggered Refresh
Thread-to-rank Assignment 18% better than Staggered Refresh 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

18 Relaxing Rank Assignment
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

19 Comparisons to Prior Work
4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

20 Conclusions Exposes an important artifact in memory stalls
Service interruptions require a re-evaluation of data placement RA (rank assignment) is a simple solution for an emerging problem RA can also be leveraged to reduce the impact of NVM write drain RA is a software solution that only requires best-effort page mapping Outperforms hardware-only schemes 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment

21 Thank You 4/18/2016 Addressing Service Interruptions in Memory with Thread to Rank Assignment


Download ppt "Manjunath Shevgoor, Rajeev Balasubramonian, University of Utah"

Similar presentations


Ads by Google