Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1.

Similar presentations


Presentation on theme: "A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1."— Presentation transcript:

1 A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1

2 Dynamic Random Access Memory (DRAM) used as main memory DRAM stores data as charge on capacitor Leakage DRAM cells leak data! DRAM Chip 1 2 DRAM is a volatile memory  Charge leaks quickly Introduction

3 DRAM maintains data by Refresh operations DRAM Chip Refresh JEDEC specified DRAM retention time: 64ms (< 85 C) 32ms (> 85 C) Charge on cells restored 3 DRAM relies on Refresh for data integrity Refresh: Restoring Data in DRAM Time between Refresh ≤ Retention Time

4 1Gb2Gb 4Gb 8Gb 2.8% 9% 7.7% 5.1% Chip Density ~18% ~36% 16Gb32Gb 4 Time spent in Refresh proportional to number of Rows Increasing memory capacity  More time spent in Refresh The time for doing Refresh is increasing with chip density Refresh: A Growing Problem

5 Memory unavailable for Read/Write during Refresh 5 time No Refresh REFRESH B Interference due to Refresh time Wait Refresh blocks reads  Higher read latency Refresh Blocks Reads AB A B Serviced

6 6 Our Goal: Reduce the Read Latency impact of Refresh Impact of Refresh is significant, and increasing Impact of Refresh

7 7  Introduction & Motivation  Refresh Operation: Background  Refresh Pausing  Evaluation  Alternative Proposals  Summary Outline

8 8 Row 1 Row 2 Row 3 Row 4 Row 5 Row n-1 Row n A DRAM Bank Refresh Refresh operates on a Row granularity Refresh Operation

9 Burst Mode: Memory unavailable until all rows finish refresh Distributed Mode: 8K refresh pulses in 64ms 9 Refresh 64ms Refresh 64ms Distributed mode reduces contention from Refresh Refresh Modes

10 Every pulse refreshes a ‘Bundle of rows’ Chip SizeRows in a Refresh bundle (per bank) 512 Mb1 1Gb2 2Gb4 4Gb or 8Gb (Twin 4Gb die)8 10 Refresh Bundle currently have upto 8 rows, and increasing Refresh Bundle

11 T RFC is the time to do refresh for every refresh pulse 11 T RFC unavailable available 8Gbunavailable T RFC available 16Gb T RFC unavailable available 32Gb High T RFC  Read waits for refresh for long time The Latency Wall of Refresh Current 8Gb chips have T RFC of 350ns >> read latency

12 12  Introduction & Motivation  Refresh Operation: Background  Refresh Pausing  Evaluation  Alternative Proposals  Summary Outline

13 A time Refresh B Request B arrives Interrupted 13 Baseline system Refresh (Cont.) Refresh Pausing B A Refresh time Request B arrives Insight: Make Refresh Operations Interruptible Pausing Refresh reduces wait time for Reads Pausing at arbitrary point can cause data loss Refresh Pausing

14 Bank Rows Row Buffer d c b a Refresh Pulse (4 rows in a bundle) Chip With Refresh Pausing Pause Read X X Without Refresh Pausing 14 X Refresh Pausing at Row boundary to service read Refresh Pausing: When to Pause?

15 Memory Controller generates a Refresh Enable (RE) signal Pausing requires ‘active low’ detection of RE One way communication only Memory Controller Refresh Enable (RE) to DRAM RE 1 0 Pause Resume 15 Refresh Pausing: Interface Details

16 Row Address Counter increments the addresses Stop the increment using a simple AND gate Active Low Refresh Enable as ‘Refresh Pause’ Address Generator Row Address Counter EN Incrementer Refresh Bundle Addresses DRAM 16 RE Refresh Pausing: Track a Paused Row

17 Scheduler schedules: Read, Write, and Refresh Responsible for Pausing Refresh for Read Keeps track of refresh time done before Pause Processor Bus Memory Controller SchedulerRead Queue Write Queue Refresh Enable DRAM 17 Refresh Pausing: Memory Scheduler

18 Pausing can delay Refresh JEDEC allows delay of up-to 8 pending refresh If 8 pending refresh, then issue ‘Forced Refresh’ Forced Refresh cannot be Paused 18 Reads/Writes Forced Refresh Refresh Pulses Refresh Issued Refresh Not Issued Forced Refresh for data integrity Forced Refresh

19 19  Introduction & Motivation  Refresh Operation: Background  Refresh Pausing  Evaluation  Alternative Proposals  Summary Outline

20 Simulator: uSIMM from Memory Scheduling Championship (MSC) Workloads: MSC Suite COMMERCIAL(5), PARSEC(9), BIOBENCH(2) and SPEC(2) Configuration: Number of Cores4 Last Level Cache1MB DRAM (DDR3)8 Chips/Rank, 8Gb/Chip Channels, Ranks, Banks4,2,8 Refresh (Baseline)Distributed (JEDEC) 20 Results presented for temperature > 85C (paper also has <85C) Experimental Setup

21 - Refresh Pausing gives ~7% read latency reduction for an 8Gb chip 21 7% Results: Read Latency

22 - Refresh Pausing gives ~5% performance improvement for an 8Gb chip 22 Results: Performance

23 Refresh Pausing more effective as chips density increases 23 Results: Impact of Chip Density

24 24  Introduction & Motivation  Refresh Operation: Background  Refresh Pausing  Evaluation  Alternative Proposals  Summary Outline

25 Elastic Refresh waits for idle period before issuing a refresh Estimates average inter-arrival time of memory request 3 units A Request A B Request B time A Request A B Request B 4 units Refresh A Request A B Request B Refresh 7 units Wait No Refreshes With Refreshes Elastic Refresh 25 The “Wait and Watch” policy can increase wait times Elastic Refresh for Scheduling Refresh [MICRO’10]

26 26 Refresh Pausing outperforms Elastic Refresh Comparison with Elastic Refresh

27 Reduce bundles size and have more bundles 27 T REFI T REFI /2 T RFC T RFC /2 T REFI /2 DDR4 x2 ModeDDR3 Distributed Mode In x2 mode, T REFI is reduced by 2 (x4 mode by 4) In x2 mode T RFC is reduced by 2 (x4 mode by 4) Fine Grained Refresh to reduce contention of Refresh DDR4 proposals: x2 and x4 modes

28 28 DDR4 modes (x2 and x4) useful but not enough Comparison with DDR4

29 29  Introduction & Motivation  Refresh Operation: Background  Refresh Pausing  Evaluation  Alternative Proposals  Summary Outline

30 DRAM relies on Refresh for data integrity Time for Refresh increases with chip density Refresh blocks read, increases read latency Refresh Pausing: make Refresh Interruptible Pausing provides 5% improvement for 8Gb, increases with higher density Applicable also to DDR4 (fine grained refresh) 30 Summary

31 THANK YOU 31

32 Refresh+Read Reads operate on a rank Refreshes may also operate on the same rank DRAMs serve only a single request at a time Scheduler Read Queue Reads Rank Refresh 32

33 Refresh Row Bundle T RFC : Time to refresh one bundle of rows T REC : Current Recovery Time T REFI : Time until next bundle refresh Larger refresh-row bundle implies larger T RFC T RFC T REC REFRESH T REFI 33 Row 1Row n Refresh Row Bundle

34 Hierarchically organized as Channels, Ranks and Banks Chip DRAM Organization Rank 1 Rank 2 Channel Banks Rows READ 34

35 Refresh Modes Burst and Distributed Mode Bank Rows Rank Chips Refresh Burst Mode Refresh In burst mode, all rows in all banks refresh simultaneously Distributed Mode Distributed mode: Only a few rows in all banks refresh; refresh is distributed in time 35

36 Transactions in DRAMs Three transactions of concern – Reads – Writes – Refreshes DRAM Processor Bus ReadWrite Refresh Mismanagement of requests leads to collisions! A scheduler is needed to manage requests to DRAM 36

37 Temperature Sensitivity of Refresh Pausing - Upto 22% increase in speedup for future chips The savings of Refresh Pausing is higher while operating at high temperatures 37

38 Auto and Self Refresh Special Refresh Modes for DRAMs Auto Refresh – Internal Counter issues pulses in distributed fashion (CBR and RAS only) Self Refresh – DRAM is internally refreshed at a power optimized rate (Activity == 0) Self Refresh Modes are only used when DRAMs stay idle 38

39 Mitigating Penalty Pause a refresh bundle at row granularity T RPC = row cycle time + current recovery time Current recovery time is small for individual rows Thus refreshes can be made interruptible a.Maximum Refresh penalty without pausing is T RFC b.Maximum Refresh penalty with pausing is to T RPC 39


Download ppt "A Case for Refresh Pausing in DRAM Memory Systems Prashant Nair Chia-Chen Chou Moinuddin Qureshi 1."

Similar presentations


Ads by Google