1 Lecture 2: Memory Energy Topics: energy breakdowns, handling overfetch, LPDRAM, row buffer management, channel energy, refresh energy.

Slides:



Advertisements
Similar presentations
Memory Controller Innovations for High-Performance Systems
Advertisements

Application-Aware Memory Channel Partitioning † Sai Prashanth Muralidhara § Lavanya Subramanian † † Onur Mutlu † Mahmut Kandemir § ‡ Thomas Moscibroda.
4/17/20151 Improving Memory Bank-Level Parallelism in the Presence of Prefetching Chang Joo Lee Veynu Narasiman Onur Mutlu* Yale N. Patt Electrical and.
Rethinking DRAM Design and Organization for Energy-Constrained Multi-Cores Aniruddha N. Udipi, Naveen Muralimanohar*, Niladrish Chatterjee, Rajeev Balasubramonian,
COEN 180 DRAM. Dynamic Random Access Memory Dynamic: Periodically refresh information in a bit cell. Else it is lost. Small footprint: transistor + capacitor.
1 Lecture 6: Chipkill, PCM Topics: error correction, PCM basics, PCM writes and errors.
1 MemScale: Active Low-Power Modes for Main Memory Qingyuan Deng, David Meisner*, Luiz Ramos, Thomas F. Wenisch*, and Ricardo Bianchini Rutgers University.
1 Lecture 13: DRAM Innovations Today: energy efficiency, row buffer management, scheduling.
1 Lecture 11: Large Cache Design IV Topics: prefetch, dead blocks, cache networks.
Micro-Pages: Increasing DRAM Efficiency with Locality-Aware Data Placement Kshitij Sudan, Niladrish Chatterjee, David Nellans, Manu Awasthi, Rajeev Balasubramonian,
Lecture 12: DRAM Basics Today: DRAM terminology and basics, energy innovations.
1 Lecture 14: Cache Innovations and DRAM Today: cache access basics and innovations, DRAM (Sections )
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
Lecture 17 I/O Optimization. Disk Organization Tracks: concentric rings around disk surface Sectors: arc of track, minimum unit of transfer Cylinder:
1 Lecture 16: Virtual Memory Today: DRAM innovations, virtual memory (Sections )
1 Lecture 13: Cache Innovations Today: cache access basics and innovations, DRAM (Sections )
1 Lecture 14: Virtual Memory Today: DRAM and Virtual memory basics (Sections )
1 Lecture 14: DRAM, PCM Today: DRAM scheduling, reliability, PCM Class projects.
1 Lecture 1: Introduction and Memory Systems CS 7810 Course organization:  5 lectures on memory systems  5 lectures on cache coherence and consistency.
1 Towards Scalable and Energy-Efficient Memory System Architectures Rajeev Balasubramonian School of Computing University of Utah.
1 Lecture 4: Memory: HMC, Scheduling Topics: BOOM, memory blades, HMC, scheduling policies.
1 University of Utah & HP Labs 1 Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0 Naveen Muralimanohar Rajeev Balasubramonian.
1 Efficient Data Access in Future Memory Hierarchies Rajeev Balasubramonian School of Computing Research Buffet, Fall 2010.
LOT-ECC: LOcalized and Tiered Reliability Mechanisms for Commodity Memory Systems Ani Udipi § Naveen Muralimanohar* Rajeev Balasubramonian Al Davis Norm.
1 Lecture: Virtual Memory, DRAM Main Memory Topics: virtual memory, TLB/cache access, DRAM intro (Sections 2.2)
1 CS/EE 6810: Computer Architecture Class format:  Most lectures on YouTube *BEFORE* class  Use class time for discussions, clarifications, problem-solving,
FAMU-FSU College of Engineering 1 Computer Architecture EEL 4713/5764, Fall 2006 Dr. Linda DeBrunner Module #17—Main Memory Concepts.
Part V Memory System Design
Row Buffer Locality Aware Caching Policies for Hybrid Memories HanBin Yoon Justin Meza Rachata Ausavarungnirun Rachael Harding Onur Mutlu.
1 Lecture 14: DRAM Main Memory Systems Today: cache/TLB wrap-up, DRAM basics (Section 2.3)
By Edward A. Lee, J.Reineke, I.Liu, H.D.Patel, S.Kim
Modern DRAM Memory Architectures Sam Miller Tam Chantem Jon Lucas CprE 585 Fall 2003.
CS/EE 5810 CS/EE 6810 F00: 1 Main Memory. CS/EE 5810 CS/EE 6810 F00: 2 Main Memory Bottom Rung of the Memory Hierarchy 3 important issues –capacity »BellÕs.
High-Performance DRAM System Design Constraints and Considerations by: Joseph Gross August 2, 2010.
1 Adapted from UC Berkeley CS252 S01 Lecture 18: Reducing Cache Hit Time and Main Memory Design Virtucal Cache, pipelined cache, cache summary, main memory.
1 Lecture 20: Big Data, Memristors Today: architectures for big data, memristors.
1 Lecture 5: Refresh, Chipkill Topics: refresh basics and innovations, error correction.
1 Lecture 5: Scheduling and Reliability Topics: scheduling policies, handling DRAM errors.
1 Lecture 3: Memory Buffers and Scheduling Topics: buffers (FB-DIMM, RDIMM, LRDIMM, BoB, BOOM), memory blades, scheduling policies.
1 Lecture: Memory Technology Innovations Topics: memory schedulers, refresh, state-of-the-art and upcoming changes: buffer chips, 3D stacking, non-volatile.
1 Lecture 3: Memory Energy and Buffers Topics: Refresh, floorplan, buffers (SMB, FB-DIMM, BOOM), memory blades, HMC.
1 Lecture: DRAM Main Memory Topics: DRAM intro and basics (Section 2.3)
15-740/ Computer Architecture Lecture 25: Main Memory
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
1 Lecture 4: Memory Scheduling, Refresh Topics: scheduling policies, refresh basics.
1 Lecture 16: Main Memory Innovations Today: DRAM basics, innovations, trends HW5 due on Thursday; simulations can take a few hours Midterm: 32 scores.
1 Lecture: Memory Basics and Innovations Topics: memory organization basics, schedulers, refresh,
UH-MEM: Utility-Based Hybrid Memory Management
Reducing Memory Interference in Multicore Systems
CSE 502: Computer Architecture
Lecture 23: Cache, Memory, Security
Niladrish Chatterjee Manjunath Shevgoor Rajeev Balasubramonian
Samira Khan University of Virginia Oct 9, 2017
Lecture 15: DRAM Main Memory Systems
Lecture: Memory, Multiprocessors
Lecture: DRAM Main Memory
Lecture 23: Cache, Memory, Virtual Memory
Lecture: DRAM Main Memory
Lecture: DRAM Main Memory
Lecture: Memory Technology Innovations
Lecture 6: Reliability, PCM
Lecture 15: Memory Design
Aniruddha N. Udipi, Naveen Muralimanohar*, Niladrish Chatterjee,
Lecture 22: Cache Hierarchies, Memory
15-740/ Computer Architecture Lecture 19: Main Memory
Niladrish Chatterjee Manjunath Shevgoor Rajeev Balasubramonian
RAIDR: Retention-Aware Intelligent DRAM Refresh
18-447: Computer Architecture Lecture 19: Main Memory
Presentation transcript:

1 Lecture 2: Memory Energy Topics: energy breakdowns, handling overfetch, LPDRAM, row buffer management, channel energy, refresh energy

2 Power Wall Many contributors to memory power (Micron power calc):  Overfetch  Channel  Buffer chips and SerDes  Background power (output drivers)  Leakage and refresh

3 Power Wall Memory system contribution (see HP power advisor): IBM data, from WETI 2012 talk by P. Bose

4 Overfetch Overfetch caused by multiple factors:  Each array is large (fewer peripherals  more density)  Involving more chips per access  more data transfer pin bandwidth  More overfetch  more prefetch; helps apps with locality  Involving more chips per access  less data loss when a chip fails  lower overhead for reliability

5 Re-Designing Arrays Udipi et al., ISCA’10

6 Selective Bitline Activation Additional logic per array so that only relevant bitlines are read out Essentially results in finer-grain partitioning of the DRAM arrays Two papers in 2010: Udipi et al., ISCA’10, Cooper-Balis and Jacob, IEEE Micro

7 Rank Subsetting Instead of using all chips in a rank to read out 64-bit words every cycle, form smaller parallel ranks Increases data transfer time; reduces the size of the row buffer But, lower energy per row read and compatible with modern DRAM chips Increases the number of banks and hence promotes parallelism (reduces queuing delays) Initial ideas proposed in Mini-Rank (MICRO 2008) and MC-DIMM (CAL 2008 and SC 2009)

8 DRAM Variants – LPDRAM and RLDRAM LPDDR (low power) and RLDRAM (low latency) Data from Chatterjee et al. (MICRO 2012)

9 LPDRAM Low power device operating at lower voltages and currents Efficient low power modes, fast exit from low power mode Lower bus frequencies Typically used in mobile systems (not in DIMMs)

10 Heterogeneous Memory Chatterjee et al., MICRO 2012 Implement a few DIMMs/channels with LPDRAM and a few DIMMs/channels with RLDRAM Fetch critical data from RLDRAM and non-critical data from LPDRAM Multiple ways to classify data as critical or not:  identify hot (frequently accessed) pages  the first word of a cache line is often critical Every cache line request is broken into two requests

11 Row Buffer Management Open Page policy: maximizes row buffer hits, minimizes energy Close Page policy: helps performance when there is limited locality Hybrid policies: can close a row buffer after it has served its utility; lots of ways to predict utility: time, accesses, locality counters for a bank, etc.

12 Micro-Pages Sudan et al., ASPLOS’10 Organize data across banks to maximize locality in a row buffer Key observation: most locality is restricted to a small portion of an OS page Such hot micro-pages are identified with hardware counters and co-located on the same row Requires hardware indirection to a page’s new location Works well only if most activity is confined to a few micro-pages

13 MemScale Deng et al., ASPLOS 2011 Performs DVFS on the memory controller and DFS on the channel The frequencies depend on bandwidth utilization and estimated energy/performance drop Requires no change to DRAM chips and DIMMs (in modern systems, the channel/DIMM frequency is set at boot time) Only saves energy on the processor, not on the channel and DIMM

14 Refresh Every DRAM cell must be refreshed within a 64 ms window A row read/write automatically refreshes the row Every refresh command performs refresh on a number of rows, the memory system is unavailable during that time A refresh command is issued by the memory controller once every 7.8us on average

15 RAIDR Liu et al., ISCA 2012 Process variation impacts the leakage rate for each cell Groups of rows are classified into bins based on leakage rate Each bin has its own refresh rate (multiples of 64ms) that are tracked with Bloom filters Prior work:  Smart Refresh: skip refresh for recently read rows  Flikker: non-critical data is placed in rows that are refreshed less frequently

16 Title Bullet