Phase Change Memory What to wear out today? Chris Craik, Aapo Kyrola, Yoshihisa Abe.

Slides:



Advertisements
Similar presentations
Pooja ROY, Manmohan MANOHARAN, Weng Fai WONG National University of Singapore ESWEEK (CASES) October 2014 EnVM : Virtual Memory Design for New Memory Architectures.
Advertisements

Outline Memory characteristics SRAM Content-addressable memory details DRAM © Derek Chiou & Mattan Erez 1.
Lecture 19: Cache Basics Today’s topics: Out-of-order execution
Jaewoong Sim Alaa R. Alameldeen Zeshan Chishti Chris Wilkerson Hyesoon Kim MICRO-47 | December 2014.
Computer Organization and Architecture
Better I/O Through Byte-Addressable, Persistent Memory
CMSC 611: Advanced Computer Architecture Cache Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from.
+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
1 Lecture 6: Chipkill, PCM Topics: error correction, PCM basics, PCM writes and errors.
CMPE 421 Parallel Computer Architecture MEMORY SYSTEM.
What is memory? Memory is used to store information within a computer, either programs or data. Programs and data cannot be used directly from a disk or.
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
1 Lecture 15: DRAM Design Today: DRAM basics, DRAM innovations (Section 5.3)
1 Lecture 16B Memories. 2 Memories in General Computers have mostly RAM ROM (or equivalent) needed to boot ROM is in same class as Programmable Logic.
1 Lecture 16: Virtual Memory Today: DRAM innovations, virtual memory (Sections )
12/3/2004EE 42 fall 2004 lecture 391 Lecture #39: Magnetic memory storage Last lecture: –Dynamic Ram –E 2 memory This lecture: –Future memory technologies.
IT Systems Memory EN230-1 Justin Champion C208 –
Virtual Memory Topics Virtual Memory Access Page Table, TLB Programming for locality Memory Mountain Revisited.
1 Lecture 16B Memories. 2 Memories in General RAM - the predominant memory ROM (or equivalent) needed to boot ROM is in same class as Programmable Logic.
1 Lecture 14: DRAM, PCM Today: DRAM scheduling, reliability, PCM Class projects.
Memory Hierarchy and Cache Design The following sources are used for preparing these slides: Lecture 14 from the course Computer architecture ECE 201 by.
Systems I Locality and Caching
Lecture on Electronic Memories. What Is Electronic Memory? Electronic device that stores digital information Types –Volatile v. non-volatile –Static v.
NVSleep: Using Non-Volatile Memory to Enable Fast Sleep/Wakeup of Idle Cores Xiang Pan and Radu Teodorescu Computer Architecture Research Lab
Defining Anomalous Behavior for Phase Change Memory
Lecture 7: PCM, Cache coherence
© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and.
EXTRAPOLATION PITFALLS WHEN EVALUATING LIMITED ENDURANCE MEMORY Rishiraj Bheda, Jesse Beu, Brian Railing, Tom Conte Tinker Research.
EECS 318 CAD Computer Aided Design LECTURE 10: Improving Memory Access: Direct and Spatial caches Instructor: Francis G. Wolff Case.
The Memory Hierarchy 21/05/2009Lecture 32_CA&O_Engr Umbreen Sabir.
Overview of Physical Storage Media
+ CS 325: CS Hardware and Software Organization and Architecture Memory Organization.
1  1998 Morgan Kaufmann Publishers Recap: Memory Hierarchy of a Modern Computer System By taking advantage of the principle of locality: –Present the.
Energy Reduction for STT-RAM Using Early Write Termination Ping Zhou, Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department.
Computer Memory Storage Decoding Addressing 1. Memories We've Seen SIMM = Single Inline Memory Module DIMM = Dual IMM SODIMM = Small Outline DIMM RAM.
Evaluating STT-RAM as an Energy-Efficient Main Memory Alternative
COMP203/NWEN Memory Technologies 0 Plan for Memory Technologies Topic Static RAM (SRAM) Dynamic RAM (DRAM) Memory Hierarchy DRAM Accelerating Techniques.
Introduction: Memory Management 2 Ideally programmers want memory that is large fast non volatile Memory hierarchy small amount of fast, expensive memory.
1 CMP-MSI.07 CARES/SNU A Reusability-Aware Cache Memory Sharing Technique for High Performance CMPs with Private Caches Sungjune Youn, Hyunhee Kim and.
Princess Sumaya Univ. Computer Engineering Dept. Chapter 5:
ECE/CS 552: Cache Concepts © Prof. Mikko Lipasti Lecture notes based in part on slides created by Mark Hill, David Wood, Guri Sohi, John Shen and Jim Smith.
Memory Management memory hierarchy programs exhibit locality of reference - non-uniform reference patterns temporal locality - a program that references.
Memory Hierarchy and Cache. A Mystery… Memory Main memory = RAM : Random Access Memory – Read/write – Multiple flavors – DDR SDRAM most common 64 bit.
1 Lecture 20: OOO, Memory Hierarchy Today’s topics:  Out-of-order execution  Cache basics.
CMSC 611: Advanced Computer Architecture Memory & Virtual Memory Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material.
15-740/ Computer Architecture Lecture 5: Project Example Justin Meza Yoongu Kim Fall 2011, 9/21/2011.
Computer System Structures Storage
Memory COMPUTER ARCHITECTURE
CS 105 Tour of the Black Holes of Computing
Scalable High Performance Main Memory System Using PCM Technology
Better I/O Through Byte-Addressable, Persistent Memory
Influence of Cheap and Fast NVRAM on Linux Kernel Architecture
Lecture 21: Memory Hierarchy
Lecture 23: Cache, Memory, Virtual Memory
CS 105 Tour of the Black Holes of Computing
Lecture 22: Cache Hierarchies, Memory
William Stallings Computer Organization and Architecture 7th Edition
MICROPROCESSOR MEMORY ORGANIZATION
Lecture 6: Reliability, PCM
Jianbo Dong, Lei Zhang, Yinhe Han, Ying Wang, and Xiaowei Li
Lecture 20: OOO, Memory Hierarchy
2.C Memory GCSE Computing Langley Park School for Boys.
Lecture 22: Cache Hierarchies, Memory
ECE 463/563 Fall `18 Memory Hierarchies, Cache Memories H&P: Appendix B and Chapter 2 Prof. Eric Rotenberg Fall 2018 ECE 463/563, Microprocessor Architecture,
Literature Review A Nondestructive Self-Reference Scheme for Spin-Transfer Torque Random Access Memory (STT-RAM) —— Yiran Chen, et al. Fengbo Ren 09/03/2010.
Memory Principles.
Architecting Phase Change Memory as a Scalable DRAM Alternative
Presentation transcript:

Phase Change Memory What to wear out today? Chris Craik, Aapo Kyrola, Yoshihisa Abe

Memory Technologies Concerns – Density – Latency – Energy Off Chip Technologies – DRAM Moderately dense, but not very fast – Flash Fairly dense, but near-disk slowness

Evaluation of Technologies DRAMNAND FlashNOR Flash Density Read Latency60ns25,000ns300ns Write Speed1000MB/s2.4MB/s0.5MB/s EnduranceEff. Infinite10^4 Retention?Refresh10 Years

Phase Change Memory Bit recorded in ‘Phase Change Material’ – SET to 1 by heating to crystallization point – RESET to 0 by heating to melting point – Resistance indicates state

Phase Change Memory Density – 4x increase over DRAM Latency – 4x increase over DRAM Energy – No leakage – Reads are worse(2x), writes much worse (40x) Wear out – Limited number of writes (but better than Flash) Non-volatile – data persists in memory

Evaluation of Technologies DRAMNAND FlashNOR FlashPCM Density Read Latency60ns25,000ns300ns ns Write Speed1000MB/s2.4MB/s0.5MB/s100MB/s EnduranceEff. Infinite10^4 10^6 to 10^8 Retention?Refresh10 Years

Solutions to wearing & energy Partial writes = write only bits that have changed a)Caches keep track of written bytes/words per cacheline (Lee et. al) storage overhead vs. accuracy b)When writing a row to memory, first read old row and compare => write only modified bits (Zhou et al.) Writes cause thermal expansion / contraction that wears the material and requires strong current. But contrary to DRAM, PCM does not leak energy. Most written bits redundant!

Solutions to wearing & energy (cont.) Buffer organisation (Lee et al.) – DRAM uses one row buffer (2048B) – propose using up to 32 * 64B narrow buffers, each with own association capture coalescing writes: temporal locality more important than spatial locality find 4*512B most effective area-neutral also helps decrease latency Small DRAM buffer for PCM (Qureshi et al.) – combine low latency of DRAM with high capacity of PCM – similarly use Flash cache for Disk

Solutions to wearing & energy Wear leveling (Zhou et al.) – row shifting: even out writes among cells in a row needs extra hardware – segment swapping: even out between pages implemented in memory controller Spatial locality is now a problem!

PCM as On-chip Cache Hybrid on-chip cache architecture consisting of multiple memory technologies PCM, SRAM, embedded DRAM (eDRAM), and Magnetic RAM (MRAM) PCM is slow compared to SRAM etc. – But high density, non-volatility etc. help Use as complement to faster memory technologies As “slow” L2 cache, as L3 cache etc. PCM

Cache Structure Example Use PCM as huge L3 cache SRAM and eDRAM both as L2 – Faster and smaller SRAM region – Slower and larger eDRAM region L3 PCM (32MB) L2 eDRAM (Slow: <4MB) L2 SRAM (Fast: 256KB) Core w/ L1 L3 SRAM 1MB L2 SRAM 256K B Core w/ L1 Same Footprint Compared to 3-level SRAM cache model: 18% improvement in instructions per cycle Comparable power consumption Despite additional layer of PCM and its large capacity Various design possibilities PCM as “third” L2 cache etc.

Summary PCM can be viable approach towards next-generation memory architecture – High density, non-volatility – Various techniques to overcome shortcomings Short endurance, high-energy writes, latencies – Could be used as main memory or in on-chip cache hierarchy

Questions How well do results obtained on benchmark apps translate to real usage? Variance of endurance of memory cells? – may some cells wear out very quickly? Possibilities of PCM non-volatility instant wake-up from hibernation etc.