© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and.

© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and Luis Lastras IBM T. J. Watson Research Center, Yorktown Heights, NY

© 2007 IBM Corporation 2 Introduction More cores in system  More concurrency  Larger working set DRAM-based memory system hitting: power, cost, scaling wall Phase Change Memory (PCM): Emerging technology, projected to be more scalable, higher density, power-efficient

© 2007 IBM Corporation 3 PCM Operation T melt T cryst Time RESET SET Temperature Switching by heating using electrical pulses RESET state: amorphous (high resistance) SET state: crystalline (low resistance) Large Current SET Low resistance Photo Courtesy: Bipin Rajendran, IBM Read latency 2x-4x of DRAM. Write latency much higher Small Current RESET High resistance Access Device Memory Element

© 2007 IBM Corporation 4 Problem of Contention from Slow Writes PCM writes 4x-8x slower than reads Writes not latency critical. Typical response: Use large buffers and intelligent scheduling. But once write is scheduled to a bank, later arriving read waits Write request causes contention for reads  increased read latency

© 2007 IBM Corporation 6 Configuration: Hybrid Memory Processor Chip DRAM Cache PCM-Based Main Memory Baseline uses read priority scheduling if WRQ < 80% full. If WRQ>80% full, oldest-first policy  “forced write” (rare <0.1%) Each bank has a separate RDQ and WRQ (32-entry) (256MB)

© 2007 IBM Corporation 7 Problem Writes significantly increase read latency (Problem only for asymmetric memories) Read Latency=1k cycles Write Latency=8k cycles (sensitivity in paper) 12 workloads: each with 8 benchmarks from SPEC06 Baseline No Read Priority Write Latency=1K Write Latency=0 Effective Read Latency (Cycles) Norm. Execution Time

© 2007 IBM Corporation 9 Write Cancellation Write Cancellation: “abort” on-going write to Improve read latency Line in non-deterministic state: read matching read request from WRQ Perform write cancellation as soon as a read request arrives at a bank (as long as the write is not done in forced-mode)

© 2007 IBM Corporation 10 Write Cancellation with Static Threshold WCST: Cancel write request only if less than K% service done Canceling a write request close to completion is wasteful and causes episodes of forced-writes (low performance) 2365 (NeverCancel) (AlwaysCancel)

© 2007 IBM Corporation 11 Adaptive Write Cancellation Best threshold depends on num pending entries in WRQ. Fewer entries  Higher threshold (best read latency) More entries  Lower threshold (reduces forced writes) Write Cancellation with Adaptive Threshold (WCAT) Threshold = 100 – (4*NumEntriesInWRQ) 100% 0% 10 20 30 50% Num Entries in WRQ Threshold High Low ForcedWrites

© 2007 IBM Corporation 12 Adaptivity of WCAT Num Entries in WRQLow (0-1) Med (2-13) High (14-25) Forced (26+) WCST(K=75%)61.4%29.8%7.4%1.43% WCAT58.2%35.4%5.6%0.72% WCAT uses higher threshold initially with empty WRQ but Lower threshold later reduces the episodes of forced-writes We sampled all WRQ every 2M cycles to measure occupancy

© 2007 IBM Corporation 15 Iterative Write in PCM devices In Multi-Level Cells (MLC), the programming precision requirement increases linearly with the number of levels PCM cells respond differently to same programming pulse Acknowledged solution to address uncertainty: Iterative writes Each iteration consists of steps of: write-read-verify Write Verify Read Not done Done

© 2007 IBM Corporation 16 Model for Iterative Writes We develop an analytical model to capture number of iterations: In terms of bits/cell, num levels written in one shot, and learning Time required to write a line is worst-case of all cells in line Avg number of iterations: 8.3 (consistent with MLC literature) MLC:3 bits/cell

© 2007 IBM Corporation 17 Concept of Write Pausing Iterative writes can be paused to service pending read requests Reads can be performed at the end of each iteration (potential pause point) Iter 1Iter 2Iter 3Iter 4 Potential Pause Points Iter 1Iter 2Rd XIter 3 Rd X Iter 4 Better read latency with negligible write overhead We extend the iterative write algorithm of Nirschl et al. [IEDM’07] to support Write Pausing

© 2007 IBM Corporation 20 Write Pausing + WCAT Iter 1Iter 2Iter 3 Rd X Iter 4 Iter 1Iter 2Rd XIter 3 Rd X Iter 4 Iter 1Iter 2Rd XIter 3 Rd X Iter 4 Iter2 Cancelled Only one iteration is cancelled  “micro-cancellation” has low overhead

© 2007 IBM Corporation 24 Summary  Slow writes increase the effective read latency (2.3x)  Write Cancellation: Cancel ongoing write to service read  Threshold based write cancellation  Adaptive Threshold: better performance, half the overhead  Write Pausing exploits iterative write to service pending reads  Write Pausing + Micro Cancellation close to optimal pause  Effective read latency: from 2365 to 1330 cycles (1.45x speedup)  We will need large write buffers to exploit the benefit of Pausing

© 2007 IBM Corporation 27 Workloads and Figure of Merit 12 memory-intensive workloads from SPEC 2006: 6 rate-mode (eight copies of same benchmark) 6 mix-mode (two copies of four benchmarks) Key metric: Effective Read Latency Tin = Time at which read request enters RDQ Tout = Time at which read request finishes service at memory Effective Read Latency = Tout – Tin (average reported)

© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and.

Similar presentations

Presentation on theme: "© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and.

Similar presentations

Presentation on theme: "© 2007 IBM Corporation HPCA – 2010 Improving Read Performance of PCM via Write Cancellation and Write Pausing Moinuddin Qureshi Michele Franceschini and."— Presentation transcript:

Similar presentations

About project

Feedback