Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accelerating DRAM Performance

Similar presentations

Presentation on theme: "Accelerating DRAM Performance"— Presentation transcript:

1 Accelerating DRAM Performance
Bill Gervasi Chairman, JEDEC Memory Parametrics

2 Simple, incremental steps
RAM Evolution 5400MB/s DDR667 Mainstream Memories 4300MB/s DDR533 3200MB/s “DDR II” DDR400 2700MB/s DDR333 Simple, incremental steps 2100MB/s The industry standards roadmap for main memories, shown by megabytes per second on a 64bit module. Each step of the way designers have been able to leverage the lessons from the previous generation when using the next. Also in each case, a single controller has been able to support at least two generations simultaneously. DDR is the latest member of this continuum providing a doubling of peak performance of SDR, and the roadmap continues into the future with DDR II, providing 3.2GB/s with a 400MHz clock. DDR II will be an easy migration from DDR I designs. “DDR I” DDR266

3 Key to System Evolution
Never over-design! Implement just enough new features to achieve incremental improvements Drive the hell out of the volumes to get the enhancements for free

4 New Specifications DDR I DDR II DDR333 chips PC2700 MicroDIMM
PC2700 SO-DIMM PC2700 Registered DIMM PC2700 Unbuffered DIMM DDR II DDR400 chips DDR533 chips DDR667 chips DIMM outline

5 DDR333 333 MHz data rate per pin Approved for both TSOP and FBGA
First introduction of FBGA into SDRAM family One package-dependent timing consideration! Most improvements from tighter DLL design Purpose of the DLL is accurate delivery of data and strobes during read cycles

6 DLL Effects CK CK DDR266 = 750 ps DDR333 = 600 ps tDQSCK* DQS
Clock jitter, pulse width distortion, DQS pull in or push out from pattern effects, p-channel to n-channel variation

7 Data Capture Parameters
DQS tDQSQ* tQHS* (simplified view) data DDR266 = 750 ps DDR333 = 550 ps for TSOP = 600 ps for FBGA DDR266 = 750 ps DDR333 = 450 ps for TSOP = 400 ps for FBGA Data pin skew, simultaneous switching output effects, output driver variation Note that data valid window width is package independent!

8 DDR II DDR400 SS800 DDR533 DDR667 Main System Memory Modules, etc
Embedded Applications Point to Point

9 The DDR II Family DDR II similarities to DDR I:
Compatible RAS/CAS command set & protocol DDR II differences from DDR I: DDR I = 2.5V, DDR II = 1.8V with calibration Prefetch 4 Differential data strobes Improved command bus utilization: Write latency as a function of read latency Additive latency to help fill holes New FBGA package & memory modules Tighter package parasitics

10 DDR II Improves DDR I Enables higher burst frequency
Makes better use of command slots Lower voltage swing simplifies system concerns

11 DDR II Data Capture DQS tDQSQ* tQHS* (simplified view) data DDR400 = 450 ps DDR533 = 400 ps DDR400 = 350 ps DDR533 = 300 ps Improvements from a combination of packaging, process, and voltage levels

12 Preparing for DDR II Transitional controllers will need 2.5V or 1.8V selectable I/Os Allocate pins for differential data strobe, tie one pin to VREF for DDR I mode Use fixed burst length = 4 Programmable write latency WL = 1 for DDR I compliance WL = Read Latency – 1 for DDR II compliance Optional: additive latency

13 1.8V I/O Voltage

14 1.8V Signaling 2.5V VDDQ 1.8V 1.60V VDDQ VIHac 1.43V VIHdc 1.15V VREF 1.25V VIHac 1.03V VILdc VIHdc 1.07V 0.90V VREF VILac VILdc 0.90V 0.77V VILac 0.65V VSS VSS 0V SSTL_2 SSTL_18

15 I/O Calibration Balance n- and p-channel driver strength
Protocol defined for initializing memory interface Command tells the DRAM to hold signals in a state, controller overdrives and adjusts drive strength to match Data VTT Controller Data VREF DRAM

16 Differential Data Strobe

17 Differential Data Strobe
Just as DDR added differential clock to SDR DDR II adds differential data strobe to DDR I Transition at the crosspoint of DQS and DQS Route these signals as a differential pair Common mode noise rejection

18 Differential Data Strobe
VREF DQS DQS high time DQS low time Normal balanced signal VREF DQS DQS high time DQS low time Mismatched Rise & Fall signal Error!

19 Differential Data Strobe
DQS VREF DQS DQS high time DQS low time Normal balanced signal DQS VREF DQS DQS high time DQS low time Mismatched Rise & Fall signal Significantly reduced symmetry error

20 Prefetch 4

21 Moving to the Next Level
Today’s SDRAM architectures assume an inexpensive DRAM core timing DDR I (DDR200, DDR266, and DDR333) prefetches 2 data bits: increase performance without increasing core timing costs DDR II (DDR400, DDR533, DDR667) prefetches 4 bits internally, but keeps DDR double pumped I/O

22 Prefetch 2 Versus 4 CK READ data Prefetch 2 Core access time
Costs $$$ Essentially free

23 So Why Not Prefetch 8 Now? 64 bit bus widths are most practical tradeoff Inexpensive 60  motherboards 8 bytes per data cycle Dropping to 32 bits or 16 bits raises system costs Expensive 28  motherboards Prefetch 8: lots of wasted data & bandwidth Prefetch 8 means 64 bytes per access minimum on 64 bit bus

24 Filling Command Slots

25 Additive Latency Command slot availability is disrupted by CAS latency even on seamless read bursts Sometimes with odd CAS latencies, sometimes with even These collisions can be avoided by shifting READs and WRITEs in the command stream Additive latency shifts R & W commands earlier – applies to both

26 Read Latency In the past, data access from a READ command was simply CAS Latency Combined with Additive Latency, ability to order commands better

27 Read & Additive Latencies
CK ACT RD data CAS Latency CK ACT RD RL = AL + CL data Additive Latency CAS Latency

28 Write Latency Complex controllers had collisions between command slots and data bus availability These are eliminated in DDR II by setting Write Latency = Read Latency – 1 Combined with Additive Latency, lots of flexibility in ordering commands

29 Write & Additive Latencies
CK ACT WR data Additive Latency = 0 WL = RL – 1 CK ACT WR WL = AL + CL – 1 = RL – 1 data Additive Latency CL – 1

30 Summary DRAM evolution continues
266  333  400  533  667 MHz data rates 2100  5400 MB/s throughput per module Each step is a simple incremental improvement – without adding cost! DDR II family adds a few new features Lower voltage with I/O calibration Differential data strobes Command utilization improvements

31 Thank You

Download ppt "Accelerating DRAM Performance"

Similar presentations

Ads by Google