Presentation is loading. Please wait.

Presentation is loading. Please wait.

Accelerating DRAM Performance Bill Gervasi Chairman, JEDEC Memory Parametrics.

Similar presentations


Presentation on theme: "Accelerating DRAM Performance Bill Gervasi Chairman, JEDEC Memory Parametrics."— Presentation transcript:

1 Accelerating DRAM Performance Bill Gervasi Chairman, JEDEC Memory Parametrics

2 RAM Evolution 2100MB/s 2700MB/s Mainstream Memories DDR266 DDR333 Simple, incremental steps DDR MB/s DDR MB/s DDR MB/s “DDR I” “DDR II”

3 Key to System Evolution Never over-design! Implement just enough new features to achieve incremental improvements Drive the hell out of the volumes to get the enhancements for free

4 New Specifications DDR I –DDR333 chips –PC2700 MicroDIMM –PC2700 SO-DIMM –PC2700 Registered DIMM –PC2700 Unbuffered DIMM DDR II –DDR400 chips –DDR533 chips –DDR667 chips –DIMM outline

5 DDR MHz data rate per pin Approved for both TSOP and FBGA –First introduction of FBGA into SDRAM family –One package-dependent timing consideration! Most improvements from tighter DLL design –Purpose of the DLL is accurate delivery of data and strobes during read cycles

6 DLL Effects Clock jitter, pulse width distortion, DQS pull in or push out from pattern effects, p-channel to n-channel variation CK tDQSCK * CK DQS DDR266 = 750 ps DDR333 = 600 ps

7 Data Capture Parameters Data pin skew, simultaneous switching output effects, output driver variation Note that data valid window width is package independent! tQHS * (simplified view) DQS DDR266 = 750 ps DDR333 = 550 ps for TSOP = 600 ps for FBGA DDR266 = 750 ps DDR333 = 450 ps for TSOP = 400 ps for FBGA tDQSQ * data

8 DDR II DDR400 DDR533 DDR667 SS800 Main System Memory Modules, etc Embedded Applications Point to Point

9 The DDR II Family DDR II similarities to DDR I: –Compatible RAS/CAS command set & protocol DDR II differences from DDR I: –DDR I = 2.5V, DDR II = 1.8V with calibration –Prefetch 4 –Differential data strobes –Improved command bus utilization: Write latency as a function of read latency Additive latency to help fill holes –New FBGA package & memory modules Tighter package parasitics

10 DDR II Improves DDR I Enables higher burst frequency Makes better use of command slots Lower voltage swing simplifies system concerns

11 DDR II Data Capture Improvements from a combination of packaging, process, and voltage levels tQHS * (simplified view) DQS DDR400 = 450 ps DDR533 = 400 ps DDR400 = 350 ps DDR533 = 300 ps tDQSQ * data

12 Preparing for DDR II Transitional controllers will need 2.5V or 1.8V selectable I/Os Allocate pins for differential data strobe, tie one pin to VREF for DDR I mode Use fixed burst length = 4 Programmable write latency –WL = 1 for DDR I compliance –WL = Read Latency – 1 for DDR II compliance Optional: additive latency

13 1.8V I/O Voltage

14 1.8V Signaling 2.5V SSTL_ V 0.90V 1.43V 1.07V 1.25V 0V 0.90V 1.03V 0.77V 0.65V 1.15V 1.8V VSS VDDQ VREF VIHac VIHdc VILdc VILac VREF VSS VDDQ VIHac VIHdc VILdc VILac SSTL_2

15 I/O Calibration Balance n- and p-channel driver strength Protocol defined for initializing memory interface Command tells the DRAM to hold signals in a state, controller overdrives and adjusts drive strength to match V TT V REF Data Controller DRAM

16 Differential Data Strobe

17 Just as DDR added differential clock to SDR DDR II adds differential data strobe to DDR I Transition at the crosspoint of DQS and DQS Route these signals as a differential pair –Common mode noise rejection

18 Differential Data Strobe DQS high time V REF DQS low time DQS DQS high time V REF DQS low time DQS Normal balanced signal Mismatched Rise & Fall signal Error!

19 Differential Data Strobe DQS high time V REF DQS low time DQS DQS high time V REF DQS low time DQS Normal balanced signal Mismatched Rise & Fall signal DQS Significantly reduced symmetry error

20 Prefetch 4

21 Moving to the Next Level Today’s SDRAM architectures assume an inexpensive DRAM core timing DDR I (DDR200, DDR266, and DDR333) prefetches 2 data bits: increase performance without increasing core timing costs DDR II (DDR400, DDR533, DDR667) prefetches 4 bits internally, but keeps DDR double pumped I/O

22 Prefetch 2 Versus 4 CK READ Prefetch 2 Prefetch 4 Core access time Costs $$$ Essentially free data

23 So Why Not Prefetch 8 Now? 64 bit bus widths are most practical tradeoff –Inexpensive 60  motherboards –8 bytes per data cycle Dropping to 32 bits or 16 bits raises system costs –Expensive 28  motherboards Prefetch 8: lots of wasted data & bandwidth –Prefetch 8 means 64 bytes per access minimum on 64 bit bus

24 Filling Command Slots

25 Additive Latency Command slot availability is disrupted by CAS latency even on seamless read bursts –Sometimes with odd CAS latencies, sometimes with even –These collisions can be avoided by shifting READs and WRITEs in the command stream Additive latency shifts R & W commands earlier – applies to both

26 Read Latency In the past, data access from a READ command was simply CAS Latency Combined with Additive Latency, ability to order commands better

27 Read & Additive Latencies CK ACTRD CK ACTRD CAS Latency RL = AL + CL Additive Latency data

28 Write Latency Complex controllers had collisions between command slots and data bus availability These are eliminated in DDR II by setting Write Latency = Read Latency – 1 Combined with Additive Latency, lots of flexibility in ordering commands

29 Write & Additive Latencies CK ACTWR CK ACTWR WL = RL – 1 CL – 1 WL = AL + CL – 1 = RL – 1 Additive Latency Additive Latency = 0 data

30 Summary DRAM evolution continues –266  333  400  533  667 MHz data rates –2100  5400 MB/s throughput per module Each step is a simple incremental improvement – without adding cost! DDR II family adds a few new features –Lower voltage with I/O calibration –Differential data strobes –Command utilization improvements

31 Thank You


Download ppt "Accelerating DRAM Performance Bill Gervasi Chairman, JEDEC Memory Parametrics."

Similar presentations


Ads by Google