Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd.

Similar presentations


Presentation on theme: "1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd."— Presentation transcript:

1 1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd.

2 2 Achieving Higher Memory Bandwidth Fig. 5.27 HP3

3 3 Improving Memory Chip Performance Several techniques to get more bits/sec from a DRAM chip: Allow repeated accesses to the row buffer without another row access time Allow repeated accesses to the row buffer without another row access time  burst mode, fast page mode, EDO mode, … Simplify the DRAM-CPU interface Simplify the DRAM-CPU interface  add a clock to reduce overhead of synchronizing with the controller  = synchronous DRAM (SDRAM) Transfer data on both rising and falling clock edges Transfer data on both rising and falling clock edges  double data rate (DDR)  Each of the above adds a small amount of logic to exploit the high internal DRAM bandwidth

4 4 Conventional DRAM Architectures 16 Mb (16M  1) chip One 4096  4096 array of data bits 16 Mb (1M  16) chip 16 1024  1024 arrays of data bits  Interface is either the original asynchronous interface or one of the many recent minor modifications of it RAS: Row Address Strobe RAS: Row Address Strobe CAS: Column Address Strobe CAS: Column Address Strobe DRAM asynchronously controlled by processor DRAM asynchronously controlled by processor

5 5 Basic Mode of Operation  Slowest mode  Uses only single row and column address  Row access is slow (60-70ns) compared to column access (5-10ns)  Leads to three techniques for DRAM speed improvement Getting more bits out of DRAM on one access given timing constraints Getting more bits out of DRAM on one access given timing constraints Pipelining the various operations to minimize total time Pipelining the various operations to minimize total time Segmenting the data in such a way that some operations are eliminated for a given set of accesses Segmenting the data in such a way that some operations are eliminated for a given set of accesses RowColumn Address RAS CAS Data

6 6 Nibble (or Burst) Mode  Several consecutive columns are accessed  Only first column address is explicitly specified  Rest are internally generated using a counter RAS------------------------------------ CASCASCASCAS RACA D1D2D3D4 RAS------------------------------------ CASCASCASCAS RACA D1D2D3D4

7 7 Fast Page Mode  Accesses arbitrary columns within same row  Static column mode is similar RAS------------------------------------ CASCASCASCAS RACA1CA2CA3CA4 D1D2D3D4 RAS------------------------------------ CASCASCASCAS RACA1CA2CA3CA4 D1D2D3D4

8 8 EDO Mode  Arbitrary column addresses  Pipelined  EDO = Extended Data Out  Has other modes like “burst EDO”, which allows reading of a fixed number of bytes starting with each specified column address RAS------------------------------------ CASCASCASCASCASCASCAS RACA1CA2CA3CA4CA5CA6CA7 D1D2D3D4D5D6 RAS------------------------------------ CASCASCASCASCASCASCAS RACA1CA2CA3CA4CA5CA6CA7 D1D2D3D4D5D6

9 9 Evolutionary DRAM Architectures  SDRAM (Synchronous DRAM) Interface retains a good part of conventional DRAM interface Interface retains a good part of conventional DRAM interface  addresses multiplexed in two halves  separate data pins  two control signals All address, data, and control signals are synchronized with an external clock (100-150 MHz) All address, data, and control signals are synchronized with an external clock (100-150 MHz)  Allows decoupling of processor and memory  Allows pipelining a series of reads and writes Peak speed per memory module: 800-1200 MB/sec Peak speed per memory module: 800-1200 MB/sec

10 10 Revolutionary DRAM Architectures  Examples RDRAM (Rambus DRAM) RDRAM (Rambus DRAM) MDRAM (MoSys DRAM) MDRAM (MoSys DRAM)  Salient features Many smaller memory banks interleaved on one chip Many smaller memory banks interleaved on one chip “Protocol based” architecture “Protocol based” architecture  Narrow, fully multiplexed communication protocol Example: RAMBUS (RDRAM, DRDRAM) Example: RAMBUS (RDRAM, DRDRAM)  Each chip is more like a memory system than a component  Interleaved memory and a high-speed interface  Packet-switched bus (split transaction bus)  Chip can return variable #bytes from a single request, performs own reset, transfers on both clock edges  Narrow bus (1-2 data bytes) –Upto 3 transactions can be done concurrently  Internally, 72-bit wide bus with 5 ns cycle time  Up to 1600 Mbps peak bandwidth  Expensive!

11 11 Other types of Memory  ROM = Read-only Memory  Flash = ROM which can be written once in a while Used in embedded systems, small microcontrollers Used in embedded systems, small microcontrollers Offer IP protection, security Offer IP protection, security

12 12 Memory Interleaving  Goal: Try to take advantage of bandwidth of multiple DRAMs in memory system  Memory address A is converted into (b,w) pair, where b = bank index b = bank index w = word index within bank w = word index within bank  Logically a wide memory Accesses to B banks staged over time to share internal resources such as memory bus Accesses to B banks staged over time to share internal resources such as memory bus  Interleaving can be on Low-order bits of address (cyclic) Low-order bits of address (cyclic)  b = A mod B, w = A div B High-order bits of address (block) High-order bits of address (block) Combination of the two (block-cyclic) Combination of the two (block-cyclic)

13 13 Low-order Bit Interleaving

14 14 Mixed Interleaving  Memory address register is 6 bits wide Most significant 2 bits give bank address Most significant 2 bits give bank address Next 3 bits give word address within bank Next 3 bits give word address within bank LSB gives (parity of) module within bank LSB gives (parity of) module within bank  6 = 000110 2 = (00, 011, 0) = (0, 3, 0)  41 = 101001 2 = (10, 100, 1) = (2, 4, 1)


Download ppt "1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Nov. 18, 2002 Topic: Main Memory (DRAM) Organization – contd."

Similar presentations


Ads by Google