Lecture 11: External SRAM

Lecture 11: External SRAM
UCSD ECE 111 Prof. Farinaz Koushanfar Fall 2016 Some slides courtesy of MIT 6.111 Instructor: Chris Terman

Memories: A practical primer

Memories in Verilog

Multi-port memories, aka., reg files

FIFO in action

FPGA memory implementation

LUT-based RAMs

LUT-based RAM module

Tools often build it for you

BRAM Block RAM (BRAM) is a type of random access memory (configurable memory module) that is embedded throughout an FPGA for data storage. Use BRAM to: Transfer data between multiple clock domains. Transfer data between an FPGA target and a host processor. Transfer data between FPGA targets. Store large data sets on an FPGA target more efficiently than RAM built from look-up tables.

Block Memories (BRAM)

Memory classification and metrics

Overview Random Access Memory (RAM) is used for massive storage.
A register file is faster and more flexible, but feasible only for small storage due to large size. A read or write operation to an SRAM (asynchronous Static RAM) requires that data, address, and control signals be asserted in a specific order, and remain stable for a certain amount of time . SRAM is accessed through a memory controller which ensures this requirements are met.

Static RAM: Latch-based memory

Memory array architecture

Static SRAM cell (6T cell)

Using external memory devices

Basic Memory Controller
The memory controller provides a ‘synchronous wrap’ around the SRAM data_f2s_r data_s2f data_f2s_ur

When the main system wants to access the memory, it places the address and data (for a write operation) on the bus and activates mem and r/w At the rising edge of the clock, all signals are sampled by the memory controller and the desired operation is performed accordingly

Ports on the side of the main system: mem: asserted to 1 to initiate a memory operation r/w: specifies whether the operation is a read (1) or write (0) operation addr: 18-bit address data_f2m: 16-bit data to be written from FPGA to SRAM data_m2f_r: 16-bit registered data retrieved from SRAM to FPGA data_m2f_ur: 16-bit unregistered data retrieved from SRAM to FPGA ready: indicates if controller is ready for new command

Block Diagram of a Memory Controller
The FSM follows the timing diagrams Figures 11.2 and 11.3 to generate a proper control sequence Two data registers: one each for read and write operations tri-state buffer

MCM 6264C 8k * 8 SRAM

Functional Table of SRAM
Operation ce_n we_n oe_n lb_n ub_n dio (lower) dio (upper) disabled 1 - Z read data_out write data_in

Functional Table of SRAM
Default: ce_n, lb_n, and ub_n signals always activated (SRAM always enabled and using both bytes of the data bus) Simplified functional table with ce_n, lb_n, and ub_n set to defaults Operation we_n oe_n dio (16-bits) output disabled 1 Z read 16-bit word data_out write 16-bit word - data_in

Reading an Asynchronous SRAM

Address controlled reads

Writing to asynchronous SRAM

Sample memory interface logic

Tri state data buses in Verilog

Synchronous SRAM memories

ZBT eliminates the wait state

Pipelining allows faster clock

Register File vs SRAM A register file usually has one write port and multiple read ports while SRAM usually have a common read/write port The read and write ports of a register file can be accessed at the same time Writing to a register takes only one clock cycle Data from a register's read ports is always available and the read operation involves no clock or additional control signals for register files

Register File vs SRAM A register file is faster and more flexible. However, due to the circuit size of an FF, a register file is feasible only for small storage.

EEPROM Electronically Erasable Programmable ROM

Interacting with Flash and EEPROM

Dynamic RAM (DRAM) Cell

Asynchronous DRAM operation

Addressing with memory maps

Memory devices (helpful knowledge)

You should understand why

Memory Controller FSM for SRAM
We will consider several design choices First we will describe a safe design that provides large timing margins and does not impose any stringent timing constraints. Then we will consider some aggressive designs, the challenges they bring and some potential solutions.

Safe Design Defaults: oe_n = 1; we_n = 1; tri_n = 1; ready = 0
FSM is initially in the idle state, starts the memory operation when the mem signal is activated. The r/w signal determines whether it is a read or write operation. read: r1 state write: w1 state

Safe Design: Read Operation
The memory address, addr, is sampled and stored in the raddr register at the transition. The data is stored in the rs2f register at the transition from r2 to idle, and the oe_n signal is deactivated afterwards. data_s2f_r is a registered output and available after the FSM exits the r2 state until the next read cycle. The data_s2f_ur is an unregistered output connected directly to the SRAM's dio bus. Its data becomes valid one clock cycle earlier than data_s2f_r but will be removed after the FSM enters the idle state.

Safe Design: Write Operation
The memory address, addr, and data, data_f2s, are sampled and stored in the raddr and rf2s registers at transition. The we_n and tri_n signals are both activated in the w1 state. tri_n controls the data flow from FPGA to SRAM At the w2 state, we_n is deactivated but tri_n remains asserted to ensure that the data is properly latched to the SRAM during 0→1 edge of we_n. At the end of write cycle, FSM returns to idle state and tri_n is deactivated to remove data from dio bus.

Safe Design: Timing Analysis
Assumptions: FSM is controlled by a 50-MHz clock and thus stays in each state for 20 ns. The SRAM is IS61LV25616A with the following timing parameters

Read operation: During the read cycle, oe_n is asserted for two states, i.e., 40ns which provides a 30ns margin over the 10ns tAA . The data is stored in the data_s2f register when the FSM moves from the r2 state to the idle state. Although oe_n is deasserted at the transition, the data remains valid for a small interval because of the FPGA's pad delay and the tHZOE delay of the SRAM chip. It can be sampled properly by the clock edge.

Write operation: During the write cycle, we_n is asserted in the w1 state, and the 20ns interval exceeds the 8ns tPWEI requirement. The tri_n signal remains asserted in the w2 state and thus ensures that the data is still stable during the 0-to-1 transition edge of the we_n signal.

Performance: Both read and write operations take two clock cycles to complete. During the read operation, data_s2f_ur is available just before the rising edge of the second clock cycle and the data_s2f_r is available right after the rising edge of the second clock cycle. Both read and write operations must return to idle state after completion. The main system must wait for another clock cycle to issue a new memory operation, and thus the back-to-back memory access takes three clock cycles.

Timing Issues on Asynchronous SRAM
Deactivation of the we_n signal The 0-to-1 transition of we_n functions somewhat like a clock edge of an FF, in which the data is latched and stored to the internal memory element. Even though the data hold time (tHD) is zero for this SRAM, deactivating we_n and removing data at the same time, is not a reliable a approach because of the variations in propagation delays. Must ensure that we_n is deactivated before data is removed from the bus.

Timing Issues on Asynchronous SRAM
Potential conflict on the data bus, dio. The data bust is bidirectional, used for both read an write operations. A condition known as fighting occurs if both controller and SRAM place data on the bus at the same time.

Alternative Design I Target: reduce the back-to-back operation overhead Instead of always returning to the idle state, the memory controller check the mem signal at the end of current memory operation (i.e., in the r2 or w2 state) and determine the next state. Initiates a new memory operation immediately if there is pending request.

Alternative Design I idle

Alternative Design I: Timing Analysis
Back-to-back memory operations may cause fighting on the data bus. For example, if write operation is performed immediately after a read operation Tristate buffer of SRAM: passing → high impedance Tristate buffer FPGA: high impedance → passing If either of the tristate buffers changes mode too slowly (delays tHZOE and tLZOE), both buffers may allow data to be placed on the bus in a small interval and fighting occurs *The timing issues with the basic ‘safe’ design also apply to this design

Alternative Design II Target: perform single memory operation in one clock cycle Permitted by the timing parameters as the read and write cycles of the SRAM are each 10ns and each clock cycle is 20 ns The r2 and w2 states are removed Takes one clock cycle to complete the memory access and requires two clock cycles to complete the back-to-back operations.

Alternative Design II

Alternative Design II: Timing Analysis
Read operation Address signal first propagates through the FPGA's I/0 pads to SRAM's address bus, and retrieved data then propagates back through I/0 pads to FPGA's internal logic Need to satisfy: SRAM address access time (tAA = 10ns) + two pad delays (4ns ~ 10ns each) < one cycle (20ns) Write operation we_n must be deactivated before data to properly latch data to SRAM, which normal synthesis cannot guarantee. Fine tune of synthesis is required to achieve these. *The timing issues with the basic ‘safe’ design also apply to this design

Alternative Design III
Target: combine the features from the two preceding designs Takes one clock cycle to complete the memory access and one clock cycle to complete back-to-back operations. The we_n signal must be asserted for a fraction of the clock period and cannot be shown in the diagram. It is derived from the we_tmp signal shown is w1 state

Alternative Design III
idle

Alternative Design III: Timing Analysis
The data is latched to the SRAM at the 0-to-1 transition of the we_n signal During back-to-back write operations, state remains w1 and we_n remain asserted to 0 continuously. One possible solution is to assert the signal only at the first half of the clock, which is 10ns (< tWPE1 ). assign we_n = we_tmp | ~clk; which is not reliable due to potential glitches and delay variation. Better alternatives are discussed next. *The timing issues with the previous two designs also apply to this design

Advanced FPGA Features to Solve Timing Issues
An FSM cannot generate a control sequence that is "finer" than the period of its clock signal Some device (Spartan 3 in this case) and software dependent ad-hoc features to obtain better control: Digital Clock Manager (DCM): to obtain a “finer” control sequence by using a faster clock Input/Output Block (IOB): to minimize the off-chip pad delay

Digital Clock Manager (DCM)
DCM can multiply or divide the frequency or shift the phase of incoming clock to generate new clock. It is possible to drive memory controller with a DCM-generated 200- MHz (period = 5ns) clock signal. Example: To satisfy the 10ns we_n requirement, one can expand the w1 state to two states and assert the we_n signal in these states. The complete write operation now requires four states but they amount to only 20ns.

Input/Output Block (IOB)
An input/output block (IOB) of a Spartan-3 FPGA provides a programmable interface between an I/O pin and the device's internal logic. To minimize off-chip pad delay, output registers of memory controller can be placed at the FFs in IOBs and configure the driver with proper slew rate . An IOB contains a double data rate (DDR) register, which has 2 clocks and 2 inputs. Conceptually, the inputs are sampled independently by the two clocks and sampled values are stored in the same register.

Input/Output Block (IOB)
Combining DDR and DCM to generate we_n clk180, generated by the DCM is 180 degree out of phase with clk The 1 is always loaded at the rising edge of the clk180 signal (falling edge of the clk signal) essentially deactivating the second half of the we_n signal to generates a clean half-cycle signal.

Extra slides…

Block Diagram of SRAM 18-bit address bus Bidirectional 16-bit data bus
chip enable write enable output enable lower byte enable upper byte enable

Block Diagram of SRAM 18-bit address bus, ad
Bidirectional 16-bit data bus, dio, divided into upper and lower bytes, which can be accessed individually Five control signals (‘_n’ denotes active low): ce_n (chip enable) we_n (write enable) oe_n (output enable) lb_n (lower byte enable) ub_n (upper byte enable)

Timing Diagrams for Read Operation
tRC tOHA tAA Timing diagram of an address controlled read cycle we_n = 1, oe_n = 0 tRC (≈ tAA): read cycle time, the minimal elapsed time between two read operations tAA: address access time, the time required to obtain stable output data after an address change tOHA: output hold time, the time that the output data remains valid after the address changes (not to be confused with the hold time of an FF) Find the specific values for the device in the data sheet

Timing Diagrams for Read Operation
tLZOE tDOE tHZOE Timing diagram of an oe_n controlled read cycle we_n = 1 tDOE: output enable access time, the time required to obtain valid data after oe_n is activated tHZOE: output enable to high-Z time, the time for the tri-state buffer to enter the high-impedance state after oe_n is deactivated. tLZOE: output enable to low-Z time, the time for the tri-state buffer to leave the high-impedance state after oe_n is activated Find the specific values for the device in the data sheet

Timing Diagram for Write Operation
tWC tSA tWPE1 tSD tHD tWC: write cycle time, the minimal elapsed time between two write operations tSA: address setup time, the minimal time that the address must be stable before we_n is activated tHA: address hold time, the minimal time that the address must be stable after we_n is deactivated tPWE1: we_n pulse width, the minimal time that we_n must be asserted tSO: data setup time, the minimal time that data must be stable before the latching edge (the edge in which we_n moves from 0 to 1). tHD: data hold time, minimal time that data must be stable after latching edge

Control Sequences for Read Cycle
we_n should be deactivated during the entire operation Place the address on the ad bus and activate the oe_n signal. These two signals must be stable for the entire operation. Wait for at least tAA. The data from the SRAM becomes available after this interval Retrieve the data from dio and deactivate the o e n signal.

Control Sequences for Write Cycle
Place the address on the ad bus and data on the dio bus and activate the we_n signal. These signals must be stable for the entire operation. Wait for at least tPWE1. Deactivate the we_n signal. The data is latched to the SRAM at the 0- to-1 transition edge. Remove the data from the dio bus.

Lecture 11: External SRAM

Similar presentations

Presentation on theme: "Lecture 11: External SRAM"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture 11: External SRAM

Similar presentations

Presentation on theme: "Lecture 11: External SRAM"— Presentation transcript:

Similar presentations

About project

Feedback