Download presentation
Presentation is loading. Please wait.
1
UNIT V : SUBSYSTEM DESIGN
VLSI Design 01/03/2009 VLSI Design
2
Subsystem Design Goals of Custom Subsystem Design
Maximize performance Minimize area Fit together with other subsystems Key idea: optimize across levels of abstractions Layout Circuit Logic Register-Transfer and Higher 01/03/2009 VLSI Design
3
Optimizing for Performance and/or Area
Layout level Microscopic changes: move wires, change wire sizing add vias, reduce source/drain cap, etc. Macroscopic changes: cell placement, design of hierarchy Circuit level Transistor sizing Advanced circuits (e.g., dynamic logic) 01/03/2009 VLSI Design
4
Optimizing for Performance and/or Area (cont'd)
Logic level Use specialized designs (e.g. shifters, ALUs, etc.) Flatten to reduce delay Restructure Register-transfer level (and above) Place latches/flip flops to maximize performance (retiming) Encode FSMs to minimize area/delay Perform computations in parallel with extra hardware if cost permits Pipeline logic to increase performance 01/03/2009 VLSI Design
5
Pipelining Key idea: Partition combinational function with latches / flip flops Each partition is called a stage Time between each result: one clock period Latency: number of clock cycles before result appears (== number of stages) 01/03/2009 VLSI Design
6
Subsystem Design Principles
Pipelining System Q1 Q2 Φ i1 Q1 Q2 01/03/2009 VLSI Design
7
Pipelining Comments Impact on performance Design concerns / limits
Increases operations per unit time Increases latency Added overhead due to register setup times Design concerns / limits Balance stage delays for best performance Structure of logic may limit number of stages 01/03/2009 VLSI Design
8
A pipeline with unbalanced stage delays The longest stage delays decide clock period.
01/03/2009 VLSI Design
9
Effect of Adding Pipeline Stages
01/03/2009 VLSI Design
10
Latency, Clock Period and Throughput
Latency: the time to be taken for processing of an input datum from input to output. = N x Tp Clock period: = total processing time/ N Throughput = total processing/ clock period Total processing (sequential) (N=10 stages pipeline parallel) instructions = 10,000 clock period = 1 usec. time = 10 usec. latency = 10 usec. throughput = 1 G inst./sec throughput = 10 G inst./sec 01/03/2009 VLSI Design
11
Overview 1.ADDERS 2.SHIFTERS 3.High Density Memory Elements 01/03/2009
VLSI Design
12
1.Adders Half Adder Full Adder Ripple Carry Adder
Carry Look Ahead adder Carry Skip Adder (Carry By-pass Adder) Carry Select Adder 01/03/2009 VLSI Design
13
Binary Adder Half-Adder Binary Addition – single bit addition
– sum of 2 binary numbers can be larger than either number – need a “carry-out” to store the overflow Half-Adder – 2 inputs (x and y) and 2 outputs (sum and carry) 01/03/2009 VLSI Design
14
Half-Adder Circuits Most Basic Logic – NAND and NOR only circuits
01/03/2009 VLSI Design
15
Half-Adder Circuits NAND Implementation 01/03/2009 VLSI Design
16
Half-Adder Circuits NOR Implementation 01/03/2009 VLSI Design
17
Adder Design Full Adder Sum: si = ai XOR bi XOR ci
Carry: ci+1 = ai*bi + ai*ci + bi*ci Ai Bi Ci Si Ci+1 1 1 Ai Bi 1 1 Ci+1 Ci 1 1 1 Si 1 1 1 1 1 1 1 1 1 1 1 1 1 01/03/2009 VLSI Design
18
– full-adder has a “carry-in” input
When adding more than one bit, must consider the carry of the previous bit – full-adder has a “carry-in” input 01/03/2009 VLSI Design
19
The Binary Adder 01/03/2009 VLSI Design
20
Cascading Multi-bit Adders
Adding 2 binary (multi-bit) words – adding 2 n-bit word produces an n-bit sum and a carry example: 4bit addition Carry Bits – binary adding of n-bits will produce an n+1 carry – can be used as carry-in for next stage or as an overflow flag Cascading Multi-bit Adders – carry-out from a binary word adder can be passed to next cell to add larger words – example: 3 cascaded 4b binary adders for 12b addition 01/03/2009 VLSI Design
21
Ripple Carry Adder To use single bit full-adders to add multi-bit words – must apply carry-out from each bit addition to next bit addition – essentially like adding 3 multi-bit words • each ci is generated from the i-1 addition Ripple-Carry Adder – passes carry-out of each bit to carry-in of next bit – for n-bit addition, requires n Full-Adders 01/03/2009 VLSI Design
22
Ripple Carry Adder Ripple: constructed from n full adders
Compact, but delay proportional to n May be tolerable when n=8, BUT What about n=32? Potential worst cases: A0 or B0 to S31 A0 or B0 to C32 A3 B3 A2 B2 A1 B1 A0 B0 C4 C3 C3 C2 C2 C1 C1 C0 S3 S2 S1 S0 01/03/2009 VLSI Design
23
Complimentary Static CMOS Full Adder
28 Transistors 01/03/2009 VLSI Design
24
Adder/Subtractor using R-C Adders
Subtraction using 2’s complements – 2’s complement of X: X2s = X+1 • invert and add 1 – Subtraction via addition: Y - X = Y + X2s • R-C Adder/Subtactor Cell – control line, add_sub: 0 = add, 1 = subtract – XOR used to pass (add_sub=1) or invert (add_sub=0) – set first carry-in, c0, to 1 will add 1 for 2’s complement 01/03/2009 VLSI Design
25
Carry Look-Ahead Adder (CLA)
To avoid the linear growth of the carry delay, we use a Carry Look-Ahead Adder (CLA) in which the carries can be generated in parallel The Carry of each bit is generated from the propagate and the generate signals as well as the input carry 01/03/2009 VLSI Design
26
Carry Look-Ahead Adder
01/03/2009 VLSI Design
27
Carry Look-Ahead Adder
01/03/2009 VLSI Design
28
Speeding up Carry - Carry Lookahead
Key idea: trade off delay, amount of logic used Benefit: Faster addition Cost: much more logic Define two signals for each adder stage: Generate gi = ai*bi Propagate pi = ai + bi Why use these names? Adder i will always generate a carry if ai, bi both true A i will propagate a carry input if either or both ai, bi both true 1 X 1 A0 B0 C1 C0 S0 01/03/2009 VLSI Design
29
Express Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, B Generate (G) = AB Propagate (P) = A Å B Delete = A B Can also derive expressions for S and C based on D and P o Note that we will be sometimes using an alternate definition for Propagate (P) = A + B 01/03/2009 VLSI Design
30
Carry Look ahead Trees Can continue building the tree hierarchically.
01/03/2009 VLSI Design
31
Carry Look Ahead Adder 01/03/2009 VLSI Design
32
CLA in reduced CMOS Reduce previous CLA logic circuit
– construct CMOS push-pull network for each term – expanded carry terms • c1 = g0 + c0•p0 • c2 = g1 + g0•p1 + c0•p0•p1 • c3 = g2 + g1•p2 + g0•p1•p2 + c0•p0•p1•p2 • c4 = g3 + g2•p3+ g1•p2•p3 + g0•p1•p2•p3 + c0•p0•p1•p2•p3 – nFETs network for each carry term 01/03/2009 VLSI Design
33
16b Adder Using 4b CLA Blocks
Create SUMs from outputs of this circuit 01/03/2009 VLSI Design
34
Manchester Carry Chain
Goal: speed up carry using circuit design Key idea: use dynamic logic Conditionally discharge using generate signal Connect successive bits using propagate signal worst case discharge path through entire chain Ci-1’ Ci’ 01/03/2009 VLSI Design
35
Manchester Carry Generation
Create switch-logic network for carry – define carry in terms of controls where only one control is active at a given time – consider single bit FA truth table Generate gi = ai • bi propagate pi = ai ⊕ bi carry-kill ki = ai + bi only one of these is active for each case if generate = 1 then ci+1 = 1 if propagate = 1 then ci+1 = ci if carry-kill = 1 then ci+1 = 0 Carry-kill: generated to cover cases not covered by gi and pi, where carry will be zero (ai,bi)=(0,0) ..?? (ai,bi)=(1,1)….?? 01/03/2009 VLSI Design
36
Carry-Skip Adder Quickly generate a carry under certain conditions and skip the carry-generation block • recall ci+1 = gi + ci •pi • note generation of gi is more complex than generation of pi – so, generate pi and check ci pi case, skip gi generation if ci •pi = 1 01/03/2009 VLSI Design
37
A carry-skip adder is designed to speed up a wide adder by aiding
the propagation of a carry bit around a portion of the entire adder. 01/03/2009 VLSI Design
38
Carry/Skip Adder Key idea: in some cases, carry out = carry in
Bypass gate - "forward" carry signal Break forwarding into groups group group group FA FA FA FA FA FA FA FA FA Pi-1 Pi Pi+1 skip skip skip 01/03/2009 VLSI Design
39
Carry-Bypass Adder(Carry-Skip Adder)
01/03/2009 VLSI Design
40
Carry-Skip Adder (cont.)
tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum 01/03/2009 VLSI Design
41
tsum = time to generate the sum of the final stage
tSetup : the fixed overhead time to create the generate and propagate signals tCarry : The propagation delay through the single bit.The worst case carry-propagation delay through a single stage of M bits is approximately M times larger tbypass= the propagation delay through the bypass multiplexer of a single stage tsum = time to generate the sum of the final stage 01/03/2009 VLSI Design
42
Carry Ripple versus Carry Bypass
01/03/2009 VLSI Design
43
Carry-Select Adder – take a lot of chip area
– uses multiple adder blocks to increase speed – take a lot of chip area 01/03/2009 VLSI Design
44
Carry-Select Adder 01/03/2009 VLSI Design
45
Carry/Select Adder Key idea: compute two sum values
sum assuming carry=0, sum assuming carry=1 Select proper result based on carry in 01/03/2009 VLSI Design
46
Carry-Select Adder tSetup : the fixed overhead time to create the generate and propagate signals 01/03/2009 VLSI Design
47
Carry Select Adder: Critical Path
01/03/2009 VLSI Design
48
Carry-Save Adder Parallel FA, 3 inputs and 2 outputs
– does not add carry-out to next bit (thus no ripple) • carry is saved for use by other blocks – useful for adding more than 2 numbers 01/03/2009 VLSI Design
49
Carry Save Addition A full adder sums 3 inputs and produces 2 outputs
Carry output has twice weight of sum output N full adders in parallel are called carry save adder Produce N sums and N carry outs 01/03/2009 VLSI Design
50
CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form Final CPA computes actual result 01/03/2009 VLSI Design
51
CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form Final CPA computes actual result 01/03/2009 VLSI Design
52
Multipliers 01/03/2009 VLSI Design
53
Multipliers 01/03/2009 VLSI Design
54
Multipliers 01/03/2009 VLSI Design
55
Multipliers 01/03/2009 VLSI Design
56
Register-Based Multiplier
01/03/2009 VLSI Design
57
Array Multipliers 01/03/2009 VLSI Design
58
Array Multipliers 01/03/2009 VLSI Design
59
Array Multipliers 01/03/2009 VLSI Design
60
Booth Multiplier 01/03/2009 VLSI Design
61
Booth Multiplier 01/03/2009 VLSI Design
62
Booth Multiplier Structure of a Booth multiplier 01/03/2009
VLSI Design
63
Wallace Tree Multiplier
01/03/2009 VLSI Design
64
Wallace Tree Multiplier
Wallace tree multiplication 01/03/2009 VLSI Design
65
Wallace Tree Multiplier
01/03/2009 VLSI Design
66
Serial Multiplication
Serial multiplier 1. Require MN clock cycles to produce a product for an N-bit multiplier and a M-bit multiplicand 01/03/2009 VLSI Design
67
Serial Multiplication
Serial/parallel multiplier 1.Require M+N clock cycles to produce a product for an N-bit multiplier and a M-bit multiplicand 2. The critical path consists of the adders 01/03/2009 VLSI Design
68
Registers Basic Register Function – store a byte of data
– implement data movement functions such as shift, rotate – basis for other functions • counter/timer • Basic Register Circuit – cascade of DFF cells – additional logic to multiplex multiple inputs/outputs – typical I/O options • parallel load • load from left/right cell (shift) parallel output 01/03/2009 VLSI Design
69
Shifter Design Why shift? Arithmetic operations Floating-point
Bit field extraction Shift Register - one shift per clock cycle Hardware shifters - implement as comb. logic Single-bit shifters Barrel shifters Logarithmic shifters 01/03/2009 VLSI Design
70
Shift and Rotate Operations
– move each bit of data to an adjacent bit – roll end bit to other end • Shift – load ‘0’ into the open end bit Examples: 4b operations on data a3 a2 a1 a0 – Rotate Left: output = a2 a1 a0 a3 – Rotate Right: output = a0 a3 a2 a1 – Shift Left: output = a2 a1 a0 0 – Shift Right: output = 0 a3 a2 a1 01/03/2009 VLSI Design
71
Can use switch circuits to implement fast multi shift/rotate functions
Switch Shift/Rotate Circuits Can use switch circuits to implement fast multi shift/rotate functions – will not store/hold data since no FF is used & is not synchronous – Example: 4-bit Left Rotate Switching Array Rotate Left, moves lower bits to higher bits only one select (Rol_x) is active at a time 01/03/2009 VLSI Design
72
Left-Rotate switching Array
01/03/2009 VLSI Design
73
Design of a 4X4 Barrel-Shifter
Any general purpose n-bit shifter should be able to shift incoming data by up to (n– 1) places in a right-shift or left-shift direction. If we now further specify that all shifts should be on an 'end-around1 basis, so that any bit shifted out at one end of a data word will be shifted in at the other end of the word, then the problem of right shift or left shift is greatly eased. For a 4-bit word, that a 1-bit shift right is equivalent to a 3-bit shift left and a 2-bit shift right is equivalent to a 2-bit shift left. Thus we can achieve a capability to shift left or right by zero, one, two, or three places by designing a circuit which will shift right only (say) by zero, one, two, or three places. 01/03/2009 VLSI Design
74
Design of a 4X4 Barrel-Shifter
Data could be loaded from the output of the ALU and shifting effected; then the outputs of each stage of the shift register would provide the required parallel output to be returned to the register array. 01/03/2009 VLSI Design
75
Design of a 4X4 Barrel-Shifter
In this case, the shifter must have: input from a four-line parallel data bus; four output lines for the shifted data; means of transferring input data to output lines with any shift from zero to three bits inclusive. 01/03/2009 VLSI Design
76
Design of a 4X4 Barrel-Shifter
switch-like MOS pass transistor and transmission gate. Strategy decided for the direction of data and control signal flow, and the approach adopted should make this feasible. Remember that the overall strategy in this case is for data to flow horizontally and control signals vertically. 01/03/2009 VLSI Design
77
Design of a 4X4 Barrel-Shifter
A solution which meets these requirements emerges from the days of switch and relay contact based switching networks — the crossbar switch. Consider a direct MOS switch implementation of a 4 x 4 crossbar switch. The arrangement is quite general and may be readily expanded to accommodate n-bit inputs/outputs. In fact, this arrangement is an overkill in that any input line can be connected to any or all output lines — if all switches are closed, then all inputs are connected to all outputs in one glorious short circuit. Furthermore, 16 control signals (sw00- sw15), one for each transistor switch, must be provided to drive the crossbar switch, and such complexity is highly undesirable. An adaptation of this arrangement recognizes the fact that we can couple the switch gates together in groups of four (in this case) and also form four separate groups corresponding to shifts of zero, one, two and three bits. The arrangement is readily adapted so that the in-lines also run horizontally (to conform to the required strategy). 01/03/2009 VLSI Design
78
01/03/2009 VLSI Design
79
Design of a 4X4 Barrel-Shifter
The resulting arrangement is known as a barrel shifter and a 4 x 4-bit barrel shifter circuit diagram The interbus switches have their gate inputs connected in a staircase fashion in groups of four and there are now four shift control inputs which must be mutually exclusive in the active state. The structure of the barrel shifter is clearly one of high regularity and generality and it may be readily represented in stick diagram form. 01/03/2009 VLSI Design
80
Barrel Shifter Shifts m inputs into n outputs
– typically n = m or n = m/2 • Example 8x4 barrel shifter – outputs 1 of 4 combinations of 4-adjacent-bits 01/03/2009 VLSI Design
81
Barrel Shifter 8x4 nMOS switch barrel shifter 01/03/2009 VLSI Design
82
Area Dominated by Wiring
The Barrel Shifter Area Dominated by Wiring 01/03/2009 VLSI Design
83
4x4 barrel shifter 01/03/2009 VLSI Design
84
01/03/2009 VLSI Design
85
Design of a 4X4 Barrel-Shifter
The stick diagram clearly Conveys regular topology and allows the choice of a standard cell from which complete barrel shifters of any size may be formed by replication of the standard cell. It should be noted that standard cell boundaries must be carefully chosen to allow for butting together side by side and top to bottom to retain the overall topology. Once the standard cell dimensions have been determined, then any n x n barrel shifter may be configured and its outline, or bounding box, arrived at by summing up the dimensions of the replicated standard cell. 01/03/2009 VLSI Design
86
01/03/2009 VLSI Design
87
01/03/2009 VLSI Design
88
Observations Approach to the design of a system and of a particular subsystem steps involved may be set out as follows: 1).Set out a specification together with an architectural block diagram. 2).Suitably partition the architecture into subsystems which are, as far as possible, self-contained and which give as simple interconnection requirements as possible. 3).Set out a tentative floor plan showing the proposed physical disposition of subsystems on the chip. 4).Determine interconnection strategy. 5).Revise 2, 3 and 4 interactively as necessary. 6)Choose layers on which to run buses and the main control signals. 7)Take each subsystem in turn and conceive a regular architecture to conform to the strategy set out in 4. Set out circuit and/or logic diagrams as appropriate. Remember that MOS switch-based logic is such that both the logic 1 and logic 0 conditions of an output must be deliberately satisfied (not as in TTL logic, where if logic 1 conditions are satisfied then logic 0 conditions follow automatically). 8) Develop stick diagrams adopting suitable tactics to observe the overall strategy (4) and choice of layers (6). Determine suitable standard cell(s) from which the subsystem may be formed. 01/03/2009 VLSI Design
89
A Generic Digital Processor
01/03/2009 VLSI Design
90
Architecture of a CPU Flags: overflow, zero, etc. function Read/write
Control function Read/write Memory Register File Data path This is a very simple CPU The control unit is a state machine. Schedules the operations, synchronized different components Memory could be RAM, ROM, Content-Addressable Memory (CAM) Data path is the manipulator of data. Performs arithmetic and logic operations I haven’t shown I/O 01/03/2009 VLSI Design
91
Arithmetic and Logic Unit (ALU)
Functions Arithmetic (add, sub, inc, dec) Logic (and, or, not, xor) Comparison (<, >, <=, >=, !=) Control signals Function selection Operation mode (signed, unsigned) Output Operation result (data) Flags (overflow, zero, negative) 01/03/2009 VLSI Design
92
Simple ALU Example Control Multiplexer Register Adder Shifter Data Out
Bit 3 Bit 2 Data Out Data in Bit 1 The inputs are stored in registers Operations are performed on the inputs Mux selects the functional unit (e.g., adder or shifter) whose output should be connected to “data out” Bit-sliced fashion Adder important: very common, a lot of effort has been put in optimization Data widths: 8, 12, 16, 32, 64, 128 bits. Data type: integer, floating point Bit 0 Tile identical processing elements 01/03/2009 VLSI Design [© Prentice Hall]
93
Building Blocks for Digital Architectures
Arithmetic unit - Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.) Memory - RAM, ROM, Buffers, Shift registers Control - Finite state machine (PLA, random logic.) - Counters Interconnect - Switches - Arbiters - Bus 01/03/2009 VLSI Design
94
An Intel Microprocessor
Itanium has 6 integer execution units like this 01/03/2009 VLSI Design
95
Bit-Sliced Design 01/03/2009 VLSI Design
96
Memory Classification
01/03/2009 VLSI Design
97
Memory Architecture - Decoders
01/03/2009 VLSI Design
98
Row/Column Memory Structure
01/03/2009 VLSI Design
99
Hierarchical Memory Structure
01/03/2009 VLSI Design
100
ROM Designs Mask-Programmable - Set before fabrication
Field-Programmable - Fused connections Electrically Programmable - Floating Gate Designs 01/03/2009 VLSI Design
101
Non Volatile Read/Write (NVRW) memories
Same architecture as ROM structures A floating transistor gate is used similar to traditional MOS, except that an extra polysilicon strip is inserted between the gate and channel allow the threshold voltage to be progammable Floating gate Source Substrate Gate Drain n + +_ p t ox Device cross-section Schematic symbol G S D Floating-Gate MOS Transistor (FAMOS) 01/03/2009 VLSI Design
102
Floating gate transistor programming
20 V 10 V 5 V D S Avalanche injection 0 V - 5 V D S Removing programming voltage leaves charge trapped 5 V - 2.5 V D S Programming results in higher V T . Process is self-timing - Effectively increases Threshold voltage Floating gate is surrounded by an insulator material traps the electrons 01/03/2009 VLSI Design
103
Flash Electrically Erasable ROMs
Control gate Floating gate erasure Thin tunneling oxide n 1 source n 1 drain programming p- substrate To erase: ground the gate and apply a 12V at the source 01/03/2009 VLSI Design
104
Types of NV-RWM EPROM (FAMOS) EEPROM (FLOTOX) Flash EPROM
Program using avalanch hot-electron injection Erase using ultraviolet light EEPROM (FLOTOX) Program & erase using Fowler-Nordheim Tunneling Advantage: elecrically eraseable Flash EPROM Program using avalanche hot-electron injection Erase using Fowler-Nordheim tunneling 01/03/2009 VLSI Design
105
Volatile Read-Write Memory (RAM)
Static Data stored as long as power on Large (6 transistors/cell) Fast Differential Dynamic Periodic refresh required Small (1-3 transistors/cell) Slower Single-Ended 01/03/2009 VLSI Design
106
Static random access memory
. 01/03/2009 VLSI Design
107
6T SRAM Cell Cell size accounts for most of array size 6T SRAM Cell
Reduce cell size at expense of complexity 6T SRAM Cell Used in most commercial chips Data stored in cross-coupled inverters Read: Precharge bit, bit_b Raise wordline Write: Drive data onto bit, bit_b 01/03/2009 VLSI Design
108
01/03/2009 VLSI Design
109
SRAM cell M3 six-transistor CMOS SRAM cell. 01/03/2009 VLSI Design
110
Reading Content of the memory is a 1, stored at Q.
The read cycle is started by precharging both the bit lines to a logical 1, then asserting the word line WL, enabling both the access transistors. The second step occurs when the values stored in Q and Q are transferred to the bit lines by leaving BL at its precharged value and discharging BL through M1 and M5 to a logical 0. On the BL side, the transistors M4 and M6 pull the bit line toward VDD, a logical 1. If the content of the memory was a 0, the opposite would happen and BL would be pulled toward 1 and BL toward 0. 01/03/2009 VLSI Design
111
Writing The start of a write cycle begins by applying the value to be written to the bit lines. If we wish to write a 0, we would apply a 0 to the bit lines, i.e. setting BL to 1 and BL to 0. A 1 is written by inverting the values of the bit lines. WL is then asserted and the value that is to be stored is latched in. Note that the reason this works is that the bit line input-drivers are designed to be much stronger than the relatively weak transistors in the cell itself, so that they can easily override the previous state of the cross-coupled inverters. Careful sizing of the transistors in an SRAM cell is needed to ensure proper operation. 01/03/2009 VLSI Design
112
Standby If the word line is not asserted, the access transistors M5 and M6 disconnect the cell from the bit lines. 01/03/2009 VLSI Design
113
Memory Basics RAM: Random Access Memory ROM: Read Only Memory
–historically defined as memory array with individual bit access –refers to memory with both Read and Write capabilities ROM: Read Only Memory –no capabilities for “online”memory Write operations –Write typically requires high voltages or erasing by UV light 01/03/2009 VLSI Design
114
Static vs. Dynamic Memory
Volatility of Memory –volatile memory loses data over time or when power is removed •RAM is volatile –non-volatile memory stores date even when power is removed •ROM is non-volatile Static vs. Dynamic Memory –Static: holds data as long as power is applied (SRAM) –Dynamic: must be refreshed periodically (DRAM) 01/03/2009 VLSI Design
115
DRAM = Dynamic Random Access Memory
DRAM Basics DRAM = Dynamic Random Access Memory –Dynamic: must be refreshed periodically –Volatile: loses data when power is removed Comparison to SRAM –DRAM is smaller & less expensive per bit –SRAM is faster –DRAM requires more peripheral circuitry 01/03/2009 VLSI Design
116
6.7.3 The Three-Transistor Dynamaic RAM
The One-Transistor Dynamic RAM M1 X BL WL Cs CBL X Vdd-Vt WL write “1” BL Vdd read Vdd/2 sensing 01/03/2009 VLSI Design
117
1T DRAM Cell -single access nFET
–storage capacitor (referenced to VDD or Ground) –control input: word line, WL –data I/O: bit line Storage capacitor 01/03/2009 VLSI Design
118
DRAM Operation RAM data is held on the storage capacitor
–temporary –due to leakage currents which drain charge Charge Storage– if Cs is charged to Vs –Qs = Cs Vs •if Vs = 0, then Qs = 0: LOGIC 0 •if Vs = large, then Qs > 0: LOGIC 1 Storage capacitor 01/03/2009 VLSI Design
119
Write Operation –turn on access transistor: WL = VDD
–apply voltage, Vd (high or low), to bit line –Cs is charged (or discharged) –if Vd= 0 •Vs = 0, Qs = 0, store logic 0 –if Vd= VDD •Vs = VDD-Vtn, Qs = Cs (VDD-Vtn), logic 1 01/03/2009 VLSI Design
120
Hold Operation turn off access transistor: WL = 0 •charge held on Cs
01/03/2009 VLSI Design
121
Hold Time During Hold, leakage currents will slowly discharge Cs
–due to leakage in the access transistor when it is OFF if IL is known, can determine discharge time 01/03/2009 VLSI Design
122
Hold Time, th max time voltage on Cs is high enough to be a logic 1
• = time to discharge from Vmax to V1 if we estimate IL as a constant •desire large hold time •th increases with larger Cs and lower IL •typical value, th = 50μsec –with IL= 1nA, Cs=50fF, and ΔVs=1V 01/03/2009 VLSI Design
123
Refresh Rate must include refresh circuitry in a DRAM circuit
•DRAM is “Dynamic”, data is stored for only short time Refresh Operation -to hold data as long as power is applied, data must be refreshed –periodically read every cell •amplify cell data •rewrite data to cell Refresh Rate, frefresh– frequency at which cells must be refreshed to maintain data– must include refresh circuitry in a DRAM circuit 01/03/2009 VLSI Design Refresh operation
124
Read Operation –turn on access transistor
–charge on Cs is redistributed on the bit line capacitance, Cbit –this will change the bit line voltage, Vbit –which is amplified to read a 1 or 0 Charge Redistribution -Read is destructive, so must refresh after read Leakage cause stored values to “disappear” → refresh periodically 01/03/2009 VLSI Design
125
01/03/2009 VLSI Design
126
Physical design (layout) is CRITICAL in DRAM
DRAM Physical Design Physical design (layout) is CRITICAL in DRAM –high density is required for commercial success •Must minimize area of the 1T DRAM cell –typically only 30% of the chip is needed for peripherals (refresh, etc.) •For DRAM in CMOS, must minimize area of storage capacitor –but, large capacitor (> 40fF) is good to increase hold time, th 01/03/2009 VLSI Design
127
How DRAM cells are manufactured?
Trench capacitor 01/03/2009 VLSI Design
128
01/03/2009 VLSI Design
129
The Three-Transistor Dynamic RAM
01/03/2009 VLSI Design
130
References Introduction to VLSI Circuits and Systems,- John P Uyemura,John Wiley & sons Pvt Ltd Modern VLSI Design - Wayne Wolf, Pearson Education, 3rd Edition CMOS VLSI design, Neil H.E.Weste,David Harris,Ayan Banerjee 01/03/2009 VLSI Design
131
For those who do ill to you ,the best response is to return good to them….
--- 01/03/2009 VLSI Design
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.