Presentation is loading. Please wait.

Presentation is loading. Please wait.

UNIT V : SUBSYSTEM DESIGN

Similar presentations


Presentation on theme: "UNIT V : SUBSYSTEM DESIGN"— Presentation transcript:

1 UNIT V : SUBSYSTEM DESIGN
VLSI Design 01/03/2009 VLSI Design

2 Subsystem Design Goals of Custom Subsystem Design
Maximize performance Minimize area Fit together with other subsystems Key idea: optimize across levels of abstractions Layout Circuit Logic Register-Transfer and Higher 01/03/2009 VLSI Design

3 Optimizing for Performance and/or Area
Layout level Microscopic changes: move wires, change wire sizing add vias, reduce source/drain cap, etc. Macroscopic changes: cell placement, design of hierarchy Circuit level Transistor sizing Advanced circuits (e.g., dynamic logic) 01/03/2009 VLSI Design

4 Optimizing for Performance and/or Area (cont'd)
Logic level Use specialized designs (e.g. shifters, ALUs, etc.) Flatten to reduce delay Restructure Register-transfer level (and above) Place latches/flip flops to maximize performance (retiming) Encode FSMs to minimize area/delay Perform computations in parallel with extra hardware if cost permits Pipeline logic to increase performance 01/03/2009 VLSI Design

5 Pipelining Key idea: Partition combinational function with latches / flip flops Each partition is called a stage Time between each result: one clock period Latency: number of clock cycles before result appears (== number of stages) 01/03/2009 VLSI Design

6 Subsystem Design Principles
Pipelining System Q1 Q2 Φ i1 Q1 Q2 01/03/2009 VLSI Design

7 Pipelining Comments Impact on performance Design concerns / limits
Increases operations per unit time Increases latency Added overhead due to register setup times Design concerns / limits Balance stage delays for best performance Structure of logic may limit number of stages 01/03/2009 VLSI Design

8 A pipeline with unbalanced stage delays The longest stage delays decide clock period.
01/03/2009 VLSI Design

9 Effect of Adding Pipeline Stages
01/03/2009 VLSI Design

10 Latency, Clock Period and Throughput
Latency: the time to be taken for processing of an input datum from input to output. = N x Tp Clock period: = total processing time/ N Throughput = total processing/ clock period Total processing (sequential) (N=10 stages pipeline parallel) instructions = 10,000 clock period = 1 usec. time = 10 usec. latency = 10 usec. throughput = 1 G inst./sec throughput = 10 G inst./sec 01/03/2009 VLSI Design

11 Overview 1.ADDERS 2.SHIFTERS 3.High Density Memory Elements 01/03/2009
VLSI Design

12 1.Adders Half Adder Full Adder Ripple Carry Adder
Carry Look Ahead adder Carry Skip Adder (Carry By-pass Adder) Carry Select Adder 01/03/2009 VLSI Design

13 Binary Adder Half-Adder Binary Addition – single bit addition
– sum of 2 binary numbers can be larger than either number – need a “carry-out” to store the overflow Half-Adder – 2 inputs (x and y) and 2 outputs (sum and carry) 01/03/2009 VLSI Design

14 Half-Adder Circuits Most Basic Logic – NAND and NOR only circuits
01/03/2009 VLSI Design

15 Half-Adder Circuits NAND Implementation 01/03/2009 VLSI Design

16 Half-Adder Circuits NOR Implementation 01/03/2009 VLSI Design

17 Adder Design Full Adder Sum: si = ai XOR bi XOR ci
Carry: ci+1 = ai*bi + ai*ci + bi*ci Ai Bi Ci Si Ci+1 1 1 Ai Bi 1 1 Ci+1 Ci 1 1 1 Si 1 1 1 1 1 1 1 1 1 1 1 1 1 01/03/2009 VLSI Design

18 – full-adder has a “carry-in” input
When adding more than one bit, must consider the carry of the previous bit – full-adder has a “carry-in” input 01/03/2009 VLSI Design

19 The Binary Adder 01/03/2009 VLSI Design

20 Cascading Multi-bit Adders
Adding 2 binary (multi-bit) words – adding 2 n-bit word produces an n-bit sum and a carry example: 4bit addition Carry Bits – binary adding of n-bits will produce an n+1 carry – can be used as carry-in for next stage or as an overflow flag Cascading Multi-bit Adders – carry-out from a binary word adder can be passed to next cell to add larger words – example: 3 cascaded 4b binary adders for 12b addition 01/03/2009 VLSI Design

21 Ripple Carry Adder To use single bit full-adders to add multi-bit words – must apply carry-out from each bit addition to next bit addition – essentially like adding 3 multi-bit words • each ci is generated from the i-1 addition Ripple-Carry Adder – passes carry-out of each bit to carry-in of next bit – for n-bit addition, requires n Full-Adders 01/03/2009 VLSI Design

22 Ripple Carry Adder Ripple: constructed from n full adders
Compact, but delay proportional to n May be tolerable when n=8, BUT What about n=32? Potential worst cases: A0 or B0 to S31 A0 or B0 to C32 A3 B3 A2 B2 A1 B1 A0 B0 C4 C3 C3 C2 C2 C1 C1 C0 S3 S2 S1 S0 01/03/2009 VLSI Design

23 Complimentary Static CMOS Full Adder
28 Transistors 01/03/2009 VLSI Design

24 Adder/Subtractor using R-C Adders
Subtraction using 2’s complements – 2’s complement of X: X2s = X+1 • invert and add 1 – Subtraction via addition: Y - X = Y + X2s • R-C Adder/Subtactor Cell – control line, add_sub: 0 = add, 1 = subtract – XOR used to pass (add_sub=1) or invert (add_sub=0) – set first carry-in, c0, to 1 will add 1 for 2’s complement 01/03/2009 VLSI Design

25 Carry Look-Ahead Adder (CLA)
To avoid the linear growth of the carry delay, we use a Carry Look-Ahead Adder (CLA) in which the carries can be generated in parallel The Carry of each bit is generated from the propagate and the generate signals as well as the input carry 01/03/2009 VLSI Design

26 Carry Look-Ahead Adder
01/03/2009 VLSI Design

27 Carry Look-Ahead Adder
01/03/2009 VLSI Design

28 Speeding up Carry - Carry Lookahead
Key idea: trade off delay, amount of logic used Benefit: Faster addition Cost: much more logic Define two signals for each adder stage: Generate gi = ai*bi Propagate pi = ai + bi Why use these names? Adder i will always generate a carry if ai, bi both true A i will propagate a carry input if either or both ai, bi both true 1 X 1 A0 B0 C1 C0 S0 01/03/2009 VLSI Design

29 Express Sum and Carry as a function of P, G, D
Define 3 new variable which ONLY depend on A, B Generate (G) = AB Propagate (P) = A Å B Delete = A B Can also derive expressions for S and C based on D and P o Note that we will be sometimes using an alternate definition for Propagate (P) = A + B 01/03/2009 VLSI Design

30 Carry Look ahead Trees Can continue building the tree hierarchically.
01/03/2009 VLSI Design

31 Carry Look Ahead Adder 01/03/2009 VLSI Design

32 CLA in reduced CMOS Reduce previous CLA logic circuit
– construct CMOS push-pull network for each term – expanded carry terms • c1 = g0 + c0•p0 • c2 = g1 + g0•p1 + c0•p0•p1 • c3 = g2 + g1•p2 + g0•p1•p2 + c0•p0•p1•p2 • c4 = g3 + g2•p3+ g1•p2•p3 + g0•p1•p2•p3 + c0•p0•p1•p2•p3 – nFETs network for each carry term 01/03/2009 VLSI Design

33 16b Adder Using 4b CLA Blocks
Create SUMs from outputs of this circuit 01/03/2009 VLSI Design

34 Manchester Carry Chain
Goal: speed up carry using circuit design Key idea: use dynamic logic Conditionally discharge using generate signal Connect successive bits using propagate signal worst case discharge path through entire chain Ci-1’ Ci’ 01/03/2009 VLSI Design

35 Manchester Carry Generation
Create switch-logic network for carry – define carry in terms of controls where only one control is active at a given time – consider single bit FA truth table Generate gi = ai • bi propagate pi = ai ⊕ bi carry-kill ki = ai + bi only one of these is active for each case if generate = 1 then ci+1 = 1 if propagate = 1 then ci+1 = ci if carry-kill = 1 then ci+1 = 0 Carry-kill: generated to cover cases not covered by gi and pi, where carry will be zero (ai,bi)=(0,0) ..?? (ai,bi)=(1,1)….?? 01/03/2009 VLSI Design

36 Carry-Skip Adder Quickly generate a carry under certain conditions and skip the carry-generation block • recall ci+1 = gi + ci •pi • note generation of gi is more complex than generation of pi – so, generate pi and check ci pi case, skip gi generation if ci •pi = 1 01/03/2009 VLSI Design

37 A carry-skip adder is designed to speed up a wide adder by aiding
the propagation of a carry bit around a portion of the entire adder. 01/03/2009 VLSI Design

38 Carry/Skip Adder Key idea: in some cases, carry out = carry in
Bypass gate - "forward" carry signal Break forwarding into groups group group group FA FA FA FA FA FA FA FA FA Pi-1 Pi Pi+1 skip skip skip 01/03/2009 VLSI Design

39 Carry-Bypass Adder(Carry-Skip Adder)
01/03/2009 VLSI Design

40 Carry-Skip Adder (cont.)
tadder = tsetup + Mtcarry + (N/M-1)tbypass + (M-1)tcarry + tsum 01/03/2009 VLSI Design

41 tsum = time to generate the sum of the final stage
tSetup : the fixed overhead time to create the generate and propagate signals tCarry : The propagation delay through the single bit.The worst case carry-propagation delay through a single stage of M bits is approximately M times larger tbypass= the propagation delay through the bypass multiplexer of a single stage tsum = time to generate the sum of the final stage 01/03/2009 VLSI Design

42 Carry Ripple versus Carry Bypass
01/03/2009 VLSI Design

43 Carry-Select Adder – take a lot of chip area
– uses multiple adder blocks to increase speed – take a lot of chip area 01/03/2009 VLSI Design

44 Carry-Select Adder 01/03/2009 VLSI Design

45 Carry/Select Adder Key idea: compute two sum values
sum assuming carry=0, sum assuming carry=1 Select proper result based on carry in 01/03/2009 VLSI Design

46 Carry-Select Adder tSetup : the fixed overhead time to create the generate and propagate signals 01/03/2009 VLSI Design

47 Carry Select Adder: Critical Path
01/03/2009 VLSI Design

48 Carry-Save Adder Parallel FA, 3 inputs and 2 outputs
– does not add carry-out to next bit (thus no ripple) • carry is saved for use by other blocks – useful for adding more than 2 numbers 01/03/2009 VLSI Design

49 Carry Save Addition A full adder sums 3 inputs and produces 2 outputs
Carry output has twice weight of sum output N full adders in parallel are called carry save adder Produce N sums and N carry outs 01/03/2009 VLSI Design

50 CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form Final CPA computes actual result 01/03/2009 VLSI Design

51 CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form Final CPA computes actual result 01/03/2009 VLSI Design

52 Multipliers 01/03/2009 VLSI Design

53 Multipliers 01/03/2009 VLSI Design

54 Multipliers 01/03/2009 VLSI Design

55 Multipliers 01/03/2009 VLSI Design

56 Register-Based Multiplier
01/03/2009 VLSI Design

57 Array Multipliers 01/03/2009 VLSI Design

58 Array Multipliers 01/03/2009 VLSI Design

59 Array Multipliers 01/03/2009 VLSI Design

60 Booth Multiplier 01/03/2009 VLSI Design

61 Booth Multiplier 01/03/2009 VLSI Design

62 Booth Multiplier Structure of a Booth multiplier 01/03/2009
VLSI Design

63 Wallace Tree Multiplier
01/03/2009 VLSI Design

64 Wallace Tree Multiplier
Wallace tree multiplication 01/03/2009 VLSI Design

65 Wallace Tree Multiplier
01/03/2009 VLSI Design

66 Serial Multiplication
Serial multiplier 1. Require MN clock cycles to produce a product for an N-bit multiplier and a M-bit multiplicand 01/03/2009 VLSI Design

67 Serial Multiplication
Serial/parallel multiplier 1.Require M+N clock cycles to produce a product for an N-bit multiplier and a M-bit multiplicand 2. The critical path consists of the adders 01/03/2009 VLSI Design

68 Registers Basic Register Function – store a byte of data
– implement data movement functions such as shift, rotate – basis for other functions • counter/timer • Basic Register Circuit – cascade of DFF cells – additional logic to multiplex multiple inputs/outputs – typical I/O options • parallel load • load from left/right cell (shift) parallel output 01/03/2009 VLSI Design

69 Shifter Design Why shift? Arithmetic operations Floating-point
Bit field extraction Shift Register - one shift per clock cycle Hardware shifters - implement as comb. logic Single-bit shifters Barrel shifters Logarithmic shifters 01/03/2009 VLSI Design

70 Shift and Rotate Operations
– move each bit of data to an adjacent bit – roll end bit to other end • Shift – load ‘0’ into the open end bit Examples: 4b operations on data a3 a2 a1 a0 – Rotate Left: output = a2 a1 a0 a3 – Rotate Right: output = a0 a3 a2 a1 – Shift Left: output = a2 a1 a0 0 – Shift Right: output = 0 a3 a2 a1 01/03/2009 VLSI Design

71 Can use switch circuits to implement fast multi shift/rotate functions
Switch Shift/Rotate Circuits Can use switch circuits to implement fast multi shift/rotate functions – will not store/hold data since no FF is used & is not synchronous – Example: 4-bit Left Rotate Switching Array Rotate Left, moves lower bits to higher bits only one select (Rol_x) is active at a time 01/03/2009 VLSI Design

72 Left-Rotate switching Array
01/03/2009 VLSI Design

73 Design of a 4X4 Barrel-Shifter
Any general purpose n-bit shifter should be able to shift incoming data by up to (n– 1) places in a right-shift or left-shift direction. If we now further specify that all shifts should be on an 'end-around1 basis, so that any bit shifted out at one end of a data word will be shifted in at the other end of the word, then the problem of right shift or left shift is greatly eased. For a 4-bit word, that a 1-bit shift right is equivalent to a 3-bit shift left and a 2-bit shift right is equivalent to a 2-bit shift left. Thus we can achieve a capability to shift left or right by zero, one, two, or three places by designing a circuit which will shift right only (say) by zero, one, two, or three places. 01/03/2009 VLSI Design

74 Design of a 4X4 Barrel-Shifter
Data could be loaded from the output of the ALU and shifting effected; then the outputs of each stage of the shift register would provide the required parallel output to be returned to the register array. 01/03/2009 VLSI Design

75 Design of a 4X4 Barrel-Shifter
In this case, the shifter must have: input from a four-line parallel data bus; four output lines for the shifted data; means of transferring input data to output lines with any shift from zero to three bits inclusive. 01/03/2009 VLSI Design

76 Design of a 4X4 Barrel-Shifter
switch-like MOS pass transistor and transmission gate. Strategy decided for the direction of data and control signal flow, and the approach adopted should make this feasible. Remember that the overall strategy in this case is for data to flow horizontally and control signals vertically. 01/03/2009 VLSI Design

77 Design of a 4X4 Barrel-Shifter
A solution which meets these requirements emerges from the days of switch and relay contact based switching networks — the crossbar switch. Consider a direct MOS switch implementation of a 4 x 4 crossbar switch. The arrangement is quite general and may be readily expanded to accommodate n-bit inputs/outputs. In fact, this arrangement is an overkill in that any input line can be connected to any or all output lines — if all switches are closed, then all inputs are connected to all outputs in one glorious short circuit. Furthermore, 16 control signals (sw00- sw15), one for each transistor switch, must be provided to drive the crossbar switch, and such complexity is highly undesirable. An adaptation of this arrangement recognizes the fact that we can couple the switch gates together in groups of four (in this case) and also form four separate groups corresponding to shifts of zero, one, two and three bits. The arrangement is readily adapted so that the in-lines also run horizontally (to conform to the required strategy). 01/03/2009 VLSI Design

78 01/03/2009 VLSI Design

79 Design of a 4X4 Barrel-Shifter
The resulting arrangement is known as a barrel shifter and a 4 x 4-bit barrel shifter circuit diagram The interbus switches have their gate inputs connected in a staircase fashion in groups of four and there are now four shift control inputs which must be mutually exclusive in the active state. The structure of the barrel shifter is clearly one of high regularity and generality and it may be readily represented in stick diagram form. 01/03/2009 VLSI Design

80 Barrel Shifter Shifts m inputs into n outputs
– typically n = m or n = m/2 • Example 8x4 barrel shifter – outputs 1 of 4 combinations of 4-adjacent-bits 01/03/2009 VLSI Design

81 Barrel Shifter 8x4 nMOS switch barrel shifter 01/03/2009 VLSI Design

82 Area Dominated by Wiring
The Barrel Shifter Area Dominated by Wiring 01/03/2009 VLSI Design

83 4x4 barrel shifter 01/03/2009 VLSI Design

84 01/03/2009 VLSI Design

85 Design of a 4X4 Barrel-Shifter
The stick diagram clearly Conveys regular topology and allows the choice of a standard cell from which complete barrel shifters of any size may be formed by replication of the standard cell. It should be noted that standard cell boundaries must be carefully chosen to allow for butting together side by side and top to bottom to retain the overall topology. Once the standard cell dimensions have been determined, then any n x n barrel shifter may be configured and its outline, or bounding box, arrived at by summing up the dimensions of the replicated standard cell. 01/03/2009 VLSI Design

86 01/03/2009 VLSI Design

87 01/03/2009 VLSI Design

88 Observations Approach to the design of a system and of a particular subsystem steps involved may be set out as follows: 1).Set out a specification together with an architectural block diagram. 2).Suitably partition the architecture into subsystems which are, as far as possible, self-contained and which give as simple interconnection requirements as possible. 3).Set out a tentative floor plan showing the proposed physical disposition of subsystems on the chip. 4).Determine interconnection strategy. 5).Revise 2, 3 and 4 interactively as necessary. 6)Choose layers on which to run buses and the main control signals. 7)Take each subsystem in turn and conceive a regular architecture to conform to the strategy set out in 4. Set out circuit and/or logic diagrams as appropriate. Remember that MOS switch-based logic is such that both the logic 1 and logic 0 conditions of an output must be deliberately satisfied (not as in TTL logic, where if logic 1 conditions are satisfied then logic 0 conditions follow automatically). 8) Develop stick diagrams adopting suitable tactics to observe the overall strategy (4) and choice of layers (6). Determine suitable standard cell(s) from which the subsystem may be formed. 01/03/2009 VLSI Design

89 A Generic Digital Processor
01/03/2009 VLSI Design

90 Architecture of a CPU Flags: overflow, zero, etc. function Read/write
Control function Read/write Memory Register File Data path This is a very simple CPU The control unit is a state machine. Schedules the operations, synchronized different components Memory could be RAM, ROM, Content-Addressable Memory (CAM) Data path is the manipulator of data. Performs arithmetic and logic operations I haven’t shown I/O 01/03/2009 VLSI Design

91 Arithmetic and Logic Unit (ALU)
Functions Arithmetic (add, sub, inc, dec) Logic (and, or, not, xor) Comparison (<, >, <=, >=, !=) Control signals Function selection Operation mode (signed, unsigned) Output Operation result (data) Flags (overflow, zero, negative) 01/03/2009 VLSI Design

92 Simple ALU Example Control Multiplexer Register Adder Shifter Data Out
Bit 3 Bit 2 Data Out Data in Bit 1 The inputs are stored in registers Operations are performed on the inputs Mux selects the functional unit (e.g., adder or shifter) whose output should be connected to “data out” Bit-sliced fashion Adder important: very common, a lot of effort has been put in optimization Data widths: 8, 12, 16, 32, 64, 128 bits. Data type: integer, floating point Bit 0 Tile identical processing elements 01/03/2009 VLSI Design [© Prentice Hall]

93 Building Blocks for Digital Architectures
Arithmetic unit - Bit-sliced datapath (adder, multiplier, shifter, comparator, etc.) Memory - RAM, ROM, Buffers, Shift registers Control - Finite state machine (PLA, random logic.) - Counters Interconnect - Switches - Arbiters - Bus 01/03/2009 VLSI Design

94 An Intel Microprocessor
Itanium has 6 integer execution units like this 01/03/2009 VLSI Design

95 Bit-Sliced Design 01/03/2009 VLSI Design

96 Memory Classification
01/03/2009 VLSI Design

97 Memory Architecture - Decoders
01/03/2009 VLSI Design

98 Row/Column Memory Structure
01/03/2009 VLSI Design

99 Hierarchical Memory Structure
01/03/2009 VLSI Design

100 ROM Designs Mask-Programmable - Set before fabrication
Field-Programmable - Fused connections Electrically Programmable - Floating Gate Designs 01/03/2009 VLSI Design

101 Non Volatile Read/Write (NVRW) memories
Same architecture as ROM structures A floating transistor gate is used similar to traditional MOS, except that an extra polysilicon strip is inserted between the gate and channel allow the threshold voltage to be progammable Floating gate Source Substrate Gate Drain n + +_ p t ox Device cross-section Schematic symbol G S D Floating-Gate MOS Transistor (FAMOS) 01/03/2009 VLSI Design

102 Floating gate transistor programming
20 V 10 V 5 V D S Avalanche injection 0 V - 5 V D S Removing programming voltage leaves charge trapped 5 V - 2.5 V D S Programming results in higher V T . Process is self-timing - Effectively increases Threshold voltage Floating gate is surrounded by an insulator material  traps the electrons 01/03/2009 VLSI Design

103 Flash Electrically Erasable ROMs
Control gate Floating gate erasure Thin tunneling oxide n 1 source n 1 drain programming p- substrate To erase: ground the gate and apply a 12V at the source 01/03/2009 VLSI Design

104 Types of NV-RWM EPROM (FAMOS) EEPROM (FLOTOX) Flash EPROM
Program using avalanch hot-electron injection Erase using ultraviolet light EEPROM (FLOTOX) Program & erase using Fowler-Nordheim Tunneling Advantage: elecrically eraseable Flash EPROM Program using avalanche hot-electron injection Erase using Fowler-Nordheim tunneling 01/03/2009 VLSI Design

105 Volatile Read-Write Memory (RAM)
Static Data stored as long as power on Large (6 transistors/cell) Fast Differential Dynamic Periodic refresh required Small (1-3 transistors/cell) Slower Single-Ended 01/03/2009 VLSI Design

106 Static random access memory
. 01/03/2009 VLSI Design

107 6T SRAM Cell Cell size accounts for most of array size 6T SRAM Cell
Reduce cell size at expense of complexity 6T SRAM Cell Used in most commercial chips Data stored in cross-coupled inverters Read: Precharge bit, bit_b Raise wordline Write: Drive data onto bit, bit_b 01/03/2009 VLSI Design

108 01/03/2009 VLSI Design

109 SRAM cell M3 six-transistor CMOS SRAM cell. 01/03/2009 VLSI Design

110 Reading Content of the memory is a 1, stored at Q.
The read cycle is started by precharging both the bit lines to a logical 1, then asserting the word line WL, enabling both the access transistors. The second step occurs when the values stored in Q and Q are transferred to the bit lines by leaving BL at its precharged value and discharging BL through M1 and M5 to a logical 0. On the BL side, the transistors M4 and M6 pull the bit line toward VDD, a logical 1. If the content of the memory was a 0, the opposite would happen and BL would be pulled toward 1 and BL toward 0. 01/03/2009 VLSI Design

111 Writing The start of a write cycle begins by applying the value to be written to the bit lines. If we wish to write a 0, we would apply a 0 to the bit lines, i.e. setting BL to 1 and BL to 0. A 1 is written by inverting the values of the bit lines. WL is then asserted and the value that is to be stored is latched in. Note that the reason this works is that the bit line input-drivers are designed to be much stronger than the relatively weak transistors in the cell itself, so that they can easily override the previous state of the cross-coupled inverters. Careful sizing of the transistors in an SRAM cell is needed to ensure proper operation. 01/03/2009 VLSI Design

112 Standby If the word line is not asserted, the access transistors M5 and M6 disconnect the cell from the bit lines. 01/03/2009 VLSI Design

113 Memory Basics RAM: Random Access Memory ROM: Read Only Memory
–historically defined as memory array with individual bit access –refers to memory with both Read and Write capabilities ROM: Read Only Memory –no capabilities for “online”memory Write operations –Write typically requires high voltages or erasing by UV light 01/03/2009 VLSI Design

114 Static vs. Dynamic Memory
Volatility of Memory –volatile memory loses data over time or when power is removed •RAM is volatile –non-volatile memory stores date even when power is removed •ROM is non-volatile Static vs. Dynamic Memory –Static: holds data as long as power is applied (SRAM) –Dynamic: must be refreshed periodically (DRAM) 01/03/2009 VLSI Design

115 DRAM = Dynamic Random Access Memory
DRAM Basics DRAM = Dynamic Random Access Memory –Dynamic: must be refreshed periodically –Volatile: loses data when power is removed Comparison to SRAM –DRAM is smaller & less expensive per bit –SRAM is faster –DRAM requires more peripheral circuitry 01/03/2009 VLSI Design

116 6.7.3 The Three-Transistor Dynamaic RAM
The One-Transistor Dynamic RAM M1 X BL WL Cs CBL X Vdd-Vt WL write “1” BL Vdd read Vdd/2 sensing 01/03/2009 VLSI Design

117 1T DRAM Cell -single access nFET
–storage capacitor (referenced to VDD or Ground) –control input: word line, WL –data I/O: bit line Storage capacitor 01/03/2009 VLSI Design

118 DRAM Operation RAM data is held on the storage capacitor
–temporary –due to leakage currents which drain charge Charge Storage– if Cs is charged to Vs –Qs = Cs Vs •if Vs = 0, then Qs = 0: LOGIC 0 •if Vs = large, then Qs > 0: LOGIC 1 Storage capacitor 01/03/2009 VLSI Design

119 Write Operation –turn on access transistor: WL = VDD
–apply voltage, Vd (high or low), to bit line –Cs is charged (or discharged) –if Vd= 0 •Vs = 0, Qs = 0, store logic 0 –if Vd= VDD •Vs = VDD-Vtn, Qs = Cs (VDD-Vtn), logic 1 01/03/2009 VLSI Design

120 Hold Operation turn off access transistor: WL = 0 •charge held on Cs
01/03/2009 VLSI Design

121 Hold Time During Hold, leakage currents will slowly discharge Cs
–due to leakage in the access transistor when it is OFF if IL is known, can determine discharge time 01/03/2009 VLSI Design

122 Hold Time, th max time voltage on Cs is high enough to be a logic 1
• = time to discharge from Vmax to V1 if we estimate IL as a constant •desire large hold time •th increases with larger Cs and lower IL •typical value, th = 50μsec –with IL= 1nA, Cs=50fF, and ΔVs=1V 01/03/2009 VLSI Design

123 Refresh Rate must include refresh circuitry in a DRAM circuit
•DRAM is “Dynamic”, data is stored for only short time Refresh Operation -to hold data as long as power is applied, data must be refreshed –periodically read every cell •amplify cell data •rewrite data to cell Refresh Rate, frefresh– frequency at which cells must be refreshed to maintain data– must include refresh circuitry in a DRAM circuit 01/03/2009 VLSI Design Refresh operation

124 Read Operation –turn on access transistor
–charge on Cs is redistributed on the bit line capacitance, Cbit –this will change the bit line voltage, Vbit –which is amplified to read a 1 or 0 Charge Redistribution -Read is destructive, so must refresh after read Leakage cause stored values to “disappear” → refresh periodically 01/03/2009 VLSI Design

125 01/03/2009 VLSI Design

126 Physical design (layout) is CRITICAL in DRAM
DRAM Physical Design Physical design (layout) is CRITICAL in DRAM –high density is required for commercial success •Must minimize area of the 1T DRAM cell –typically only 30% of the chip is needed for peripherals (refresh, etc.) •For DRAM in CMOS, must minimize area of storage capacitor –but, large capacitor (> 40fF) is good to increase hold time, th 01/03/2009 VLSI Design

127 How DRAM cells are manufactured?
Trench capacitor 01/03/2009 VLSI Design

128 01/03/2009 VLSI Design

129 The Three-Transistor Dynamic RAM
01/03/2009 VLSI Design

130 References Introduction to VLSI Circuits and Systems,- John P Uyemura,John Wiley & sons Pvt Ltd Modern VLSI Design - Wayne Wolf, Pearson Education, 3rd Edition CMOS VLSI design, Neil H.E.Weste,David Harris,Ayan Banerjee 01/03/2009 VLSI Design

131 For those who do ill to you ,the best response is to return good to them….
--- 01/03/2009 VLSI Design


Download ppt "UNIT V : SUBSYSTEM DESIGN"

Similar presentations


Ads by Google