Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMOS VLSI Design Chapter 12 Memory

Similar presentations


Presentation on theme: "CMOS VLSI Design Chapter 12 Memory"— Presentation transcript:

1 CMOS VLSI Design Chapter 12 Memory
This work is protected by Canadian copyright laws and is provided solely for the use of instructors in teaching their courses and assessing student learning. Dissemination or sale of any part of this work (including on the Internet) will destroy the integrity of the work and is not permitted. The copyright holder grants permission to instructors who have adopted the textbook accompanying this work to post this material online only if the use of the website is restricted by access codes to students in the instructor's class that is using the textbook and provided the reproduced material bears this copyright notice. slides from David Harris adapted by Duncan Elliott Textbook: CMOS VLSI Design - A Circuits and Design Perspective, 4th Edition, N. H. E. Weste & D. Harris CH 12 Memory

2 Outline Memory Arrays SRAM Architecture SRAM Cell Decoders
Column Circuitry Multiple Ports CAMS CH 12 Memory

3 Memory Arrays CH 12 Memory

4 Array Architecture 2n words of 2m bits each
If n >> m, fold by 2k into fewer rows of more columns Good regularity – easy to design Very high density if good cells are used CH 12 Memory

5 12T SRAM Cell Basic building block: SRAM Cell
Holds one bit of information, like a latch Must be read and written 12-transistor (12T) SRAM cell Use a simple latch connected to bitline 46 x 75 l unit cell CH 12 Memory

6 SRAM large area (6T with N-well) logic CMOS process compatible
higher static power volatile fast CH 12 Memory

7 6T SRAM Cell Cell size accounts for most of array size
Reduce cell size at expense of complexity 6T SRAM Cell Used in most commercial chips Data stored in cross-coupled inverters Read: Precharge bit, bit_b Raise wordline Write: Drive data onto bit, bit_b CH 12 Memory

8 SRAM Read Precharge both bitlines high Then turn on wordline
One of the two bitlines will be pulled down by the cell Ex: A = 0, A_b = 1 bit discharges, bit_b stays high But A bumps up slightly Read stability A must not flip N1 >> N2 CH 12 Memory

9 SRAM Write Drive one bitline high, the other low Then turn on wordline
Bitlines overpower cell with new value Ex: A = 0, A_b = 1, bit = 1, bit_b = 0 Force A_b low, then A rises high Writability Must overpower feedback inverter N2 >> P1 CH 12 Memory

10 SRAM Sizing High bitlines must not overpower inverters during reads
But low bitlines must write new value into cell CH 12 Memory

11 SRAM Column Example Read Write CH 12 Memory

12 SRAM Layout Cell size is critical: 26 x 45 l (even smaller in industry) Tile cells sharing VDD, GND, bitline contacts CH 12 Memory

13 CH 12 Memory

14 Thin Cell SRAM In nanometer CMOS
Avoid bends in polysilicon and diffusion Orient all transistors in one direction Lithographically friendly or thin cell layout fixes this Also reduces length and capacitance of bitlines [R W Mann] CH 12 Memory

15 Commercial SRAMs Five generations of Intel SRAM cell micrographs
Transition to thin cell at 65 nm Steady scaling of cell area CH 12 Memory

16 Decoders n:2n decoder consists of 2n n-input AND gates
One needed for each row of memory Build AND from NAND or NOR gates Static CMOS Pseudo-nMOS CH 12 Memory

17 Decoder Layout Decoders must be pitch-matched to SRAM cell
Requires very skinny gates CH 12 Memory

18 Large Decoders For n > 4, NAND gates become slow
Break large gates into multiple smaller gates CH 12 Memory

19 Predecoding Many of these gates are redundant Factor out common
gates into predecoder Saves area Same path effort CH 12 Memory

20 CH 12 Memory

21 CH 12 Memory

22 Column Circuitry Some circuitry is required for each column
Bitline conditioning Sense amplifiers Column decoding CH 12 Memory

23 Bitline Conditioning Precharge bitlines high before reads
Equalize bitlines to minimize voltage difference when using sense amplifiers CH 12 Memory

24 Sense Amplifiers Bitlines have many cells attached
Ex: 32-kbit SRAM has 128 rows x 256 cols 128 cells on each bitline tpd  (C/I) DV Even with shared diffusion contacts, 64C of diffusion capacitance (big C) Discharged slowly through small transistors (small I) Sense amplifiers are triggered on small voltage swing (reduce DV) CH 12 Memory

25 Differential Pair Amp Differential pair requires no clock
But always dissipates static power CH 12 Memory

26 Clocked Sense Amp Clocked sense amp saves power
Requires sense_clk after enough bitline swing Isolation transistors cut off large bitline capacitance CH 12 Memory

27 Twisted Bitlines Sense amplifiers also amplify noise
Coupling noise is severe in modern processes Try to couple equally onto bit and bit_b Done by twisting bitlines CH 12 Memory

28 Column Multiplexing Recall that array may be folded for good aspect ratio Ex: 2 kword x 16 folded into 256 rows x 128 columns Must select 16 output bits from the 128 columns Requires 16 8:1 column multiplexers CH 12 Memory

29 Tree Decoder Mux Column mux can use pass transistors
Use nMOS only, precharge outputs One design is to use k series transistors for 2k:1 mux No external decoder logic needed CH 12 Memory

30 Single Pass-Gate Mux Or eliminate series transistors with separate decoder CH 12 Memory

31 Ex: 2-way Muxed SRAM CH 12 Memory

32 Multiple Ports We have considered single-ported SRAM
One read or one write on each cycle Multiported SRAM are needed for register files Examples: Multicycle MIPS must read two sources or write a result on some cycles Pipelined MIPS must read two sources and write a third result each cycle Superscalar MIPS must read and write many sources and results each cycle CH 12 Memory

33 Dual-Ported SRAM Simple dual-ported SRAM
Two independent single-ended reads Or one differential write Do two reads and one write by time multiplexing Read during ph1, write during ph2 CH 12 Memory

34 Multi-Ported SRAM Adding more access transistors hurts read stability
Multiported SRAM isolates reads from state node Single-ended bitlines save area CH 12 Memory

35 Large SRAMs Large SRAMs are split into subarrays for speed
Ex: UltraSparc 512KB cache 4 128 KB subarrays Each have 16 8KB banks 256 rows x 256 cols / bank 60% subarray area efficiency Also space for tags & control [Shin05] CH 12 Memory

36 Serial Access Memories
Serial access memories do not use an address Shift Registers Tapped Delay Lines Serial In Parallel Out (SIPO) Parallel In Serial Out (PISO) Queues (FIFO, LIFO) CH 12 Memory

37 Shift Register Shift registers store and delay data
Simple design: cascade of registers Watch your hold times! CH 12 Memory

38 Denser Shift Registers
Flip-flops aren’t very area-efficient For large shift registers, keep data in SRAM instead Move read/write pointers to RAM rather than data Initialize read address to first entry, write to last Increment address on each cycle Also good for crossing clock boundaries CH 12 Memory

39 CAMs Extension of ordinary memory (e.g. SRAM)
Read and write memory as usual Also match to see which words contain a key CH 12 Memory

40 10T CAM Cell Add four match transistors to 6T SRAM 56 x 43 l unit cell
CH 12 Memory

41 CAM Cell Operation Read and write like ordinary SRAM For matching:
Leave wordline low Precharge matchlines Place key on bitlines Matchlines evaluate Miss line Pseudo-nMOS or dynamic NOR of match lines Goes high if no words match CH 12 Memory

42 CH 12 Memory

43 DRAM compact area with DRAM process low static power volatile
slow to mid speed CH 12 Memory

44 CH 12 Memory

45 Stacked One-T DRAM [Asanovic] 1-T DRAM Cell word access transistor bit
Storage capacitor (FET gate, trench, stack) VREF TiN top electrode (VREF) Ta2O5 dielectric W bottom electrode poly word line access transistor [Asanovic] CH 12 Memory

46 CH 12 Memory

47 CH 12 Memory

48 CH 12 Memory

49 CH 12 Memory

50 CH 12 Memory

51 CH 12 Memory

52 CH 12 Memory

53 CH 12 Memory

54 Read-Only Memories Read-Only Memories are nonvolatile
Retain their contents when power is removed Mask-programmed ROMs use one transistor per bit Presence or absence determines 1 or 0 CH 12 Memory

55 ROM Example 4-word x 6-bit ROM Represented with dot diagram
Dots indicate 1’s in ROM Word 0: Word 1: Word 2: Word 3: Looks like 6 4-input pseudo-nMOS NORs CH 12 Memory

56 ROM Array Layout Unit cell is 12 x 8 l (about 1/10 size of SRAM)
CH 12 Memory

57 Row Decoders ROM row decoders must pitch-match with ROM
Only a single track per word! CH 12 Memory

58 Complete ROM Layout CH 12 Memory

59 CH 12 Memory

60 Process cross section? CH 12 Memory

61 PROMs and EPROMs Programmable ROMs
Build array with transistors at every site Burn out fuses to disable unwanted transistors Electrically Programmable ROMs Use floating gate to turn off unwanted transistors EPROM, EEPROM, Flash CH 12 Memory

62 Flash Programming Charge on floating gate determines Vt
Logic 1: negative Vt Logic 0: positive Vt Cells erased to 1 by applying a high body voltage so that electrons tunnel off floating gate into substrate Programmed to 0 by applying high gate voltage CH 12 Memory

63 NAND Flash High density, low cost / bit Programmed one page at a time
Erased one block at a time Example: 4096-bit pages 16 pages / 8 KB block Many blocks / memory CH 12 Memory

64 64 Gb NAND Flash 64K cells / page 4 bits / cell (multiple Vt)
64 cells / string 256 pages / block 2K blocks / plane 2 planes [Trinh09] CH 12 Memory


Download ppt "CMOS VLSI Design Chapter 12 Memory"

Similar presentations


Ads by Google