Digital Integrated Circuits A Design Perspective

Slides:



Advertisements
Similar presentations
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
Advertisements

+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
Sistemi Elettronici Programmabili1 Progettazione di circuiti e sistemi VLSI Anno Accademico Lezione Memorie (vedi anche i file pcs1_memorie.pdf.
COEN 180 DRAM. Dynamic Random Access Memory Dynamic: Periodically refresh information in a bit cell. Else it is lost. Small footprint: transistor + capacitor.
Digital Integrated Circuits A Design Perspective
Elettronica T AA Digital Integrated Circuits © Prentice Hall 2003 SRAM & DRAM.
Introduction to CMOS VLSI Design Lecture 13: SRAM
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 25 - Subsystem.

Designing Combinational Logic Circuits: Part2 Alternative Logic Forms:
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
Digital Integrated Circuits© Prentice Hall 1995 Memory SEMICONDUCTOR MEMORIES.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 32: Array Subsystems (DRAM/ROM) Prof. Sherief Reda Division of Engineering,
Introduction to CMOS VLSI Design SRAM/DRAM
Digital Integrated Circuits© Prentice Hall 1995 Memory SEMICONDUCTOR MEMORIES.
CMOS Digital Integrated Circuits
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 31: Array Subsystems (SRAM) Prof. Sherief Reda Division of Engineering,
Memory and Advanced Digital Circuits 1.
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Memories: –ROM; –SRAM; –DRAM. n PLAs.
Lecture 19: SRAM.
Parts from Lecture 9: SRAM Parts from
Digital Integrated Circuits A Design Perspective
55:035 Computer Architecture and Organization
Semiconductor Memories Lecture 1: May 10, 2006 EE Summer Camp Abhinav Agarwal.
12/1/2004EE 42 fall 2004 lecture 381 Lecture #38: Memory (2) Last lecture: –Memory Architecture –Static Ram This lecture –Dynamic Ram –E 2 memory.
Review: Basic Building Blocks  Datapath l Execution units -Adder, multiplier, divider, shifter, etc. l Register file and pipeline registers l Multiplexers,
Semiconductor Memories.  Semiconductor memory is an electronic data storage device, often used as computer memory, implemented on a semiconductor-based.
© Digital Integrated Circuits 2nd Memories Digital Integrated Circuits A Design Perspective SemiconductorMemories Jan M. Rabaey Anantha Chandrakasan Borivoje.
Semiconductor Memories Mohammad Sharifkhani. Outline Introduction Non-volatile memories.
Digital Integrated Circuits© Prentice Hall 1995 Memory SEMICONDUCTOR MEMORIES Adapted from Jan Rabaey's IC Design. Copyright 1996 UCB.
© Digital Integrated Circuits 2nd Memories Digital Integrated Circuits A Design Perspective SemiconductorMemories Jan M. Rabaey Anantha Chandrakasan Borivoje.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n Latches and flip-flops. n RAMs and ROMs.
Modern VLSI Design 4e: Chapter 6 Copyright  2008 Wayne Wolf Topics Memories: –ROM; –SRAM; –DRAM; –Flash. Image sensors. FPGAs. PLAs.
Sp09 CMPEN 411 L23 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 23: Memory Cell Designs SRAM, DRAM [Adapted from Rabaey’s Digital Integrated.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 12.1 EE4800 CMOS Digital IC Design & Analysis Lecture 12 SRAM Zhuo Feng.
Memory Semiconductor Memory Classification ETEG 431 SG Size: Bits, Bytes, Words. Timing Parameter: Read, Write Cycle… Function: ROM, RWM, Volatile, Static,
Digital Design: Principles and Practices
CSE477 L24 RAM Cores.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 24: RAM Cores Mary Jane Irwin ( )
ECE 300 Advanced VLSI Design Fall 2006 Lecture 19: Memories
CSE477 L23 Memories.1Irwin&Vijay, PSU, 2002 CSE477 VLSI Digital Circuits Fall 2002 Lecture 23: Semiconductor Memories Mary Jane Irwin (
Reading Assignment: Chapter 10 of Rabaey Chapter 8.3 of Weste
Chapter 10 Memories Boonchuay Supmonchai Integrated Design Application Research (IDAR) Laboratory August 7, 2005.
Washington State University
Computer Memory Storage Decoding Addressing 1. Memories We've Seen SIMM = Single Inline Memory Module DIMM = Dual IMM SODIMM = Small Outline DIMM RAM.
© Digital Integrated Circuits 2nd Memories Digital Integrated Circuits A Design Perspective SemiconductorMemories Jan M. Rabaey Anantha Chandrakasan Borivoje.
Washington State University
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 22: Memery, ROM
1 Semiconductor Memories. 2 Semiconductor Memory Classification Read-Write Memory Non-Volatile Read-Write Memory Read-Only Memory EPROM E 2 PROM FLASH.
CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 24: Peripheral Memory Circuits [Adapted from Rabaey’s Digital Integrated Circuits, Second Edition,
Introduction to Computer Organization and Architecture Lecture 7 By Juthawut Chantharamalee wut_cha/home.htm.
Semiconductor Memory Types
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Sp09 CMPEN 411 L21 S.1 CMPEN 411 VLSI Digital Circuits Spring 2009 Lecture 21: Shifters, Decoders, Muxes [Adapted from Rabaey’s Digital Integrated Circuits,
EE 534 summer 2004 University of South Alabama EE534 VLSI Design System summer 2004 Lecture 14:Chapter 10 Semiconductors memories.
Memory (Contd..) Memory Timing: Definitions ETEG 431 SG.
Chapter 5 Internal Memory. contents  Semiconductor main memory - organisation - organisation - DRAM and SRAM - DRAM and SRAM - types of ROM - types of.
Computer Architecture Chapter (5): Internal Memory
CSE477 L25 Memory Peripheral.1Irwin&Vijay, PSU, 2003 CSE477 VLSI Digital Circuits Fall 2003 Lecture 25: Peripheral Memory Circuits Mary Jane Irwin (
UNIT- III SEQUENTIAL LOGIC CIRCUITS. Static Latches and Registers The Bistability Principle: Static memories use positive feedback to create a bistable.
EE586 VLSI Design Partha Pande School of EECS Washington State University
Norhayati Soin 06 KEEE 4426 WEEK 15/1 6/04/2006 CHAPTER 6 Semiconductor Memories.
Lecture 19: SRAM.
MOS Memory and Storage Circuits
Digital Integrated Circuits A Design Perspective
Memory.
Semiconductor Memories
DIICD Class 13 Memories.
Presentation transcript:

Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Semiconductor Memories December 20, 2002

Chapter Overview Memory Classification Memory Architectures The Memory Core Periphery Reliability Case Studies

Semiconductor Memory Classification Non-Volatile Read-Write Memory Read-Write Memory Read-Only Memory Random Non-Random EPROM Mask-Programmed Access Access 2 E PROM Programmable (PROM) SRAM FIFO FLASH LIFO DRAM Shift Register CAM

Memory Timing: Definitions

Memory Architecture: Decoders bits M bits S S Decoder Word 0 Word 0 S 1 Word 1 A Word 1 S 2 Storage Storage Word 2 A Word 2 cell 1 cell N words A S K 2 1 N - 2 Word N - 2 Word N - 2 S N - 1 Word N - 1 Word N - 1 K = log N 2 Input-Output Input-Output ( M bits) ( M bits) Intuitive architecture for N x M memory Too many select signals: N words == N select signals K = log 2 N Decoder reduces the number of select signals

Array-Structured Memory Architecture Problem: ASPECT RATIO or HEIGHT >> WIDTH Amplify swing to rail-to-rail amplitude Selects appropriate word

Hierarchical Memory Architecture Advantages: 1. Shorter wires within blocks 2. Block address activates only 1 block => power savings

Block Diagram of 4 Mbit SRAM Clock generator CS, WE buffer I/O Y -address X x1/x4 controller Z Predecoder and block selector Bit line load Transfer gate Column decoder Sense amplifier and write driver 128 K Array Block 0 Subglobal row decoder Subglobal row decoder Global row decoder Block 31 Block 30 Block 1 Local row decoder [Hirose90]

Contents-Addressable Memory

Memory Timing: Approaches DRAM Timing Multiplexed Adressing SRAM Timing Self-timed

Read-Only Memory Cells BL BL BL VDD WL WL WL 1 BL BL BL WL WL WL GND Diode ROM MOS ROM 1 MOS ROM 2

MOS OR ROM BL [0] BL [1] BL [2] BL [3] WL [0] V WL [1] WL [2] V WL [3] DD WL [1] WL [2] V DD WL [3] V bias Pull-down loads

MOS NOR ROM WL [0] V Pull-up devices GND WL [1] WL [2] GND WL [3] BL DD Pull-up devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3]

MOS NOR ROM Layout Programmming using the Active Layer Only Cell (9.5l x 7l) Programmming using the Active Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion

MOS NOR ROM Layout Programmming using the Contact Layer Only Cell (11l x 7l) Programmming using the Contact Layer Only Polysilicon Metal1 Diffusion Metal1 on Diffusion

MOS NAND ROM V DD Pull-up devices BL [0] BL [1] BL [2] BL [3] WL [0] WL [1] WL [2] WL [3] All word lines high by default with exception of selected row

MOS NAND ROM Layout Programmming using the Metal-1 Layer Only Cell (8l x 7l) Programmming using the Metal-1 Layer Only No contact to VDD or GND necessary; Loss in performance compared to NOR ROM drastically reduced cell size Polysilicon Diffusion Metal1 on Diffusion

NAND ROM Layout Programmming using Implants Only Cell (5l x 6l) Polysilicon Threshold-altering implant Metal1 on Diffusion

Equivalent Transient Model for MOS NOR ROM DD C bit r word c WL BL Model for NOR ROM Word line parasitics Wire capacitance and gate capacitance Wire resistance (polysilicon) Bit line parasitics Resistance not dominant (metal) Drain and Gate-Drain capacitance

Equivalent Transient Model for MOS NAND ROM DD Model for NAND ROM BL C r L bit c r bit WL word c word Word line parasitics Similar to NOR ROM Bit line parasitics Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance

Decreasing Word Line Delay

Precharged MOS NOR ROM V f pre DD Precharge devices WL [0] GND WL [1] WL [2] GND WL [3] BL [0] BL [1] BL [2] BL [3] PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design.

Non-Volatile Memories The Floating-gate transistor (FAMOS) D Source Drain t ox t ox n + p n +_ Substrate Schematic symbol Device cross-section

Floating-Gate Transistor Programming 20 V 10 V 5 V D S Avalanche injection 0 V 2 5 V D S Removing programming voltage leaves charge trapped 5 V 2 2.5 V D S Programming results in higher V T .

A “Programmable-Threshold” Transistor

FLOTOX EEPROM Fowler-Nordheim I -V characteristic FLOTOX transistor Floating gate Gate I Source Drain V 20 – 30 nm -10 V GD 10 V n 1 n 1 Substrate p 10 nm Fowler-Nordheim I -V characteristic FLOTOX transistor

EEPROM Cell BL WL V Absolute threshold control is hard Unprogrammed transistor might be depletion  2 transistor cell V DD

Flash EEPROM Many other options … Control gate n drain programming p- Floating gate erasure Thin tunneling oxide n 1 source n 1 drain programming p- substrate Many other options …

Cross-sections of NVM cells Flash EPROM Courtesy Intel

Basic Operations in a NOR Flash Memory― Erase

Basic Operations in a NOR Flash Memory― Write

Basic Operations in a NOR Flash Memory― Read

NAND Flash Memory Courtesy Toshiba Word line(poly) Unit Cell Source line (Diff. Layer) Courtesy Toshiba

NAND Flash Memory Word lines Select transistor Bit line contact Source line contact Active area STI Courtesy Toshiba

Characteristics of State-of-the-art NVM

Read-Write Memories (RAM) STATIC (SRAM) Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential DYNAMIC (DRAM) Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended

6-transistor CMOS SRAM Cell WL V DD M M 2 4 Q Q M M 6 5 M M 1 3 BL BL

CMOS SRAM Analysis (Read) WL V DD BL M 4 BL Q = Q = 1 M 6 M 5 V M V DD 1 DD V DD C C bit bit

CMOS SRAM Analysis (Read) 1.2 1 0.8 0.6 Voltage Rise (V) 0.4 0.2 Voltage rise [V] 0.5 1 1.2 1.5 2 2.5 3 Cell Ratio (CR)

CMOS SRAM Analysis (Write) BL = 1 Q M 4 5 6 V DD WL

CMOS SRAM Analysis (Write)

6T-SRAM — Layout VDD GND Q WL BL M1 M3 M4 M2 M5 M6

Resistance-load SRAM Cell WL V DD R R L L Q Q M M 3 4 BL M M BL 1 2 Static power dissipation -- Want R L large Bit lines precharged to V DD to address t p problem

SRAM Characteristics

3-Transistor DRAM Cell No constraints on device ratios WWL BL 1 M X 3 2 C S RWL V DD T D No constraints on device ratios Reads are non-destructive Value stored at node X when writing a “1” = V WWL -V Tn

3T-DRAM — Layout BL2 BL1 GND RWL WWL M3 M2 M1

1-Transistor DRAM Cell Write: C is charged or discharged by asserting WL and BL. S Read: Charge redistribution takes places between bit line and storage capacitance D V BL PRE – BIT C S + ------------ = Voltage swing is small; typically around 250 mV.

DRAM Cell Observations 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out. DRAM memory cells are single ended in contrast to SRAM cells. The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation. Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design. When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD

Sense Amp Operation D V (1) (0) t Sense amp activated PRE BL Sense amp activated Word line activated

1-T DRAM Cell Cross-section Layout Capacitor Metal word line Poly SiO 2 Field Oxide n + Inversion layer induced by plate bias M word 1 line Diffused bit line Polysilicon plate Polysilicon gate Cross-section Layout Uses Polysilicon-Diffusion Capacitance Expensive in Area

SEM of poly-diffusion capacitor 1T-DRAM

Advanced 1T DRAM Cells Stacked-capacitor Cell Trench Cell Word line Insulating Layer Cell plate Capacitor dielectric layer Cell Plate Si Transfer gate Isolation Refilling Poly Capacitor Insulator Storage electrode Storage Node Poly Si Substrate 2nd Field Oxide Trench Cell Stacked-capacitor Cell

Static CAM Memory Cell ••• ••• CAM Bit Word ••• Wired-NOR Match Line int S ••• •••

CAM in Cache Memory Hit Logic Address Decoder CAM SRAM ARRAY ARRAY Input Drivers Sense Amps / Input Drivers Address Tag Hit R/W Data

Periphery Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry

Row Decoders Collection of 2M complex logic gates Organized in regular and dense fashion (N)AND Decoder NOR Decoder

Hierarchical Decoders Multi-stage implementation improves performance • • • WL 1 WL A A A A A A A A A A A A A A A A 1 1 1 1 2 3 2 3 2 3 2 3 • • • NAND decoder using 2-input pre-decoders A A A A A A A A 1 1 3 2 2 3

Dynamic Decoders 2-input NOR decoder 2-input NAND decoder V WL A A A A Precharge devices GND GND V DD WL 3 WL 3 WL WL 2 2 WL 1 WL 1 WL WL V f A A A A DD 1 1 A A A A 1 1 f 2-input NOR decoder 2-input NAND decoder

4-input pass-transistor based column decoder S BL 1 2 3 D 2-input NOR decoder Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count

4-to-1 tree based column decoder BL BL BL BL 1 2 3 A A A 1 A 1 D Number of devices drastically reduced Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers progressive sizing combination of tree and pass transistor approaches

Decoder for circular shift-register V DD R WL f 1 2 •

Sense Amplifiers Idea: Use Sense Amplifer small s.a. transition input C D V × I av ---------------- = make V as small as possible small large Idea: Use Sense Amplifer small transition s.a. input output

Differential Sense Amplifier V DD M M 3 4 y Out bit M M bit 1 2 SE M 5 Directly applicable to SRAMs

Differential Sensing ― SRAM

Latch-Based Sense Amplifier (DRAM) EQ BL BL V DD SE SE Initialized in its meta-stable point with EQ Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point.

Charge-Redistribution Amplifier V ref V V L M S 1 C small M M C 2 3 large Transient Response Concept

Charge-Redistribution Amplifier― EPROM V DD SE M Load 4 Out C Cascode out V M device casc 3 C col Column WLC M decoder 2 BL C EPROM M BL 1 WL array

Single-to-Differential Conversion How to make a good Vref?

Open bitline architecture with dummy cells EQ L L L V 1 R R L DD 1 SE BLL BLR … … C C C S S S SE C C C S S S Dummy cell Dummy cell

DRAM Read Process with Dummy Cell 3 3 2 2 BL BL V V 1 1 BL BL 1 2 3 1 2 3 t (ns) t (ns) reading 0 reading 1 3 EQ WL 2 V SE 1 1 2 3 t (ns) control signals

Voltage Regulator Equivalent Model V M V V V V M V DD drive REF DL bias V REF - M drive + V DL

Charge Pump

DRAM Timing

RDRAM Architecture network mux/demux Bus Clocks k Data k 3 l memory array network mux/demux Column demux packet dec. Row demux packet dec.

Address Transition Detection V DD DELAY A t d ATD ATD DELAY A t 1 d … DELAY A t N 2 1 d

Reliability and Yield

Sensing Parameters in DRAM 1000 C D (1F) V smax (mv) Q 100 S (1C) smax C V S (1F) , DD V , S C 10 , S Q V , DD (V) D C Q 5 C V / 2 S S DD V 5 Q / ( C 1 C ) smax S S D 4K 64K 1M 16M 256M 4G 64G Memory Capacity (bits / chip) From [Itoh01]

Noise Sources in 1T DRam substrate BL Adjacent BL C -particles WL WBL a -particles WL leakage C S electrode C cross

Open Bit-line Architecture —Cross Coupling EQ WL WL WL WL WL WL 1 C D C D 1 WBL WBL BL BL C Sense C BL BL Amplifier C C C C C C

Folded-Bitline Architecture

Transposed-Bitline Architecture

Alpha-particles (or Neutrons) WL V DD BL SiO 2 n 1 1 2 2 1 2 1 2 1 2 1 2 1 1 Particle ~ 1 Million Carriers

Yield Yield curves at different stages of process maturity (from [Veendrick92])

Redundancy Row Decoder Row Address Redundant rows Fuse : Bank columns Memory Array Row Decoder Column Column Decoder Address

Error-Correcting Codes Example: Hamming Codes with e.g. B3 Wrong 1 = 3

Redundancy and Error Correction

Sources of Power Dissipation in Memories V DD CHIP I 5 S C D V f 1S I DD i i DCP nC V f DE INT m selected mi C V f act PT INT I DCP n m(n ROW non-selected 2 1)i hld DEC ARRAY mC V f DE INT PERIPHERY COLUMN DEC V SS From [Itoh00]

Data Retention in SRAM (A) 1.30u 1.10u 900n 700n 500n 300n 100n 0.00 .600 1.20 1.80 Factor 7 0.13 m CMOS m 0.18 m CMOS VDD Ileakage (A) SRAM leakage increases with technology scaling

Suppressing Leakage in SRAM V DD low-threshold transistor V V DD DDL sleep V DD,int sleep V DD,int SRAM SRAM SRAM cell cell cell SRAM SRAM SRAM cell cell cell V SS,int sleep Inserting Extra Resistance Reducing the supply voltage

Data Retention in DRAM From [Itoh00]

Case Studies Programmable Logic Array SRAM Flash Memory

PLA versus ROM Programmable Logic Array Main difference But … structured approach to random logic “two level logic implementation” NOR-NOR (product of sums) NAND-NAND (sum of products) IDENTICAL TO ROM! Main difference ROM: fully populated PLA: one element per minterm Note: Importance of PLA’s has drastically reduced 1. slow 2. better software techniques (mutli-level logic synthesis) But …

Programmable Logic Array Pseudo-NMOS PLA V DD GND GND GND GND GND GND GND V X X X X X X f f DD 1 1 2 2 1 AND-plane OR-plane

Dynamic PLA AND-plane OR-plane f GND V f f f V X X X X X X f f GND AND DD f OR f OR f AND V X X X X X X f f GND DD 1 1 2 2 1 AND-plane OR-plane

Clock Signal Generation for self-timed dynamic PLA Dummy AND row AND f AND t t pre eval f Dummy AND row f AND OR f OR (a) Clock signals (b) Timing generation circuitry

PLA Layout

4 Mbit SRAM Hierarchical Word-line Architecture

Bit-line Circuitry Block Bit-line select ATD load BEQ Local WL Memory cell B / T B / T CD CD CD I / O I/O line I / O Sense amplifier

Sense Amplifier (and Waveforms) I/O Lines Address Data-cut ATD BEQ SEQ DATA Vdd GND SA, SA I / O I / O SEQ Block select ATD BS SA BS SA SEQ SEQ SEQ SEQ DATA De i BS

1 Gbit Flash Memory From [Nakamura02]

Writing Flash Memory Read level (4.5 V) Number of cells 10 0V 1V 2V Vt of memory cells 3V 4V 2 4 6 8 Read level (4.5 V) Number of cells Evolution of thresholds Final Distribution From [Nakamura02]

125mm2 1Gbit NAND Flash Memory 32 word lines x 1024 blocks Charge pump 2kB Page buffer & cache 10.7mm 16896 bit lines 11.7mm From [Nakamura02]

125mm2 1Gbit NAND Flash Memory Technology 0.13m p-sub CMOS triple-well 1poly, 1polycide, 1W, 2Al Cell size 0.077m2 Chip size 125.2mm2 Organization 2112 x 8b x 64 page x 1k block Power supply 2.7V-3.6V Cycle time 50ns Read time  25s Program time 200s / page Erase time 2ms / block From [Nakamura02]

Semiconductor Memory Trends (up to the 90’s) Memory Size as a function of time: x 4 every three years

Semiconductor Memory Trends (updated) From [Itoh01]

Trends in Memory Cell Area From [Itoh01]

Semiconductor Memory Trends Technology feature size for different SRAM generations