Alireza Shafaei, Yanzhi Wang, Xue Lin, and Massoud Pedram

Slides:



Advertisements
Similar presentations
PIDS: Poster Session 2002 ITRS Changes and 2003 ITRS Key Issues ITRS Open Meeting Dec. 5, 2002 Tokyo.
Advertisements

Barcelona Forum on Ph.D. Research in Communications, Electronics and Signal Processing 21st October 2010 Soft Errors Hardening Techniques in Nanometer.
Savas Kaya and Ahmad Al-Ahmadi School of EE&CS Russ College of Eng & Tech Search for Optimum and Scalable COSMOS.
Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 10: IC Technology.
Kwangok Jeong and Andrew B. Kahng UCSD VLSI CAD Laboratory
Semiconductor Memory Design. Organization of Memory Systems Driven only from outside Data flow in and out A cell is accessed for reading by selecting.
Lecture 5: DC & Transient Response
PERFECT Empower Project: Prototype Software Releases PI: Massoud Pedram Co-PIs: Murali Annavaram and Kaushik Roy (Purdue) July 17, 2014.
0 1 Width-dependent Statistical Leakage Modeling for Random Dopant Induced Threshold Voltage Shift Jie Gu, Sachin Sapatnekar, Chris Kim Department of Electrical.
Introduction to CMOS VLSI Design Lecture 3: CMOS Transistor Theory David Harris Harvey Mudd College Spring 2004.
VLSI Design CMOS Transistor Theory. EE 447 VLSI Design 3: CMOS Transistor Theory2 Outline Introduction MOS Capacitor nMOS I-V Characteristics pMOS I-V.
SRAM Mohammad Sharifkhani. Effect of Mismatch.
Introduction to CMOS VLSI Design Lecture 18: Design for Low Power David Harris Harvey Mudd College Spring 2004.
11/29/2004EE 42 fall 2004 lecture 371 Lecture #37: Memory Last lecture: –Transmission line equations –Reflections and termination –High frequency measurements.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 32: Array Subsystems (DRAM/ROM) Prof. Sherief Reda Division of Engineering,
11/5/2004EE 42 fall 2004 lecture 281 Lecture #28 PMOS LAST TIME: NMOS Electrical Model – NMOS physical structure: W and L and d ox, TODAY: PMOS –Physical.
Low-Power CMOS SRAM By: Tony Lugo Nhan Tran Adviser: Dr. David Parent.
Introduction to CMOS VLSI Design Lecture 3: CMOS Transistor Theory
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 31: Array Subsystems (SRAM) Prof. Sherief Reda Division of Engineering,
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 22: Material Review Prof. Sherief Reda Division of Engineering, Brown University.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Fred Chen & Lixin Su SOI DRAM Low Power, Multi-Gigabit DRAM Cell Design Issues Using SOI Technology Fred Chen & Lixin Su May 12, 1999 A Presentation for.
Circuit Performance Variability Decomposition Michael Orshansky, Costas Spanos, and Chenming Hu Department of Electrical Engineering and Computer Sciences,
EE4800 CMOS Digital IC Design & Analysis
Lecture 19: SRAM.
Lecture 7: Power.
Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variation and Reliability 1 Lerong Cheng, 1 Yan Lin,
Parts from Lecture 9: SRAM Parts from
The CMOS Inverter Slides adapted from:
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
MOS Capacitors MOS capacitors are the basic building blocks of CMOS transistors MOS capacitors distill the basic physics of MOS transistors MOS capacitors.
TOWARDS AN EARLY DESIGN SPACE EXPLORATION TOOL SET FOR STT-RAM DESIGN Philip Asare and Ben Melton.
EE415 VLSI Design DYNAMIC LOGIC [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
THE INVERTERS. DIGITAL GATES Fundamental Parameters l Functionality l Reliability, Robustness l Area l Performance »Speed (delay) »Power Consumption »Energy.
Ratioed Circuits Ratioed circuits use weak pull-up and stronger pull-down networks. The input capacitance is reduced and hence logical effort. Correct.
Penn ESE370 Fall Townley & DeHon ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 14: October 1, 2014 Layout and.
הפקולטה למדעי ההנדסה Faculty of Engineering Sciences.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 12.1 EE4800 CMOS Digital IC Design & Analysis Lecture 12 SRAM Zhuo Feng.
A 256kb Sub-threshold SRAM in 65nm CMOS
Penn ESE370 Fall Townley & DeHon ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 13: October 5, 2011 Layout and.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.
Lecture 10: Circuit Families. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 10: Circuit Families2 Outline  Pseudo-nMOS Logic  Dynamic Logic  Pass Transistor.
1 Chapter 5. Metal Oxide Silicon Field-Effect Transistors (MOSFETs)
Leakage reduction techniques Three major leakage current components 1. Gate leakage ; ~ Vdd 4 2. Subthreshold ; ~ Vdd 3 3. P/N junction.
EE141 © Digital Integrated Circuits 2nd Devices 1 Goal of this lecture  Present understanding of device operation  nMOS/pMOS as switches  How to design.
CMOS Fabrication nMOS pMOS.
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 28: November 16, 2011 Memory Periphery.
Design and Analysis of A Novel 8T SRAM Cell December 14, 2010 Department of Microelectronic Engineering & Centre for Efficiency Oriented Languages University.
11. 9/15 2 Figure A 2 M+N -bit memory chip organized as an array of 2 M rows  2 N columns. Memory SRAM organization organized as an array of 2.
Low-Power BIST (Built-In Self Test) Overview 10/31/2014
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
Click to edit Master title style Progress Update Energy-Performance Characterization of CMOS/MTJ Hybrid Circuits Fengbo Ren 05/28/2010.
UTB SOI for LER/RDF EECS Min Hee Cho. Outline  Introduction  LER (Line Edge Roughness)  RDF (Random Dopant Fluctuation)  Variation  Solution – UTB.
1 Dual-V cc SRAM Class presentation for Advanced VLSIPresenter:A.Sammak Adopted from: M. Khellah,A 4.2GHz 0.3mm 2 256kb Dual-V CC SRAM Building Block in.
Low Power SRAM VLSI Final Presentation Stephen Durant Ryan Kruba Matt Restivo Voravit Vorapitat.
Introduction to CMOS VLSI Design CMOS Transistor Theory
EE 653: Group #3 Impact of Drowsy Caches on SER Arjun Bir Singh Mohammad Abdel-Majeed Sameer G Kulkarni.
Guided by: Prof.J.D.PRADHAN Submitted By: K.Anurag Regn no:
Lecture 19: SRAM.
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Low Write-Energy STT-MRAMs using FinFET-based Access Transistors
VLSI Design MOSFET Scaling and CMOS Latch Up
INTRODUCTION: MD. SHAFIQUL ISLAM ROLL: REGI:
Lecture 19 OUTLINE The MOS Capacitor (cont’d) The MOSFET:
Lecture 19 OUTLINE The MOS Capacitor (cont’d) The MOSFET:
Presentation transcript:

Alireza Shafaei, Yanzhi Wang, Xue Lin, and Massoud Pedram FinCACTI: Architectural Analysis and Modeling of Caches with Deeply-scaled FinFET Devices Alireza Shafaei, Yanzhi Wang, Xue Lin, and Massoud Pedram Department of Electrical Engineering University of Southern California http://atrak.usc.edu/

Outline Introduction CACTI Cache Modeling Tool FinFET Devices Robust SRAM Cell Design CACTI Cache Modeling Tool FinCACTI (CACTI with FinFET support) Technological Parameters FinFET-based SRAM Cell Characteristics Gate and Diffusion Capacitances 8T SRAM Cell Support Simulation Results

Introduction Memory design in deeply-scaled CMOS technologies Increased short channel effects (SCE) Higher sensitivity to device mismatches Cache memories based on conventional 6T SRAM cell using planar CMOS devices may fail to function because of poor cell stability (read stability and write-ability) Solutions to enhance the cell stability Device-level Use quasi-planar FinFET devices Circuit-level Introduce robust SRAM cell structures, e.g., 8T SRAM cells

FinFET-based SRAM cells FinFET Devices Improved gate control (and lower impact of source and drain terminals) over the channel Reduces SCE Higher ON/OFF current ratio and improved energy efficiency Superior physical scalability Higher immunity to random variations and soft errors Technology-of-choice beyond the 10nm CMOS node FinFET geometries: LFIN: fin (gate) length TSI: fin width HFIN: fin height Wmin: effective channel width of a single fin (Wmin ≈ 2 x HFIN) FinFET-based SRAM cells

Robust SRAM Cells 8T SRAM cell Conventional 6T SRAM cell Read stability: Pull down transistor must be stronger than the access transistor Write-ability: Pull up transistor must be weaker than the access transistor 𝑊 𝑀3 ≤ 𝑊 𝑀5 ≤ 𝑊 𝑀1 Vulnerable especially in technology nodes below 16nm where process variations become a severe issue 8T SRAM cell Decouples the storage node from the read bit-line No constraint needed for read stability Improved cell stability Separate read path

Architecture-level Memory Modeling CACTI, a widely-used delay, power, and area modeling tool for cache and memory systems CACTI 6.5 N. Muralimanohar, R. Balasubramonian, and N. Jouppi, “Optimizing NUCA Organizations and Wiring Alternatives for Large Caches With CACTI 6.0,” MICRO-40, 2007.

CACTI Shortcomings for Future Memory Designs Only supports planar CMOS devices for the following technology nodes Metal pitch values: 90nm, 65nm, 45nm, 32nm, 22nm (with McPAT) Inaccurate technological parameters Extracted from ITRS documents (transistor and wire parameter values are predictions and best expert opinions from 2005 ITRS) Only supports conventional 6T SRAM cell designs A 6T SRAM cell design optimized for 130nm process is adopted for all technology nodes The impact of Vdd scaling and device mismatches are ignored

Prior Work: CACTI-FinFET Process variation models The name is changed to CACTI-PVT later Exact Quote: “For FinFETs in the deep submicron regime, satisfactory analytical models are still not available” Lookup-tables used to store gate-level power/timing parameters Our approach (FinCACTI) Develop and use analytical models for calculating gate- level parameters from technology-dependent device-level characteristics Easier to add new CMOS technologies or new devices C.-Y. Lee and N. Jha, “CACTI-FinFET: An Integrated Delay and Power Modeling Framework for FinFET-based Caches under Process Variations,” DAC, 2011.

FinCACTI Accurate technological parameters for deeply-scaled (7nm) FinFET devices from Synopsys Technology Computer-Aided Design (TCAD) tool suite ON/OFF currents of N- and P-type fins (for temperatures ranging from 300K to 400K) SPICE-compatible Verilog-A models in order to derive gate- and circuit-level parameters (e.g., the PMOS to NMOS size ratio, and the stack effect factor), and to characterize FinFET-based SRAM cells (static noise margin, and leakage power) Area and capacitance models for FinFET devices Layout area, power, and access delay calculations for FinFET-based 6T and 8T SRAM cells Architectural support for the 8T SRAM cell

Technological Parameters CACTI 6.5 ITRS predictions if (tech == 32) { SENSE_AMP_D = .03e-9; // s SENSE_AMP_P = 2.16e-15; // J //For 2013, MPU/ASIC stagger-contacted M1 half-pitch is 32 nm (so this is 32 nm //technology i.e. FEATURESIZE = 0.032). Using the SOI process numbers for //HP and LSTP. vdd[0] = 0.9; Lphy[0] = 0.013; Lelec[0] = 0.01013; t_ox[0] = 0.5e-3; v_th[0] = 0.21835; c_ox[0] = 4.11e-14; mobility_eff[0] = 361.84 * (1e-2 * 1e6 * 1e-2 * 1e6); Vdsat[0] = 5.09E-2; c_g_ideal[0] = 5.34e-16; c_fringe[0] = 0.04e-15; c_junc[0] = 1e-15; I_on_n[0] = 2211.7e-6; I_on_p[0] = I_on_n[0] / 2; nmos_effective_resistance_multiplier = 1.49; n_to_p_eff_curr_drv_ratio[0] = 2.41; gmp_to_gmn_multiplier[0] = 1.38; Rnchannelon[0] = nmos_effective_resistance_multiplier * vdd[0] / I_on_n[0]; Rpchannelon[0] = n_to_p_eff_curr_drv_ratio[0] * Rnchannelon[0]; I_off_n[0][0] = 1.52e-7; … I_off_n[0][100] = 6.1e-6; }

Technological Parameters (cont’d) FinCACTI Device-level parameters obtained by Synopsys TCAD Tool Suite Gate- and circuit-level parameters from Verilog-A-based SPICE simulations 7nm FinFET Parameter Value Comment Vdd (V) 0.45 Supply voltage Vth (V) 0.235 Threshold voltage ION,NMOS (A/µm) 8.82e-04 ON current of a N-type FinFET ION,PMOS (A/µm) 5.50e-04 ON current of a P-type FinFET IOFF,NMOS (A/µm) 7.62e-08 OFF current of a N-type FinFET IOFF,PMOS (A/µm) 1.16e-07 OFF current of a P-type FinFET Lphy (nm) 7 Physical gate length Cg,ideal (A/µm) 1.59e-16 Ideal gate capacitance PMOS to NMOS size ratio 1.6 NAND2 stack effect factor 0.4 Stack effect of two N-type FinFETs NAND3 stack effect factor 0.2 Stack effect of three N-type FinFETs NOR2 stack effect factor Stack effect of two P-type FinFETs Param. Name Param. Symbol Value (nm) Min Gate Length LFIN 7 Fin Width TSI 3.5 Fin Height HFIN 14 Fin Pitch PFIN 10.5 Oxide Thickness Tox 1.55

FinFET Layout: Single vs. Multiple Fins PFIN: fin pitch, or the minimum center-to-center distance between two adjacent parallel fins—Depends on the underlying FinFET technology. NFIN: number of fins—For a FinFET with channel width of W, 𝑁 𝐹𝐼𝑁 = 𝑊 𝑊 𝑚𝑖𝑛

SRAM Cell Characteristics (SNM) 6T-n: a 6T SRAM cell whose pull-down transistors have n fins each 6T-1 SRAM cell does not work properly in the 7nm technology because of too weak a pull down transistor Cell SNM (V) 6T-2 0.0861 6T-3 0.0925 6T-4 0.0973 8T 0.1776 Butterfly curves: common graphical representation of SNM SNM: Static Noise Margin

SRAM Cell Characteristics (Layout Area) Area (nm2) 6T-1 6,615 6T-2 7,938 6T-3 9,261 6T-4 10,584 8T Assuming very conservative design rules: Y-span = 2LFIN + 14λ X-span6T-n = 2(n-1)PFIN + 30λ X-span8T = 42λ

SRAM Cell Characteristics (Leakage Power) During the standby mode: BL and BLB (or WBL and WBLB) are pre-charged to VDD RBL is pre-discharged to 0, and All word-lines are deactivated Cell Pleak (nW) 6T-1 0.67 6T-2 1.58 6T-4 1.92 8T 1.32

Channel width under the same layout footprint Transistor Area Layouts of a transistor with channel width of W in planar CMOS and FinFET process technologies: Channel width under the same layout footprint Planar CMOS FinFET 𝑋−𝑆𝑝𝑎𝑛 =31.5𝑛𝑚 𝑌−𝑆𝑝𝑎𝑛 = 21𝑛𝑚 𝐿 = 𝐿𝐹𝐼𝑁 = 7𝑛𝑚 CMOS: 𝑊 = 21𝑛𝑚 FinFET ( 𝐻 𝐹𝑖𝑛 =14𝑛𝑚, 𝑃 𝐹𝑖𝑛 =10.5𝑛𝑚): 𝑊 2×14𝑛𝑚 ⋅10.5𝑛𝑚=21𝑛𝑚 ⇒𝑊=56𝑛𝑚 Transistor’s X-span is determined by contact-related design rules (similar for planar CMOS and FinFET) and the channel length (L).

Gate and Diffusion Capacitances Width quantization property of FinFET devices FinFET width can only take discrete values The effective channel width ( 𝑊 𝐶𝐻 ) may become larger than the required width (i.e., an over-sized transistor) 𝑁 𝐹𝐼𝑁 = 𝑊 𝑊 𝑚𝑖𝑛 𝐶 𝑔,𝑖𝑑𝑒𝑎𝑙 , 𝐶 𝑜𝑣 , 𝐶 𝑓𝑟 denote ideal gate, overlap, and total fringing capacitances, respectively; 𝐶𝑗 is the unit area drain junction capacitance; 𝐶𝑗𝑠𝑤 and 𝐶𝑗𝑠𝑤𝑔 are unit length sidewall and gate sidewall junction capacitances, respectively; 𝑊 𝐷 is the total drain width; 𝐴 𝐷 and 𝑃 𝐷 are the area and perimeter of the drain junction, respectively; 𝐶 𝐺 and 𝐶 𝐷 represent the total gate and drain capacitances, respectively. 𝑊 𝐶𝐻 = 𝑁 𝐹𝐼𝑁 ⋅ 𝑊 𝑚𝑖𝑛 𝐶 𝐺 𝑁 𝐹𝐼𝑁 = 𝐶 𝑔,𝑖𝑑𝑒𝑎𝑙 + 𝐶 𝑜𝑣 + 𝐶 𝑓𝑟 ⋅𝑊 𝐶𝐻 𝐶 𝐷 𝑁 𝐹𝐼𝑁 = 𝐶 𝑗 ⋅ 𝐴 𝐷 + 𝐶 𝑗𝑠𝑤 ⋅ 𝑃 𝐷 + 𝐶 𝑗𝑠𝑤𝑔 ⋅ 𝑊 𝐶𝐻 𝐴 𝐷 = 𝑊 𝐷 ⋅ 𝑇 𝑆𝐼 ⋅𝑁 𝐹𝐼𝑁 𝑃 𝐷 =2⋅ 𝑊 𝐷 + 𝑇 𝑆𝐼 ⋅𝑁 𝐹𝐼𝑁 𝐶 𝑗 =0.0005 𝐹 𝑚 2 𝐶 𝑗𝑠𝑤 =5.0𝑒−10 𝐹 𝑚 𝐶 𝑗𝑠𝑤𝑔 =0 BSIM-CMG 107.0.0

8T SRAM Cell Modified row decoder Capacitances of read and write WLs, and read and write BLs for a sub-array with n rows and m columns: 𝐶 𝑅𝑊𝐿 =𝑚⋅ 𝐶 𝐺 𝑁 𝐹𝐼𝑁,𝑀8 + 𝑊 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊 The drain capacitance of each access transistor (M5, M6, and M8) is divided by two since each contact is shared between two vertically adjacent cells. 𝑊 𝐶𝑒𝑙𝑙 and 𝐻 𝐶𝑒𝑙𝑙 denote the width and height of the SRAM cell, respectively; 𝐶 𝑊 represents the unit length wire capacitance; 𝑁 𝐹𝐼𝑁,𝑀𝑖 is the number of fins in transistor 𝑀 𝑖 . 𝐶 𝑊𝑊𝐿 =𝑚⋅ 2⋅𝐶 𝐺 𝑁 𝐹𝐼𝑁,𝑀5 + 𝑊 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊 𝐶 𝑅𝐵𝐿 =𝑛⋅ 𝐶 𝐷 𝑁 𝐹𝐼𝑁,𝑀8 /2+ 𝐻 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊 𝐶 𝑊𝐵𝐿 =𝑛⋅ 𝐶 𝐷 𝑁 𝐹𝐼𝑁,𝑀5 /2+ 𝐻 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊

Simulation Setup For all simulations a 4MB, 8-way, set-associative L3 cache with the following configurations is assumed: Technological parameters of 32nm (and 22nm) (½ metal pitch) planar CMOS process are extracted (from McPAT). Results of 6T-1 cell under 7nm (gate length) FinFET are reported for comparison purposes. Parameter Value Cache size 4MB Device type HP Block size 64B Associativity 8 Read/write ports 1 Bus width 512 Cache model Uniform Cache Access Number of banks 4 Temperature 330K Objective Energy-Delay Product 32nm: Vdd = 0.90V 22nm: Vdd = 0.80V 7nm: Vdd = 0.45V

Simulation Results (1) Feature size scaling Smaller footprint of FinFETs Vdd scaling Lower OFF current of FinFETs

Simulation Results (2) Capacitance scaling Higher ON current of FinFETs Smaller SRAM footprint in FinFETs Vdd scaling (for energy)

Simulation Results (3) 8T SRAM Cell 6T SRAM Cell 6T-2 Access Time (ns) Read Energy (nJ) Leakage Power (mW) Cache Area (mm2) 32nm CMOS 2.084 0.790 47.582 19.590 22nm CMOS 1.744 0.447 59.829 9.240 16nm CMOS 1.459 0.253 75.227 4.358 10nm CMOS 1.221 0.143 94.588 2.056 7nm CMOS 1.021 0.081 118.932 0.970 7nm FinFET 0.569 0.048 19.873 0.826 Scaling Factor 0.84 0.57 1.26 0.47 8T SRAM Cell Access Time (ns) Read Energy (nJ) Leakage Power (mW) Cache Area (mm2) 32nm CMOS 1.397 0.493 59.199 15.545 22nm CMOS 1.164 0.278 76.135 7.345 16nm CMOS 0.970 0.157 97.917 3.470 10nm CMOS 0.809 0.089 125.930 1.640 7nm CMOS 0.674 0.050 161.957 0.775 7nm FinFET 0.498 0.043 23.187 0.714 Scaling Factor 0.83 0.56 1.29 0.47 6T SRAM Cell 6T-2

Future Work XML interfaces for Dual-Vdd support Technological parameters SRAM cell configuration Dual-Vdd support Super- and near-threshold regimes ON/OFF currents, and sense-amplifier characteristics for near-threshold regime Dual-gate controlled SRAM cells SRAM cell layout area, ON/OFF currents of dual-gate FinFETs 14nm planar CMOS designed using TCAD tools Updated wire parameters Technical report and a web interface for FinCACTI