Presentation on theme: "Alireza Shafaei, Yanzhi Wang, Xue Lin, and Massoud Pedram"— Presentation transcript:
1Alireza Shafaei, Yanzhi Wang, Xue Lin, and Massoud Pedram FinCACTI: Architectural Analysis and Modeling of Caches with Deeply-scaled FinFET DevicesAlireza Shafaei, Yanzhi Wang,Xue Lin, and Massoud PedramDepartment of Electrical EngineeringUniversity of Southern California
3Introduction Memory design in deeply-scaled CMOS technologies Increased short channel effects (SCE)Higher sensitivity to device mismatchesCache memories based on conventional 6T SRAM cell using planar CMOS devices may fail to function because of poor cell stability (read stability and write-ability)Solutions to enhance the cell stabilityDevice-levelUse quasi-planar FinFET devicesCircuit-levelIntroduce robust SRAM cell structures, e.g., 8T SRAM cells
4FinFET-based SRAM cells FinFET DevicesImproved gate control (and lower impact of source and drain terminals) over the channelReduces SCEHigher ON/OFF current ratio and improved energy efficiencySuperior physical scalabilityHigher immunity to random variations and soft errorsTechnology-of-choice beyond the 10nm CMOS nodeFinFET geometries:LFIN: fin (gate) lengthTSI: fin widthHFIN: fin heightWmin: effective channel width of a single fin (Wmin ≈ 2 x HFIN)FinFET-based SRAM cells
5Robust SRAM Cells 8T SRAM cell Conventional 6T SRAM cell Read stability: Pull down transistor must be stronger than the access transistorWrite-ability: Pull up transistor must be weaker than the access transistor𝑊 𝑀3 ≤ 𝑊 𝑀5 ≤ 𝑊 𝑀1Vulnerable especially in technology nodes below 16nm where process variations become a severe issue8T SRAM cellDecouples the storage node from the read bit-lineNo constraint needed for read stabilityImproved cell stabilitySeparate read path
6Architecture-level Memory Modeling CACTI, a widely-used delay, power, and area modeling tool for cache and memory systemsCACTI 6.5N. Muralimanohar, R. Balasubramonian, and N. Jouppi, “Optimizing NUCA Organizations and Wiring Alternatives for Large Caches With CACTI 6.0,” MICRO-40, 2007.
7CACTI Shortcomings for Future Memory Designs Only supports planar CMOS devices for the following technology nodesMetal pitch values: 90nm, 65nm, 45nm, 32nm, 22nm (with McPAT)Inaccurate technological parametersExtracted from ITRS documents (transistor and wire parameter values are predictions and best expert opinions from 2005 ITRS)Only supports conventional 6T SRAM cell designsA 6T SRAM cell design optimized for 130nm process is adopted for all technology nodesThe impact of Vdd scaling and device mismatches are ignored
8Prior Work: CACTI-FinFET Process variation modelsThe name is changed to CACTI-PVT laterExact Quote: “For FinFETs in the deep submicron regime, satisfactory analytical models are still not available”Lookup-tables used to store gate-level power/timing parametersOur approach (FinCACTI)Develop and use analytical models for calculating gate- level parameters from technology-dependent device-level characteristicsEasier to add new CMOS technologies or new devicesC.-Y. Lee and N. Jha, “CACTI-FinFET: An Integrated Delay and Power ModelingFramework for FinFET-based Caches under Process Variations,” DAC, 2011.
9FinCACTIAccurate technological parameters for deeply-scaled (7nm) FinFET devices from Synopsys Technology Computer-Aided Design (TCAD) tool suiteON/OFF currents of N- and P-type fins (for temperatures ranging from 300K to 400K)SPICE-compatible Verilog-A models in order to derive gate- and circuit-level parameters (e.g., the PMOS to NMOS size ratio, and the stack effect factor), and to characterize FinFET-based SRAM cells (static noise margin, and leakage power)Area and capacitance models for FinFET devicesLayout area, power, and access delay calculations for FinFET-based 6T and 8T SRAM cellsArchitectural support for the 8T SRAM cell
11Technological Parameters (cont’d) FinCACTIDevice-level parameters obtained by Synopsys TCAD Tool SuiteGate- and circuit-level parameters from Verilog-A-based SPICE simulations7nm FinFETParameterValueCommentVdd (V)0.45Supply voltageVth (V)0.235Threshold voltageION,NMOS (A/µm)8.82e-04ON current of a N-type FinFETION,PMOS (A/µm)5.50e-04ON current of a P-type FinFETIOFF,NMOS (A/µm)7.62e-08OFF current of a N-type FinFETIOFF,PMOS (A/µm)1.16e-07OFF current of a P-type FinFETLphy (nm)7Physical gate lengthCg,ideal (A/µm)1.59e-16Ideal gate capacitancePMOS to NMOS size ratio1.6NAND2 stack effect factor0.4Stack effect of two N-type FinFETsNAND3 stack effect factor0.2Stack effect of three N-type FinFETsNOR2 stack effect factorStack effect of two P-type FinFETsParam. NameParam.SymbolValue (nm)Min Gate LengthLFIN7Fin WidthTSI3.5Fin HeightHFIN14Fin PitchPFIN10.5Oxide ThicknessTox1.55
12FinFET Layout: Single vs. Multiple Fins PFIN: fin pitch, or the minimum center-to-center distance between two adjacent parallel fins—Depends on the underlying FinFET technology.NFIN: number of fins—For a FinFET with channel width of W,𝑁 𝐹𝐼𝑁 = 𝑊 𝑊 𝑚𝑖𝑛
13SRAM Cell Characteristics (SNM) 6T-n: a 6T SRAM cell whose pull-down transistors have n fins each6T-1 SRAM cell does not work properly in the 7nm technology because of too weak a pull down transistorCellSNM (V)6T-20.08616T-30.09256T-40.09738T0.1776Butterfly curves: common graphical representation of SNMSNM: Static Noise Margin
15SRAM Cell Characteristics (Leakage Power) During the standby mode:BL and BLB (or WBL and WBLB) are pre-charged to VDDRBL is pre-discharged to 0, andAll word-lines are deactivatedCellPleak (nW)6T-10.676T-21.586T-41.928T1.32
16Channel width under the same layout footprint Transistor AreaLayouts of a transistor with channel width of W in planar CMOS and FinFET process technologies:Channel width under the same layout footprintPlanar CMOSFinFET𝑋−𝑆𝑝𝑎𝑛 =31.5𝑛𝑚𝑌−𝑆𝑝𝑎𝑛 = 21𝑛𝑚𝐿 = 𝐿𝐹𝐼𝑁 = 7𝑛𝑚CMOS:𝑊 = 21𝑛𝑚FinFET ( 𝐻 𝐹𝑖𝑛 =14𝑛𝑚, 𝑃 𝐹𝑖𝑛 =10.5𝑛𝑚):𝑊 2×14𝑛𝑚 ⋅10.5𝑛𝑚=21𝑛𝑚⇒𝑊=56𝑛𝑚Transistor’s X-span is determined by contact-related design rules (similar for planar CMOS and FinFET) and the channel length (L).
17Gate and Diffusion Capacitances Width quantization property of FinFET devicesFinFET width can only take discrete valuesThe effective channel width ( 𝑊 𝐶𝐻 ) may become larger than the required width (i.e., an over-sized transistor)𝑁 𝐹𝐼𝑁 = 𝑊 𝑊 𝑚𝑖𝑛𝐶 𝑔,𝑖𝑑𝑒𝑎𝑙 , 𝐶 𝑜𝑣 , 𝐶 𝑓𝑟 denote ideal gate, overlap, and total fringing capacitances, respectively; 𝐶𝑗 is the unit area drain junction capacitance; 𝐶𝑗𝑠𝑤 and 𝐶𝑗𝑠𝑤𝑔 are unit length sidewall and gate sidewall junction capacitances, respectively; 𝑊 𝐷 is the total drain width; 𝐴 𝐷 and 𝑃 𝐷 are the area and perimeter of the drain junction, respectively; 𝐶 𝐺 and 𝐶 𝐷 represent the total gate and drain capacitances, respectively.𝑊 𝐶𝐻 = 𝑁 𝐹𝐼𝑁 ⋅ 𝑊 𝑚𝑖𝑛𝐶 𝐺 𝑁 𝐹𝐼𝑁 = 𝐶 𝑔,𝑖𝑑𝑒𝑎𝑙 + 𝐶 𝑜𝑣 + 𝐶 𝑓𝑟 ⋅𝑊 𝐶𝐻𝐶 𝐷 𝑁 𝐹𝐼𝑁 = 𝐶 𝑗 ⋅ 𝐴 𝐷 + 𝐶 𝑗𝑠𝑤 ⋅ 𝑃 𝐷 + 𝐶 𝑗𝑠𝑤𝑔 ⋅ 𝑊 𝐶𝐻𝐴 𝐷 = 𝑊 𝐷 ⋅ 𝑇 𝑆𝐼 ⋅𝑁 𝐹𝐼𝑁𝑃 𝐷 =2⋅ 𝑊 𝐷 + 𝑇 𝑆𝐼 ⋅𝑁 𝐹𝐼𝑁𝐶 𝑗 = 𝐹 𝑚 2𝐶 𝑗𝑠𝑤 =5.0𝑒−10 𝐹 𝑚𝐶 𝑗𝑠𝑤𝑔 =0BSIM-CMG
188T SRAM CellModified row decoderCapacitances of read and write WLs, and read and write BLs for a sub-array with n rows and m columns:𝐶 𝑅𝑊𝐿 =𝑚⋅ 𝐶 𝐺 𝑁 𝐹𝐼𝑁,𝑀8 + 𝑊 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊The drain capacitance of each access transistor (M5, M6, and M8) is divided by two since each contact is shared between two vertically adjacent cells.𝑊 𝐶𝑒𝑙𝑙 and 𝐻 𝐶𝑒𝑙𝑙 denote the width and height of the SRAM cell, respectively; 𝐶 𝑊 represents the unit length wire capacitance; 𝑁 𝐹𝐼𝑁,𝑀𝑖 is the number of fins in transistor 𝑀 𝑖 .𝐶 𝑊𝑊𝐿 =𝑚⋅ 2⋅𝐶 𝐺 𝑁 𝐹𝐼𝑁,𝑀5 + 𝑊 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊𝐶 𝑅𝐵𝐿 =𝑛⋅ 𝐶 𝐷 𝑁 𝐹𝐼𝑁,𝑀8 /2+ 𝐻 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊𝐶 𝑊𝐵𝐿 =𝑛⋅ 𝐶 𝐷 𝑁 𝐹𝐼𝑁,𝑀5 /2+ 𝐻 𝐶𝑒𝑙𝑙 ⋅ 𝐶 𝑊
19Simulation SetupFor all simulations a 4MB, 8-way, set-associative L3 cache with the following configurations is assumed:Technological parameters of 32nm (and 22nm) (½ metal pitch) planar CMOS process are extracted (from McPAT).Results of 6T-1 cell under 7nm (gate length) FinFET are reported for comparison purposes.ParameterValueCache size4MBDevice typeHPBlock size64BAssociativity8Read/write ports1Bus width512Cache modelUniform Cache AccessNumber of banks4Temperature330KObjectiveEnergy-Delay Product32nm: Vdd = 0.90V22nm: Vdd = 0.80V7nm: Vdd = 0.45V
20Simulation Results (1) Feature size scaling Smaller footprint of FinFETsVdd scalingLower OFF current of FinFETs
21Simulation Results (2) Capacitance scaling Higher ON current of FinFETsSmaller SRAM footprint in FinFETsVdd scaling (for energy)
22Simulation Results (3) 8T SRAM Cell 6T SRAM Cell 6T-2 Access Time (ns) Read Energy (nJ)Leakage Power (mW)Cache Area (mm2)32nm CMOS2.0840.79047.58219.59022nm CMOS1.7440.44759.8299.24016nm CMOS1.4590.25375.2274.35810nm CMOS1.2210.14394.5882.0567nm CMOS1.0210.0810.9707nm FinFET0.5690.04819.8730.826Scaling Factor0.840.571.260.478T SRAM CellAccess Time (ns)Read Energy (nJ)Leakage Power (mW)Cache Area (mm2)32nm CMOS1.3970.49359.19915.54522nm CMOS1.1640.27876.1357.34516nm CMOS0.9700.15797.9173.47010nm CMOS0.8090.0891.6407nm CMOS0.6740.0500.7757nm FinFET0.4980.04323.1870.714Scaling Factor0.830.561.290.476T SRAM Cell6T-2
23Future Work XML interfaces for Dual-Vdd support Technological parametersSRAM cell configurationDual-Vdd supportSuper- and near-threshold regimesON/OFF currents, and sense-amplifier characteristics for near-threshold regimeDual-gate controlled SRAM cellsSRAM cell layout area, ON/OFF currents of dual-gate FinFETs14nm planar CMOS designed using TCAD toolsUpdated wire parametersTechnical report and a web interface for FinCACTI