Presentation is loading. Please wait.

Presentation is loading. Please wait.

Csaba Andras-Moritz ECE 668 3D IC Technology and Emerging 3D Processors.

Similar presentations


Presentation on theme: "Csaba Andras-Moritz ECE 668 3D IC Technology and Emerging 3D Processors."— Presentation transcript:

1 Csaba Andras-Moritz ECE 668 3D IC Technology and Emerging 3D Processors

2 2 Outline  Motivation  TSV-based 3D IC and Monolithic 3D IC  Skybridge Fabric  NP-dynamic Skybridge Fabric  Skybridge-3D-CMOS Fabric

3 3 Motivation  Device scaling challenges V DD, V TH not scaling linearly Secondary effects  System level power / performance challenges Interconnection bottleneck Increasing RC  Fabrication challenges Lithography limitation Doping control challenges Performance trend [1] Lithography challenge with scaling [1] [1] J. Warnock, "Circuit Design Challenges at the 14nm Technology Node", pp. 464-467, DAC 2011

4 4  TSV 3D ICs: uses normal die process but needs special packaging  Monolithic 3D IC: uses smaller vias, applies sequential process for each die  TSV 3D IC has easier process and higher throughput  Monolithic 3D ICs has better RC reduction Monolithic 3D IC vs. TSV-based 3D IC Implementation of TSV based 3D ICs [2] (via dimension: 5-10 μm) Block-level Monolithic 3D [3] (via dimension: 50-100nm) [2] J. H. Lau, et al., “TSV manufacturing yield and hidden costs for 3D IC integration”, pp. 1031-1042, ECTC, 2010 [3] S. Panth, et al., “High-density integration of functional modules using monolithic 3D-IC technology”, pp. 681-686, ASP-DAC, 2013

5 5  Transistor-level Monolithic 3D: Fine-grained 3D IC with Intra-cell benefits Simplified process for each die due to single type of MOS Less cost of each layer due to less mask layers Uses existing commercial CAD tools for placement and routing Gate-level vs. Transistor-level Monolithic 3D IC Gate-level Monolithic 3D [4] Transistor-level Monolithic 3D [4] S. Panth, et al., "Design challenges and solutions for ultra-high-density monolithic 3D ICs", pp 1-2, S3S 2014

6 6  Inter-layer dielectric to avoid coupling between tiers  Monolithic Inter-layer via to connect pull-up and pull-down network  Cell-to-cell routing uses metal layers in top-tier Overview of Transistor-Level Monolithic 3D IC

7 7 Monolithic 3D IC Bottom-up Sequential Process

8 8  True Vertical Integration Addresses 3D Device, circuit, connectivity, heat management and manufacturing requirements Follows a template based approach with uniform vertical nanowires Achieves tremendous benefits across all aspects Skybridge: 3D Integrated Framework Abstract View of Envisioned Skybridge Fabric [5] [5] M. Rahman, et al., "Fine-grained 3-D integrated circuit fabric using vertical nanowires", pp. 9.3.1-9.3.7, 3DIC 2015

9 9 Skybridge Fabric Components  Fabric assembly by integration of core components Specially architected for 3D Core Fabric Components

10 10 Vertical Gate-all-Around Junctionless Transistor  Single type uniform V-GAA Junctionless transistor as active device  Simple device structure Uniform doping Device formation by material deposition Junctionless Device Structure and TCAD Simulation results

11 11 Skybridge 3D Circuit Style  Dynamic circuit style amenable to physical implementation Uses only single type uniform n-type V-GAA Junctionless transistors  Supports compound, cascaded dynamic circuits with both dual rail and single rail inputs Skybridge 3D Circuit Style. A) XOR Schematic; B) HSPICE Simulation; C) XOR Layout [6] [6] M. Rahman, et al., "Skybridge: 3-D Integrated Circuit Technology Alternative to CMOS" 2014; http://arxiv.org/abs/1404.0607

12 12  Various circuit options for optimizations  High fan-in support due to dynamic circuit style Skybridge Circuit Styles, and High Fan-In Options Compound vs Cascaded Circuits with Dual Rail and Single Rail Fan-in Sensitivity Analysis

13 13  Follows Skybridge circuit style  Utilizes fabric components 32 transistors for Full Adder accommodated in just 4 logic nanowires Full Adder Implementation in Skybridge Full-Adder Layout HSPICE Simulated Waveforms Full Adder Schematic

14 14  Volatile memory with single type transistors No sizing/doping requirements as in SRAM  Two cross-coupled dynamic NAND gates for storage Addresses noise, leakage power concerns through circuit level designs Volatile Memory in Skybridge Fabric Skybridge RAM Schematic Simulated HSPICE Waveforms

15 15  Intrinsic fabric features for noise mitigation Engineered GND shielding approach All signal routing through Coaxial structures Noise Mitigation in Skybridge

16 16  Higher gate voltage in “Precharge” transistors to boost current  Long interconnect delay mitigation Logic replication, dynamic buffer insertion, CMOS-like inverters in long interconnect paths Signal Pull-Up and Delay Mitigation Inverters in Long Interconnect Path [7] [7] S. Khasanvis, et al., “Architecting Connectivity for Fine-grained 3-D Vertically Integrated Circuits,” NANOARCH, pp. 175-180, 2015.

17 17  Arithmetic circuit design examples with Adders and Multiplier  High fan-in circuit designs to evaluate scalability potentials Arithmetic Circuit Design Examples in Skybridge Array Multiplier Design (Block Diagram) 8 and 16 bit CLA designs (Block Diagram)

18 18 Benchmarking with respect to equivalent CMOS designs at 16nm  WISP-4: 30x density and 3.5x performance/watt benefits  High Bit-Width Arithmetic Circuits: 16-bit CLA design achieves 60.5x density, and 16.5x performance/watt benefits  Analytical Interconnect Modeling Results: 10x less interconnect length, and 100x less repeater count Benchmarking Results (Skybridge vs. 2D-CMOS) CLA Throughput (s -1 ) Power (μW) Area (μm 2 ) CMOSSBCMOSSBCMOSSB 4-Bit Multiplier 5.0e95.1e942.3172501.27 4-Bit CLA9.9e910.4e923519.418.70.76 8-Bit CLA4.5e95.7e928723.564.71.34 16-Bit CLA 2.4e93.7e929727.8130.22.15 Benchmarking of Arithmetic Circuits

19 19 Integrated fabric approaches extending Skybridge 3-D concepts to incorporate both n-type and p-type transistors  NP-Dynamic-Skybridge (NP-D-SB): an integrated framework to achieve NP-dynamic circuits in vertical nanowires  SkyBridge-3D-CMOS(S3DC): an integrated framework to achieve static circuits in vertical nanowires NP-Dynamic Skybridge and Skybridge-3D-CMOS Fabric

20 20  Specifically designed fabric components for incorporating both p- and n-type transistors Vertical Si nanowire array with p- and n-doped regions as building blocks Device engineering for designing both p- and n-type transistors SB-ILC provides Ohmic connection between doping regions Fabric Components Fabric Components [8] [8] J. Shi, et al., “Architecting NP-Dynamic Skybridge,” NANOARCH, pp. 169-174, 2015

21 21  Support for elementary logic gates NAND, NOR with vertically stacked transistors in a single nanowire  Compact implementation 5-in NOR needs 5 nanowires in SB, but only one nanowire in NP-D-SB NP-D-SB NOR and NAND Gate NOR GateNAND Gate

22 22  Improved in logic diversity and flexibility Skybridge is limited in AND-of-NANDs logic for compound gate NP-dynamic SB has both OR-of-NORs and AND-of-NANDs gate logics Diversity in logic expression helps to build compact circuit Compound Gates in NP-D-SB OR-of-NORs Gate LogicAND-of-NANDs Gate Logic

23 23  Uses uniform set of {PRE EVA} clock to control circuits  No monotonicity problem in cascading of n-type and p-type gates Cascaded Gates in NP-D-SB Cascaded Gates Schematic Cascaded Gates Layout HSPICE Simulation

24 24  Significant benefits for latency, power-latency product and density 3x latency benefits over Skybridge single-rail implementation Over 2x density improvement over Skybridge dual-rail implementation At least 17% Throughput/Power benefit  Throughput is worse due to less number of pipelined stages Benchmarking Results (NP-D-SB vs. SB) Benchmarking Evaluation Results

25 25  SB-CMOS follows static CMOS circuit style  Signal “In”: routed between stages with routing nanowire  Signal “Int0”, “Int1” and “Int2”: routed between stages without routing nanowires Cascaded Inverters in S3DC SB-CMOS Circuit style

26 26  3-in SB-3D-CMOS NAND: 3 nanowires for 3 parallel p-transistors Multiple nanowires shorted together by SB-ILC and bridges S3DC NAND Gate 3-in NAND physical layout Layout legend 3-in NAND schematic

27 27  SB-CMOS full adder implemented with 11 nanowires: 28 transistors in 0.06 um 2, 28X denser than 16nm CMOS technology S3DC Full Adder 1-bit SB-CMOS full adder design 1-bit full adder transistor-level schematic 1-bit SB-CMOS full adder physical layout Layout legend

28 28 S3DC 6T SRAM  Cross-coupled INVs for holding value  Pass transistors for write / read control Independent read / write access  Customize transistor strength with various voltage levels SRAM schematic and physical layout SRAM operations Write operationRead operation Layout legend

29 29  Compared with 16nm-CMOS: Much better power and area efficiency Worse performance  Compared with SB: Better latency but lower throughput Better power efficiency and less power consumption Good density Evaluation Results (S3DC vs. SB and 2D-CMOS) Latency( ps) Throughput (Ops./sec.) Power (μW) Performance/Watt (Ops./J) Area ( μm 2 ) SB-CMOS5012E+910.11.98E+141.09 16nm CMOS2014.97E+91722.89E+1350.1 SB (dual-rail)5245.09E+941.31.23E+141.27 Benchmarking Evaluation Results

30 30 Modeling and Simulation of Thermal Profile in 3-D  Fine-grain transistor level modeling accounting for thermal conductivity at nanoscale  Thermal profiling of 3-D circuits with and without Skybridge Heat Extraction features for the worst case static heat scenario Thermal Evaluation Methodology [9] [9] M. Rahman, et al., “Architecting 3-D Integrated Circuit Fabric with Intrinsic Thermal Management Features, ” NANOARCH, pp. 157-162, 2015

31 31  Analytical calculation of thermal resistance for different FET regions  Electrical equivalent representation for HSPICE simulations Thermal Modeling of V-GAA Junctionless Transistor Heat Flow Paths Thermal Resistance Network for the Device Simulation Results

32 32  Up-to 85% average temperature reduction with heat extraction Thermal Simulations for 3-D Circuits

33 33  Evaluation methodology accounting for material structures, device physics, circuit style, and 3-D parasitics  Design rules derived from circuit requirements and manufacturing assumptions at 16nm Circuit Evaluation Methodology, and Design Rules Width (nm) X Length (nm) Z Thicknes s (nm) Y Spacing (nm) Bridge (X,Y,Z) 16n- 58n 16n16n-58n16n-37n Transistor Channel (X,Y,X) 16n 58n Transistor Spacing (Z) ---16n Gate Electrode (Z) 29n16n11.5n- Contact (X,Y,Z)26n16n 39 Heat Junction (X,Y,Z) 22n16n6n- Coaxial (Si-M1) (X,Y) 37n- 4n (Si- M1) Coaxial (M1-M2) (X,Y) 58n- 4n (M1- M2) Evaluation Methodology Design Rules

34 34  4-bit fully functional microprocessor (WISP-4) design RISC architecture, 5 pipeline stages Implemented in Skybridge, NP-dynamic-Skybridge and S3DC 3D WISP-4 Microprocessor WISP-4 architecture

35 35 WISP-4: Instruction Fetch and Decode  Instruction fetch stage 4-bit CLA for Program Counter 4:16 decoder to decode ROM address 16*9 ROM to store instructions  Instruction decode stage 3:8 decoder to decode opcode 2-bit buffers for buffering address and data Instruction Fetch Instruction Decode

36 36  Register access stage Four 4-bit registers for operands Two 4:1 multiplexer and one 2:1 multiplexer for operand selection WISP-4: Register File Register File

37 37  Execution stage 4-bit CLA and multiplier for addition and multiplications A buffer for data buffering Two 2:1 multiplexers for result selection WISP-4: Arithmetic Logic Unit Arithmetic Logic Unit

38 38 WISP-4 Benchmarking Results  Compared with 2D CMOS: 30x ~ 60x density benefits Up to 8x power efficiency benefits Up to 2x benefits in throughput WISP-4 Throughput (ops/sec) Power (uW) Power Efficiency (ops/Joule) Density (mm -2 ) 2D CMOS4.31E+98864.86+123.46E+3 Skybridge5.1E+9 (1.19x)301 (0.34x)1.69E+13 (3.46x)1.05E+5 (30x) NP-D-SB9.1E+9 (2.11x)230 (0.26x)3.96E+13 (8.15x)1.96E+5 (56.6x) S3DC4.55E+9 (1.06x)186 (0.21x)2.45E+13 (5.04x)9.43E+4 (27.3x) WISP-4 Benchmarking Results


Download ppt "Csaba Andras-Moritz ECE 668 3D IC Technology and Emerging 3D Processors."

Similar presentations


Ads by Google