1 A Deep Sub-Micron VLSI Design Flow using Layout Fabrics Sunil P. Khatri University of Colorado, Boulder Amit Mehrotra University of Illinois, Urbana-Champaign.

Slides:

Advertisements

Similar presentations

TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.

Advertisements

Gregory Shklover, Ben Emanuel Intel Corporation MATAM, Haifa 31015, Israel Simultaneous Clock and Data Gate Sizing Algorithm with Common Global Objective.

Circuit Extraction 1 Outline –What is Circuit Extraction? –Why Circuit Extraction? –Circuit Extraction Algorithms Goal –Understand Extraction problem –Understand.

Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.

ELEC Digital Logic Circuits Fall 2014 Logic Synthesis (Chapters 2-5) Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and.

EECE579: Digital Design Flows

1 A Lithography-friendly Structured ASIC Design Approach By: Salman Goplani* Rajesh Garg # Sunil P Khatri # Mosong Cheng # * National Instruments, Austin,

Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 11 - Combinational.

Combinational Circuits

Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.

Spring 08, Jan 15 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.

An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.

Spring 07, Jan 16 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Introduction Vishwani D. Agrawal James J. Danaher.

Optimal Layout of CMOS Functional Arrays ECE665- Computer Algorithms Optimal Layout of CMOS Functional Arrays T akao Uehara William M. VanCleemput Presented.

A PLA based Asynchronous Micropipelining Approach for Sub- threshold Circuit Design Authors: Nikhil Jayakumar* Rajesh Garg* Bruce Gamache $ Sunil P. Khatri*

Physical Design Outline –What is Physical Design –Design Methods –Design Styles –Analysis and Verification Goal –Understand physical design topics Reading.

SPFD-Based Wire Removal in a Network of PLAs Sunil P. Khatri* Subarnarekha Sinha* Andreas Kuehlmann** Robert K. Brayton* Alberto Sangiovanni-Vincentelli*

Evolution of implementation technologies

1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.

A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.

Analysis and Avoidance of Cross-talk in on-chip buses Chunjie Duan Ericsson Wireless Communications Anup Tirumala Jasmine Networks Sunil P Khatri University.

Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.

1 Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu.

ENEE 644 Dr. Ankur Srivastava Office: 1349 A.V. Williams URL: Computer-Aided Design of.

Statistical Critical Path Selection for Timing Validation Kai Yang, Kwang-Ting Cheng, and Li-C Wang Department of Electrical and Computer Engineering University.

1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.

UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.

A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.

Dose Map and Placement Co-Optimization for Timing Yield Enhancement and Leakage Power Reduction Kwangok Jeong, Andrew B. Kahng, Chul-Hong Park, Hailong.

Charles Kime & Thomas Kaminski © 2004 Pearson Education, Inc. Terms of Use (Hyperlinks are active in View Show mode) Terms of Use Lecture 12 – Design Procedure.

Introduction to VLSI Design – Lec01. Chapter 1 Introduction to VLSI Design Lecture # 2 A Circuit Design Example.

Power Reduction for FPGA using Multiple Vdd/Vth

Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego

CAD for Physical Design of VLSI Circuits

Open Discussion of Design Flow Today’s task: Design an ASIC that will drive a TV cell phone Exercise objective: Importance of codesign.

TSV-Aware Analytical Placement for 3D IC Designs Meng-Kai Hsu, Yao-Wen Chang, and Valerity Balabanov GIEE and EE department of NTU DAC 2011.

ASIC Design Flow – An Overview Ing. Pullini Antonio

Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.

Computer-Aided Design of Digital VLSI Circuits & Systems Priyank Kalla Dept. of Elec. & Comp. Engineering University of Utah,SLC Perspectives on Next-Generation.

1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.

CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.

Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.

J. Christiansen, CERN - EP/MIC

Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.

Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.

Introduction to CMOS VLSI Design Lecture 5: Logical Effort GRECO-CIn-UFPE Harvey Mudd College Spring 2004.

Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs) Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British.

Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.

Recent Topics on Programmable Logic Array

1 A Fast Algorithm for Power Grid Design Jaskirat Singh Sachin Sapatnekar Department of Electrical and Computer Engineering University of Minnesota.

Ｅｘｅｒｃｉｓｅ TAIST ICTES Program VLSI Design Methodology Hiroaki Kunieda Tokyo Institute of Technology.

DEVICES AND DESIGN : ASIC. DEFINITION Any IC other than a general purpose IC which contains the functionality of thousands of gates is usually called.

Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,

1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,

EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Combinational Circuits.

Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Multi-Level Logic Synthesis.

Logic synthesis flow Technology independent mapping –Two level or multilevel optimization to optimize a coarse metric related to area/delay Technology.

Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.

-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.

Architecture and algorithm for synthesizable embedded programmable logic core Noha Kafafi, Kimberly Bozman, Steven J. E. Wilton 2003 Field programmable.

Written by Whitney J. Wadlow

-1- Soft Core Viterbi Decoder EECS 290A Project Dave Chinnery, Rhett Davis, Chris Taylor, Ning Zhang.

Introduction to ASICs ASIC - Application Specific Integrated Circuit

ELEC 7770 Advanced VLSI Design Spring 2016 Introduction

ELEC 7770 Advanced VLSI Design Spring 2014 Introduction

EE141 Design Styles and Methodologies

Timing Analysis 11/21/2018.

ELEC 7770 Advanced VLSI Design Spring 2012 Introduction

ELEC 7770 Advanced VLSI Design Spring 2010 Introduction

HIGH LEVEL SYNTHESIS.

Presentation transcript:

1 A Deep Sub-Micron VLSI Design Flow using Layout Fabrics Sunil P. Khatri University of Colorado, Boulder Amit Mehrotra University of Illinois, Urbana-Champaign Robert K Brayton Alberto L Sangiovanni-Vincentelli University of California, Berkeley

2 Our VLSI Design Flow Optimized logic netlist Layout Logic Optimization Technology Mapping Routing Placement Logic netlist

3 Motivation  Modern IC processes  Feature size well below 1 micron  Certain electrical effects increasingly important  Cross-talk  Electromigration  Self Heat  Statistical variations  Logic abstraction eroded  Existing design paradigms need to be rethought

4 a C C 2 1 C C 2 C 2 1 a a v v a C C 2 1 C C 2 C 2 1 a a v v C 1 C 2 C 2 1 C 2 C Research Focus  Tackled in an ad-hoc manner  Increases turn-around time  Verified cross-talk trends  Accurate 3-D capacitance extraction  Delay variation 2.47:1 (200  m wires, 10X drivers, 0.1  m technology)  The cross-talk issue C C 2 1 C C 2 C 2 1 a v a a v a C C 2 1 C C 2 C 2 1 v a a v a C C 2 1 C C 2 C 2 1 v a

5 Outline  Previous Approaches  New idea:  New idea: The Fabric Approach  Fabric1 (in DAC-1999)  Standard-cell based design  Fabric3 (in ICCAD-2000)  Network of PLA based design  Further Tasks  Summary

6 Previous Approaches  [ALPHA 97] :  Metal layers 3 and 6 dedicated to power  Not viable in future processes  [Rubio 94]:  Functional analysis based on layout  Post-layout methods don’t scale  [Kirkpatrick 94, 96] :  Concept of digital sensitivity  Requires don’t-care and image computations

7 Solution: Layout Fabrics dense wiring fabric  Repeating dense wiring fabric (DWF) pattern at minimum pitch by design  We handle cross-talk by design  A new layout and design paradigm S SS V VS G S S V G V

8 Research Contribution  Verify cross-talk trends  Fabric1 [KMBSO99] (in DAC)  Incorportated into traditional design flow  Fabric3 [KBS00] (in ICCAD-00)  Network of PLAs  Detailed electrical characterization  Synthesis, wire removal algorithms  Both utilize DWF pattern  1.02:1 cross-talk delay variation

9 Layout Fabrics  Advantages  Pre-characterized parasitics  Uniform, low cross-coupling capacitance  40X  40X lower, 2% delay variation  Uniform, low signal inductance  Automatic power and ground routing  Uniform, low power and ground resistance  Can effectively implement regular structures  Disadvantages  5% increase in total capacitance  Area penalty  Power increase

10 Capacitance in DWF  Experimental setup  “Strawman” process model, copper wires, low-K dielectric  Capacitances from 3-D field solver (space3d)  Simulated three wires in spice  0.1 micron process, Metal2 wires  Length 200 microns, 10x minimum drivers  Non-DWF  Delay variation 2.47:1  Signal integrity problems for fast slew rates  With DWF  40X reduction in cross-coupling capacitance  Delay variation 1.02:1, no signal integrity problem

11 Inductance in the DWF  Low and uniform in DWF  Current return path is at minimum spacing  In regular layout style, varies greatly  Problems reported for clock signals  Compared inductance of Metal8 trace  Verified using ASITIC Inductance (nH / micron)

12 VDD/GND Resistance in DWF  Check resistance at various points in DWF  Compare with standard cell case  Varies greatly  Measured at end of row  L/W = 1000/8 VDD/GND resistance (ohms)

13 Buffer Insertion in DWF  Easily performed  VDD and GND available all over routing area

14 Fabric1 - Introduction  DWF pattern utilized chip-wide  Library cells implemented in this pattern Std CellFabric Cell  Synthesis, placement and routing use standard cell methodology

15 Fabric1 - Results

16 Fabric1 - Results

17 Fabric3 Programmable Logic Arrays  Network of Programmable Logic Arrays  Combine many logic nodes into a PLA  Routing area utilizes DWF pattern  PLA implements a multi-output function  example : f = a b + c ; g = a b + c a b c abcbfg AND planeOR Plane

18 Fabric3 PLA Core Layout b g a a b f clk

19 PLAs v/s Standard Cells dense fast  PLAs are dense and fast PLA Standard Cell

20 PLA Characteristics  Why is the PLA area and delay so low?  Wiring localized within PLA  PLA core transistor sizes are minimum  No p-transistor to n-transistor diffusion spacing  “Gigahertz” chip utilized pre-charged PLAs  High performance  Quick implementation  Didn’t use a network of PLAs

21 Network of PLAs  PLAs are pre-charged  Inputs to all PLAs must settle before evaluation begins a g f d b c e

22 Network of PLAs  For correct operation:  PLA dependency graph must be acyclic  Evaluation of PLA i after completion of slowest PLA j in its “fanin”  Self-timed design style  Each PLA generates a completion signal  Overhead of one wordline, one output  Delay formula to find slowest PLA j

23 Decomposition  Algorithm collapses wiring into PLAs  Input:  Input: multi-level combinational network W bound H bound  Output:  Output: Correct network of PLAs  Our algorithm greedily grows a PLA until either bound is violated  Attempt to reduce wires by selecting fanouts for inclusion in the PLA being grown

24 Choice of W, H  Choice of W  Driven by synthesis constraints  Large W means larger runtimes  espresso and folding done in inner loop  Use W between 25 and 50  Choice of H  Driven by power considerations  Large H also affects synthesis runtimes  Used H between 15 and 40

25 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 a g f d b c 1 e 2 Fabric3 - Decomposition a g f d b c e

26 Place/Route Flow  PLA generation  PLA generation using perl script  Layout generated on the fly  2 Layer experiments:  Placement using vpr  FPGA placement tool  All PLAs have approximately same size  Routing using wolfe  interface to TimberWolfSC and yacr  3-6 Layer experiments:  Placement using CADENCE qplace  Routing using CADENCE router

27 Fabric3 - Area Results

28 Fabric3 - Timing Results

29 Fabric3 - Results  Timing results essentially unchanged  For C3540, delay variation due to cross-talk is 3.45:1 (Stdcell) versus 1.07:1 (Fabric3)

30 Fabric3 layout (2 Layer)

31 Future Tasks  Better algorithms:  Better ways of decomposing original netlist  Refining the fabric:  Alternative denser fabrics  Encoding PLA inputs [Schmookler80]  Connecting gates to PLA outputs  Alternative implementation of logic blocks:  Different PLA styles  Alternative circuits

32 Summary  Layout fabrics to eliminate cross-talk in DSM VLSI design  New layout and design paradigm  Fix cross-talk by design  Highly regular and predictable  Network of PLA based design flow  PLA decomposition algorithms  Minimal area penalty  15% timing improvement

33 Thank you!!