Dec 1, 2003 Slide 1 Copyright, 1999 - 2003 © Zenasis Technologies, Inc. Flex-Cell Optimization A Paradigm Shift in High-Performance Cell-Based Design A.

Slides:



Advertisements
Similar presentations
TOPIC : SYNTHESIS DESIGN FLOW Module 4.3 Verilog Synthesis.
Advertisements

Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.
Timing Override Verification (TOV) Erik Seligman CS 510, Lecture 18, March 2009.
Slide 1 Bayesian Model Fusion: Large-Scale Performance Modeling of Analog and Mixed- Signal Circuits by Reusing Early-Stage Data Fa Wang*, Wangyang Zhang*,
Logic Synthesis – 3 Optimization Ahmed Hemani Sources: Synopsys Documentation.
1 Cleared for Open Publication July 30, S-2144 P148/MAPLD 2004 Rea MAPLD 148:"Is Scaling the Correct Approach for Radiation Hardened Conversions.
Ch.3 Overview of Standard Cell Design
Integrated Circuits Laboratory Faculty of Engineering Digital Design Flow Using Mentor Graphics Tools Presented by: Sameh Assem Ibrahim 16-October-2003.
High-Level Constructors and Estimators Majid Sarrafzadeh and Jason Cong Computer Science Department
Timing Analysis Timing Analysis Instructor: Dr. Vishwani D. Agrawal ELEC 7770 Advanced VLSI Design Team Project.
The Design Process Outline Goal Reading Design Domain Design Flow
Modern VLSI Design 2e: Chapter4 Copyright  1998 Prentice Hall PTR.
Design Team Project: Physical Design ( Layout ) Kyungseok Kim ELEC 7770 Advanced VLSI Design Lecturer: Dr. Vishwani D. Agrawal.
On the Relevance of Wire Load Models Kenneth D. Boese, Cadence Design Systems, San Jose Andrew B. Kahng, UCSD CSE and ECE Depts., La Jolla Stefanus Mantik,
Toward Performance-Driven Reduction of the Cost of RET-Based Lithography Control Dennis Sylvester Jie Yang (Univ. of Michigan,
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
An Introduction to Synopsys Design Automation Jeremy Lee November 7, 2007.
1 Application Specific Integrated Circuits. 2 What is an ASIC? An application-specific integrated circuit (ASIC) is an integrated circuit (IC) customized.
A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.
Layout-based Logic Decomposition for Timing Optimization Yun-Yin Lien* Youn-Long Lin Department of Computer Science, National Tsing Hua University, Hsin-Chu,
1 Chapter 7 Design Implementation. 2 Overview 3 Main Steps of an FPGA Design ’ s Implementation Design architecture Defining the structure, interface.
From Concept to Silicon How an idea becomes a part of a new chip at ATI Richard Huddy ATI Research.
Placement-Centered Research Directions and New Problems Xiaojian Yang Amir Farrahi Synplicity Inc.
Signal Integrity Methodology on 300 MHz SoC using ALF libraries and tools Wolfgang Roethig, Ramakrishna Nibhanupudi, Arun Balakrishnan, Gopal Dandu Steven.
Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai.
TM Efficient IP Design flow for Low-Power High-Level Synthesis Quick & Accurate Power Analysis and Optimization Flow JAN Asher Berkovitz Yaniv.
1 VLSI Design SMD154 LOW-POWER DESIGN Magnus Eriksson & Simon Olsson.
ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.
Chap. 1 Overview of Digital Design with Verilog. 2 Overview of Digital Design with Verilog HDL Evolution of computer aided digital circuit design Emergence.
Are classical design flows suitable below 0.18  ? ISPD 2001 NEC Electronics Inc. WR0999.ppt-1 Wolfgang Roethig Senior Engineering Manager EDA R&D Group.
ECO Methodology for Very High Frequency Microprocessor Sumit Goswami, Srivatsa Srinath, Anoop V, Ravi Sekhar Intel Technology, Bangalore, India Introduction.
CAD for Physical Design of VLSI Circuits
CADENCE CONFIDENTIAL 1CADENCE DESIGN SYSTEMS, INC. Cadence Formal Verification 2003 Beijing International Microelectronics Symposium C. Michael Chang Vice.
ASIC Design Flow – An Overview Ing. Pullini Antonio
Lessons Learned The Hard Way: FPGA  PCB Integration Challenges Dave Brady & Bruce Riggins.
Design Verification An Overview. Powerful HDL Verification Solutions for the Industry’s Highest Density Devices  What is driving the FPGA Verification.
A New Method For Developing IBIS-AMI Models
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
HDL-Based Layout Synthesis Methodologies Allen C.-H. Wu Department of Computer Science Tsing Hua University Hsinchu, Taiwan, R.O.C {
FPGA-Based System Design: Chapter 6 Copyright  2004 Prentice Hall PTR Topics n Design methodologies.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Massachusetts Institute of Technology 1 L14 – Physical Design Spring 2007 Ajay Joshi.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
ESL and High-level Design: Who Cares? Anmol Mathur CTO and co-founder, Calypto Design Systems.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.
Introduction to CMOS VLSI Design Lecture 5: Logical Effort GRECO-CIn-UFPE Harvey Mudd College Spring 2004.
ASIC, Customer-Owned Tooling, and Processor Design Nancy Nettleton Manager, VLSI ASIC Device Engineering April 2000 Design Style Myths That Lead EDA Astray.
SSV Summit November 2013 Cadence Tempus™ Timing Signoff Solution.
TOPIC : SYNTHESIS INTRODUCTION Module 4.3 : Synthesis.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Topics Combinational network delay.
ECE 260B – CSE 241A /UCB EECS Kahng/Keutzer/Newton Physical Design Flow Read Netlist Initial Placement Placement Improvement Cost Estimation Routing.
CHAPTER 8 Developing Hard Macros The topics are: Overview Hard macro design issues Hard macro design process Physical design for hard macros Block integration.
Tools - Design Manager - Chapter 6 slide 1 Version 1.5 FPGA Tools Training Class Design Manager.
Baseband Implementation of an OFDM System for 60GHz Radios: From Concept to Silicon Jing Zhang University of Toronto.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.
A Novel Timing-Driven Global Routing Algorithm Considering Coupling Effects for High Performance Circuit Design Jingyu Xu, Xianlong Hong, Tong Jing, Yici.
Physical Design of FabScalar Generated Cores EE6052 Class Project Wei Zhang.
-1- Soft Core Viterbi Decoder EECS 290A Project Dave Chinnery, Rhett Davis, Chris Taylor, Ning Zhang.
ASIC Design Methodology
The Interconnect Delay Bottleneck.
Digital readout architecture for Velopix
Jody Matos, Augusto Neutzling, Renato Ribas and Andre Reis
On the Relevance of Wire Load Models
Top-level Schematics Digital Block Sign-off Digital Model of Chip
Revisiting and Bounding the Benefit From 3D Integration
EE141 Design Styles and Methodologies
The performance requirements for DSP applications continue to grow and the traditional solutions do not adequately address this new challenge Paradigm.
H a r d w a r e M o d e l i n g O v e r v i e w
Presentation transcript:

Dec 1, 2003 Slide 1 Copyright, © Zenasis Technologies, Inc. Flex-Cell Optimization A Paradigm Shift in High-Performance Cell-Based Design A Paradigm Shift in High-Performance Cell-Based Design

Dec 1, 2003 Slide 2 Copyright, © Zenasis Technologies, Inc. The Power-User Dilemma Custom Team=400 3 GHz, 3 Years Flex-Cell Opt Team= MHz 6 Months FPGA ASIC/COT Team= MHz 9 Months Cost / TTM Speed, Power, Area Takes too long! Results aren’t good enough!

Dec 1, 2003 Slide 3 Copyright, © Zenasis Technologies, Inc. The Timing Dilemma Design Team clock target – 350 MHz On Post-logic synth./Post-placement STA –Only 300 MHz – Problem!! Options –Design change Rewrite RTL – Tapeout Delay!! –Better technology Smaller geometry – Tapeout delay and NRE cost!! Low-k technology – Yield hit!! –Better tools Flex-Cell Optimization –Custom-design benefits in std cell flow Design Team clock target – 350 MHz On Post-logic synth./Post-placement STA –Only 300 MHz – Problem!! Options –Design change Rewrite RTL – Tapeout Delay!! –Better technology Smaller geometry – Tapeout delay and NRE cost!! Low-k technology – Yield hit!! –Better tools Flex-Cell Optimization –Custom-design benefits in std cell flow

Dec 1, 2003 Slide 4 Copyright, © Zenasis Technologies, Inc. Root of the Problem Various past studies, including a special session at DAC 2000 Std-Cell based design “an order of magnitude” lower performance than custom, at same process node –Architecture –Fixed cell library –Layout Std-Cell based design “an order of magnitude” lower performance than custom, at same process node –Architecture –Fixed cell library –Layout Fixed cell library can account for as much as 25% of the performance shortfall

Dec 1, 2003 Slide 5 Copyright, © Zenasis Technologies, Inc. Rich vs Smart Simply creating a “richer” cell library does not solve problem –Too many cells hinder automated optimization –Missing design-specific context information –Well-known matching problems for larger cells Simply creating a “richer” cell library does not solve problem –Too many cells hinder automated optimization –Missing design-specific context information –Well-known matching problems for larger cells Custom-crafted cells, for specific design, can inject large timing gains late in the design cycle Compute-intensive process –Transistor netlist optimization –Cell layout creation –View generation Custom-crafted cells, for specific design, can inject large timing gains late in the design cycle Compute-intensive process –Transistor netlist optimization –Cell layout creation –View generation

Dec 1, 2003 Slide 6 Copyright, © Zenasis Technologies, Inc. Flex-Cell Optimization -- Concept Transistor Level Physical Level Physical Level Logical Level Physical Level Logical Level Flex-Cell Opt Optimization at Gate, Transistor & Physical Levels

Dec 1, 2003 Slide 7 Copyright, © Zenasis Technologies, Inc. Prior Work Manual custom-crafting of cells, is well established –Tactical cells: every high-performance design project uses some Manual custom-crafting of cells, is well established –Tactical cells: every high-performance design project uses some Automated transistor-level netlist creation/optimization –Fishburn, Dunlop(1985): TILOS, transistor sizing –Gavrilov et al (1997): Library-less synthesis –Kanecko, Tian (1998): Concurrent cell generation and mapping of digital logic –Liu, Abraham (1999): Transistor-level synthesis of combinational logic Automated transistor-level netlist creation/optimization –Fishburn, Dunlop(1985): TILOS, transistor sizing –Gavrilov et al (1997): Library-less synthesis –Kanecko, Tian (1998): Concurrent cell generation and mapping of digital logic –Liu, Abraham (1999): Transistor-level synthesis of combinational logic

Dec 1, 2003 Slide 8 Copyright, © Zenasis Technologies, Inc. Flex-Cell Optimization Targets Eliminate deficiency due to fixed cell library –Boost performance by 15% - 25% Close aggressive timing in days Retain proven existing cell-based design flow Use high-yield process, still get performance Minimal increase in die-size or power Get custom-design performance from std-cell- based flow Eliminate deficiency due to fixed cell library –Boost performance by 15% - 25% Close aggressive timing in days Retain proven existing cell-based design flow Use high-yield process, still get performance Minimal increase in die-size or power Get custom-design performance from std-cell- based flow

Dec 1, 2003 Slide 9 Copyright, © Zenasis Technologies, Inc. STA Cluster formation Critical Paths Key Steps Flex-cell (custom crafted) creation Gate-level optimization 1 Cell 13 Transistors 6 Wires a b d a c b a d c a 4 Cells 22 Transistors 9 Wires a c d b a Post synthesis netlist d

Dec 1, 2003 Slide 10 Copyright, © Zenasis Technologies, Inc. Flex-Cell Optimization with Physicals Physically-aware STA –Placement aware Congestion Blockage –Multiple levels of accuracy for route info Steiner estimates Global route Detailed route** Physically-aware STA –Placement aware Congestion Blockage –Multiple levels of accuracy for route info Steiner estimates Global route Detailed route** Physically-driven optimization –Physically-aware clustering and mapping –Physically-aware gate-level optimizations –Low disturbance to existing placement –Incremental legalization of placement –Incremental re-computation of routes/estimates Physically-driven optimization –Physically-aware clustering and mapping –Physically-aware gate-level optimizations –Low disturbance to existing placement –Incremental legalization of placement –Incremental re-computation of routes/estimates

Dec 1, 2003 Slide 11 Copyright, © Zenasis Technologies, Inc. Sample Flex-Cell Tx-Level View of Gate Cluster 22 Transistors Path depth = 3 levels b b d d c c a c a a y c 13 Transistors; Path depth = 2 levels Critical Path: a -> y Rise = 0.12 ns; Fall = 0.10 ns a b d a c b a d d c a y After Tx-Level Optimization Before After Rise (critical) 0.26ns 0.12ns Fall (critical) 0.31ns 0.10ns # Cells 4 1 # Transistors Path depth 3 2 # nets 9 7 Tx Opt Custom-Crafted Flex-Cell 1 Cell, 7 nets Critical Path: a -> y Rise = 0.12 ns ; Fall = 0.10 ns Gate-Level Cluster a c d b y Critical Path: a -> y Rise = 0.26 ns ; Fall = 0.31 ns 4 Cells, 9 nets a

Dec 1, 2003 Slide 12 Copyright, © Zenasis Technologies, Inc. Transistor-Level Optimization

Dec 1, 2003 Slide 13 Copyright, © Zenasis Technologies, Inc. Key Issues Judicious mix of gate-level and transistor-level optimization Judicious mix of discrete and continuous transistor sizing Effective use of transistor-level restructuring Fast and accurate transistor-level simulation –50x to 100x faster than Spice Accurate estimation of parasitics given transistor- level netlist Judicious mix of gate-level and transistor-level optimization Judicious mix of discrete and continuous transistor sizing Effective use of transistor-level restructuring Fast and accurate transistor-level simulation –50x to 100x faster than Spice Accurate estimation of parasitics given transistor- level netlist

Dec 1, 2003 Slide 14 Copyright, © Zenasis Technologies, Inc. Impact On a Sample Critical Path Original Critical Path Optimized Path 0.20 Flex-Cell Flex-Cell % Improvement

Dec 1, 2003 Slide 15 Copyright, © Zenasis Technologies, Inc. Results (ZenTime  ) 38K+ instance design 16% performance boost –297 MHz --> 344 MHz Implemented in a 0.13u process Added 132 flex-cells, 5,927 instances Without increasing power or area 38K+ instance design 16% performance boost –297 MHz --> 344 MHz Implemented in a 0.13u process Added 132 flex-cells, 5,927 instances Without increasing power or area

Dec 1, 2003 Slide 16 Copyright, © Zenasis Technologies, Inc. Impact on Global Timing Initial frequency: 297 MHz Final frequency: 344 MHz Initial frequency: 297 MHz Final frequency: 344 MHz

Dec 1, 2003 Slide 17 Copyright, © Zenasis Technologies, Inc. Timing Optimization Results with physicals (def, sdf, …) with wire loads

Dec 1, 2003 Slide 18 Copyright, © Zenasis Technologies, Inc. I/O & Design Flow GDSII Back-end Design Extraction & Verification Detailed Route Front-end Design Constraints Design Library Flex-Cell Opt Timing Physical Synthesis Physical Gatelevel Opt. Discrete Sizing Cont. Sizing Clustering Timing Interface library.lib library.lef library.cdl netlist.v netlist.def constr.sdc tech.bsim3 netlist.set_load netlist.sdf opt_netlist.v opt_netlist.def flex-cell.est.lib flex-cell.est.lef flex-cell.cdl Flex-Cell Factory

Dec 1, 2003 Slide 19 Copyright, © Zenasis Technologies, Inc. Automated Flex-Cell Generation Tool Suite and Flow Sized spice netlists Cell Architecture gds lef ant. lef eqn.v mos.v lumpedC.sp distrRC.sp Layout Functional Spice TimingPower Noise/ glitch.lib.db.tlf Reports.lib ??

Dec 1, 2003 Slide 20 Copyright, © Zenasis Technologies, Inc. Summary New dimension in optimization of cell-based designs Essential to find the “right balance” between gate-level and transistor-level optimization Better design quality, higher runtime Timing, Area, Power no longer a simple trade- off –Possible to improve more than one, simultaneously Many challenges –Lots of research opportunities!! New dimension in optimization of cell-based designs Essential to find the “right balance” between gate-level and transistor-level optimization Better design quality, higher runtime Timing, Area, Power no longer a simple trade- off –Possible to improve more than one, simultaneously Many challenges –Lots of research opportunities!!

Dec 1, 2003 Slide 21 Copyright, © Zenasis Technologies, Inc. The History of Methodology Shifts Netlist schematic Netlist optimization Logic synthesis Physical synthesis Flex-cell optimization Flex-cell synthesis Physical optimization