Interconnect Optimizations. A scaling primer Ideal process scaling: –Device geometries shrink by S  = 0.7x) Device delay shrinks by s –Wire geometries.

Slides:



Advertisements
Similar presentations
Design and Implementation of VLSI Systems (EN1600)
Advertisements

Topics Electrical properties of static combinational gates:
Advanced Interconnect Optimizations. Buffers Improve Slack RAT = 300 Delay = 350 Slack = -50 RAT = 700 Delay = 600 Slack = 100 RAT = 300 Delay = 250 Slack.
EE141 © Digital Integrated Circuits 2nd Wires 1 The Wires Dr. Shiyan Hu Office: EERC 731 Adapted and modified from Digital Integrated Circuits: A Design.
Wires.
UNIT 4 BASIC CIRCUIT DESIGN CONCEPTS
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
Confidentiality/date line: 13pt Arial Regular, white Maximum length: 1 line Information separated by vertical strokes, with two spaces on either side Disclaimer.
1 Arkadiy Morgenshtein, Eby G. Friedman, Ran Ginosar, Avinoam Kolodny Technion – Israel Institute of Technology Timing Optimization in Logic with Interconnect.
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Lecture #34 Page 1 ECE 4110–5110 Digital System Design Lecture #34 Agenda 1.Timing 2.Clocking Techniques Announcements 1.n/a.
Noise Model for Multiple Segmented Coupled RC Interconnects Andrew B. Kahng, Sudhakar Muddu †, Niranjan A. Pol ‡ and Devendra Vidhani* UCSD CSE and ECE.
A Look at Chapter 4: Circuit Characterization and Performance Estimation Knowing the source of delays in CMOS gates and being able to estimate them efficiently.
Interconnect Optimizations. A scaling primer Ideal process scaling: –Device geometries shrink by  = 0.7x) Device delay shrinks by  –Wire geometries.
EE4271 VLSI Design Interconnect Optimizations Buffer Insertion.
04/11/02EECS 3121 Lecture 26: Interconnect Modeling, continued EECS 312 Reading: 8.2.2, (text) HW 8 is due now!
The Wire Scaling has seen wire delays become a major concern whereas in previous technology nodes they were not even a secondary design issue. Wire parasitic.
© Digital Integrated Circuits 2nd Inverter CMOS Inverter: Digital Workhorse  Best Figures of Merit in CMOS Family  Noise Immunity  Performance  Power/Buffer.
04/09/02EECS 3121 Lecture 25: Interconnect Modeling EECS 312 Reading: 8.3 (text), 4.3.2, (2 nd edition)
Interconnect Optimizations
Lecture #25a OUTLINE Interconnect modeling
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 22: Material Review Prof. Sherief Reda Division of Engineering, Brown University.
EE 201A (Starting 2005, called EE 201B) Modeling and Optimization for VLSI Layout Instructor: Lei He
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
Temperature-Aware Design Presented by Mehul Shah 4/29/04.
Circuit characterization and Performance Estimation
CMOS VLSI For Computer Engineering Lecture 4 – Logical Effort Prof. Luke Theogarajan parts adapted form Harris – and Rabaey-
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 3 ASIC.
A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.
CSET 4650 Field Programmable Logic Devices
1 Delay Estimation Most digital designs have multiple data paths some of which are not critical. The critical path is defined as the path the offers the.
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 7 Programmable.
EE141 © Digital Integrated Circuits 2nd Wires 1 The Wires Dr. Shiyan Hu Office: EERC 731 Adapted and modified from Digital Integrated Circuits: A Design.
Review: CMOS Inverter: Dynamic
Chapter 4 Interconnect Analysis. Organization 4.1 Linear System 4.2 Elmore Delay 4.3 Moment Matching and Model Order Reduction –AWE –PRIMA 4.4 Recent.
CAD for Physical Design of VLSI Circuits
EE 5900 Advanced Algorithms for Robust VLSI CAD, Spring 2009 Static Timing Analysis and Gate Sizing.
Elmore Delay, Logical Effort
Physical Synthesis Ing. Pullini Antonio
Penn ESE370 Fall DeHon 1 ESE370: Circuit-Level Modeling, Design, and Optimization for Digital Systems Day 7: September 22, 2010 Delay and RC Response.
Linear Delay Model In general the propagation delay of a gate can be written as: d = f + p –p is the delay due to intrinsic capacitance. –f is the effort.
Inverter Chapter 5 The Inverter April 10, Inverter Objective of This Chapter  Use Inverter to know basic CMOS Circuits Operations  Watch for performance.
INTERCONNECT MODELING M.Arvind 2nd M.E Microelectronics
1 Interconnect/Via. 2 Delay of Devices and Interconnect.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Wire delay. n Buffer insertion. n Crosstalk. n Inductive interconnect. n Switch logic.
Basics of Energy & Power Dissipation
EE 4271 VLSI Design, Fall 2013 Static Timing Analysis and Gate Sizing Optimization.
© Digital Integrated Circuits 2nd Inverter Digital Integrated Circuits A Design Perspective The Inverter Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.
Introduction to Clock Tree Synthesis
Interconnect/Via.
Chapter 4: Secs ; Chapter 5: pp
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Wire delay. n Buffer insertion. n Crosstalk. n Inductive interconnect.
Modern VLSI Design 3e: Chapter 3 Copyright  1998, 2002 Prentice Hall PTR Topics n Electrical properties of static combinational gates: –transfer characteristics;
EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
1 Modeling and Optimization of VLSI Interconnect Lecture 2: Interconnect Delay Modeling Avinoam Kolodny Konstantin Moiseev.
Circuit Delay Performance Estimation Most digital designs have multiple signal paths and the slowest one of these paths is called the critical path Timing.
Wires & wire delay Lecture 9 Tuesday September 27, 2016.
Circuit characterization and Performance Estimation
The Interconnect Delay Bottleneck.
Static Timing Analysis and Gate Sizing Optimization
EE141 Chapter 4 The Wire March 20, 2003.
Circuits and Interconnects In Aggressively Scaled CMOS
Topics Driving long wires..
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Static Timing Analysis and Gate Sizing Optimization
Topics Circuit design for FPGAs: Logic elements. Interconnect.
Wire Indctance Consequences of on-chip inductance include:
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Presentation transcript:

Interconnect Optimizations

A scaling primer Ideal process scaling: –Device geometries shrink by S  = 0.7x) Device delay shrinks by s –Wire geometries shrink by  R/  :  /(ws.hs) = r/s 2 Cc/  : (hs).  /(Ss) = Cc C/  : similar R/  doubles, C/  and Cc/  unchanged SGD h w l S ll hh SS ww

Interconnect role Short (local) interconnect –Used to connect nearby cells –Minimize wire C, i.e., use short min-width wires Medium to long-distance (global) interconnect –Size wires to tradeoff area vs. delay –Increasing width  Capacitance increases, Resistance decreases Need to find acceptable tradeoff - wire sizing problem “Fat” wires –Thicker cross-sections in higher metal layers –Useful for reducing delays for global wires –Inductance issues, sharing of limited resource

Cross-Section of A Chip

Block scaling Block area often stays same –# cells, # nets doubles –Wiring histogram shape invariant Global interconnect lengths don’t shrink Local interconnect lengths shrink by s

Interconnect delay scaling Delay of a wire of length l :  int = (rl)(cl) = rcl 2 (first order) Local interconnects :  int : (r/s 2 )(c)(ls) 2 = rcl 2 –Local interconnect delay unchanged (compare to faster devices) Global interconnects :  int : (r/s 2 )(c)(l) 2 = (rcl 2 )/s 2 –Global interconnect delay doubles – unsustainable! Interconnect delay increasingly more dominant

Buffer Insertion For Delay Reduction

Analysis of Simple RC Circuit state variable Input waveform ± v(t) C R v T (t) i(t)

Analysis of Simple RC Circuit Step-input response: match initial state: output response for step-input: v0v0 v 0 u(t) v 0 (1-e -t/RC )u(t)

Delays of Simple RC Circuit v(t) = v 0 (1 - e -t/RC ) -- waveform under step input v 0 u(t) v(t)=0.5v 0  t = 0.69RC –i.e., delay = 0.69RC (50% delay) v(t)=0.1v 0  t = 0.1RC v(t)=0.9v 0  t = 2.3RC –i.e., rise time = 2.2RC (if defined as time from 10% to 90% of Vdd) Commonly used metric T D = RC (= Elmore delay)

Elmore Delay Delay

Elmore Delay Driver is modeled as R Driver intrinsic gate delay t(B) Delay =  all Ri  all Cj downstream from Ri Ri*Cj Elmore delay at n2 R(B)*(C1+C2)+R(w)*C2 Elmore delay at n1 R(B)*(C1+C2) R(B) C1 R(w) C2 n1 B n2

Elmore Delay For uniform wire No matter how to lump, the Elmore delay is the same x C unit wire capacitance c unit wire resistance r

Delay for Buffer v C u C(b) u Intrinsic buffer delay Driver resistance Input capacitance

R Buffers Reduce Wire Delay x/2 cx/4 rx/2 t_unbuf = R( cx + C ) + rx( cx/2 + C ) t_buf = 2R( cx/2 + C ) + rx( cx/4 + C ) + t b t_buf – t_unbuf = RC + t b – rcx 2 /4 x/2 cx/4 rx/2 C C R x ∆t∆t

Combinational Logic Delay Combinational logic delay <= clock period Combinational Logic Register Primary Input Register Primary Output clock

Buffered global interconnects: Intuition Interconnect delay = r.c.l 2 Now, interconnect delay =  r.c.l i 2 < r.c.l 2 (where l =  l j ) since  (l j 2 ) < (  l j ) 2 (Of course, account for buffer delay also) l1l1 lnln l3l3 l2l2 l

Optimal inter-buffer length First order (lumped parasitic, Elmore delay) analysis Assume N identical buffers with equal inter-buffer length l (L = Nl) For minimum delay, L R d – On resistance of inverter C g – Gate input capacitance r,c – Resistance, cap. per micron … … l

Optimal interconnect delay Substituting l opt back into the interconnect delay expression: Delay grows linearly with L (instead of quadratically)

Optimized interconnect delay scaling Rewriting the optimal interconnect delay expression, With optimally sized buffers (using dT/dh = 0),

Total buffer count Ever-increasing fractions of total cell count will be buffers –70% in 32nm nm65nm45nm32nm % cells used to buffer nets clk-buf buf tot-buf

ITRS projections