Presentation is loading. Please wait.

Presentation is loading. Please wait.

Interconnect Optimization for Deep-Submicron and Giga-Hertz ICs Lei He UCLA Computer Science Department Los Angeles, CA.

Similar presentations


Presentation on theme: "Interconnect Optimization for Deep-Submicron and Giga-Hertz ICs Lei He UCLA Computer Science Department Los Angeles, CA."— Presentation transcript:

1 Interconnect Optimization for Deep-Submicron and Giga-Hertz ICs Lei He http://cadlab.cs.ucla.edu/~helei UCLA Computer Science Department Los Angeles, CA 90095 Los Angeles, CA 90095

2 Outline n Background and overview n LR-based STIS optimization u LR -- local refinement u STIS -- simultaneous transistor and interconnect sizing n Conclusions and future works

3 Upcoming Design Challenges n Microprocessors used in server computers u 1998 -- 0.25um, 7.5M FETs, 450MHz u 2001 -- 0.18um, ~100M FETs, >1GHz F close to tape-out u 2005 -- 0.10um, ~200M FETS, ~3.5GHz F launch design in 2003 F begin developing design tools in 2001 F start research right now We are moving faster than Moore’s Law We are moving faster than Moore’s Law

4 Critical Issue: Interconnect Delay n Starting from 0.25um generation, circuit delay is dominated by interconnect delay n Efforts to control interconnect delay u Processing technology: Cu and low K dielectric u Design technology:interconnect-centric design

5 Layout Design: Device-Centric versus Interconnect-Centric Floorplaning Global Routing Detailed Routing Timing Driven Placement Delay Budgeting Device-Centric Device-Centric Floorplaning Interconnect Planning, and Interconnect Optimization Global Routing with Interconnect Optimization Detailed Routing with Variable Width and Spacing Timing Driven Placement Delay Budgeting Interconnect-Centric Interconnect-Centric

6 Topology Interconnect Optimization Device locations and constraints: Delay Power Signal integrity Skew... n Automatic solutions guided by accurate interconnect and device models Sizing Spacing n Other critical optimizations: buffer insertion, simultaneous device and interconnect sizing …

7 UCLA TRIO Package n Integrated system for interconnect design n Efficient polynomial-time optimal/near-optimal algorithms u Interconnect topology optimization u Optimal buffer insertion u Optimal wire sizing u Wire sizing and spacing considering Cx u Simultaneous device and interconnect sizing u Simultaneous topology generation with buffer insertion and wiresizing n Accurate interconnect models u 2 -1/2 D capacitance model u 2 -1/2 D inductance model u Elmore delay and higher-order delay models n Improve interconnect performance by up to 7x ! u Used in industry, e.g., Intel

8 Contributions to TRIO Package n Integrated system for interconnect design n Efficient polynomial-time optimal/near-optimal algorithms u Interconnect topology optimization u Optimal buffer insertion u Optimal wire sizing [ ICCAD’95, TODAES’96] u Wire sizing and spacing considering Cx [ICCAD’97, TCAD’99] u Simultaneous device and interconnect sizing [ ICCAD’96, ISPD’88,TCAD’99] u Simultaneous topology generation with buffer insertion and wiresizing n Accurate interconnect models u 2 -1/2 D capacitance model [DAC’97] (with Cadence) u 2 -1/2 D inductance model [CICC’99] (with HP Labs) u Elmore delay and higher-order delay models n Improve interconnect performance by up to 7x ! u Used in industry, e.g., Intel

9 Outline n Background and overview n LR-based STIS optimization u Motivation for LR-based optimization n Conclusions and future works

10 Discrete Wiresizing Optimization [Cong-Leung, ICCAD’93] n Given: A set of possible wire widths { W 1, W 2, …, W r } n Find: An optimal wire width assignment to minimize weighted sum of sink delays Wiresizing Optimization

11 Dominance Relation and Local Refinement n Dominance Relation u For all E j, w(E j )  w'(E j )  W dominates W’ (i.e., W  W’) W W’ W’ n Local refinement (LR) u LR for E 1 to find an optimal width for E 1, assuming widths for other wires are fixed with respect to current width assignment u Single-variable optimization can be solved efficiently

12 Dominance Property for Discrete Wiresizing [Cong-Leung, ICCAD’93] n If solution W dominates optimal solution W* W’ = local refinement of W Then, W’ dominates W* n If solution W is dominated by optimal solution W* W’ = local refinement of W Then, W’ is dominated by W* A highly efficient algorithm to compute A highly efficient algorithm to compute tight lower and upper bounds of optimal solution tight lower and upper bounds of optimal solution

13 Bound Computation based on Dominance Property n Lower bound computed starting with minimum widths u LR operations on all wires constitute a pass of bound computation u LR operations can be in an arbitrary order u New solution is wider, but is still dominated by the optimal solution n Upper bound is computed similarly, but beginning with maximum widths n We alternate lower and upper bound computations u Total number of passes is linearly bounded by size of solution space n Optimal solution is often achieved in experiments LR LR

14 Other Problems Solved by LR operation n Multi-source discrete wiresizing [Cong-He, ICCAD’95] u Bundled-LR is proposed to speed up LR by a factor of 100x n Continuous wiresizing [Chen-Wong, ISCAS’96] u Linear convergence is proved [Chu-Wong, TCAD’99] n Simultaneous buffer and wire sizing [Chen-Chang-Wong, DAC’96] u Lagrangian relaxation is proposed to minimize max delay F via a sequence of weighted delay minimizations u Extended to general gates and multiple nets [Chu-Chen-Wong, ICCAD’98] Why does LR operation work? Why does LR operation work?

15 Outline n Background and overview n LR-based STIS optimization u Motivation of LR-based optimization u Simple CH-program and application to STIS problem n Conclusions and future works

16 Simple CH-function [Cong-He, ICCAD’96, TCAD’99] n It includes the objective functions for a number of works u Discrete or continuous wire sizing [Cong-Leung, ICCAD’93][Cong-He, ICCAD’95][Chen-Wong,ISCAS’96] u Simultaneous device and wire sizing [Cong-Koh, ICCAD’94][Chen- Chang-Wong, DAC’96][Cong-Koh-Leung, ILPED’96][Chu-Chen-Wong, ICCAD’98] n is a simple CH-function u Variables x i and x j are positive, either continuous or discrete u Coefficients a pi and b qj are positive constants n Examples: u

17 Dominance Property for Simple CH-Program n Optimization problem to minimize a simple CH- function is a simple CH-program. n The dominance property holds for simple CH-program w.r.t. the LR operation. u If X dominates optimal solution X* X’ = local refinement of X Then, X’ dominates X* u If X is dominated by optimal solution X* X’=local refinement of X Then, X’ is dominated by X* n LR operation can be used for all simple CH-programs

18 Simple CH-function [Cong-He, ICCAD’96, TCAD’99] n It includes the objective functions for a number of works u Discrete or continuous wire sizing [Cong-Leung, ICCAD’93][Cong-He, ICCAD’95][Chen-Wong,ISCAS’96] u Simultaneous device and wire sizing [Cong-Koh, ICCAD’94][Chen- Chang-Wong, DAC’96][Cong-Koh-Leung, ILPED’96][Chu-Chen-Wong, ICCAD’98] n is a simple CH-function u Variables x i and x j are positive u Coefficients a pi and b qj are positive constants n Examples: u Unified and efficient solution Unified and efficient solution

19 General Formulation: STIS Simultaneous Transistor and Interconnect Sizing Given: Circuit netlist and initial layout design Given: Circuit netlist and initial layout design Determine: Discrete sizes for devices/wires Determine: Discrete sizes for devices/wires Minimize:  Delay +  Power +  Area Minimize:  Delay +  Power +  Area  It is the first publication to consider simultaneous device and wire sizing for complex gates and multiple paths

20 n u unit-width resistance u unit-width area capacitance u effective-fringing capacitance u discrete widths and variables for optimization STIS Objective for Delay Minimization n It is a simple CH-function under simple model assuming R 0, C 0 and C 1 are constants n STIS can be solved by computing lower and upper bounds via LR operations u Identical lower and upper bounds often achieved Res = R 0 /x Res = R 0 /x Cap = C 0 * x + (C f + C x ) Cap = C 0 * x + (C f + C x ) = C 0 * x + C 1 = C 0 * x + C 1

21 SPICE-Delay reduction of LR-Based STIS n STIS optimization versus manual optimization for clock net [Chien-et al.,ISCC’94]: u 1.2um process, 41518.2 um wire, 154 inverters n Two formulations for LR-based optimization u sgws simultaneous gate and wire sizing u stissimultaneous transistor and interconnect sizing manual sgws stis max delay (ns) 4.6324 4.34 (-6.2%)3.96 (-14.4) power(mW)60.85 46.1 (-24.3%)46.3 (-24.2%) clock skew (ps)470 130 (-3.6x)40 (-11.7x) n Runtime (wire segmenting: 10um) u LR-based sgws 1.18s, stis 0.88s u HSPICE simulation ~2100s in total

22 n u unit-width resistance u unit-width area capacitance u fringing capacitance u discrete widths and variables for optimization STIS Objective for Delay Minimization n It is a simple CH-function under simple model assuming R 0,C 0 and C 1 are constants Over-simplified for DSM (Deep Submicron) designs Over-simplified for DSM (Deep Submicron) designs

23 R 0 is far away from a Constant! n R 0 depends on size, input slope t t and output load c l u May differ by a factor of 2 size = 100x c l \ t t 0.05ns 0.10ns0.20ns 0.225pf 12200 12270 19180 0.425pf 8135 9719 12500 0.825pf8124866510250 size = 400x c l \ t t 0.05ns 0.10ns0.20ns 0.501pf 122001555019150 0.901pf 115601336017440 1.701pf8463968812470 effective-resistance R 0 for unit-width n-transistor n Using more accurate model like the table-based device model has the potential of further delay reduction. u But easy to be trapped at local optimum, and to be even worse than using simple model [Fishburn-Dunlop, ICCAD’85]

24 Neither C 0 nor C 1 is a Constant n Both depend on wire width and spacing u Especially C 1 = C f +C x is highly sensitive to spacing n C x accounts for >50% capacitance in DSM u Proper spacing may lead to extra delay reduction u But no existing algorithm for optimal spacing E1E1 E1E1 spacing spacing Spacing (nm) Spacing (nm)

25 STIS-DSM Problem to Consider DSM Effects n STIS-DSM problem u Find: device sizing, and wire sizing and spacing solution optimal w.r.t. accurate device model and multiple nets STIS-DSM STIS-DSM n Easier but less appealing formulation: single-net STIS-DSM u Find: device sizing, and wire sizing and spacing solution optimal w.r.t. accurate device model and a single-net u Assume: its neighboring wires are fixed Non-constant R 0, C 0 and C 1 Non-constant R 0, C 0 and C 1

26 Outline n Background and overview n LR-based STIS optimization u Motivation: LR-based wire sizing u Simple CH-program and application to STIS problem u Bounded CH-program and application to STIS-DSM problem n Conclusions and future works

27 Go beyond Simple CH-function n It is a simple CH-function if u a pi and b qj are positive constants n It is a bounded CH-function if u a pi and b qj are arbitrary functions of X u a pi and b qj are positive and bounded F and n Examples: u u Objective function for STIS-DSM problem

28 Extended-LR Operation n Extended-LR (ELR) operation is a relaxed LR operation u Replace a pi and b qj by its lower or upper bound during LR operation. u Make sure that the resulting lower or upper bound is always correct F but might be conservative. n Example: u ELR for a lower bound is u ELR for an upper bound is ?

29 n Theorem ( [Cong-He, ISPD’98, TCAD’99] ) u Dominance property holds for bounded CH-program with respect to ELR operation General Dominance Property

30 n Theorem ( [Cong-He, ISPD’98, TCAD’99] ) u Dominance property holds for bounded CH-program with respect to ELR operation General Dominance Property n To minimize u If X dominates optimal solution X* X’ = Extended-LR of X Then, X’ dominates X* u If X is dominated by optimal solution X* X’= Extended-LR of X Then, X’ is dominated by X*

31 n STIS-DSM can be solved as a bounded CH-program u Lower bound computed by ELR starting with minimum sizes u Upper bound computed by ELR starting with maximum sizes u Lower- and upper-bound computations are alternated to shrink solution space u Up-to-date lower and upper bounds of R 0, C 0 and C 1 are used F Uncertainty of R 0, C 0 and C 1 is reduced when the solution space is shrunk n There exists an optimal solution to the STIS-DSM problem between final lower and upper bounds u How large is the gap? Solution to STIS-DSM Problem

32 Gaps between Lower and Upper Bounds n Two nets under 0.18um technology: DCLK and 2cm line u STIS-DSM uses table-based device model and ELR operation u STIS uses simple device model and LR operation n We compare average lower-bound width / average gap DCLK STISSTIS-DSMSTISSTIS-DSM sgws 5.39/0.0713.0/1.912.50/0.0032.78/0.025 stis 17.2/1.5321.6/2.362.69/0.0172.82/0.030 2cm line STISSTIS-DSMSTISSTIS-DSM sgws 108/0.108112/0.04.98/0.0044.99/0.106 stis 126/0.97125/1.985.05/0.0325.11/0.091 TransistorsWires n Gap is almost negligible u About 1% of lower-bound width in most cases

33 n STIS-DSM versus STIS u STIS-DSM uses table-based device model and ELR operation u STIS uses simple device model and LR operation DCLK STIS STIS-DSM sgws 1.16 (0.0%) 1.08 (-6.8%) stis 1.13 (0.0%) 0.96 (-15.1%) 2cm line STIS STIS-DSM sgws 0.82 (0.0%) 0.81 (-0.4%) stis 0.75 (0.0%) 0.69 (-7.6%) n STIS-DSM achieves up to 15% extra reduction n Runtime is still impressive u Total optimization time~10 seconds Delay Reduction by Accurate Device Model

34 Delay Reduction by Wire Spacing n Multi-net STIS-DSM achieves up to 39% delay reduction è Single-net STIS-DSM has a significant delay reduction if we compare it with previous wire sizing formulations pitch-spacing delay runtime single-netmulti-net multi-net 1.10um 1.31 0.79 (-39%) 2.0s 1.65um 0.720.52 (-27%) 2.4s 2.20um 0.46 0.42 (-8.7%) 2.3s 2.75um 0.380.36 (-5.3%) 4.9s 3.30um 0.35 0.32 (-8.6%) 7.7s n Multi-net STIS-DSM versus single-net STIS-DSM u Test case: F 16-bit bus F each bit is 10mm-long with 500um per segment

35 Valid for general problems Valid for general problems Conclusions n Interconnect-centric design is the key to DSM and GHz IC designs n Interconnect optimization is able to effectively control interconnect delay u Problem formulations should consider DSM effects F e.g., LR-based optimization for STIS-DSM problem n More is needed to close the loop of interconnect- centric design u Interconnect planning u Interconnect optimization for inductance and noise u Interconnect verification, especially for pattern- dependent noise and delay u …...


Download ppt "Interconnect Optimization for Deep-Submicron and Giga-Hertz ICs Lei He UCLA Computer Science Department Los Angeles, CA."

Similar presentations


Ads by Google