Download presentation

Presentation is loading. Please wait.

Published byMarvin Hughston Modified over 2 years ago

1
1 Timing-Driven, Over-the-Block Rectilinear Steiner Tree Construction with Pre-Buffering and Slew Constraints Yilin Zhang and David Z. Pan ECE, Univ. of Texas at Austin ISPD’ 2014

2
Outline Background & Motivation TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms Experimental Results Conclusion 2

3
History of VLSI RSMTs Wirelength driven: BOI, BI1S, RV-based RST, FLUTE and GeoSteiner Obstacle-avoiding RSMT (OA-RSMT) ›[Chow +, VLSI14] [Liu +, DAC12][Li +, ICCAD08] Over-the-block RSMT (OB-RSMT) are proposed since 2012 ›[Huang +, ICCAD12] [Zhang +, ICCAD12] Minimum delay routing tree (MDRT) : BA-Tree, etc. RAT-driven RSMT: C-Tree, etc. 3

4
Limitations on Previous Timing- driven RST Cluster nodes during bottom-up method ›Such as BA-Tree and C-Tree Clustering distance metric: ›spatial and slack 4 Hard to find accurate slack: Some segments are not fixed yet All segments are not buffered yet

5
Limitations in Dealing Blocks Completely neglect block will have slew problem ›No over-the-block buffer allowed Obstacle avoiding ›More congested outside-block ›Detour means more WL and worse timing 5 detours

6
Post-buffering Topology Tuning is Necessary Buffering plays a big role in delay reduction ›Shielding effect; linear delay on long wire ›But it is always placed after wiring Change topology after buffering is fruitful! 6 D SB unchanged D SA decreased D b2

7
Our Contributions Use pre-buffering to find practical slack for each node in the graph Use over-the-block routing resource to improve WL, buffering cost and timing Apply post-buffering tuning to improve timing on critical paths with little extra cost 7

8
Outline Background & Motivation TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms Experimental Results Conclusion 8

9
Problem Formulation N = {s 0,s 1,s 2,...,s n }, n sinks and source s 0 B = {b 1, b 2,..., b m }, non-overlapping rectilinear blocks in two-dimensional space R Buffered T(V, E) connects all the pins in N to optimize WNS with the lowest buffering cost ›V is the set of nodes ›E is the set of horizontal and vertical edges. Slew rate on every point in T within constraints ›Slew mode buffering [Hu+, TCAD07] No buffers are allowed over the blocks 9

10
Timing Models Elmore Delay Slew ›Peri Model + Bakoglu’s Metric » ( 4% error [Kashyap+, ISPD03] [Bakoglu+, 90] ) 10

11
Overall Algorithm 11 Initial timing-driven RST with Pre-buffering Find all over-the-block slew violation and fix them Buffering Tune the topology according to buffering information Buffering N & B Return buffered T

12
Initial Tree Generation with Pre-Buffering 12 Iterative method ›Until converges or oscillates between several states Feed back real delay to each node to find slack (criticality) ›Identified critical sinks before topology construction are real critical ones ›Practical slack on each node

13
Initial Tree with Pre-Buffering Flow 13 [Lin+, TCAD11]

14
14 Initial Tree with Pre-Buffering Example Simple model without buffering suggests D is critical However, with buffering, D is not critical Now, D is inserted far from source with less WL

15
Buffering-Aware Over-the-Block TD-RST TD-RST needs over-the-block route ›Better WL, buffer resources and timing ›Replace obstacle-avoiding detours with shorter over- the-block connection 15 150ps 100ps 120ps 110ps

16
16 Different with WL-driven BOB-RSMT Original WL driven Move non-critical paths to save slew Protect critical paths for timing WL+slack

17
The hard problem with over-the-block is slew Each topology confines a set of inside trees Use hypothetic buffer to check if it is possible for buffering 17 Slew Constraints in Buffering-Aware TD-RST

18
Optimization Primitives Three optimization primitives 18 Parallel sliding Perpendicular sliding EP merging [Zhang, ICCAD12]

19
Formulation consider slack and WL together 19 Formulation of Buffering-Aware TD-RST W ij C d EP i t : delay increase for every sink downstream EP i t Increase of TNS Increase of WL

20
Buffer-location-based Tuning Benefits Tuning topology after buffering benefits! Buffering resources are costly Improve timing without increasing buffers is tempting ›With small amount of WL increase We propose a way to post-tune the topology base on buffer location information 20

21
Saturated/Un-saturated Buffers Some buffers are “Saturated” and some are “Un- saturated” ›Saturate: the slew reaches maximum ›Un-saturated: slew does not reach maximum 21

22
Buffer-location-based Tuning Study Un-saturated buffer == opportunity 22 WL increase Delay to A improves

23
Buffer-location-based Tuning Condition Δslew = slew max – slew cur L max is the max allowed distance to relocate ›If neglecting buffer input cap, L max = ›If consider buffer input cap, L max = 23

24
Buffer-location-based Tuning Flow 24 Sort all sinks according to slack Tuning Buffered T Return buffered T n = n.parent satisfy L max constraint ? For each neg slack sink n n at source? N Y Continue Buffering

25
Outline Background & Motivation TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms Experimental Results Conclusion 25

26
Experimental Setups C++ programming language Intel Core 3.0GHz Linux machine with 32GB memory Gurobi Optimizer 5.10 for mathematical optimization RC01-RC12 are benchmarks [Feng+, ISPD06] Two sizes of buffers: 450 ohms and 850 ohms, 3.8 fF and 1.9 fF Interconnect RC from ITRS and slew constraints 70ps 26

27
Experimental Setups SD-OARST is baseline [Lin+, TCAD11] TOB-RST-1 OA-RST with pre-buffering TOB-RST-2 is over-the-block with pre-buffering TOB-RST is over-the-block with pre-buffering and post-buffering tuning 27

28
Experimental Results 28 TOB-RST-1 to SD-OARST ›similarity of WL (buffering cost) ›pre-buffering benefits the slack TOB-RST-2 to TOB-RST-1: ›179ps on average for WNS ›buffering cost and WL reduced by 6% and 5% TOB-RST to TOB-RST-2: ›70ps in WNS on average, less than 1% more WL

29
Experimental Results 29

30
Outline Background & Motivation TOB-RSMT ›Problem Formulation ›TOB-RSMT Algorithms Experimental Results Conclusion 30

31
Conclusion Timing-driven over-the-block rectilinear Steiner minimum tree Use pre-buffering to find practical slack for each node Use over-the-block routing resources to improve WL, buffering cost and timing Apply post-buffering tuning to improve timing on critical paths with little extra cost Significantly improve WNS for all benchmarks along with 2% less WL and 4% less buffering cost than SD-OARST 31

32
Acknowledgment This work is supported in part by Oracle Thanks to Dr. Salim Chowdhury, Dr. Rajendran Panda and Dr. Akshay Sharma from Oracle 32 Thank you! Questions?

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google