Chris Chu Iowa State University Yiu-Chung Wong Rio Design Automation Fast and Accurate Rectilinear Steiner Minimal Tree Algorithm for VLSI Design Chris Chu Iowa State University Yiu-Chung Wong Rio Design Automation
RSMT Problem Rectilinear Steiner minimal tree (RSMT) problem: Given pin positions, find a rectilinear Steiner tree with minimum WL NP-complete Optimal algorithms: Hwang, Richards, Winter [ADM 92] Warme, Winter, Zachariasen [AST 00] GeoSteiner package Near-optimal algorithms: Griffith et al. [TCAD 94] Batched 1-Steiner heuristic (BI1S) Mandoiu, Vazirani, Ganley [ICCAD-99] Low-complexity algorithms: Borah, Owens, Irwin [TCAD 94] Edge-based heuristic, O(n log n) Zhou [ISPD 03] Spanning graph based, O(n log n) Algorithms targeting low-degree nets (VLSI applications): Soukup [Proc. IEEE 81] Single Trunk Steiner Tree (STST) Chen et al. [SLIP 02] Refined Single Trunk Tree (RST-T)
Overview A fast and accurate algorithm targeting VLSI applications Based on the FLUTE (Fast LookUp Table Estimation) idea [ICCAD-04] with three new contributions The new algorithm is still called FLUTE Handling of low degree nets is extremely well: Optimal and extremely efficient for nets up to 9 pins Still very accurate for nets up to degree 100 So FLUTE is especially suitable for VLSI applications: Over all 1.57 million nets in 18 IBM circuits [ISPD 98] More accurate than Batched 1-Steiner heuristic Almost as fast as minimum spanning tree construction
Review of FLUTE Lookup Table based approach Originally proposed for wirelength estimation Given a net: 1. Find the group index of the net 2. Get the POWVs from LUT 3. Find the segment lengths 4. Find WL for each POWV and return the best Group index: 3142 3 2 1 4 POWVs: (1,2,1,1,1,1) (1,1,1,1,2,1) 3 2 5 6 3 2 5 6 HPWL + 2 = 22 HPWL + 6 = 26 Return
Statistics on POWV Table Boundary compaction technique to build LUT Optimal up to degree 9 Table size for all nets up to degree 9 is 2.75MB MST-based algorithm to evaluate a net efficiently Impractical for high-degree nets
High-Degree Nets by Net Breaking Build lookup table only up to degree D=9 For nets up to degree D, use lookup table For nets with degree > D, recursively break net until degree <= D Original Net Breaking Technique: Try to break a net both horizontally and vertically For each direction, select one pin to break the net Select the pin that minimize total HPWL of two subnets
Our Contributions 1. Extension for RSMT construction 2. Improved net breaking technique Optimal net breaking algorithm Net Breaking Heuristic #1 Net Breaking Heuristic #2 Net Breaking Heuristic #3 3. Accuracy control scheme
RSMT Construction If degree <= D, store 1 routing topology for each POWV If degree > D, Steiner trees of two sub-nets are combined Redundant segment can be detected and removed POWV (1,2,1,1,1,1) POWV (1,1,1,1,2,1)
Optimal Net Breaking Algorithm Condition: Pins on opposite quadrants. Theorem: By combining the two optimal sub-trees, the Steiner tree constructed is optimal. Steiner node
Net Breaking Heuristic #1 A score for each direction and each pin Break in a way which gives the highest score Subnet 1 Pin r Subnet 2
Net Breaking Heuristic #2 A score for each direction and each pin Break in a way which gives the highest score Subnet 1 Pin r Subnet 2
Net Breaking Heuristic #3 A score for each direction and each pin Break in a way which gives the highest score Center grid point Pin r
Accuracy Control Scheme Accuracy parameter A Break a net in A ways with the highest scores Subnets are handled with accuracy max(A-1, 1 ) Runtime complexity = O(A! n log n) Default A=3 3 1 1 1 2 2 1
Experimental Setup Comparing five techniques: RMST – Prim’s RMST algorithm Prim [BSTJ 57] RST-T – Refined Single Trunk Tree Chen et al. [SLIP 02] SPAN –Spanning graph based algorithm Zhou [ISPD 03] BI1S -- Batched Iterated 1-Steiner heuristic Griffith et al. [TCAD 94] FLUTE with D=9 and A=3 18 IBM circuits in the ISPD98 benchmark suite Placement by FastPlace [ISPD 04] Optimal solutions by GeoSteiner 3.1 (Warme et al.)
Benchmark Information
Accuracy Comparison
Runtime Comparison All experiments are carried out on a 750 MHz Sun Sparc-2 machine Normalized
Breakdown According to Net Degree All 1.57 million nets in 18 circuits
Accuracy vs. Runtime Tradeoff RMST Runtime (Error 4.23%) D=9 A=1 A=2 A=3 (default) A=4 A=5 A=6 A=7 BI1S Error (Runtime 8020s)
Conclusion FLUTE: Very suitable for VLSI applications: Rectilinear Steiner Minimal Tree algorithm Post-placement pre-routing wirelength estimation Very suitable for VLSI applications: Optimal up to degree 9 Very accurate up to degree 100 Very fast Nice tradeoff between accuracy and runtime Techniques introduced: Extension of FLUTE idea to RSMT construction 1 optimal algorithm + 3 heuristics for net breaking Scheme to tradeoff accuracy and runtime
Future Works Better technique to handle high-degree nets RSMT construction with obstacles Extend to timing-driven Steiner tree construction Source code available in GSRC Bookshelf: http://vlsicad.eecs.umich.edu/BK/slots (Rectilinear Spanning and Steiner tree slot)
Thank You
Accuracy for Nets of Degree <=100
Runtime for Nets of Degree <=100
POWV Generation for Degree >= 7 Need to include some extra topologies For degree 7 or more, if all pins are on boundary, include the following topologies in addition to those generated by boundary compaction: Enumerate all POWVs for degree-7 nets Enumerate almost all POWVs for degree-8 nets