VADA Lab.SungKyunKwan Univ. 4 Net-based Delay Constraints We can use lumped C or Elmore delay to estimate delays for each net The delay of critical net can be constrained to be less than some target by adding a non-linear penalty. The total constraint penalty is added to the primary opti- mization cost function
VADA Lab.SungKyunKwan Univ. 5 Net vs. Path Delays Need a way to convert path delays to net delay requirements along the path Two basic methods: Slack-based constraint generation and Path- based layout
VADA Lab.SungKyunKwan Univ. 6 What is Slack? Slack is the time by which a given path's arrival time 'beats' the required time Negative slack implies that you missed the clock: i.e. a timing viola- tion!
VADA Lab.SungKyunKwan Univ. 7 Computing Slack: Example Gate delay
VADA Lab.SungKyunKwan Univ. 8 The Zero Slack Algorithm
VADA Lab.SungKyunKwan Univ. 9 Limitations of ZSA Guaranteed to meet path constraint if all net constraints met but it is overly-conservative. This lack of flexibility causes us to over constrain the placement. It would be better to optimize the path delay directly!
VADA Lab.SungKyunKwan Univ. 10 Path-based Delay Constraints We can write slack equations of the form Ta + Tb + Tc+ Td < Tslack Then penalize the paths which exceed their slack just as we penalized nets which exceeded their delays in ZAS Trade-off is performance! It may not be possible to monitor all nets. Ideally we would allow layout optimization to tradeoff delay on one net against others on the path
VADA Lab.SungKyunKwan Univ. 11 TDP IBM's TDP is a path-based timing driven partitioning, placement and global routing system [DON90]
VADA Lab.SungKyunKwan Univ. 12 TDP Results Results for a typical 30K circuit chip 40,000 slack equations! Timing-driven partitioning took 39 CPU min Timing-driven placement took 56 CPU min(IBM 3090 200)
VADA Lab.SungKyunKwan Univ. 13 Issues in deep submicron timing verification With decreasing device sizes and operating voltages, interconnect-related effects create regions of uncertainty during device switching. These effects include crosstalk, simultaneously switching outputs and signal noise. Specifically, at the deep-submicron level, device switching behavior is better described as a transition region rather than a transition point and this produces less correlated delays on the same chip. Because the delay data in current ASIC cell libraries is based on the interval from one transition point to another, the libraries do not accurately model the uncertainties and the loss of delay correlation introduced by transition regions. This ambiguity can create timing variations that significantly lower production yields or lead to device failures. The solution to this problem is to model these effects in either ASIC libraries or in timing verification. The better solution is to handle it in timing verification using current libraries. The two primary timing verification methods for eliminating the hazards today are static timing analysis and Verilog gate-level timing simulation (at best- and worst-case process, voltage and temperature conditions). Both methods use discrete delay values, and neither method can effectively model regions of uncertainty on interconnect delays.
VADA Lab.SungKyunKwan Univ. 14 How do you use SUE for critical path optimization? In datapath designs, some simple directions by the designer can produce speed optimized layouts. These directives are easily given and modified in SUE. The placement of components on the schematic direct relative placement in the preplacement file. Cells can also be hard placed at specific row, column locations. Empty space can be indicated by a special spacer component. SUE automatically generates the flat row, column placement and predicted wire lengths from the compact, hierarchical schematic. Wire predictions can be used to drive a timing analyzer such as Pearl. SUE reads the output from Pearl and displays the critical paths directly on the schematic. The placement can be modified to optimize the critical paths, or extra drivers can be added to the critical path, all in the schematic. You then run through the placement and timing iteration again. This iteration continues until the timing criteria are satisfied. The iteration loop is fast and visual. When you are satisfied with the design performance, the placement file is passed to Silicon Ensemble from Cadence or other commercially available placement and routing tools
VADA Lab.SungKyunKwan Univ. 15 How do you use SUE for critical path optimization? Further, static timing analysis generates unrealistically pessimistic delays that cause large numbers of false paths. Gate-level timing simulation assumes that delays are perfectly correlated across the die. In deep submicron ASICs, gate delays are no longer fully correlated because of signal transition regions. Thus, traditional timing simulators miss many timing hazards in deep submicron ASICs. The solution is to utilize range-delay simulation, which accurately models the effects of ambiguous switching regions and delay correlations. A range-delay simulator models delays with a range of values that represent switching uncertainties. The range-delay simulator can also apply histograms to eliminate false paths. In this technique, a five-, equal-segment histogram represents the probability distribution of submicron ASIC delays. The simulator then combines the histograms to produce realistic delay ranges across paths, thus avoiding the false paths that would result from using a single pessimistic delay value. Using histogram-based range simulation, designers can quickly identify and debug timing hazards due to deep submicron effects prior to sign-off. Eliminating these hazards can greatly improve ASIC production yields and reduce time to volume. Allen Wu is an ASIC programs applications engineer at Nextwave Design Automation (San Jose, CA).
VADA Lab.SungKyunKwan Univ. 16 The following figure shows the placement generated for sample 8-bit ALU. A critical path is highlighted in red and blue. Timing for other nets is indicated in the menu and new nets can be selected and highlighted.
VADA Lab.SungKyunKwan Univ. 18 References Constraint Generation [BEN91] J.Benkoski and A.Strojwa: The Role of Timing verication in Layout Synthesis, DAC-91. [CHO90] U.Choudhury and A.Sangiovanni-Vincentelli:Constraint Generation for Routing Analog Circuits, DAC-90 [LUK91] W.Luk:A Fast Physical Constraint Generator for Timing Driven Layout, DAC-91 [NAI88] R.Nair C.Berman, P.Hauge, and E.Yoa: Generation of Performance Constrains for Layout, DAC-91 [SUT90] S.Satanthavibul and E.Shragowitz:An Adaptive Timing-Driven Layout for High Speed, VLSI, DAC-90 [SUT91] S.Satanthavibul and E.Shragowitz Dynamic Prediction of Critical Paths andNets for Constructive Timing-Driven Placement, DAC-91 Timing-Driven Floorplanning and Partitioning [LAP89] D.LaPotin and Y.Chen: Early Matcing of System Requirements and Pakage Capabilities, ICCAD-89 [SHI92] M.Shin, E.Kuh and R.Tsai:Performance-Driven System Partitioning on Multi-Chip Modules DAC-92 [SUT89] S.Sutanthaibul, E.Shragowitz: and J.Rosen:An Analytical Approach to Floorplan Design and Optimization, IEEE Trans on CAD June 1991 Timing-Driven Global Routing [BRA90] D.Brasen and M.Bushnell:MHERTZ: A new Optimization Algorithm for Floorplanning and Global Routiong, DAC-90 [CON91] J.Cong,et al.:Performance-Driven Global Routing for Cell Based ICs, ICCAD-91 [CON92] J.Cong et al.:Provably Good Performance-Driven Global Routing, ICCAD-90 [HUA93] J.Huang et al.:An Ecient Timing-Driven Global Routing Algorithm, DAC-93 [PAT85] A.Patel, N.Soong, and R.Korn:Hierarchical VLSI Routing, IEEE Trans on CAD April 1985