Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†

Name: Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†
Uploaded: 2017-12-13T00:15:51+00:00
Duration: PTM31S42
Channel: Edwin Stewart
Description: Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†

SI for Free: Machine Learning of Interconnect Coupling Delay and Transition Effects
Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath† ‡† ECE and †CSE Departments, UC San Diego {abk, muluo, Thanks you for the introduction.

Outline Motivation Previous Work Our Methodology and Accuracy Results
Design of Experiments and Robustness Results Conclusions My talk is structured as follows. First I provide motivations. Then, I summarize previous works. Next, I describe our methodology and our accuracy results, then we describe our design of experiments and robustness results, then I give the conclusions.

Non-SI to SI Calibration Use Case
Post-P&R files Non-SI to SI Calibration Use Case .db, .lib .spef .v .sdc Post P & R Database .sdc .v Non-SI Timing Report Non-SI Timing Report Non-SI Timing Report Save cost Save runtime But still accurate Calibration: Recipe to Convert Non-SI Timing Report to SI Timing Report The main motivation of our talk is if we can create a recipe to convert a non-SI timing report to a SI timing report. The conversion recipe is what we refer to as calibration. Instead of using STA tools with SI capability, we can use STA tools without SI capability to generate timing reports from Post P & R databases. Then, by some recipe to convert from non-SI to SI, we can generate SI timing reports. In this way, we do not need STA tools with SI capability. That saves money, saves runtime, but still get the accuracy of SI timing reports. SI Timing Report SI Timing Report SI Timing Report

Non-SI vs. SI: How Bad is the Divergence?
Slack diverges by 81ps (clock period = 1.0ns) 81ps is ~4 stages of logic at 28nm FDSOI Path Slack in Non-SI Mode (ns) Ideal correlation 81ps Let’s examine how much do SI and non-SI reports of the STA tool differ. This plot shows path slack in SI mode in the X-axis and path slack in non-SI mode in the Y-axis. The black line is the ideal correlations and the red circles are our experimental data. The divergence can be as large as 81ps in the critical path at a clock period of 1ns. 81ps is equivalent to 4 stages of logic at 28nm. Path slack in SI Mode (ns)

Non-SI to SI Calibration is Difficult!
Multiple electrical, logic structure and layout parameters Complex interactions between parameters Black-box code in STA tools Electrical parameters Commercial STA tools However, calibrating SI to non-SI is difficult. Several electrical, logic structure, and layout parameters affect these correlations. And there are complex interactions between parameters. Besides, commercial STA tools is generally a blackbox. We do not know exactly what it does internally for STA analysis. Thus it is hard to model timing reports. SI Timing reports: Incr delay Transition time Path delay Logic structure parameters Layout parameters

Example Challenge: Clock Period Dependency
∆path slack is 81ps at signoff clock period of 1.0ns Tightening clock period to 0.82ns changes ∆path slack to 143ps! 143ps at tighter clock period 81ps at signoff clock period During our modeling process we encountered several challenges. An example is the challenge in clock period. When clock period is 1.0ns the difference of max delta path slack between SI and non-SI in the critical path is only 81ps. But when we tighten the clock period, the difference becomes 143ps. This is non-intuitive because Xps change in clock-period should changes arrival times at all pins by the same amount.

Example Challenge: Ground Capacitance Dependency
14ps 15ps Another challenge arises due to ground cap dependency. We show the incremental transition time, incremental delay as functions of ground cap. We see that while incremental transition time increases with ground cap, incremental delay decreases. This anti-correlation is non-obvious, because in general we expect both of transition time and delay to increase with the capacitance. Incremental transition time (DTran) increases but incremental delay (SI Incr Delay) due to SI decreases This anti-correlation is non-obvious!

Our Contributions Identify multiple sources of timing divergence between non-SI and SI modes Provide new insights in terms of modeling parameters required to calibrate non-SI to SI timing Develop new models to calibrate non-SI to SI timing using machine learning-based techniques Demonstrate accuracy and robustness of our models on a variety of testcases Worst-case divergence of 5.2ps in incremental delay due to SI Worst-case divergence of 8.2ps in SI-aware path delay We overcome these challenges with our methodology And our contributions are: We identify multiple sources of timing divergence between non-SI and SI modes And based on that we provide new insight of the require parameters to calibrate non-SI to SI timing Then we develop new models to accurately calculate the SI timing based on non-SI timing reports using machine learning based techniques. We demonstrate accuracy of our models on variety of testcases.

Design of Experiments and Robustness Results Conclusions Now I review previous work.

Review of Previous Works
Analytical SI-induced delay models Sapatnekar2000 Lumps coupling capacitance to ground capacitance using Miller coupling factors Uses an iterative algorithm to estimate crosstalk delay on nets Xiao2000 Derive a two-pole model for crosstalk noise computation using iterative Newton-Raphson method Correlation of STA tools Thiel2004 Correlate SPICE to PT timing reports Kahng2013 Propose an offset-based correlation and wire delay estimation using linear regression to calibrate path slacks with PT Han2014 Develop machine learning models to correlate SI to SI and non-SI to non-SI timing between STA tools, and STA and design implementation tools The previous works can be classified into two categories. The first category is the analytical modeling of SI induced delay models. These include, Sapatnekar2000 and Xiao2000 works. For the analytical model, the runtime can be high, so practically it is difficult to use for large design. The second category is the correlations of STA tools. That includes Thiel 2004, Kahng 2013 and Han 2014 works. It is worthy to mention the third one because this work calibrates non-SI to non-SI timing reports, or SI to SI timing reports. And starting from that, we calibrate timing reports of STA tools from non-SI mode to SI mode. Let’s take a look at Han’s works

Miscorrelations of Han2014
Calibrate non-SI to non-SI or SI to SI Signoff timer to signoff timer Signoff timer to IC implementation tools Divergence of 60ps when trying to calibrate non-SI to SI We need new models to calibrate SI from non-SI! Actual Incremental Delay in SI Mode (ps) Predicted Incremental Delay in SI Mode using non-SI Mode Information (ps) 60ps Ideal correlation Han2014 calibrates non-SI to non-SI or SI to SI They can calibrate signoff timer to signoff timer bothe with SI-mode on or off. For example, PrimeTime or ETS. On the other hand, they can also calibrate signoff timer to IC implementation tools. For example. SOCE, ICC. We reimplement the methodology and use it to calibrate the non-SI timing report to SI timing report of the same timer. The results shows the incremental delay can be as large as 60ps in the critical path. That’s a large number for incremental delay. We need to new models to calibrate SI from non-SI.

Design of Experiments and Robustness Results Conclusions Let’s look at our modeling methodology and accuracy results.

Identifying Modeling Parameters
Incremental Delay in SI Mode (ns) Incremental Delay in SI Mode (ns) LE Rw x Cc We conduct studies to understand parameters that affect SI timing. Here we show the sensitivity of incremental delay due to SI with electrical parameters. We need to consider new electrical parameters like the product of Rw and Cc, as well as LE. Rw, Cc are the resistance, coupling cap an arc. And LE denotes the logic effort of a driver. Based on this studies we identify Rw,Cc, LE for modeling of incremental delay in SI Need to consider new electrical parameters Rw is the resistance of an arc Cc is the coupling capacitance of an arc LE is the logic effort of a driver Thus, RW, Cc, LE have great impacts on timing, we identify them as parameters for incremental delay in SI mode

List of Modeling Parameters
Type Source Transition time in non-SI mode Electrical Non-SI timing reports Resistance of an arc SPEF Coupling cap of an arc Electrical, layout Ratio of coupling to total capacitance Logical effort of driver SPEF, Liberty Ratio of arc’s stage to total # stages Logic structure Clock period Constraint SDC {Min, max} x {rise, fall} delta arrival times between the worst aggressor and victim Toggle rate of victim net Operational, logic structure Path delay in non-SI mode Several new electrical, logic structure, layout parameters We Identify parameters for modeling based on the type of studies in the previous studies. Here we listed all the parameters that are used for modeling, and their respective type and sources. These includes electrical, layout, logic structure, parameters and constraint. They can be from spef, non-SI report, library and SDC files. Compared with the previous works we use several new parameters for the modeling. We develop three models based on our studies. We first model incremental transition time based on these parameters. Then we model incremental delay due to SI based on transition time model and parameters. Finally we model SI-aware path delay base on incremental delay model and other parameters. Incremental transition time due to SI Incremental delay due to SI SI-aware path delay Electrical, logic structure, and layout parameters and constraint

Comparison between Models
Actual Path Delay (ps) Predicted Path Delay (ps) 87.3ps Ideal correlation Actual Path Delay (ps) Predicted Path Delay (ps) 8.2ps Ideal correlation Here we compare the Han 2014 models and the prediction of our models. Because Han’s models does not consider SI incremental delay, so the differences can be as large as 87.3ps. However, by adding the new electrical, logic structure, and layout parameters. The path delay prediction error reduces to only 8.2ps. Han2014 models have worst-case path delay error of 87.3ps vs. 8.2ps error from our models

Modeling Flow Timing Reports in SI Mode Timing Reports in Non-SI Mode Create Training, Validation and Testing Sets ANN (2 Hidden Layers, 5-Fold Cross-Validation) SVM (RBF Kernel, 5-Fold Cross-Validation) HSM (Weighted Predictions from ANN and SVM) Here we show the flow of our modeling. We first acquire timing reports for both SI and non-SI modes by running STA tools. Then we divide those reports into training, validation, and testing sets. We then use ANN with 2 hidden layers and use 5-fold cross-validation. We also use SVM with RBF kernel and 5-fold cross-validation. Then we use HSM to find weighted predictions from ANN and SVM. Finally we save the model and exit. Because linear regression cannot capture complex interactions between parameters, so we use non-linear modeling techniques such as ANN and SVM. Non-linear techniques can handle this by using hidden parameters. Save Model and Exit Linear regression cannot capture complex interactions between parameters Non-linear techniques capture these interactions using hidden parameters

Incremental Transition Time (Due to SI) Model
∆ 𝑇 𝑠𝑖 =𝑓 𝑇 𝑠𝑖′ , 𝑅 𝑤 , 𝐶 𝑐 , 𝑟 𝐶𝑐, 𝐶𝑡𝑜𝑡 , 𝑐𝑙𝑘𝑝, 𝐿𝐸 Incremental Transition Time (Due to SI): Transition Time considering SI – Transition Time w/o SI We use six modeling parameters Meaning Incremental transition time of an arc due to SI Transition time of an arc in non-SI mode Resistance of an arc Coupling capacitance of an arc Ratio of coupling to total capacitance Clock period Logical effort of the driver of the net Here we show how we model incremental transition time. We use six parameters which include the transition time in non-SI mode, resistance of an arc, coupling capacitance of an arc, ratio of coupling to total capacitance, clock period as well as the logic effort of the driver of the net.

Accuracy of Incremental Transition Time Prediction
Predicted Incremental Transition Time (ps) Ideal correlation 7.0ps Here we show the predicted incremental transition time versus actual incremental transition time. The backline indicates the ideal correlations. The red circles are our experimental data points. Correlation is close to the ideal. The worst-case error is 7.0ps that’s 8.8 percent of real transition time. The range of errors is 11.3ps and the average absolute error is 0.7ps. Actual Incremental Transition Time (ps) Worst-case absolute error of 7.0ps (8.8%) Range of errors is 11.3ps Average absolute error of 0.7ps (0.6%)

Incremental Delay (Due to SI) Model
∆ 𝐷 𝑠𝑖 =𝑓 𝐷 𝑠𝑖′ , ∆ 𝑇 𝑠𝑖 , 𝑅 𝑤 , 𝐶 𝑐 , 𝑟 𝐶𝑐, 𝐶𝑡𝑜𝑡 , 𝑟 𝑆, 𝑁𝑠𝑡𝑔, 𝑐𝑙𝑘𝑝, ∆ 𝑎𝑟𝑟 𝑚𝑖𝑛, (𝑟,𝑓) , ∆ 𝑎𝑟𝑟 𝑚𝑎𝑥, (𝑟, 𝑓) , 𝐴 𝑟 , 𝐿𝐸 Incremental Delay (Due to SI): Delay considering SI – Delay w/o SI We use 11 modeling parameters Meaning Incremental delay of an arc due to SI Incremental delay of an arc in non-SI mode Incremental transition time of an arc due to SI (predicted) Resistance of an arc Coupling capacitance of an arc Ratio of coupling to total capacitance Ratio of arc’s stage to total # stages Clock period Delta of min (rise, fall) arrival time between aggressor and victim Delta of max (rise, fall) arrival time between aggressor and victim Toggle rate of net Logical effort of the driver of the net For the modeling for incremental delay. We use eleven parameters includes, incremental delay of an arc in non-SI mode. This can be acquired from non-SI report. incremental transition time of an arc due to SI from the previous model. As well as other electrical, logic structure, and layout parameters and constraints.

Accuracy of Incremental Delay Prediction
Ideal correlation Predicted SI Incr Delay (ps) 5.2ps Here we show the predicted SI incremental delay versus actual SI incremental delay. The worst-case absolute error is 5.2ps and the range of error is 9.8ps. The average absolute error is 1.2ps Actual SI Incr Delay (ps) Worst-case absolute error of 5.2ps (15.7%) Range of errors is 9.8ps Average absolute error of 1.2ps (1.1%)

SI-Aware Path Delay Model
∆ 𝑃 𝑠𝑖 =𝑓 𝑃 𝑠𝑖′ , 𝑖=1 𝑁 𝑠𝑡𝑔 ∆ 𝐷 𝑠𝑖 , 𝑁 𝑠𝑡𝑔 We use three modeling parameters Meaning Difference in path delays in SI and non-SI modes Non-SI path delay across all timing arcs Sum of incremental delay due to SI (predicted) across all stages in a path Number of stages in a timing path For path delay we use three parameters that include non-SI path delay, sum of incremental delay predicted from the previous models as well as number of stages in a timing path.

Accuracy of Path Delay Prediction
Ideal correlation Predicted Path Delay (ps) 8.2ps We also show the path delay prediction in Experiment 1. Wost-case absolute error is 8.2ps and the average absolute error is 1.7ps Actual Path Delay (ps) Worst-case absolute error of 8.2ps (6.9%) Average absolute error of 1.7ps (1.4%)

Design of Experiments and Robustness Results Conclusions Let’s see the design of experiments and Robustness results

Testcases We use real open-source designs and artificial testcases
Technology: 28nm foundry FDSOI Total data points: 188K Testcase Type Testcase Name Source Signoff Clock Period (ns) #Instances at Post-Synthesis CPU OST2 (1-core) Oracle (formerly, Sun) 2.2 350K GPU THEIA OpenCores 2.0 125K Modem Viterbi 1.0 97K Encoder JPEG 0.8 62K Crypto AES 13K Stack FIFO Designware 0.75 6.5K Artificial ART UCSD >= 100 We use real open-source designs of various classes, and artificial testcases from different sources that are signoff at different clock period. For all our implementations, we use 28nm foundry FDSOI library. Source of testcase are listed here and there 188k data points that are used.

Artificial Testcases Clock periods: tight (-200ps less than signoff period) and loose (200ps more than signoff period) #Stages in artificial testcase: {15, 20, 25, 30} Miller coupling factor (MCF): {2, 1, 0}× RC scaling factors: {0.5, 1.0, 2.0}× Driver sizes in artificial: {X6, X16, X24, X32} Here I show our artificial testcases. It consists of flip-flops, inverters and the wires between the buffers. The SI delay is due to the coupling cap while the non-SI delay is mainly due to the ground cap. For the artificial testcases, we test it at different clock period, number of stages, miller coupling factor, RC scaling factors as well as driver sizes.

STA Tool Flows Read databases of timing libraries
Read and link design (post-P&R netlist) Read constraints (.sdc) and parasitics (.spef) In non-SI mode, use MCF to add coupling cap to ground cap In SI mode, set flags to not reselect critical path for SI analysis, select clock nets and delay analysis mode as edge-aligned Perform path-based timing analysis of top-1K paths Obtain detailed timing reports Here we show the STA tool flows to generate the data. We first read database of timing libraries, read and link design, then we read and sdc and spef files. For non-SI mode, we simply lump the coupling cap in spef to ground gap using MCF. For SI mode, set flags to prevent reselection of critical path, analyze the same path, and we select clock nets and delay analysis mode as edge-aligned. Then we perform path-based timing analysis of top-1K paths and obtain the detailed timing reports that contains all the modeling information we need.

Predicted SI Incr Delay (ps) Actual SI Incr Delay (ps)
Robustness of Models Ideal correlation Predicted SI Incr Delay (ps) 7.9ps Here we show the robustness results of our models. We implement a JPEG with different clock period, and utilization, this changes number of stages in critical path, Rw, Cc, Ctot, driver sizes and other electrical, layout and logic structure parameters. and we want to see the how our models perform on this implementation. The x-axis is the actual SI incr delay and the y-axis shows the predicted SI incr delay. The worst-case absolute error is 7.9ps which is 12.3% of the actual SI incr delay, while the average absolute error is 1.6ps. This indicates that our model can predict other implementations, thus our models are not overfitted. Actual SI Incr Delay (ps) New implementation of JPEG has different clock period, #stages, utilization Worst-case absolute error of 7.9ps (12.3%) Average absolute error of 1.6ps (2.6%)

Design of Experiments and Robustness Results Conclusions Let’s come to the conclusion of this work.

Conclusions Calibration of non-SI to SI enables cost and runtime savings for SoC design teams We analyze electrical, logic structure and layout parameters that cause timing divergence between non-SI and SI modes We develop machine learning-based models to accurately calibrate non-SI to SI timing Our models have a worst-case error of 8.2ps Si-aware path delay in a 28nm foundry FDSOI technology Ongoing Correlate graph-based and path-based timing analysis Integrate our models with an academic timer Calibration of non-SI to SI enables cost and runtime saving for SoC design. That’s the motivation of our work. Because of this, We analyze electrical, logic structure, and layout parameters between non-SI and SI modes, in order to model the SI from non-SI timing reports With these parameters, we develop machine learning-based models to accurately calibrate non-SI to SI timing. We have achieved worst-case error of 8.2ps in 28nm foundry FDSOI technology. Our ongoing works includes Based on the machine learning methodology, to correlate graph-based timing analysis which is fast but inaccurate to path-based timing analysis with is slow but accurate. And we are also trying to integrate our models into an academic timer such as UCSD timer. We thank Dr. Tuck-Boon Chan from Qualcomm Inc. and Ms. Nancy MacDonald from Broadcom Corp. for their valuable suggestions during this work. That’s the end of my presentations. Thanks for your attention! THANK YOU!!! Our thanks to Dr. Tuck-Boon Chan of Qualcomm Inc. and Ms. Nancy MacDonald of Broadcom Corp.

BACKUP

Han2014 Modeling Parameters
Transition Time Wire delay 𝑑 𝑤 wire delay 𝑑 𝑡𝑟,𝑐,𝑜 cell output transition time R w wire resistance C 𝑤 , 𝐶 𝑒𝑓𝑓 , 𝐶 𝑐𝑜𝑢𝑝 , wire, effective, coupling capacitance

Why Bother About SI vs. Non-SI Calibration?
Calibration: Conversion of a Non-SI timing report to SI timing report Many tools perform static timing analysis (STA) in both signal integrity (SI) mode and non-SI mode Cost differences STA tool with SI mode: expensive STA tool without SI mode: cheap Runtime differences For a design with 110K instances, exhaustive path-based analysis runtime of SI is 3× the runtime of non-SI Question: Should design team buy SI licenses or cheaper non-SI licenses? Can we calibrate SI to non-SI to reduce cost and runtime?

Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†

Similar presentations

Presentation on theme: "Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†

Similar presentations

Presentation on theme: "Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†"— Presentation transcript:

Similar presentations

About project

Feedback