-1- UC San Diego / VLSI CAD Laboratory High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes Andrew B. Kahng, Bill Lin and Siddhartha.

Slides:



Advertisements
Similar presentations
(1/25) UCSD VLSI CAD Laboratory - ISQED10, March. 23, 2010 Toward Effective Utilization of Timing Exceptions in Design Optimization Kwangok Jeong, Andrew.
Advertisements

CountrySTAT Team-I November 2014, ECO Secretariat,Teheran.
OCV-Aware Top-Level Clock Tree Optimization
© 2013 IBM Corporation Use of Hierarchical Design Methodologies in Global Infrastructure of the POWER7+ Processor Brian Veraa Ryan Nett.
Uncertainty in fall time surrogate Prediction variance vs. data sensitivity – Non-uniform noise – Example Uncertainty in fall time data Bootstrapping.
Improved On-Chip Analytical Power and Area Modeling Andrew B. Kahng Bill Lin Kambiz Samadi University of California, San Diego January 20, 2010.
Timing Margin Recovery With Flexible Flip-Flop Timing Model
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
High-Level Constructors and Estimators Majid Sarrafzadeh and Jason Cong Computer Science Department
UC San Diego / VLSI CAD Laboratory NOLO: A No-Loop, Predictive Useful Skew Methodology for Improved Timing in IC Implementation Tuck-Boon Chan, Andrew.
Toward Better Wireload Models in the Presence of Obstacles* Chung-Kuan Cheng, Andrew B. Kahng, Bao Liu and Dirk Stroobandt† UC San Diego CSE Dept. †Ghent.
Intrinsic Shortest Path Length: A New, Accurate A Priori Wirelength Estimator Andrew B. KahngSherief Reda VLSI CAD Laboratory.
A System for Automatic Recording and Prediction of Design Quality Metrics Andrew B. Kahng and Stefanus Mantik* UCSD CSE and ECE Depts., La Jolla, CA *UCLA.
Background: Scan-Based Delay Fault Testing Sequentially apply initialization, launch test vector pairs that differ by 1-bit shift A vector pair induces.
Power-Aware Placement
Toward PDN Resource Estimation: A Law of General Power Density Kwangok Jeong and Andrew B. Kahng
Architectural-Level Prediction of Interconnect Wirelength and Fanout Kwangok Jeong, Andrew B. Kahng and Kambiz Samadi UCSD VLSI CAD Laboratory
Supply Voltage Degradation Aware Analytical Placement Andrew B. Kahng, Bao Liu and Qinke Wang UCSD CSE Department {abk, bliu,
On Modeling and Sensitivity of Via Count in SOC Physical Implementation Kwangok Jeong Andrew B. Kahng.
Chung-Kuan Cheng†, Andrew B. Kahng†‡,
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Toward Performance-Driven Reduction of the Cost of RET-Based Lithography Control Dennis Sylvester Jie Yang (Univ. of Michigan,
Accurate Pseudo-Constructive Wirelength and Congestion Estimation Andrew B. Kahng, UCSD CSE and ECE Depts., La Jolla Xu Xu, UCSD CSE Dept., La Jolla Supported.
A Proposal for Routing-Based Timing-Driven Scan Chain Ordering Puneet Gupta 1 Andrew B. Kahng 1 Stefanus Mantik 2
Methodology from Chaos in IC Implementation Kwangok Jeong * and Andrew B. Kahng *,** * ECE Dept., UC San Diego ** CSE Dept., UC San Diego.
SLIP 2000April 9, Wiring Layer Assignments with Consistent Stage Delays Andrew B. Kahng (UCLA) Dirk Stroobandt (Ghent University) Supported.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Placement-Centered Research Directions and New Problems Xiaojian Yang Amir Farrahi Synplicity Inc.
Combining High Level Synthesis and Floorplan Together EDA Lab, Tsinghua University Jinian Bian.
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,
-1- UC San Diego / VLSI CAD Laboratory Methodology for Electromigration Signoff in the Presence of Adaptive Voltage Scaling Wei-Ting Jonas Chan, Andrew.
Andrew B. Kahng‡†, Mulong Luo†, Siddhartha Nath†
Hierarchical Physical Design Methodology for Multi-Million Gate Chips Session 11 Wei-Jin Dai.
TM Efficient IP Design flow for Low-Power High-Level Synthesis Quick & Accurate Power Analysis and Optimization Flow JAN Asher Berkovitz Yaniv.
Accuracy-Configurable Adder for Approximate Arithmetic Designs
March 8, 2006Spectral RTL ATPG1 High-Level Spectral ATPG for Gate-level Circuits Nitin Yogi and Vishwani D. Agrawal Auburn University Department of ECE.
-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo.
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
Capturing Crosstalk-Induced Waveform for Accurate Static Timing Analysis Masanori Hashimoto, Yuji Yamada, Hidetoshi Onodera Kyoto University.
Global Routing.
1 Coupling Aware Timing Optimization and Antenna Avoidance in Layer Assignment Di Wu, Jiang Hu and Rabi Mahapatra Texas A&M University.
CAD for Physical Design of VLSI Circuits
UC San Diego / VLSI CAD Laboratory Toward Quantifying the IC Design Value of Interconnect Technology Improvement Tuck-Boon Chan, Andrew B. Kahng, Jiajia.
Horizontal Benchmark Extension for Improved Assessment of Physical CAD Research Andrew B. Kahng, Hyein Lee and Jiajia Li UC San Diego VLSI CAD Laboratory.
UC San Diego / VLSI CAD Laboratory Incremental Multiple-Scan Chain Ordering for ECO Flip-Flop Insertion Andrew B. Kahng, Ilgweon Kang and Siddhartha Nath.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
1 Wire Length Prediction-based Technology Mapping and Fanout Optimization Qinghua Liu Malgorzata Marek-Sadowska VLSI Design Automation Lab UC-Santa Barbara.
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
Chonnam national university VLSI Lab 8.4 Block Integration for Hard Macros The process of integrating the subblocks into the macro.
Kwangsoo Han‡, Andrew B. Kahng‡† and Hyein Lee‡
Outline Introduction: BTI Aging and AVS Signoff Problem
-1- Statistical Analysis and Modeling for Error Composition in Approximate Computation Circuits Wei-Ting Jonas Chan 1, Andrew B. Kahng 1, Seokhyeong.
Explicit Modeling of Control and Data for Improved NoC Router Estimation Andrew B. Kahng +*, Bill Lin * and Siddhartha Nath + UCSD CSE + and ECE * Departments.
UC San Diego / VLSI CAD Laboratory Learning-Based Approximation of Interconnect Delay and Slew Modeling in Signoff Timing Tools Andrew B. Kahng, Seokhyeong.
Mixed Cell-Height Implementation for Improved Design Quality in Advanced Nodes Sorin Dobre +, Andrew B. Kahng * and Jiajia Li * * UC San Diego VLSI CAD.
0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1.
Outline Motivation and Contributions Related Works ILP Formulation
-1- UC San Diego / VLSI CAD Laboratory On Potential Design Impacts of Electromigration Awareness Andrew B. Kahng, Siddhartha Nath and Tajana S. Rosing.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
Dept. of Electronics Engineering & Institute of Electronics National Chiao Tung University Hsinchu, Taiwan ISPD’16 Generating Routing-Driven Power Distribution.
Prediction of Interconnect Net-Degree Distribution Based on Rent’s Rule Tao Wan and Malgorzata Chrzanowska- Jeske Department of Electrical and Computer.
Improved Flop Tray-Based Design Implementation for Power Reduction
Kun Young Chung*, Andrew B. Kahng+ and Jiajia Li+
Wei-Ting J. Chan$, Yang Du+, Andrew B. Kahng$, Siddhartha Nath$
On the Relevance of Wire Load Models
Revisiting and Bounding the Benefit From 3D Integration
Recursively Adapted Radial Basis Function Networks and its Relationship to Resource Allocating Networks and Online Kernel Learning Weifeng Liu, Puskal.
Presentation transcript:

-1- UC San Diego / VLSI CAD Laboratory High-Dimensional Metamodeling for Prediction of Clock Tree Synthesis Outcomes Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY, UC San Diego

-2- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-3- Challenge: High Dimensionality Why is CTS prediction hard? Why is CTS prediction hard? Testcases Layout contexts Tools & knobs Outcomes? (power, skew, delay, wirelength) CTS instance CTS prediction is difficult due to inherent high dimensionality

-4- Challenge: Sensitivity Delay varies by up to 43% with clock entry point locations Delay varies by up to 43% with clock entry point locations Delay varies by up to 45% with core aspect ratio Delay varies by up to 45% with core aspect ratio BLBLM B RBM R CTS outcomes are sensitive to instance parameters

-5- Challenge: Multicollinearity D = Estimation errors increase at high dimensions

-6- Challenge: Realistic Instances Sinks (x, y) Rectangular core Placement blockage Simple testcases and layout contexts do not reflect real-world CTS instances ISPD 2010 CTS Benchmark 01

-7- Contributions Generate realistic testcases with real-world CTS structures Generate realistic testcases with real-world CTS structures Study and identify appropriate modeling parameters Study and identify appropriate modeling parameters Propose hierarchical hybrid surrogate modeling (HHSM) – a divide and conquer approach to overcome parameter collinearity issues Propose hierarchical hybrid surrogate modeling (HHSM) – a divide and conquer approach to overcome parameter collinearity issues Develop prediction methodologies for practical use models Develop prediction methodologies for practical use models –Which tool should be used? –How should the tool be driven? –How wrong can the model guidance be? Validate methodologies on a new CTS instance Validate methodologies on a new CTS instance

-8- Related Works Testcases Testcases –Tsay90 CTS testcases r1 - r5 with sink (x, y) coordinates CTS testcases r1 - r5 with sink (x, y) coordinates –ISPD 2010 Placement blockage Placement blockage Inverters/buffers in clock hierarchy Inverters/buffers in clock hierarchy Prediction Prediction –Kahng02 CUBIST to estimate clock skew, insertion delay CUBIST to estimate clock skew, insertion delay –Kahng13 MARS, RBF, KG, HSM to estimate several clock metrics MARS, RBF, KG, HSM to estimate several clock metrics Uniform placement of sinks, no combinational logic Uniform placement of sinks, no combinational logic Gaps in testcases and layout contexts

-9- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-10- Example of Our CTS Testcase Real-world clock structures Real-world clock structures –Clock-gating cells (CGCs) –Clock dividers –Gitch-free clock MUX Multiple levels in the clock tree hierarchy (K 6 vs. K 2 ) Multiple levels in the clock tree hierarchy (K 6 vs. K 2 ) Generators, runscripts to be published Generators, runscripts to be published CGC K1K1 K2K2 cg_en[0] cg_en[1] Glitch Free MUX DIV-8 DIV-4 CGC DIV- 24 CGC K3K3 K4K4 K5K5 K6K6 cg_en[2] cg_en[3] cg_en[4] cg_en[5] cg_en[6] Clk root pin clk mux_en[0] Sinks

-11- Example of Our CTS Instance Nonuniform sink placement

-12- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-13- Modeling Parameters Microarchitectural Microarchitectural –M sinks – # sinks Floorplan context Floorplan context –M core, M AR – core area and aspect ratio –M CEP – clock entry point –M block – placement and routing blockage % of core area Tool constraints Tool constraints –M skew, M delay – max skew and insertion delay –M buftran, M sinktran – max buffer and sink transition time –M FO – max fanout –M bufsize, M wire – max buffer size and wire width Nonuniformity measure Nonuniformity measure –M DCT – nonuniformity in sink placement

-14- Modeling Flow Synthesis (DC) Gate-level netlist Testcase Verilog RTL Generate placed DEF Floorplan parameters CTS tool parameters CTS instance CTS + CT route (ToolA) CTS + CT route (ToolB) Extract CTS metrics µArch parameter Nonuniformity parameter Fitted models for metrics Metamodeling

-15- Metamodeling Techniques Accurate because they derive surrogate models from actual post-CTS data Accurate because they derive surrogate models from actual post-CTS data Our techniques Our techniques –Hybrid Surrogate Modeling (HSM) [Kahng13] –Multivariate Adaptive Regression Splines (MARS) [Friedman91] –Radial Basis Function (RBF) [Buhmann03] –Kriging (KG) [Matheron78]

-16- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-17- Multicollinearity If parameters are linear combinations of each other If parameters are linear combinations of each other –Example: M AR, M buftran, M sinktran, M wire –Matrix of parameters is ill-conditioned –Large variance in regression coefficients –Hard to determine relationship between parameters and output –Large errors between actual and predicted outputs as D increases Previous works [Kahng13] report large estimation errors (≥ 30%) as D ≥ 10 Previous works [Kahng13] report large estimation errors (≥ 30%) as D ≥ 10

-18- Our Solution: HHSM Hierarchical Hybrid Surrogate Modeling Hierarchical Hybrid Surrogate Modeling Divide the parameters (D) into two sets Divide the parameters (D) into two sets –One set of k parameters has low collinearity –Other set of D – k parameters may have high collinearity –Derive HSM surrogate models for each set –Combine using weights from least-squares regression where, w 1,2 are weights w 1 : k parameters with low collinearity w 2 : D – k parameters with high collinearity

-19- HHSM Accuracy D = ≤ 2% ≤ 13%

-20- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-21- Use Models For Prediction Develop methodologies to answer three questions Develop methodologies to answer three questions –Q1: Which tool should be used? –Q2: How should the tool be driven? –Q3: How wrong can the model guidance be?

-22- Q1: Which Tool Should Be Used? Methodology Methodology –Determine the better tool using models –Compare with actual post-CTS data DSkewPowerDelayWirelength Errors increase 8 ≤ D ≤ 11 Errors increase 8 ≤ D ≤ 11 Errors saturate D ≥ 12 Errors saturate D ≥ 12 Worst-case prediction error = 6.13% Worst-case prediction error = 6.13% Incorrect Tool Prediction %

-23- Q2: How Should The Tool Be Driven? Methodology Methodology –Determine the smallest and largest values of parameters that deliver desired outcome Max Skew (ps)Max Delay (ns)Max Buffer Transition (ps) Skew (ps)ToolAToolBToolAToolBToolAToolB 5NNNNNN X X1.5 - X300 - X X1.5 - X300 - X X1.5 - X300 - X N – infeasible X - unbounded Parameter subspaces for tools

-24- Q3: How Wrong Can The Guidance Be? Methodology Methodology –Compare model and actual outcomes of tools –If model is wrong, Power ToolAToolB DSVMSUBSVMSUB Suboptimality ≤ 10% Suboptimality ≤ 10% Wrong guidance % and suboptimality %

-25- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-26- Max Skew (ps) Max Delay (ns) Max Buffer Transition (ps) CTS Tool Post-CTS Skew (ps) Number of CTS runs ToolA ToolB Validation on “New” CTS Instance How well does our prediction methodologies generalize? How well does our prediction methodologies generalize? Goals Goals –Apply methodologies to a new CTS instance –Obtain skew target ≤ 30ps Determine parameter values from subspace results of Q2 Determine parameter values from subspace results of Q2 Generalizes with small overhead Generalizes with small overhead Few CTS runs to deliver the desired outcome Few CTS runs to deliver the desired outcome

-27- Outline Challenges Testcase generation Design of experiments New estimation technique Prediction methodologies Validation of our methodologies Conclusions

-28- Conclusions Study high-D CTS prediction with appropriate modeling parameters Generate testcases with real-world CTS structures Propose HHSM to limit error to ≤ 13% even with multicollinearity Develop methodologies for practical use models Ongoing work – –Learning techniques to cure high-D multicollinearity – –Methodologies to characterize EDA tools – –Apply methodologies to reduce time and cost for IC implementation

-29- Acknowledgments Work supported by NSF, MARCO/DARPA, SRC and Qualcomm Inc.

-30- Thank You!

-31- Backup

-32- Brief Background on Metamodeling General form of estimation General form of estimation where, Predicted response deterministic response Random noise function Regression coefficients

-33- Regression Function: MARS where, I i : # interactions in the i th basis function b ji : ±1 x v : v th parameter t ji : knot location Knot = value of parameter where line segment changes slope

-34- Regression Function: RBF where, a j : coefficients of the kernel function K(.): kernel function µ j : centroid r j : scaling factors

-35- Regression Function: KG

-36- Hybrid Surrogate Modeling (HSM) Variant of Weighted Surrogate Modeling but uses least- squares regression to determine weights Variant of Weighted Surrogate Modeling but uses least- squares regression to determine weights where, w 1,2,3 are weights of predicted response of surrogate model for w 1 : MARS w 2 : RBF w 3 : KG