Ispd-2007 Repeater Insertion for Concurrent Setup and Hold Time Violations with Power-Delay Trade-Off Salim Chowdhury John Lillis Sun Microsystems University.

Slides:



Advertisements
Similar presentations
Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
Advertisements

OCV-Aware Top-Level Clock Tree Optimization
Mining Compressed Frequent- Pattern Sets Dong Xin, Jiawei Han, Xifeng Yan, Hong Cheng Department of Computer Science University of Illinois at Urbana-Champaign.
Advanced Interconnect Optimizations. Buffers Improve Slack RAT = 300 Delay = 350 Slack = -50 RAT = 700 Delay = 600 Slack = 100 RAT = 300 Delay = 250 Slack.
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
Timing Optimization. Optimization of Timing Three phases 1globally restructure to reduce the maximum level or longest path Ex: a ripple carry adder ==>
An Efficient Technology Mapping Algorithm Targeting Routing Congestion Under Delay Constraints Rupesh S. Shelar Intel Corporation Hillsboro, OR Prashant.
And Learning TEAL Consulting Limited a a Meeting Customer Demand in Challenging Times July 2010.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
Current-Mode Multi-Channel Integrating ADC Electrical Engineering and Computer Science Advisor: Dr. Benjamin J. Blalock Neena Nambiar 16 st April 2009.
An Optimal Algorithm of Adjustable Delay Buffer Insertion for Solving Clock Skew Variation Problem Juyeon Kim, Deokjin Joo, Taehan Kim DAC’13.
Layer Assignment Algorithm for RLC Crosstalk Minimization Bin Liu, Yici Cai, Qiang Zhou, Xianlong Hong Tsinghua University.
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
Simulated-Annealing-Based Solution By Gonzalo Zea s Shih-Fu Liu s
Statistical timing and synthesis Chandu paper. Canonical form Compute max(A,B) = C in canonical form (assuming  X i independent)
Interconnect Optimizations
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*
EE4271 VLSI Design Advanced Interconnect Optimizations Buffer Insertion.
A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.
Merging Synthesis With Layout For Soc Design -- Research Status Jinian Bian and Hongxi Xue Dept. Of Computer Science and Technology, Tsinghua University,
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
Pei-Ci Wu Martin D. F. Wong On Timing Closure: Buffer Insertion for Hold-Violation Removal DAC’14.
Triple Patterning Aware Detailed Placement With Constrained Pattern Assignment Haitong Tian, Yuelin Du, Hongbo Zhang, Zigang Xiao, Martin D.F. Wong.
1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.
Advanced Interconnect Optimizations. Timing Driven Buffering Problem Formulation Given –A Steiner tree –RAT at each sink –A buffer type –RC parameters.
A Topology-based ECO Routing Methodology for Mask Cost Minimization Po-Hsun Wu, Shang-Ya Bai, and Tsung-Yi Ho Department of Computer Science and Information.
DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.
-1- UC San Diego / VLSI CAD Laboratory A Global-Local Optimization Framework for Simultaneous Multi-Mode Multi-Corner Clock Skew Variation Reduction Kwangsoo.
CSE 242A Integrated Circuit Layout Automation Lecture: Partitioning Winter 2009 Chung-Kuan Cheng.
Power Reduction for FPGA using Multiple Vdd/Vth
A Polynomial Time Approximation Scheme For Timing Constrained Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, Charles J. Alpert** *Dept of Electrical.
Types of IP Models All-integer linear programs Mixed integer linear programs (MILP) Binary integer linear programs, mixed or all integer: some or all of.
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
Thermal-aware Steiner Routing for 3D Stacked ICs M. Pathak and S.K. Lim Georgia Institute of Technology ICCAD 07.
1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
VLSI Physical Design: From Graph Partitioning to Timing Closure Chapter 6: Detailed Routing © KLMH Lienig 1 What Makes a Design Difficult to Route Charles.
Using Support Vector Machines to Enhance the Performance of Bayesian Face Recognition IEEE Transaction on Information Forensics and Security Zhifeng Li,
Examination Committee: Dr. Poompat Saengudomlert (Chairperson) Assoc. Prof. Tapio Erke Dr. R.M.A.P. Rajatheva 1 Telecommunications FoS Asian Institute.
A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.
NA62 Trigger Algorithm Trigger and DAQ meeting, 8th September 2011 Cristiano Santoni Mauro Piccini (INFN – Sezione di Perugia) NA62 collaboration meeting,
1 ε -Optimal Minimum-Delay/Area Zero-Skew Clock Tree Wire-Sizing in Pseudo-Polynomial Time Jeng-Liang Tsai Tsung-Hao Chen Charlie Chung-Ping Chen (National.
Fast Algorithms for Slew Constrained Minimum Cost Buffering S. Hu*, C. Alpert**, J. Hu*, S. Karandikar**, Z. Li*, W. Shi* and C. Sze** *Dept of ECE, Texas.
Column Generation By Soumitra Pal Under the guidance of Prof. A. G. Ranade.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
Algorithm Design Methods (II) Fall 2003 CSE, POSTECH.
Finding Optimal Solutions to Cooperative Pathfinding Problems Trevor Standley and Rich Korf Computer Science Department University of California, Los Angeles.
FPGA-Based System Design: Chapter 6 Copyright  2004 Prentice Hall PTR Topics n Low power design. n Pipelining.
A Fully Polynomial Time Approximation Scheme for Timing Driven Minimum Cost Buffer Insertion Shiyan Hu*, Zhuo Li**, Charles Alpert** *Dept of Electrical.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
General Troubleshooting Nonlinear Diagnostics. Goal – In this workshop, our goal is to use the nonlinear diagnostics tools available in Solution Information.
An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.
An O(nm) Time Algorithm for Optimal Buffer Insertion of m Sink Nets Zhuo Li and Weiping Shi {zhuoli, Texas A&M University College Station,
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Gopakumar.G Hardware Design Group
Kun Young Chung*, Andrew B. Kahng+ and Jiajia Li+
Data Driven Resource Allocation for Distributed Learning
Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts.
Buffer Insertion with Adaptive Blockage Avoidance
Timing Analysis 11/21/2018.
Timing Optimization.
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Clockless Logic: Asynchronous Pipelines
Reinventing The Wheel: Developing a New Standard-Cell Synthesis Flow
Fast Min-Register Retiming Through Binary Max-Flow
Performance-Driven Interconnect Optimization Charlie Chung-Ping Chen
Under a Concurrent and Hierarchical Scheme
Presentation transcript:

Ispd-2007 Repeater Insertion for Concurrent Setup and Hold Time Violations with Power-Delay Trade-Off Salim Chowdhury John Lillis Sun Microsystems University of Illinois at Chicago Sun Microsystems University of Illinois at Chicago

Outline Motivation Modelling late & early modes concurrently Identifying sub-optimal solutions in a list The merging problem Power-Delay Trade-Off Interaction between late & early modes (examples) Conclusions, limitations, and future directions Acknowledgements

Motivation Traditional Flow Max Mode Optimization Logic Drops More Max Mode Optimization Close To Tape Out: Min Mode Analysis and Fixes Challenges: Resizing bits in banks? Repeater reposition? Room for more repeaters? Don’t aggravate critical paths!

Outline Motivation Modelling late & early modes concurrently Identifying sub-optimal solutions in a list The merging problem Power-Delay Trade-Off Interaction between late & early modes (examples) Conclusions, limitations, and future directions Acknowledgements

Basic Algorithm in the Late Mode Try repeater sizes to generate solutions: (c, q) pairs Identify and prune sub-optimal; fanout: avoid sub-optimal combinations Select the solution with highest driver

Concurrent Min-Max Model q b close to q bd is better higher c helps to achieve q b closer to q bd (Note: initially q b  q bd ) if (c b1 (c b1, q b1 ) Objective Function: Late-Mode q w Constraint: Early-Mode Arrival Time at the Driver: q bd s 2 is sub-optimal compared to s 1 if (s 1 => s 2 ) s 2 is sub-optimal in late mode and s 2 is sub-optimal in early mode Solution: (c w, q w, c b, q b ) Late mode: s 1 => s 2 if (c w2 < c w1 ) and (q w2 < q w1 )

Outline Motivation Modelling late & early modes concurrently Identifying sub-optimal solutions in a list The merging problem Power-Delay Trade-Off Interaction between late & early modes (examples) Conclusions, limitations and future directions Acknowledgements

Pruning a List of Solutions Four rules: c w2  c w1 in all cases: Case I: q b1 > q bd and q b2  q bd : Case II: q b1 > q bd and q b2 > q bd : Case III: q b1  q bd and q b2  q bd : Case IV: q b1  q bd and q b2 > q bd : s 2 cannot be pruned Prune s 1 if (c w1 = c w2 ) and (q w1  q w2 ) Prune s 2 if (q w2  q w1 ), (c b2  c b1 ) and (q b2  q b1 ) Prune s 1 if (q w1  q w2 ), (c w1 = c w2 ), (c b1  c b2 ), (q b1  q b2 ) Prune s 2 if (q w2  q w1 ) Prune s 1 if (c w1 = c w2 ), and (q w1  q w2 ) Prune s 2 if (q w2  q w1 )

Identifying sub-optimal solutions Solution c w q w c b q b Dominating Sol

Complexity Reduction in Pruning sGPairSOAction 1ØNoneInsert 2{1}(1,2)Insert 3{1,2}(2,3)3Delete 4{1,2}NoneInsert 5{1,2,4}(4,5)Insert 6{1,2,5,4}(5,6) (4,6)Insert 7{1,2,6,5,4}(6,7)7Delete 8{1,2,6,5,4}(6,8) (5,8)8Delete

Further Reduction in Comparison Set Set G can be stored in a 2-Way binary tree: 1 st branch: q w 2 nd branch: q b 14G={1,2,6,5,4,11,9,12,10,13}(9,14)14Delete How to quickly identify the dominating solution 9 in group G?

Example Binary Tree Solution14: c w =23q w =109c b =9q b =57 Dominating Solution: 9: c w =19q w =109c b =10q b =56 G = {1,2,6,5,4,11,9,12,10,13} {1,2,6,5,4,11}{9,12,10,13} qwqw {9}{12,10,13} qbqb

Outline Motivation Modelling late & early modes concurrently Identifying sub-optimal solutions in a list The merging problem Power-Delay Trade-Off Interaction between late & early modes (examples) Conclusions, limitations and future directions Acknowledgements

Merging Multiple Branches

Late Mode Merging LS = {(1,1)->(1X{2:3}), (2,2)->(2,3)}

Early Mode Merging ES = {(2,2)->(2X{1,3}), (1,2)->(1X{1,3})}

Identifying Non-Suboptimal Combinations LS = {(1,1)->(1X{2:3}), (2,2)->(2,3)} ES = {(2,2)->(2X{1,3}), (1,2)->(1X{1,3})} Sub-Optimal combinations are: (2,3) Non-suboptimal combinations: (1,2), (1,3), (2,1), and (1,1) Looking for a more efficient technique __________ _____

Outline Motivation Modelling late & early modes concurrently Identifying sub-optimal solutions in a list The merging problem Power-Delay Trade-Off Interaction between late & early modes (examples) Conclusions, limitations and future directions Acknowledgements

Delay-Power Trade-Off How to avoid the flat region?

Techniques for Trade-Off John Lillis (ICCAD-95 ) Prune a solution if inferior in both p and q Algorithm highlights: Put solutions into power bins Intra-bin Pruning: Linear Inter-bin Pruning: more than linear Merging: all bin-pairs All trade-offs are explicitly computed and retained Final selections at the driver Issues: Large # of bins (esp. if slew dependent) Number of bin-pairs can be O(n 2 ) Large solution space => run time

Implicit Power-Delay Trade-Off Desired trade-off is captured in a parameter: =  delay/  power For example, if 0.01 ps delay reduction for a power dissipation of 1  w is acceptable, then = 0.01 ps/  w q a = q - *P (P = power/area); *P is a “penalty” Features:Trade-Off is Implicit Controlled Solution Space and Run Time Pruning a list: if (c 2  c 1 ) and (q 2a (c 2, q 2a ): Facilitates min-power solution (test nets) Merging Much detailed: could not include in this paper

Too Little to Too Much Price Penalty#net buffered#repeaters

Outline Motivation Modelling late & early modes concurrently Identifying sub-optimal solutions in a list The merging problem Power-Delay Trade-Off Interaction between late & early modes (examples) Conclusions, limitations and future directions Acknowledgements

Interaction Betwn. Late & Early Modes Caseq b1 (ps)q b2 (ps)Repeaters at Locations I x at 3 and 7 II x at 2 and 8; b mt at 5 III x at 2, 16x at 8 and b mt at 5 IV x at 2, 16x at 6 and 8, b mt at 5

Conclusion & Future Directions A new model for repeater insertion problem Early-mode timing requirement is a constraint Helps avoid aggressive late-mode optimization creating new early mode violations Should speed up design turn-around time by avoiding ECO’s to satisfy early-mode violations Techniques for satisfying maximum and minimum slew values, accurate timing to consider these slew values and avoid the flat region in the power-delay curve

Limitations & Future Directions Limitations Run time complexity Slack Budgeting Future research topics include: Combine gate sizing and repeater insertion Cone based Graph based Better merging techniques Controlling variations: process, voltage and temperature Hierarchical We welcome collaboration with academia

Acknowledgements Reviewers for detailed feedback Rob Mains for review & encouragements Aman Joshi and Sun Management for support Program Committee and the Organizers

Thank You

Satisfying Min-Max Slews