A Useful Skew Tree Framework for Inserting Large Safety Margins Rickard Ewetz and Cheng-Kok Koh School of Electrical and Computer Engineering, Purdue University.

Slides:



Advertisements
Similar presentations
CS 140 Lecture 11 Sequential Networks: Timing and Retiming Professor CK Cheng CSE Dept. UC San Diego 1.
Advertisements

1 Lecture 16 Timing  Terminology  Timing issues  Asynchronous inputs.
OCV-Aware Top-Level Clock Tree Optimization
1 COMP541 Flip-Flop Timing Montek Singh Oct 6, 2014.
Modern VLSI Design 4e: Chapter 5 Copyright  2008 Wayne Wolf Topics n Performance analysis of sequential machines.
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
A Robust, Fast Pulsed Flip- Flop Design By: Arunprasad Venkatraman Rajesh Garg Sunil Khatri Department of Electrical and Computer Engineering, Texas A.
June 6, Using Negative Edge Triggered FFs to Reduce Glitching Power in FPGA Circuits Tomasz S. Czajkowski and Stephen D. Brown Department of Electrical.
Buffer and FF Insertion Slides from Charles J. Alpert IBM Corp.
ELEN 468 Lecture 261 ELEN 468 Advanced Logic Design Lecture 26 Interconnect Timing Optimization.
1 Interconnect Layout Optimization by Simultaneous Steiner Tree Construction and Buffer Insertion Presented By Cesare Ferri Takumi Okamoto, Jason Kong.
Chop-SPICE: An Efficient SPICE Simulation Technique For Buffered RC Trees Myung-Chul Kim, Dong-Jin Lee and Igor L. Markov Dept. of EECS, University of.
Variability-Driven Formulation for Simultaneous Gate Sizing and Post-Silicon Tunability Allocation Vishal Khandelwal and Ankur Srivastava Department of.
The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.
Assume array size is 256 (mult: 4ns, add: 2ns)
Low-power Clock Trees for CPUs Dong-Jin Lee, Myung-Chul Kim and Igor L. Markov Dept. of EECS, University of Michigan 1 ICCAD 2010, Dong-Jin Lee, University.
Minimum-Buffered Routing of Non- Critical Nets for Slew Rate and Reliability Control Supported by Cadence Design Systems, Inc. and the MARCO Gigascale.
Interconnect Optimizations. A scaling primer Ideal process scaling: –Device geometries shrink by  = 0.7x) Device delay shrinks by  –Wire geometries.
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
EELE 461/561 – Digital System Design Module #6 Page 1 EELE 461/561 – Digital System Design Module #6 – Differential Signaling Topics 1.Differential and.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 14: March 19, 2008 Statistical Static Timing Analysis.
Interconnect Optimizations
Fast Buffer Insertion Considering Process Variation Jinjun Xiong, Lei He EE Department University of California, Los Angeles Sponsors: NSF, UC MICRO, Actel,
A Global Minimum Clock Distribution Network Augmentation Algorithm for Guaranteed Clock Skew Yield A. B. Kahng, B. Liu, X. Xu, J. Hu* and G. Venkataraman*
ELEN 468 Lecture 271 ELEN 468 Advanced Logic Design Lecture 27 Interconnect Timing Optimization II.
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
Decoupling Capacitance Allocation for Power Supply Noise Suppression Shiyou Zhao, Kaushik Roy, Cheng-Kok Koh School of Electrical & Computer Engineering.
By Praveen Venkataramani Vishwani D. Agrawal TEST PROGRAMMING FOR POWER CONSTRAINED DEVICES 5/9/201322ND IEEE NORTH ATLANTIC TEST WORKSHOP 1.
03/30/031 ECE 551: Digital System Design & Synthesis Lecture Set 9 9.1: Constraints and Timing 9.2: Optimization (In separate file)
DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.
Xin-Wei Shih and Yao-Wen Chang.  Introduction  Problem formulation  Algorithms  Experimental results  Conclusions.
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
Research on Analysis and Physical Synthesis Chung-Kuan Cheng CSE Department UC San Diego
Channel Width Reduction Techniques for System-on-Chip Circuits in Field-Programmable Gate Arrays Marvin Tom University of British Columbia Department of.
Sub-expression elimination Logic expressions: –Performed by logic optimization. –Kernel-based methods. Arithmetic expressions: –Search isomorphic patterns.
March 20, 2007 ISPD An Effective Clustering Algorithm for Mixed-size Placement Jianhua Li, Laleh Behjat, and Jie Huang Jianhua Li, Laleh Behjat,
An Efficient Clustering Algorithm For Low Power Clock Tree Synthesis Rupesh S. Shelar Enterprise Microprocessor Group Intel Corporation, Hillsboro, OR.
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
STA with Variation 1. 2 Corner Analysis PRCA (Process Corner Analysis):  Takes 1.nominal values of process parameters 2.and a delta for each parameter.
Radhamanjari Samanta *, Soumyendu Raha * and Adil I. Erzin # * Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, India.
Physical Synthesis Buffer Insertion, Gate Sizing, Wire Sizing,
March 28, Glitch Reduction for Altera Stratix II devices Tomasz S. Czajkowski PhD Candidate University of Toronto Supervisor: Professor Stephen D.
1ISPD'03 Process Variation Aware Clock Tree Routing Bing Lu Cadence Jiang Hu Texas A&M Univ Gary Ellis IBM Corp Haihua Su IBM Corp.
June clock data Q-flop Flop dataQ clock Flip-flop is edge triggered. It transfers input data to Q on clock rising edge. Memory Elements.
1 COMP541 Sequential Logic Timing Montek Singh Sep 30, 2015.
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
Algorithmic Tuning of Clock Trees and Derived Non-Tree Structures Igor L. Markov and Dong-Jin Lee University of Michigan Additional details in Dong-Jin.
Institute of Applied Microelectronics and Computer Engineering College of Computer Science and Electrical Engineering, University of Rostock Slide 1 Spezielle.
Retiming EECS 290A Sequential Logic Synthesis and Verification.
Xiao Patrick Dong Supervisor: Guy Lemieux. Goal: Reduce critical path  shorter period Decrease dynamic power 2.
6/19/ VLSI Physical Design Automation Prof. David Pan Office: ACES Placement (3)
Construction of Latency-Bounded Clock Trees Rickard Ewetz, Chuan Yean Tan, Cheng-Kok Koh Purdue University.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Yanqing Zhang University of Virginia On Clock Network Design for Sub- threshold Circuitry 1.
University of Michigan Advanced Computer Architecture Lab. 2 CAD Tools for Variation Tolerance David Blaauw and Kaviraj Chopra University of Michigan.
Motivation Process & Design trends
Delay Optimization using SOP Balancing
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Analytical Delay and Variation Modeling for Subthreshold Circuits
Timing Analysis 11/21/2018.
Topics Performance analysis..
Performance Analysis (Clock Signal) مرتضي صاحب الزماني.
Buffered tree construction for timing optimization, slew rate, and reliability control Abstract: With the rapid scaling of IC technology, buffer insertion.
On the Improvement of Statistical Timing Analysis
Delay Optimization using SOP Balancing
Timing Analysis and Optimization of Sequential Circuits
Clock Tree Routing With Obstacles
Performance-Driven Interconnect Optimization Charlie Chung-Ping Chen
Jason Cong, David Zhigang Pan & Prasanna V. Srinivas
Presentation transcript:

A Useful Skew Tree Framework for Inserting Large Safety Margins Rickard Ewetz and Cheng-Kok Koh School of Electrical and Computer Engineering, Purdue University ISPD 2015

Lowering the Point of Divergence and Safety Margins A BAB C D

Skew Constraints Combinational Logic FF i FF j

Safety Margins AB C D

SCG a b c d D Q a b c d Insert Safety Margin M user = 20 a b c d Negative cycle => no Feasible Arrival times!

Cycles of Skew Constraints =0 [9] J. Fishburn. Clock skew optimization. IEEE Transactions on Computers, pages 945–951, SCG with = 0 Maximum Uniform safety margin

Greedy-UST/DME D Q a b c d Source a b c d FSR ab = [-d ab, d ba ] [17] C.-W. A. Tsao and C.-K. Koh. UST/DME: a clock tree router for general skew constraints. ACM TODAES, pages 359–379, a b d c

Insertion of Safety Margins in [17] Uniform Safety Margins M user [17] C.-W. A. Tsao and C.-K. Koh. UST/DME: a clock tree router for general skew constraints. ACM TODAES, pages 359–379, Yield (%) M user (ps) M = 15 ps

Insertion of Safety Margins in [11] D Q a b c d Source a b c d FSR ab = [-d ab, d ba ] [11] W.-C. D. Lam and C.-K. Koh. Process variation robust clock tree routing. ASP-DAC ’05, pages 606–611, M = 15 ps bc =ф=ф 20

Proposal Safety margin M user > Max uniform M Lower point of divergence Few constraints that limit the magnitude of M!

Flow UST-LSM Framework Decrease SCG edge weights with M user Detection of negative cycles Pre-synthesis Synthesis Create clusters from negative cycles Construct trees from cluster 2 to K Construct clock tree from cluster 1 and the trees from cluster 2 to K Output Input No cycles in SCG Found one cycle in SCG Reduction of safety margin from edges of negative cycles Cycle is non-negative

C1C1 C2C2 1

Evaluation of the UST-LSM Framework NameClock period (ns) Number of nets Number of cells Number of sequential elements Number of skew constraints scaled_s1423 scaled_s5378 scaled_s msp fpu ecg [8] R. Ewetz and C.-K. Koh. Benchmark circuits for clock scheduling and synthesis

Monte Carlo Framework Adopted from the ISPD2010 contest [15] Variations – Supply voltage (15%) – Wire widths (10%) – Temperature (30%) – Channel length (10%) Spatial correlations – Quad tree model [1] Stage-by-stage with slew propagation [19] [15] C. Sze. ISPD 2010 high performance clock network synthesis contest: Benchmark suite and results. ISPD’10, pages 143–143, [1] A. Agarwal, D. Blaauw, and V. Zolotov. Statistical timing analysis for intra-die process variations with spatial correlations. ICCAD’03, pages 900–907, [19] M. Zhao, K. Gala, V. Zolotov, Y. Fu, R. Panda, R. Ramkumar, and B. Agrawal. Worst case clock skew under power supply variations. TAU ’02, pages 22–28, 2002.

Evaluation metrics Metrics: – Yield (skew + transition time) – 95%-slack – Capacitive cost – Run-time Designs with loose and tight skew constraints – Loose if no negative cycles with M user = 100 ps

Designs with loose skew constraints Safety margin M user (ps) Yield95%-slackCap (fF) Run-time mspZST No margin fpuZST No margin Similar results for scaled_s1423 and scaled_s5378

Designs with Tight Skew Constraints Safety margin M user (ps) Clustering C 2 (num) Max stages C 3 (num) C 4 (num) scaled_s15850 M+20=47 yes no scaled_s15850 M+50=77 yes no ecg M+15=30 yes no ecg M+25=40 yes no

Tight Skew Constraints Yield (%) M user (ps)

Tight Skew Constraints BMSafety Margin M user (ps) Yield (%)95%-slackCap (fF)Run-time scaled _s15850 ZST 0 M=27 M+10=37 M+20=47 M+30= ecgZST 0 M=15 M+5=20 M+10=25 M+15=30 M+20=35 M+25=

Illustration on scaled_s15850

M user = M+ 0 = 27

M user = M+10 =37

Summary Combine the lowering of the point of divergence with insertion of large safety margins! Questions

Reducing the Cost BMSafety Margin M user (ps) Yield (%)Cap (fF) ISPD2015 New Cap scaled _s15850 ZST 0 M=27 M+10=37 M+20=47 M+30= ecgZST 0 M=15 M+5=20 M+10=25 M+15=30 M+20=