Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis.

Similar presentations


Presentation on theme: "CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis."— Presentation transcript:

1 CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis

2 CSE241 L4 System.2Kahng & Cichy, UCSD ©2003  All emails and announcements are archived on the class web page  Remember: tomorrow (Friday, January 17) noon- 1:20pm (EBUI 3327) prepares you for the synthesis lecture on Tuesday (see syllabus on web for expected content)  HW questions 4-6 due today  Lab #1 report due Monday; turn-in instructions and script announced last night  This class: introduction to static timing analysis  Goals:  How does STA work?  What are use models for STA?  (What kinds of things are done with STA results?) Logistics

3 CSE241 L4 System.3Kahng & Cichy, UCSD ©2003  Determine fastest permissible clock speed (e.g. 100MHz) by determining delay (including set-up and hold time) of longest path from register to register (e.g. 10ns)  Largely eliminates need for gate-level simulation to verify delay of the circuit Goal of Static Timing Verification clk Combinational logic clk Combinational logic clk Combinational logic Courtesy K. Keutzer et al. UCB

4 CSE241 L4 System.4Kahng & Cichy, UCSD ©2003 Cycle Time - Critical Path Delay  Cycle time (T) cannot be smaller than longest path delay (T max )  Longest (critical) path delay is a function of:  Total gate, wire delays l logic levels clock Q1 Q2 T clock1 T clock2 critical path, ~5 logic levels T clock1 data cycle time Courtesy K. Keutzer et al. UCB

5 CSE241 L4 System.5Kahng & Cichy, UCSD ©2003 Cycle Time - Setup Time  For FFs to correctly capture data, must be stable for:  Setup time (T setup ) before clock arrives clock Q1 Q2 T clock1 T clock2 critical path, ~5 logic levels T clock1 data setup time Courtesy K. Keutzer et al. UCB

6 CSE241 L4 System.6Kahng & Cichy, UCSD ©2003 Cycle Time – Clock Skew clock Q1 Q2 T clock1 T clock2 T clock1 T clock2 Q2 data clock skew Q2 6  If clock network has unbalanced delay – clock skew  Cycle time is also a function of clock skew (T skew ) critical path, ~5 logic levels Courtesy K. Keutzer et al. UCB

7 CSE241 L4 System.7Kahng & Cichy, UCSD ©2003 Cycle Time – Flip-Flop Delay (Clock to Q)  Cycle time is also a function of propagation delay of FF (T clk-to-Q )  T clk-to-Q : time from arrival of clock signal till change at FF output) clock Q1 Q2 T clock1 T clock2 T clock1 T clock2 Q2 clock-to-Q data Q2 critical path, ~5 logic levels Courtesy K. Keutzer et al. UCB

8 CSE241 L4 System.8Kahng & Cichy, UCSD ©2003 Min Path Delay - Hold Time  For FFs to correctly latch data, data must be stable during:  Hold time (T hold ) after clock arrives  Determined by delay of shortest path in circuit (T min ) and clock skew (T skew ) clock Q1 Q2 T clock1 T clock2 short path, ~3 logic levels T clock1 data hold time Courtesy K. Keutzer et al. UCB

9 CSE241 L4 System.9Kahng & Cichy, UCSD ©2003 Setup, Hold, Cycle Times set-up time – D stable before clock cycle time Example of a single phase clock hold time – D stable after clock When signal may change Courtesy K. Keutzer et al. UCB

10 CSE241 L4 System.10Kahng & Cichy, UCSD ©2003 Approach – Reduce to Combinational clk Combinational logic clk Combinational logic clk Combinational logic original circuit extracted block Combinational logic Courtesy K. Keutzer et al. UCB

11 CSE241 L4 System.11Kahng & Cichy, UCSD ©2003 Combinational Block  Arrival time in green A C B f 2 2 2 1 0 1 0.20.10 X Y Z W.15.05  Interconnect delay in red  Gate delay in blue  What’s the right mathematical object to use to represent this physical object? Courtesy K. Keutzer et al. UCB

12 CSE241 L4 System.12Kahng & Cichy, UCSD ©2003 Problem Formulation - 1 C B f X Y W 0.05.1 1.2 0 0 1 A.15.20 A C B f 2 2 2 1 0 1 0.10 X Y Z W.15.05 1 2 2 2 Z  Use a labeled directed graph  G =  Vertices represent gates, primary inputs and primary outputs  Edges represent wires  Labels represent delays  HW #7: In this graph model, both nodes and edges have delays. How would you modify this model so that only edges have delay, yet the same timing analysis remains possible? Courtesy K. Keutzer et al. UCB

13 CSE241 L4 System.13Kahng & Cichy, UCSD ©2003 Problem Formulation – Actual Arrival Time  Actual arrival time A(v) for a node v is latest time that signal can arrive at node v X Y A(Z) Z A(X) A(Y) Courtesy K. Keutzer et al. UCB

14 CSE241 L4 System.14Kahng & Cichy, UCSD ©2003 Problem Formulation - 2 C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 A C B f 2 2 2 1 0 1 0.10 X Y Z W.15.5.05 1 2 2 2 Z  Use a labeled directed graph  G =  Enumerate all paths - choose the longest? Courtesy K. Keutzer et al. UCB

15 CSE241 L4 System.15Kahng & Cichy, UCSD ©2003 Problem Formulation - 3 C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } 0 0 0 0 0 Origin (Kirkpatrick 1966, IBM JRD) Courtesy K. Keutzer et al. UCB

16 CSE241 L4 System.16Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z 0 0 0 0 0 Origin 0 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

17 CSE241 L4 System.17Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z 0 0 0 0 0 Origin 0 0 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

18 CSE241 L4 System.18Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z 0 0 0.1 0 Origin 0 0 0 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

19 CSE241 L4 System.19Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z 0 0 0.1 0 Origin 0 0 0 2.0 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

20 CSE241 L4 System.20Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z 0 0 0.1 0 Origin 1.1 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

21 CSE241 L4 System.21Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 2 Z 0 0 0.1 0 Origin 1.1 2.2 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

22 CSE241 L4 System.22Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0.1 A.15.20 1 2 2 Z 0 0 0.1 0 Origin 1.1 2 3 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

23 CSE241 L4 System.23Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 1 2 3.6 2 Z 0 0 0 0 0 Origin 1.1 2 3 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

24 CSE241 L4 System.24Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 1 2 3.6 2 Z 0 0 0 0 0 Origin 1.1 2 5.8 3 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

25 CSE241 L4 System.25Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 1 2 3.6 2 Z 0 0 0 0 0 Origin 1.1 2 5.8 3 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

26 CSE241 L4 System.26Kahng & Cichy, UCSD ©2003 Algorithm Execution C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 1 2 3.6 2 Z 0 0 0 0 0 Origin 1.1 2 5.8 3 5.95 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

27 CSE241 L4 System.27Kahng & Cichy, UCSD ©2003 Critical Path (sub-graph) C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 1 2 3.6 2 Z 0 0 0 0 0 Origin 1.1 2 5.8 3 5.95 Compute longest path in a graph G = // delay is set of labels, Origin is the super-source of the DAG Forward-prop(W){ for each vertex v in W for each edge from v Final-delay(w) = max(Final-delay(w), delay(v) + delay(w) + delay( )) if all incoming edges of w have been traversed, add w to W } Longest path(G){ Forward_prop(Origin) } Courtesy K. Keutzer et al. UCB

28 CSE241 L4 System.28Kahng & Cichy, UCSD ©2003 Timing for Optimization: Extra Requirements  Longest-path algorithm computes arrival times at each node  If we have constraints, need to propagate slack to each node l A measure of how much timing margin exists at each node l Can optimize a particular branch  Can trade slack for power, area, robustness clock Courtesy K. Keutzer et al. UCB

29 CSE241 L4 System.29Kahng & Cichy, UCSD ©2003 Required Arrival Time  Required arrival time R(v) is the time before which a signal must arrive to avoid a timing violation  Then recursively X Y R(Z) Z R(X) R(Y) Courtesy K. Keutzer et al. UCB

30 CSE241 L4 System.30Kahng & Cichy, UCSD ©2003 Required Time Propagation: Example C B f X Y W 0.5.1 1.2 0 0 1 A.15.20 1 2 3.6 2 Z  Assume required time at output R(f) = 5.80  Propagate required times backwards 0 O 0 0 0 Origin 1.1 2 5.8 3 5.95 5.65 3.45 0.95 0.45 -0.15 1.45 5.80 Courtesy K. Keutzer et al. UCB

31 CSE241 L4 System.31Kahng & Cichy, UCSD ©2003 Timing Slack  From arrival and required time can compute slack. For each node v:  Slack reflects criticality of a node  Positive slack l Node is not on critical path. Timing constraints met.  Zero slack l Node is on critical path. Timing constraints are barely met.  Negative slack l There is a timing violation  Slack distribution is key for timing optimization! Courtesy K. Keutzer et al. UCB

32 CSE241 L4 System.32Kahng & Cichy, UCSD ©2003 Timing Slack Computation: Example  Compute slack at each node C B f X Y W A Z Origin -0.15 0.45 -0.15 0.45 -0.15 1.45 -0.15 Courtesy K. Keutzer et al. UCB

33 CSE241 L4 System.33Kahng & Cichy, UCSD ©2003 clk Combinational logic clk Combinational logic clk Combinational logic  Computing longest path delay Full path enumeration - potentially exponential Longest path algorithm on DAG (Kirkpatrick 1966, IBM JRD) (O(v+e) or O(g + p))  Currently applied to even the largest (>10M gate) circuits  Two challenges  Asynchronous subcircuits  False paths Approach of Static Timing Verification Courtesy K. Keutzer et al. UCB

34 CSE241 L4 System.34Kahng & Cichy, UCSD ©2003 Enhancements of STA  Incremental timing analysis  Nanometer-scale process effects – variation (  probabilistic timing analysis)  Interference – crosstalk  Multiple inputs switching  Conservatism of delay propagation  HW #8: Suppose you change the size of one (combinational) gate in your design, thus invalidating the previous timing analysis. How much work must be done to regain a correct timing analysis? Courtesy K. Keutzer et al. UCB

35 CSE241 L4 System.35Kahng & Cichy, UCSD ©2003 Timing Correction  Driven by STA l “Incremental performance analysis backplane”  Fix electrical violations l Resize cells l Buffer nets l Copy (clone) cells  Fix timing problems l Local transforms (bag of tricks) l Path-based transforms DAC-2002, Physical Chip Implementation

36 CSE241 L4 System.36Kahng & Cichy, UCSD ©2003 Local Synthesis Transforms  Resize cells  Buffer or clone to reduce load on critical nets  Decompose large cells  Swap connections on commutative pins or among equivalent nets  Move critical signals forward  Pad early paths  Area recovery DAC-2002, Physical Chip Implementation

37 CSE241 L4 System.37Kahng & Cichy, UCSD ©2003 Transform Example Delay = 4 ….. Double Inverter Removal ….. Delay = 2 DAC-2002, Physical Chip Implementation

38 CSE241 L4 System.38Kahng & Cichy, UCSD ©2003 Resizing b a d e f 0.2 0.3 ? b a A 0.035 b a C 0.026 DAC-2002, Physical Chip Implementation

39 CSE241 L4 System.39Kahng & Cichy, UCSD ©2003 Cloning b a d e f g h 0.2 ? b a d e f g h A B DAC-2002, Physical Chip Implementation

40 CSE241 L4 System.40Kahng & Cichy, UCSD ©2003 Buffering b a d e f g h 0.2 ? b a d e f g h 0.1 0.2 B B DAC-2002, Physical Chip Implementation

41 CSE241 L4 System.41Kahng & Cichy, UCSD ©2003 Redesign Fan-in Tree a c d b e Arr(b)=3 Arr(c)=1 Arr(d)=0 Arr(a)=4 Arr(e)=6 1 1 1 c d e Arr(e)=5 1 1 b 1 a DAC-2002, Physical Chip Implementation

42 CSE241 L4 System.42Kahng & Cichy, UCSD ©2003 Redesign Fan-out Tree 1 1 1 3 1 1 1 Longest Path = 5 1 1 1 3 1 2 Longest Path = 4 Slowdown of buffer due to load DAC-2002, Physical Chip Implementation

43 CSE241 L4 System.43Kahng & Cichy, UCSD ©2003 Decomposition DAC-2002, Physical Chip Implementation

44 CSE241 L4 System.44Kahng & Cichy, UCSD ©2003 Swap Commutative Pins 2 c a b 2 1 0 1 1 1 3 a c b 2 1 0 1 1 2 1 5 Simple sorting on arrival times and delay works DAC-2002, Physical Chip Implementation


Download ppt "CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis."

Similar presentations


Ads by Google