Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)

Sequential Timing Optimization

Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j) T setup d max (i,j) ij

Short path timing constraints FF should not get >1 data set per period s i s j d min (i,j) T hold s i + d min (i,j)  s j + T hold d min (i,j) ij

Clock skew optimization Another approach for sequential timing optimization Deliberately change the arrival times of the clock at various memory elements in a circuit for cycle borrowing –For zero skew, delay from clock source to all FF’s = T –Positive skew of  at FF k Change delay from clock source to FF k to T +  –Negative skew of  at FF k Change delay from clock source to FF k to T –  Problem statement: set skews for optimized performance

Sequential timing optimization Two “true” sequential timing optimization methods –Retiming: moving latches around in a design –Clock skew optimization: deliberately changing clock arrival times so that the circuit is not truly “synchronous” Clk Comb Block 1 Comb Block 2 Clk FF Clk FF Clk FF Delay Clk Comb Block 1 Comb Block 2 Clk FF

Represented by the optimization problem below - solve for P and optimal skews minimize P subject to (for all pairs of FF’s (i,j) connected by a combinational path) s i + d min (i,j)  s j + Thold s i + d max (i,j) + T setup  s j + P If d max (i,j) and d min (i,j) are constant – linear program in the variables s i and P Finding the optimal clock period using skews

Graph-based approaches For a constant clock period P, the linear program = system of difference constraints s p - s q  constant As before, perform a binary search on P For each value of P build an equivalent constraint graph Shortest path in the constraint graph gives a set of skews for a given value of P If P is infeasible, there will be a negative cycle in the graph that will be detected during shortest-path calculations ij f ( P )

Retiming Assume unit gate delays, no setup times Initial Circuit: P=3 Retimed Circuit: P=2 Clk Comb Block 1 Comb Block 2 Clk FF Clk FF Clk

Retiming: Definition Relocation of flip-flops (FF’s) and latches (usually to achieve lower clock periods) Maintain the latency of all paths in circuit, i.e., number of FF stages on any input-output path must remain unchanged

Graph Notation of Circuit w(e uv ) = #latencies between u and v r(u) is # latencies moved across gate u r(PI) = r(PO) = 0: Merge them both into a “host” node h with r(h) = 0 w r (e uv ) = w(e uv ) + r(v) - r(u) u v w(e uv ) = 2 r(u) = 1 w(e uv ) = 1 u v r(v) = 2 u v w r (e uv ) = 2 u v delay = d(u) delay = d(v)

For a path from v 1 to v k Consider a path of vertices –Define w(v 1 to v k ) = w 12 + w 23 + … + w (k-1,k) –After retiming, w r (v 1 to v k ) = w 12r + w 23r + … + w (k-1,k)r = [w 12 +r(2)–r(1)]+[w 23 +r(3)–r(2)]+[w 23 +r(3)–r(2)]+…+[w (k-1,k) +r(k)–r(k-1)] = w(v 1 to v k ) + r(k) – r(1) –For a cycle, v 1 = v k, which implies that w r = w for a cycle –In other words, retiming leaves the # latencies unchanged on any cycle v1v1 v2v2 v3v3 vkvk w 12 w 23 w 34 W k-1,k

Constraints for retiming Non-negativity constraints (cannot have negative latencies) –w r on each edge must be non-negative –For any edge from vertex u to vertex v, w r (u,v) = w(u,v) + r(v) – r(u)  0 i.e., r(u) – r(v)  w(u,v) Period constraints (need a latency if path delay  period) –(or more precisely, path delay + T setup  period) –For any path from vertex v 1 to vertex v k, under clock period P, w r (v 1 to v k ) = w(v 1 to v k ) + r(v k ) – r(v 1 )  1 if delay(v 1 to v k ) > P i.e., r(v 1 ) – r(v k )  w(v 1 to v k ) – 1 if delay(v 1 to v k ) > P

Example Circuit graph: –Vertex weights = gate delays –Edge weights = # latencies Non-negativity constraints 1.r(h) – r(G1)  0 2.r(G1) – r(G2)  0 3.r(G2) – r(G3)  0 4.r(G3) – r(G4)  1 5.r(G4) – r(h)  0 Period constraints for P = 2 6.r(h) – r(G3)  -1 7.r(G1) – r(G3)  -1 8.r(G2) – r(G4)  0 9.r(G2) – r(h)  0 Clk Comb Block 1 Comb Block 2 Clk FF G1 G3 G2 G4 0 1 1 11 00 1 0 0 G1 G2G3 G4 h

Graph-based approaches System of difference constraints r(u) – r(v)  c Equivalent constraint graph Shortest path in the constraint graph gives a set of valid r values for a given value of P (note that period constraints change for different values of P) If P is infeasible, there will be a negative cycle in the graph that will be detected during shortest-path calculations vu c

Corresponding shortest path problem Find shortest path from host to get –r(h) = 0 –r(G1) = 0 –r(G2) = 0 –r(G3) = 1 –r(G4) = 0 This gives the solution 00 1 0 0 G1 G2G3 G4 h 0 0 Clk Comb Block 1 Comb Block 2 Clk FF Clk FF Clk

Overall scheme for minimum period retiming Objective: to find a retiming that minimizes the clock period (the assignment of r values may not be unique due to slack in the shortest path graph!) –Binary search over P = [0,P unretimed ] –P unretimed = period of unretimed circuit = upper bound on optimal P –Range in some iteration of the search = [P min, P max ] –Build shortest path graph with non-negativity constraints (independent of P) –At each value of P Add period constraints to shortest path graph (related to W, D matrices discussed in class – will not describe here) Solve shortest path problem If negative cycle found, set P min = P; else set P max = P Iterate until range of P is sufficiently small

Finding shortest paths Dijkstra’s algorithm –O(VlogV + E) for a graph with V vertices and E edges –Applicable only if all edge weights are non-negative –The latter condition does not hold in our case! Bellman-Ford algorithm –O(VE) for a graph with V vertices and E edges –Outline for I = 1 to V – 1 for each edge (u,v)  E update neighbor’s weights as r(v) = min[r(u) + d(u,v),r(v)] for each edge (u,v)  E if r(u) + d(u,v) > r(v) then a negative cycle exists Basic idea: in iteration I, update lowest cost path with I edges After V – 1 iterations, if any update is still required, a negative cycle exists

“Relaxation” algorithm for retiming Perform a binary search on clock period P as before At each value of P check feasibility as follows –Repeat V-1 times (where V = # vertices) 1.Set r(u) = 0 for each vertex 2.Perform timing analysis to find clock period of the circuit 3.For any vertex u with delay > P, r(u)++ 4.If no such vertex exists, P is feasible 5.Else, retime the circuit using these values of r; update the circuit and go to step 1 –If Clock period > P after V – 1 iterations, then P is infeasible

The retiming-skew relationship Skew Retiming Both borrow one unit of time from Comb Block 2 and lend it to Comb Block 1 Magnitude of optimal skew = amount of delay that the FF has to move across Can be generalized for another approach to retiming FF Clk FF Clk Comb Block 1 Comb Block 2 Clk FF Delay = 1 Clk

Can move from skews to retiming Moving a flip-flop across a gate G –left  right  increasing its skew by delay(G) – –right  left  reducing its skew by delay(G) – More generally, Old skew=s Delay=d New skew = s+d s1 s2 s3 s4 s j = max 1  i  4 (s i +MAX(i,j)) s k = max 1  i  4 (s i +MAX(i,k)) FF j FF k

Another approach to retiming Two-phase approach –Phase A: Find optimal skews (complexity depends on the number of FF’s, not the number of gates) –Phase B: Relocate FF’s to retime circuit (since most FF movements are seen to be local in practice, this does not take too long) –Not provably better than earlier approach in terms of complexity, but practically works very well

Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)

Similar presentations

Presentation on theme: "Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)

Similar presentations

Presentation on theme: "Sequential Timing Optimization. Long path timing constraints Data must not reach destination FF too late s i + d(i,j) + T setup  s j + P s i s j d(i,j)"— Presentation transcript:

Similar presentations

About project

Feedback