Presentation is loading. Please wait.

Presentation is loading. Please wait.

An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical.

Similar presentations


Presentation on theme: "An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical."— Presentation transcript:

1 An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical Engineering Texas A&M University College Station, Texas 77845, USA

2 2 DATE 2005, MUNICH03/10/2005 Outline Introduction O(b 2 n 2 ) Algorithm New O(bn 2 ) Algorithm Experimental Results Extension Conclusion

3 3 DATE 2005, MUNICH03/10/2005 Buffer insertion and sizing is one of the most effective method for reducing interconnect delay. Introduction Saxena, et al. [TCAD 2004]

4 4 DATE 2005, MUNICH03/10/2005 Modern libraries contain hundreds of different buffers with different characteristics. Polarity, input capacitance, driving resistance, intrinsic delay, noise margin, power, area, etc. Buffer library size has quadratic effect on running time in traditional algorithms. With such large number of buffers and buffer types, fast algorithms for buffer insertion are crucial for timing closure. Introduction(cont.)

5 5 DATE 2005, MUNICH03/10/2005 Problem Formulation Given: A routing tree, n possible buffer positions, sink capacitances and required arrival times (RAT), a buffer library, wire resistance and capacitance. Delay model: Elmore delay for interconnect and linear delay model for buffers. s0s0 s1s1 s2s2 s3s3 s4s4 buffer library source sinks possible buffer positions

6 6 DATE 2005, MUNICH03/10/2005 Maximum Slack Problem Find: Where to insert buffers so that the slack at the source Q(s 0 ) is maximized. )},()({min)( 0 0 0 ii i ssdelaysRATsQ   s0s0 s1s1 s3s3 s4s4 without buffer, Q(S 0 )= – 50 ps s2s2

7 7 DATE 2005, MUNICH03/10/2005 Maximum Slack Problem Find: Where to insert buffers so that the slack at the source Q(s 0 ) is maximized. )},()({min)( 0 0 0 ii i ssdelaysRATsQ   s0s0 s1s1 s3s3 s4s4 with 2 buffers, Q(S 0 )= 100 ps s2s2

8 8 DATE 2005, MUNICH03/10/2005 Previous Research Maximum Slack van Ginneken [ISCAS 90]: O(n 2 ) time and space, where n is the number of buffer positions. Lillis, Cheng and Lin [TCAS 96]: O(b 2 n 2 ) time and space for b buffer types. Shi and Li [DAC 03]: O(nlogn) time for 2-pin nets, O(nlog 2 n) time for multi-pin nets. O(nlogn) space. Minimum Buffer Cost (Area, Power, etc.) Lillis, Cheng and Lin [TCAS 96]: pseudo- polynomial time algorithm. Shi, Li and Alpert [ASPDAC 04]: buffer cost minimization is NP-hard if b is a variable.

9 9 DATE 2005, MUNICH03/10/2005 Outline Introduction O(b 2 n 2 ) Algorithm New O(bn 2 ) Algorithm Experimental Results Extension Conclusion

10 10 DATE 2005, MUNICH03/10/2005 Dynamic Programming Each candidate solution of a sub-tree is represented by a (Q, C) pair, where Q is slack and C is downstream capacitance. For any two candidates A 1 and A 2 of the same sub-tree, if Q(A 1 )  Q(A 2 ) and C(A 1 )  C(A 2 ), then A 1 is redundant. O(b 2 n 2 ) time dynamic programming algorithm (Lillis- Cheng-Lin) For b buffer types, the number of candidates is at most bn+1 For a wire, update (Q, C) value for every candidate in O(bn) time For a buffer position, add b new candidates in O(b 2 n) time For a branch point, merge two sets of candidates in O(bn 1 +bn 2 ) time

11 11 DATE 2005, MUNICH03/10/2005 Dynamic Programming Each candidate solution of a sub-tree is represented by a (Q, C) pair, where Q is slack and C is downstream capacitance. For any two candidates A 1 and A 2 of the same sub-tree, if Q(A 1 )  Q(A 2 ) and C(A 1 )  C(A 2 ), then A 1 is redundant. O(bn 2 ) time dynamic programming algorithm (This paper) For b buffer types, the number of candidates is at most bn+1 For a wire, update (Q, C) value for every candidate in O(bn) time For a buffer position, add b new candidates in O(bn) time For a branch point, merge two sets of candidates in O(bn 1 +bn 2 ) time

12 12 DATE 2005, MUNICH03/10/2005 Data Structure: Linked List Use linked list to store non-redundant candidates Sorted in decreasing Q and decreasing C order Each entry also contains the list of buffer positions (Q 1,C 1 )(Q 2,C 2 )(Q 3,C 3 ) Less CapacitanceBetter Slack

13 13 DATE 2005, MUNICH03/10/2005 Best Candidates For each buffer B i, R(B i ) is buffer driver resistance, C(B i ) is buffer input capacitance, and t(B i ) is buffer intrinsic delay. Label buffers according to non- decreasing order of resistance R(B 1 )  R(B 2 )  …  R(B b ). For each buffer type B i Define the best candidate  i as the candidate that maximizes slack among all candidates after B i is inserted. The new slack is Q(  i )–R(B i )C(  i ) –t(B i ). Define the new candidate  i as the candidate formed by  i with buffer type B i. How to find all best candidates quickly is the key addressed in this paper.

14 14 DATE 2005, MUNICH03/10/2005 Example Three buffer types R(B 1 )=1, C(B 1 ), t(B 1 ) R(B 2 )=3, C(B 2 ), t(B 2 ) R(B 3 )=5, C(B 3 ), t(B 3 ) Candidates (Q, C): (21, 5) (19, 4) (15, 3) (7, 2) (6, 1) Best candidate for B 1 is  1, and the new candidate is  1 Insert B 1 : (16  t(B 1 ), C(B 1 )) (15  t(B 1 ), C(B 1 )) (12  t(B 1 ), C(B 1 )) (5  t(B 1 ), C(B 1 )) 11 11 Insert B 2 : (6  t(B 2 ), C(B 2 )) (7  t(B 2 ), C(B 2 )) (6  t(B 2 ), C(B 2 )) (1  t(B 2 ), C(B 2 )) (3  t(B 2 ), C(B 2 )) 22 22 Insert B 3 : (  4  t(B 3 ), C(B 3 )) (  1  t(B 3 ), C(B 3 )) (0  t(B 3 ), C(B 3 )) (  3  t(B 3 ), C(B 3 )) (1  t(B 3 ), C(B 3 )) 33 33 Best candidate for B 3 is  3, and the new candidate is  3 Best candidate for B 2 is  2, and the new candidate is  2

15 15 DATE 2005, MUNICH03/10/2005 Outline Introduction O(b 2 n 2 ) Algorithm New O(bn 2 ) Algorithm Experimental Results Extension Conclusion

16 16 DATE 2005, MUNICH03/10/2005 (Q, C) Plane Non-redundant (Q, C) list is a monotonically decreasing sequence As resistance is added, Q values change A 5 (6, 1) A 4 (7, 2) A 3 (15, 3) A 2 (19, 4) A 1 (21, 5)

17 17 DATE 2005, MUNICH03/10/2005 R(B 1 ) = 1, Q=Q–R(B 1 )*C A 1 (21-5, 5)

18 18 DATE 2005, MUNICH03/10/2005 R(B 2 ) = 3, Q=Q–R(B 2 )*C A 1 (21-15, 5)

19 19 DATE 2005, MUNICH03/10/2005 R(B 3 ) = 5, Q=Q–R(B 3 )*C A 1 (21-25, 5)

20 20 DATE 2005, MUNICH03/10/2005 As R Increases, Q Decreases

21 21 DATE 2005, MUNICH03/10/2005 Best Q Values Move to Left Best Q for each R

22 22 DATE 2005, MUNICH03/10/2005 Best Candidates are in Decreasing Order of C Lemma 1: C(  1 )  C(  2 )  …  C(  b ) 11 22 33 Not enough for an O(bn) algorithm to find all best candidates. Need global search

23 23 DATE 2005, MUNICH03/10/2005 Pruned Convex Pruning A5A5 A4A4 A3A3 A2A2 Convex pruning prune candidates like A 4 A1A1

24 24 DATE 2005, MUNICH03/10/2005 Before Convex Pruning Non-Convex

25 25 DATE 2005, MUNICH03/10/2005 After Convex Pruning 22 33 11

26 26 DATE 2005, MUNICH03/10/2005 Convex Hull After convex pruning, remaining list is a convex hull Lemma 3: Best candidates must be on the convex hull A candidate is on the convex hull if and only if there exists an resistance R such that when R is added, this candidate gives maximum Q Lemma 4: On convex hull, if A i gives maximum Q among neighboring candidates, A i gives maximum Q among all candidates The slope (Q i  Q j )/(C i  C j ) between candidates A i and A j (i>j) is the extra resistance value that makes A j to have better slack than A i On convex hull, slopes are in sorted order Local Optimal  Global Optimal

27 27 DATE 2005, MUNICH03/10/2005 Local Optimal  Global Optimal A5A5 A2A2 A1A1 For any R(B i ), if A 2 gives better slack than A 1 and A 3, then A 2 is the best candidate for B i. A3A3

28 28 DATE 2005, MUNICH03/10/2005 Q C Find Convex Hull: Graham’s Scan Since the points are sorted, Graham’s scan can perform convex pruning in linear time

29 29 DATE 2005, MUNICH03/10/2005 Q C Find Convex Hull: Graham’s Scan Since the points are sorted, Graham’s scan can perform convex pruning in linear time

30 30 DATE 2005, MUNICH03/10/2005 Q C Find Convex Hull: Graham’s Scan Since the points are sorted, Graham’s scan can perform convex pruning in linear time

31 31 DATE 2005, MUNICH03/10/2005 Find Convex Hull: Graham’s Scan Since the points are sorted, Graham’s scan can perform convex pruning in linear time Q C

32 32 DATE 2005, MUNICH03/10/2005 O(bn) O(blogb) O(bn) New Subroutine for Adding Buffer At each buffer position, given the (Q, C) list N in decreasing C order and the buffer library, where R(B 1 )  R(B 2 )  …  R(B b ). Generate new (Q, C) list A 1, A 2, …, with Convex Pruning Generate new candidates  1,  2 … with the following loop Initialize j = 1, then for i = 1 to b do If A j gives better slack than A j+1 then Generate new candidates  i for buffer B i Q(  i ) = Q(A j )–R(B i )C(A j ) –t(B i ) C(  i ) = C(B i ) else j = j + 1 Sort  i s in non-increasing C order. Insert  i s into original list N

33 33 DATE 2005, MUNICH03/10/2005 O(bn 2 ) Algorithm Dynamic programming For b buffer types, the number of candidates is at most bn+1 For a wire, update (Q, C) value for every candidate in O(bn) time For a buffer position, add b new candidates in O(bn) time For a branch point, merge two sets of candidates in O(bn 1 +bn 2 ) time Total complexity is O(bn 2 ).

34 34 DATE 2005, MUNICH03/10/2005 Outline Introduction O(b 2 n 2 ) Algorithm New O(bn 2 ) Algorithm Experimental Results Extension Conclusion

35 35 DATE 2005, MUNICH03/10/2005 Speedup over O(b 2 n 2 ) Algorithm net1: 337 sinks net2: 1944 sinks net3: 2676 sinks

36 36 DATE 2005, MUNICH03/10/2005 Speedup vs. Buffer Positions Buffer Library Size: 64

37 37 DATE 2005, MUNICH03/10/2005 Outline Introduction O(b 2 n 2 ) Algorithm New O(bn 2 ) Algorithm Experimental Results Extension and Conclusion

38 38 DATE 2005, MUNICH03/10/2005 Extension to Min Buffer Cost Buffer cost is associated with area and power Find a solution satisfying the slack requirement and at the same time, has minimum buffer cost Each candidate solution is represented by a (Q, C, W) triple, where Q is slack, C is capacitance, and W is buffer cost Worst-case NP-hard Our algorithm can reduce the operation of adding a buffer from O(bN) to O(N), where N is the number of non-redundant candidates

39 39 DATE 2005, MUNICH03/10/2005 Conclusion New O(bn 2 ) algorithm for optimal buffer insertion with b buffer types Best candidates must be in decreasing order of C Best candidates must be on the convex hull Local optimal  global optimal Applicable to cost minimization and inverting buffer types

40 40 DATE 2005, MUNICH03/10/2005 Thank You!


Download ppt "An O(bn 2 ) Time Algorithm for Optimal Buffer Insertion with b Buffer Types Authors: Zhuo Li and Weiping Shi Presenter: Sunil Khatri Department of Electrical."

Similar presentations


Ads by Google