Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological.

Similar presentations


Presentation on theme: "A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological."— Presentation transcript:

1 A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological University **IBM Austin Research Lab

2 2 Outline Introduction Problem Formulation Linear time dynamic programmingLinear time dynamic programming Bound independent oracle searchBound independent oracle search The Algorithm Experimental Results Conclusion

3 3 Layer Assignment 1X1X1X1X 2X2X2X2X 4X4X4X4X In 45nm technology, layer assignment is critical for timing and buffer area optimization In 45nm technology, layer assignment is critical for timing and buffer area optimization

4 4 Wire RC and Delay Wire in higher layer has much smaller delay

5 5 Impact to Buffering A buffer can drive longer distance in higher layer A buffer can drive longer distance in higher layer Timing is improved Timing is improved Fewer buffers are needed Fewer buffers are needed

6 6 Impact to Routing/Buffering IP IP

7 7 Problem Formulation Find a minimal cost layer assignment such that the timing constraint is satisfied. Find a minimal cost layer assignment such that the timing constraint is satisfied. Same Layer Can be different layers Given Given –A buffered Steiner tree with n wire segments –Timing constraint –m wire layers with RC parameters and cost A layer refers to a pair of horizontal and vertical layers with similar RC characteristics A layer refers to a pair of horizontal and vertical layers with similar RC characteristics Between any buffers, one layer is used Between any buffers, one layer is used In early design stage, when buffering effect is considered, wire shaping is not important [Alpert TCAD’01] In early design stage, when buffering effect is considered, wire shaping is not important [Alpert TCAD’01] In post-routing stage, wire shaping could improve timing, reduce vias and reduce coupling and so forth In post-routing stage, wire shaping could improve timing, reduce vias and reduce coupling and so forth

8 Fully Polynomial Time Approximation Scheme (FPTAS) 8 A Fully Polynomial Time Approximation Scheme A Fully Polynomial Time Approximation Scheme Provably good Provably good Within (1+ ɛ ) optimal cost for any ɛ >0 Within (1+ ɛ ) optimal cost for any ɛ >0 Runs in time polynomial in n (segments), m (layers) and 1/ ɛ Runs in time polynomial in n (segments), m (layers) and 1/ ɛ Ultimate solution for an NP-hard problem in theory Ultimate solution for an NP-hard problem in theory Highly practical Highly practical 1X1X1X1X 2X2X2X2X 4X4X4X4X

9 Previous Work in ICCAD’08 It depends on M and uses a DP of O(mn 3 / ɛ 2 ) time It depends on M and uses a DP of O(mn 3 / ɛ 2 ) time 9 Bound independent oracle query Our DP needs one run for all W New FPTAS runs in O(mn 2 / ɛ ) time New FPTAS runs in O(mn 2 / ɛ ) time Ratio between upper and lower bounds of the cost of optimal layer assignment An iterative DP with incremental W

10 10 The Rough Picture W*: the cost of optimal solution Check it Make guess on W* Return the solution Good (close to W*) Not Good Key 2: Smart guess Key 1: Efficient checking

11 11 Key 1: Efficient Checking Benefit of guess Only maintain the solutions with cost no greater than the guessed cost Only maintain the solutions with cost no greater than the guessed cost Accelerate DP Accelerate DP

12 Oracle (x): the checker, able to decide whether x>W* or not Oracle (x): the checker, able to decide whether x>W* or not – Without knowing W* – Answer efficiently 12 The Oracle Oracle (x) Guess x within the bounds Setup upper and lower bounds of cost W* Update the bounds

13 13 Construction of Oracle(x) Scale and round each wire cost Only interested in whether there is a solution with cost up to x satisfying timing constraint Dynamic Programming Perform DP to scaled problem with cost bound n/ ɛ. Time polynomial in n/ ɛ

14 14 Scaling and Rounding ɛ x ɛ /n ɛ 2x ɛ /n ɛ 3x ɛ /n ɛ 4x ɛ /n Wire cost 0 Wire cost is integer after scaling and rounding with upper bound n/ ɛ. Total # solutions is bounded in DP Rounding error at each wire ɛ, total rounding error ɛ. Rounding error at each wire  x ɛ /n, total rounding error  x ɛ. Larger x: larger error, fewer distinct costs and faster Larger x: larger error, fewer distinct costs and faster Smaller x: smaller error, more distinct costs and slower Smaller x: smaller error, more distinct costs and slower Rounding is the reason of acceleration Rounding is the reason of acceleration

15 Dynamic Programming Results 15 Yes, there is a solution satisfying timing constraint No, no such solution With cost rounding back, the solution has cost at most n/ ɛ x ɛ /n + x ɛ = (1+ ɛ )x > W* With cost rounding back, the solution has cost at least n/ ɛ x ɛ /n = x  W* DP result w/ all w are integers  n/ ɛ

16 16 Solution Characterization To model effect to upstream, a candidate solution is associated with To model effect to upstream, a candidate solution is associated with v: a node v: a node Q: required arrival time Q: required arrival time W: cumulative wire cost W: cumulative wire cost

17 17 Cost (W)-Bounded Dynamic Programming (DP) Candidate solutions are propagated toward the source Start from sinks Candidate solutions are generated Two operations – –Subtree processing – –Solution update at buffer Solution Pruning

18 18 Subtree Processing Three paths Three paths –p a : a -> u –P b : b -> u –P c : c -> u Q u (l)=min{Q a -d(p a,l),Q b - d(p b,l),Q c -d(p c,l)} Q u (l)=min{Q a -d(p a,l),Q b - d(p b,l),Q c -d(p c,l)} W u (l)=W a +W b +W c +w(T,l) W u (l)=W a +W b +W c +w(T,l) Wires are in the same layer l Wires are in the same layer l (Q u,W u ) (Q a,W a ) (Q b,W b ) (Q c,W c )

19 19 Exponential # of Solutions W (=n/ ɛ ) solutions at each downstream buffer W (=n/ ɛ ) solutions at each downstream buffer Naïve merging takes O(W k ) time with k branches Naïve merging takes O(W k ) time with k branches (Q a,1,W a,1 ) (Q a,2,W a,2 ) (Q a,3,W a,3 ) (Q a,4,W a,4 ) (Q u,W a ) (Q b,1,W b,1 ) (Q b,2,W b,2 ) (Q b,3,W b,3 ) (Q b,4,W b,4 ) (Q c,1,W c,1 ) (Q c,2,W c,2 ) (Q c,3,W c,3 ) (Q c,4,W c,4 ) k For two solutions at a node with the same W, the one with smaller Q is dominated For two solutions at a node with the same W, the one with smaller Q is dominated Try to only generate non-dominated solutions since most of O(W k ) solutions are dominated solutions Try to only generate non-dominated solutions since most of O(W k ) solutions are dominated solutions

20 20 Multi-Way Merging If best Q for cost w is obtained by merging Q(a 1 i1 ), Q(a 2 i2 ),..., Q(a k ik ), where i 1 +i 2 +…i k =w, best Q for cost w+1 is obtained by If best Q for cost w is obtained by merging Q(a 1 i1 ), Q(a 2 i2 ),..., Q(a k ik ), where i 1 +i 2 +…i k =w, best Q for cost w+1 is obtained by max 1 r k min {Q(a 1 i1 ),Q(a 2 i2 ),..., Q(a r ir+1 ),...,Q(a k ik )} max 1  r  k min {Q(a 1 i1 ),Q(a 2 i2 ),..., Q(a r ir+1 ),...,Q(a k ik )}

21 Four-Branch Example 21 Solution(w=8, Q=9) is shown. To compute Solution (w=9, Q)

22 Four-Branch Example – Case 1 22 Candidate Solution (w=9, Q=8)

23 Four-Branch Example – Case 2 23 Candidate Solution (w=9, Q=4)

24 Four-Branch Example – Case 3 24 Candidate Solution (w=9, Q=5)

25 Four-Branch Example – Case 4 25 Candidate Solution (w=9, Q=7)

26 Linear Time Multi-Way Merging 26 Lemma: given a subtree with m layers, k branches and W non-dominated solutions at each downstream buffer, one can merge them in O(mkW) time. Lemma: given a subtree with m layers, k branches and W non-dominated solutions at each downstream buffer, one can merge them in O(mkW) time.

27 Solution Update at Buffer 27 After merging, one non- dominated solution per layer per cost, totally O(mW) solutions After merging, one non- dominated solution per layer per cost, totally O(mW) solutions For each cost, find largest Q for all layers after buffer and propagate it For each cost, find largest Q for all layers after buffer and propagate it (Q u,W u ) (Q a,W a ) (Q b,W b ) (Q c,W c )

28 28 Cost-Bounded DP Lemma: given a tree with n wire segments and m layers, the optimal layer assignment subject to cost budget W=n/ ɛ can be computed in O(mnW)=O(mn 2 / ɛ ) time. Lemma: given a tree with n wire segments and m layers, the optimal layer assignment subject to cost budget W=n/ ɛ can be computed in O(mnW)=O(mn 2 / ɛ ) time.

29 29 Key 2: Bound Independent Guess U (L): upper (lower) bound on W* U (L): upper (lower) bound on W* Naive binary search style approach Naive binary search style approach Runtime depends on the initial bounds U and L Runtime depends on the initial bounds U and L Oracle (x) ɛ x=(U+L)/2 and W=n/ ɛ Set U and L on W* (1+ ɛ )x U= (1+ ɛ )x L= x W*<(1+ ɛ )x W*  x

30 30 Adapt ɛ ɛ Rounding factor x ɛ /n for cost Larger ɛ : faster with rough estimation Larger ɛ : faster with rough estimation Smaller ɛ : slower with accurate estimation Smaller ɛ : slower with accurate estimation Adapt ɛ and relate it with U and L Adapt ɛ and relate it with U and L

31 31 U/L Related Scale & Round Wire cost 0 U/L x ɛ /n

32 32 Conceptually Begin with large ɛ ’ and progressively reduce it according to U/L as x approaches W* Begin with large ɛ ’ and progressively reduce it according to U/L as x approaches W* Set ɛ ’ ɛ Set ɛ ’ as a geometric sequence of …, 8, 4, 2, 1, 1/2, …, ɛ ɛ Total runtime is O(… + n/8 + n/4 + n/2 + … + n/ ɛ ) = O(n/ ɛ ). Independent of # of iterations One run of DP takes about O(n/ ɛ ) time. Total runtime is O(… + n/8 + n/4 + n/2 + … + n/ ɛ ) = O(n/ ɛ ). Independent of # of iterations

33 Oracle Query Till U/L<2 33

34 When U/L<2 34 At least one feasible solution, otherwise no solution w/ cost 2n/ ɛ L ɛ /n = 2L  U At least one feasible solution, otherwise no solution w/ cost 2n/ ɛ L ɛ /n = 2L  U Runs in O(mn 2 / ɛ ) time Runs in O(mn 2 / ɛ ) time Pick min cost solution satisfying timing at driver W=2n/ ɛ Scale and round each cost by L ɛ /n Scale and round each cost by L ɛ /n Run DP

35 35 FPTAS for Layer Assignment  Theorem: a (1+ ɛ ) approximation to the timing constrained minimum cost layer assignment problem can be computed in O(mn 2 / ɛ ) time for any ɛ >0.

36 36 The Algorithmic Flow Oracle (x) Adapting ɛ =[U/L-1] 1/2 Set U and L of W* Set x=[UL/(1+ ɛ )] 1/2 Update U or L U/L<2 Compute final solution

37 37 Experiments Experimental Setup Experimental Setup – 1000 industrial nets Compared to Dynamic Programming and the previous FPTAS [ICCAD’08] Compared to Dynamic Programming and the previous FPTAS [ICCAD’08]

38 38 Cost Ratio Compared to DP Approximation Ratio ɛ Wire Cost Ratio

39 39 Speedup Compared to DP Approximation Ratio ɛ Speedup

40 40 Observations FPTAS always achieves the theoretical guarantee FPTAS always achieves the theoretical guarantee Larger ɛ leads to more speedup Larger ɛ leads to more speedup 3.9x faster with 2.2% additional wire area compared to DP 3.9x faster with 2.2% additional wire area compared to DP Up to 6.5x faster than DP Up to 6.5x faster than DP On average about 2x faster than previous FPTAS On average about 2x faster than previous FPTAS

41 41 Conclusion Propose a (1+ ɛ ) approximation for timing constrained layer assignment for any ɛ > 0 running in O(mn 2 / ɛ ) time Propose a (1+ ɛ ) approximation for timing constrained layer assignment for any ɛ > 0 running in O(mn 2 / ɛ ) time –Linear time DP running in O(mnW) time –Bound independent oracle query –Up to 6.5x faster than DP and 2x faster than previous FPTAS –Few percent additional wire area compared to DP as guaranteed theoretically

42 42 Thanks


Download ppt "A Faster Approximation Scheme for Timing Driven Minimum Cost Layer Assignment Shiyan Hu*, Zhuo Li**, and Charles J. Alpert** *Dept of ECE, Michigan Technological."

Similar presentations


Ads by Google