Presentation is loading. Please wait.

Presentation is loading. Please wait.

Delay Optimization using SOP Balancing

Similar presentations


Presentation on theme: "Delay Optimization using SOP Balancing"— Presentation transcript:

1 Delay Optimization using SOP Balancing
Alan Mishchenko Robert Brayton UC Berkeley Stephen Jang Agate Logic Inc. Victor Kravets IBM

2 Outline Delay optimization is an important problem
AIG is used for synthesis and mapping Minimizing AIG level leads to delay reduction Contribution: Algorithm for AIG level minimization Very simple idea – remarkable consequences! Experimental results Conclusions

3 Delay Models AIG level count Library delay model
Useful to measure delay before mapping Library delay model Useful to measure delay after mapping Both metrics are approximate The real delay should be measured after P&R

4 AIG Definition and Examples
AIG is a Boolean network composed of two-input ANDs and inverters cdab 00 01 11 10 1 F(a,b,c,d) = ab + d(ac’+bc) b c a d 6 nodes 4 levels F(a,b,c,d) = ac’(b’d’)’ + c(a’d’)’ = ac’(b+d) + bc(a+d) cdab 00 01 11 10 1 a c b d 7 nodes 3 levels

5 Three Simple Tricks Structural hashing Complemented edges
Makes sure AIG is stored in a compact form Is applied during AIG construction Propagates constants Makes each node structurally unique Complemented edges Represents inverters as attributes on the edges Leads to fast, uniform manipulation Does not use memory for inverters Increases logic sharing using DeMorgan’s rule Memory allocation Uses fixed amount of memory for each node Can be done by a simple custom memory manager Even dynamic fanout manipulation is supported! Allocates memory for nodes in a topological order Optimized for traversal in the same topological order Small static memory footprint for many applications Computes fanout information on demand a b c d Without hashing a b c d With hashing

6 AIG: A Unifying Representation
An underlying data structure for various computations Representing both local and global functions Used in rewriting, resubstitution, simulation, SAT sweeping, induction, etc A unifying representation for the whole flow Synthesis, mapping, verification pass around AIGs Stored multiple structures for mapping (‘AIG with choices’) The main functional representation in ABC Foundation of ‘contemporary’ logic synthesis Source of ‘signature features’ (speed, scalability, etc)

7 AND-balancing Step 1: Covering Step 2: Tree decomposition
Cover AIG with non-overlapping multi-input ANDs of largest size without duplication Step 2: Tree decomposition In a topological order, perform tree decomposition of multi-input ANDs to reduce delay

8 Step 1: Covering Cover AIG with non-overlapping multi-input ANDs of largest size without logic duplication (unique result) g g a g b a b c a b c c f f f d e d e d e

9 Step 2: Tree Decomposition
In a topological order, decompose multi-input AND using arrival times of the fanins, to create the tree of two-input ANDs to minimize the AIG level (non-unique result) a g b c g f a b c f d e d e

10 AND-balancing Step 1: Covering Step 2: Tree decomposition g a g b a b
h c f a b c f f d e d e d e Delay: 5 levels Delay: 3 levels

11 SOP-balancing For each node, in a topological order
Compute several k-input cuts For each cut Compute truth table Compute SOP Perform delay-optimal tree balancing of the SOP Select the cut with the smallest output arrival time If there is an improvement in the output arrival time, update AIG structure

12 SOP-balancing (Illustration)
Example: F = abc + def + g D D1 D2 D3 g Original AIG structure Cut1 Cut2 Cut3 a c b d f e SOP-balancing = AND-balancing for each cube and for the sum. Given a set of cuts at a node (Cut1, Cut2, etc), choose the cut Di with the smallest output arrival time among all arrival times of the cuts (D1, D2, etc). If Di < D, replace the original AIG structure by the AIG structure of balanced SOP.

13 Why SOP-balancing Reduces Delay More Than AND-balancing?
AND-balancing is limited to multi-input ANDs SOP-balancing looks at larger functions In many cases, AND-balancing cannot reduce delay while SOP-balancing can reduce it Example: F = ab + c(d + ef) F = ab + cd + cef a b c c e f d a b c d e f 3 levels 4 levels

14 Implementation The proposed algorithm is implemented by customizing priority-cut-based technology mapper if in ABC: Command is if -g [-K <num>] [-C <num] -g uses SOP-balancing for cut evaluation -K <num> specifies the cuts size -C <num> specifies the number of cuts considered at a node Cost functions used to prioritize the cuts: Delay: the arrival time of the output Measured using the number of levels of 2-input ANDs Area: the size of the tree decomposition of the SOP Measured using the number of 2-input ANDs A. Mishchenko, S. Cho, S. Chatterjee, and R. Brayton, "Combinational and sequential mapping with priority cuts", Proc. ICCAD '07, pp

15 Experimental Setup (St. Cells)
Used a suite of industrial designs Removed two outliers with very big delay improvement Used MCNC standard-cell library from SIS distribution Performed 3 runs Reference: (st; dch; map)4 Run 1: (st; if -K 6 -g -C 8)(st; dch; map)4 Run 2: (st; if -K 6 -g -C 8)2(st; dch; map)6 Runtime impact The runtime of if –g is close to the runtime of one round of mapping (about 60 sec for a design with 500K AIG nodes)

16 Experiments: Industrial Designs

17 Experiments: Example

18 Results after P&R

19 Experimental Setup (4-LUTs)
Used a set of academic benchmarks from previous work Performed 4-LUT mapping (unit-delay, unit-area) Used 8-input cuts (with 8 cuts per node) during SOP balancing Performed 3 experimental runs Baseline (st; if -K 4) Choices Baseline + (st; dch; if -K 4)5 SOP balancing Baseline + (st; dch; if -K 4)2; (st; if ‑g -C 8 -K 8); (st; dch; if -K 4)3

20 Experiments: Mapping into 4-LUTs

21 Conclusion Introduced AIGs Illustrated AND-balancing
A known way to reduce AIG level (command “balance” in ABC) Introduced SOP-balancing A new way to reduce AIG level (command “if –g; st” in ABC) Performed experiments on industrial benchmarks Delay reduction after mapping correlates with AIG level reduction For standard cells (before placement) Achieved 30% delay reduction with 2.4% area increase Achieved 41% delay reduction with 3.9% area increase For standard cells (after placement) Achieved 20% improvement in FOM and 5% area reduction For 4-LUT FPGA mapping (before placement) Achieved 16% delay reduction with 9% area increase Future work Try with a more realistic gate library Try on highly optimized ASICs designs


Download ppt "Delay Optimization using SOP Balancing"

Similar presentations


Ads by Google