Presentation is loading. Please wait.

Presentation is loading. Please wait.

ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap.

Similar presentations


Presentation on theme: "ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap."— Presentation transcript:

1 ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap : A Depth optimal Area Optimization mapping algorithm for FPGA Designs, ICCAD 2004

2 ECE 667 Synthesis & Verificatioin - FPGA Mapping 2 FPGA Mapping (LUT-based) How is it different from ASIC (standard cells) –Structural in nature, simpler –Any function with k inputs can be mapped into a k-LUT –Typically implemented by cut mapping FPGA architecture: k-LUT F = x 1 ’x 2 ’ + x 1 x 2 x1 x2 F 0 0 1 0 1 0 1 0 0 1 1 1 2-Input LUT 0/1 x1 x2 0/1 F Programming bit P

3 ECE 667 Synthesis & Verificatioin - FPGA Mapping 3 FPGA Mapping - example A possible mapping onto 3-LUTs - each block has   inputs f g d e h b a c

4 ECE 667 Synthesis & Verificatioin - FPGA Mapping 4 Definitions DAG: Boolean network Cone C v : sub-network rooted on node v K-feasible cone: |input(C v )|  K Fanin Cone F v : the largest C v k-feasible cut: a k-feasible C v Unit delay model: –Each LUT contributes one unit delay Cut rooted on node C: cut with output C a b c d e v FvFv 3-feasible cone C v PIs Delay of 2

5 ECE 667 Synthesis & Verificatioin - FPGA Mapping 5 Problem Formulation Delay-optimal Area Optimization problem –Given: a Boolean network; an integer k (LUT size) –Goal: cover the network with k-feasible cones (k-LUTs), such that Mapping depth (delay) is minimum Area (number of LUTs) is minimized NP-hard problem on area minimization A two-step process –Cut enumeration + evaluation (delay, area) –Cut selection to minimize delay –Possible iteration to remap nodes on non-critical paths (area recovery) –Takes into consideration node duplication

6 ECE 667 Synthesis & Verificatioin - FPGA Mapping 6 New cut a b d z yx c w a b d z yx c w Process nodes in topological order from PIs to POs Combine sub-cuts of the fanin nodes to create a new cut If the size of the cut exceeds k (LUT size), discard the cut Subcut Another Subcut Cut Enumeration

7 ECE 667 Synthesis & Verificatioin - FPGA Mapping 7 a c d yx z b w e f g Delay = 1 Optimal Delay = 1 Optimal Delay = 2 Delay = 1 Delay = 2 Delay = 2  1 Delay = 2 Optimal Delay = 1 Delay computed using dynamic programming method. The longest best delay on the POs is the optimal mapping delay Delay Propagation (k = 3)

8 ECE 667 Synthesis & Verificatioin - FPGA Mapping 8 Tries to estimate area considering fanout effect A C =  [A i / f(i)] + U C i = input(C) A i : estimated area of the fanin cone of signal i f(i) : fanout number of inputs U c : area of the cut itself Can underestimate area due to node duplication qr s p nmo t u Cut C t Cut C u f(p) = 2 ApAp Cut C A s / 2 X Area Estimation

9 ECE 667 Synthesis & Verificatioin - FPGA Mapping 9 Duplication Cost Adjustment Considers potential node duplications Check the sub-cuts for multiple fanouts Area adjusted by addition of duplication cost Subcut C f2 N Cf2 = 1 Multiple fanouts New cut C I C = 4 q r s Subcut C f1 p nmo Duplication Cost:  N Cf : number of nodes contained by subcut C f  I C : cutsize of C  fi : fanout number of subcut

10 ECE 667 Synthesis & Verificatioin - FPGA Mapping 10 C3C3 fanin1fanin2 Cost (Area) Function of a Cut Some Key parameters I C : cutsize of C N C : number of nodes covered by C f(v): fanout number of the root node v P f : duplication cost a b c d e v C1C1 C2C2

11 ECE 667 Synthesis & Verificatioin - FPGA Mapping 11 Cut Selection Once cuts are generated, traverse networks from POs to PIs and select cuts that map into LUTs Select cuts such that timing is met and the area is minimized Iterative Cut Selection Procedure –Local Cost Adjustment Input Sharing Slack Distribution Cut Probing

12 ECE 667 Synthesis & Verificatioin - FPGA Mapping 12 Local Cost Adjustment – Slack Distribution Slack C = Req v – 1 – MAX (Arr i ) i  input(C) If Slack C < 0, C is not a timing_feasible cut The larger the Slack C, the better for C in terms of slack distribution effect a c d yx z b w Largest arrival time among inputs Req d : Required time of the root C

13 ECE 667 Synthesis & Verificatioin - FPGA Mapping 13 Algorithm Recap Cut generation of k- feasible cuts Area propagation under timing constraints –optimal area at a node is the minimum area among cuts that give minimum delay Representation of the cost function for a cut more accurately Global duplication cost adjustment Cut selection involving local cost adjustment


Download ppt "ECE 667 Synthesis & Verificatioin - FPGA Mapping 1 ECE 667 Synthesis and Verification of Digital Systems Technology Mapping for FPGAs D.Chen, J.Cong, DAOMap."

Similar presentations


Ads by Google