Presentation is loading. Please wait.

Presentation is loading. Please wait.

Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc.

Similar presentations


Presentation on theme: "Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc."— Presentation transcript:

1 Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc. (marvin.tom@xilinx.com) San Jose, CA, USA *Work performed at University of British Columbia David Leong University of British Columbia (davel@ece.ubc.ca) Vancouver, BC, Canada Guy Lemieux University of British Columbia (lemieux@ece.ubc.ca) Vancouver, BC, Canada

2 2 Overview Introduction, Goals and Motivation –Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits –Varying amount of interconnect variation Un/DoPack CAD Tool: –Iterative channel width reduction by whitespace insertion Results Conclusion

3 3 Overview Introduction, Goals and Motivation –Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits –Varying amount of interconnect variation Un/DoPack CAD Tool: –Iterative channel width reduction by whitespace insertion Results Conclusion

4 4 Mesh-Based FPGA Architecture 9 logic blocks 4 wires per channel 3*4=12 total horizontal tracks LLLLLLLLLLLLLLLLLLLLL L L L L Larger FPGAs have more “aggregate” interconnect 16 logic blocks 4 wires per channel 4*4=16 total horizontal tracks

5 5 Motivation: Area of FPGA Devices Number of Layout Tiles SIZE of Layout Tile Total Layout AREA = SIZE * Number MCNC Circuits Mapped onto an FPGA

6 6 Motivation: Channel Width Demand Logic Range User buys bigger device. Interconnect Range User has no choice! Devices built for worst-case channel width (fixed width) Interconnect dominates area (>70%) MCNC Circuits Mapped onto an FPGA

7 7 Goal: Reduce Channel Width But { apex4, elliptic, frisc, ex1010, spla, pdc } are unroutable…. Can we make them routable in a Constrained FPGA? Altera Cyclone Channel width constraint of 80 routing tracks Constrained FPGA Channel width constraint of 60 routing tracks Smaller area, lower cost for low-channel-width circuits

8 8 Possible Solution Trade-off logic utilization for channel width –User can always buy more logic…. (not more wires) FPGA 1FPGA 2 LLLL LLLL LLLL LLLL LLLL LLLL LLLL LLLL L L L L LLLLL Trade-off: CLB count for Channel width What about area??

9 9 Features and Costs of Two FPGA Families Sample Benchmark Circuit –10,000 LEs –150 Routing Tracks –No Multipliers –100 K Memory Altera DeviceLEsMemoryMult.RoutingCost Cyclone 1C1212,060239,616080$56 Stratix 1S1010,570920,44848232$190 Cyclone 1C2020,060294,912080$100 Stratix 1S2018,4601,669,24980232$350 Sample Benchmark Circuit –20,000 LEs –75 Routing Tracks

10 10 Overview Introduction, Goals and Motivation –Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits –Varying amount of interconnect variation Un/DoPack CAD Tool: –Iterative channel width reduction by whitespace insertion Results Conclusion

11 11 GNL Circuit Benchmark Suite Create benchmark circuits with variation –SoC Randomly integrate/stitch together “IP Blocks” –IP Blocks have varied interconnect needs Generate Netlist (GNL) –Stroobandt @ Ghent University –Synthetic benchmark generator GNL circuits generated hierarchically –Root  # I/Os, # IP blocks –Second Level  20 IP blocks, # LEs, Rent parameter

12 12 Rent Linear Interpolation 7 benchmark circuits Average Rent = 0.62, Stdev Rent = 0  0.12 240/120 primary inputs/outputs

13 13 Overview Introduction, Goals and Motivation –Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits –Varying amount of interconnect variation Un/DoPack CAD Tool: –Iterative channel width reduction by whitespace insertion Results Conclusion

14 14 Un/DoPack Flow Iterative non-uniform cluster depopulation tool Step 1: Traditional SIS/VPR Step 2: UnPack: –Congestion Calculator Step 3: DoPack: –Incremental Re-Cluster Step 4,5: Fast Place/Route

15 15 Un/DoPack Flow: SIS/VPR Step 1: Traditional SIS/VPR

16 16 Un/DoPack Flow: SIS/VPR Step 1: Traditional SIS/VPR

17 17 Un/DoPack Flow: SIS/VPR Step 1: Traditional SIS/VPR

18 18 Un/DoPack Flow: UnPack Step 2: UnPack: –Congestion Calculator

19 19 Un/DoPack Flow: UnPack Step 2: UnPack –Generate Congestion Map –CLB Label = Largest CW occ in 4 adjacent channels

20 20 Un/DoPack Flow: UnPack Step 2: UnPack: –Depop Center = Largest CLB label M X M Array

21 21 Un/DoPack Flow: UnPack Step 2: UnPack: –Option 1 Coarse Grain: Dpop Radius = M/4 Dpop Amt: 1 new row/col in array M X M Array

22 22 Un/DoPack Flow: UnPack Step 2: UnPack: –Option 2 Fine Grain: Dpop Radius = M/4, M/5, M/6, M/8 Dpop Amt: 1 new row/col in region M X M Array

23 23 Un/DoPack Flow: DoPack Step 3: DoPack: –Incremental Re-Cluster

24 24 Un/DoPack Flow: Fast P&R Step 4,5: Fast Place/Route

25 25 Un/DoPack Flow: Fast P&R Step 4,5: Fast Place/Route Fast Placement –UBC Incremental Placer (under development) –VPR –fast Fast Router –Use illegal pathfinder solution from first iterations Unsuccessful so far –Use full routed solution Slow but reliable

26 26 Overview Introduction, Goals and Motivation –Reduce channel width, lower cost, make circuits “routable” Benchmark Circuits –Varying amount of interconnect variation Un/DoPack CAD Tool: –Iterative channel width reduction by whitespace insertion Results Conclusion

27 27 Un/DoPack: Baseline Flow UnPack: Coarse grained congestion calculator DoPack: iRAC replica Fast Place: UBC Incremental Placer Fast Route: None FPGA Architecture: –LUT size (k) = 6 –Cluster size (N) = 16 –Inputs per cluster (I) = 51 –Wires of length (L) = 4

28 28 Area of GNL Benchmarks

29 29 Interconnect Variation: Impact on FPGA Architecture Design High Variation Circuits Require Wide Channel Width

30 30 Critical Path of GNL Benchmarks

31 31 Un/DoPack Congestion Map Before After Un/DoPack

32 32 Multi-Region Un-Pack Depopulate multiple regions at once –Depopulate each region separately –Smaller radius = M/10 Handle overlapping regions

33 33 Normalized Area

34 34 Normalized Critical Path

35 35 Run-Time Comparisons

36 36 Conclusion Un/DoPack: FPGA CAD flow –Find “local” congestion  depopulate  reduced interconnect demand FPGA benchmark circuit “suite” –Stdev: Used to vary interconnect demand Discoveries… –“Non-uniform” depopulation limits area inflation –“Interconnect variation” important for area inflation and FPGA architecture design –“Routing closure” achieved by re-clustering and incremental place & route UNROUTABLE circuits made ROUTABLE  buy an FPGA with MORE LOGIC!!!

37 End of Talk


Download ppt "Un/DoPack: Re-Clustering of Large System-on-Chip Designs with Interconnect Variation for Low-Cost FPGAs Marvin Tom* Xilinx Inc."

Similar presentations


Ads by Google