Presentation is loading. Please wait.

Presentation is loading. Please wait.

Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of.

Similar presentations


Presentation on theme: "Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of."— Presentation transcript:

1 Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of Electrical and Computer Engineering Vancouver, BC, Canada

2 Outline Motivation and background Algorithm Results Conclusion Future Work

3 Example: Unroutable Situation: Run circuit through VPR Circuit is unroutable at specified target channel-width

4 Example: Unroutable Situation: Run circuit through VPR Circuit is unroutable at specified target channel-width Only localized area is actually unroutable Routing congestion happens locally

5 Motivation Goal: Must meet hard channel-width constraint –Number of routing tracks is fixed at manufacture time –Must meet channel-width constraint everywhere on the FPGA Presented with an unroutable circuit –Increase available interconnect (use larger device) More interconnect everywhere = more expensive FPGA device –Decrease local interconnect demand Create more aggregate interconnect for congested regions only

6 Reclustering Congested Regions Find congested regions and reduce routing demand –Increase CLB usage to spread interconnect usage –Controlled tradeoff between CLB and interconnect usage –Cost savings: can use the same FPGA, just need to recluster

7 7 Un/DoPack CAD Flow Previous work by Marvin Tom [ICCAD2006] Target a channel width constraint Spread regional logic to reduce local routing demand –Identify congested local regions –Iteratively recluster, replace, reroute –Whitespace insertion: recluster with reduced cluster size Leave uncongested regions alone

8 Background: Un/DoPack Cluster Place Route Re-cluster Identify Place Route Target CW Met? NO VPR Un/DoPack YES

9 Contributions Improve Reduce/Area tradeoff of Un/DoPack Flow –Simultaneous area and runtime savings Use congestion information to perform better reclustering –New approach to selecting congested regions –Use of interconnect-demand model to determine how much to spread logic Findings –Up to 5x runtime speedup versus Baseline –Up to 25% area savings versus Baseline

10 Contributions Recently accepted to FPT 2009 –D. Chiu, G. Lemieux, S. Wilton, “Congestion- Driven Re-Clustering for Low-cost FPGAs”

11 Previous Depopulation Schemes Single versus Multiregion: Region Selection: –Select all CLBs in area centered on most congested CLB (Single Region) –Select all CLBs in area centered on most congested CLB, not already chosen (Multiregion) Whitespace insertion: –Baseline: insert CLB in 1 row and 1 column in FPGA –Fine-grained : insert CLB in 1 row and 1 column in region Excellent area tradeoff, but slow –Multiregion: insert CLBs proportional to congestion Good runtime performance

12 Algorithm Region Selection: Try to select regions more intelligently –Capture congested regions instead of just CLBs Whitespace Insertion: Try to estimate appropriate cluster size –Use interconnect demand model to predict outcome for depopulation

13 Un/DoPack Cluster Place Route Re-cluster Identify Place Route Target CW Met? NO VPR Un/DoPack YES Region selectionWhitespace Insertion

14 Benchmark Circuits Metacircuits designed to emulate large SOC circuits –Cluster size 16 –Built using benchmark generator GNL –Large circuit composed of smaller subcircuits (SoC style) –Each subcircuit emulates the interconnect complexity (Rent parameter) of individual MCNC circuits –The standard deviation of the rent parameter is varied to create benchmark suite

15 Region Selection Find congested regions –Post Routing congestion information Center region on most congested CLB

16 Region Selection Use congestion values to generate direction to move region

17 Region Selection Binary Search –Find region with highest average congestion

18 Region Selection Mark Next Region Sort by average congestion and depopulate in sorted order

19 Budgeted Multiregion Un/DoPack (BMR) Multiple region approach Grow number of CLBs according to budget at each iteration –Number of CLBs in a row and column of the FPGA Each region grows equal to 1 row and 1 column of region

20 Adding Whitespace Congestion-Model Driven –Use interconnect demand information to estimate how much whitespace to add –Interconnect Model Estimate post clustering channel width for region

21 Regional Interconnect Adapt demand model for regions of CLBs instead of whole FPGA –Most wiring is from inside the region –Cannot affect wiring across region directly through depopulation

22 Regional Interconnect Assume external interconnect demand stays fixed Solve for internal interconnect demand region interconnect demand = Internal demand + external demand

23 Interconnect-Demand Model where W. Fang and J. Rose. “Modeling routing demand for early-stage FPGA architecture development”

24 Interconnect-Demand Model Use congestion map to determine equation constants –Calibrate equation separately for each region Solve for lambda that gives desired channel width –Re-cluster region with lower cluster size until lambda target is met

25 Congestion-Model Multiregion Un/DoPack (CMR) Same region selection method as BMR No constraint on new CLBs in each iteration Whitespace insertion using model

26 Results Typical results –Stdev004 CMR Speedup comparable to Multiregion Un/DoPack BMR Slightly faster than Baseline

27 Results Typical results –Stdev004 BMR area better than Multiregion CMR area better than Multiregion

28 Runtime / Area Tradeoff Previous Multiregion Approach (Fast) Previous Fine- Grained Approach (Good Area) Speed-Area Tradeoff

29 Runtime / Area Tradeoff BMR Improved runtime Good area performance

30 Runtime / Area Tradeoff CMR Better runtime Good area

31 Critical Path

32 Congestion Driven Placement We can further improve area performance using a congestion driven placer

33 Conclusions Use congestion information to perform better re-clustering –Up to 5x runtime speedup versus baseline –Up to 25% area savings versus baseline Improve Reduce/Area tradeoff of Un/DoPack Flow –Simultaneous area and runtime savings

34 Future Work Consider effect of neighboring Regions Other congestion-driven tools –Fast Placement

35 Questions?

36 Outline Motivation Background Multiregion approach Congestion-Driven Whitespace insertion

37 Related Work Un/DoPack [1] –Reduce interconnect usage to meet target channel width constraint Congestion Driven Clustering –iRAC, ISPL –Single-Pass Clustering


Download ppt "Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of."

Similar presentations


Ads by Google