Download presentation
Presentation is loading. Please wait.
Published byWendy Parsons Modified over 8 years ago
1
Congestion-Driven Re-Clustering for Low-cost FPGAs MASc Examination Darius Chiu Supervisor: Dr. Guy Lemieux University of British Columbia Department of Electrical and Computer Engineering Vancouver, BC, Canada
2
Outline Motivation and background Algorithm Results Conclusion Future Work
3
Example: Unroutable Situation: Run circuit through VPR Circuit is unroutable at specified target channel-width
4
Example: Unroutable Situation: Run circuit through VPR Circuit is unroutable at specified target channel-width Only localized area is actually unroutable Routing congestion happens locally
5
Motivation Goal: Must meet hard channel-width constraint –Number of routing tracks is fixed at manufacture time –Must meet channel-width constraint everywhere on the FPGA Presented with an unroutable circuit –Increase available interconnect (use larger device) More interconnect everywhere = more expensive FPGA device –Decrease local interconnect demand Create more aggregate interconnect for congested regions only
6
Reclustering Congested Regions Find congested regions and reduce routing demand –Increase CLB usage to spread interconnect usage –Controlled tradeoff between CLB and interconnect usage –Cost savings: can use the same FPGA, just need to recluster
7
7 Un/DoPack CAD Flow Previous work by Marvin Tom [ICCAD2006] Target a channel width constraint Spread regional logic to reduce local routing demand –Identify congested local regions –Iteratively recluster, replace, reroute –Whitespace insertion: recluster with reduced cluster size Leave uncongested regions alone
8
Background: Un/DoPack Cluster Place Route Re-cluster Identify Place Route Target CW Met? NO VPR Un/DoPack YES
9
Contributions Improve Reduce/Area tradeoff of Un/DoPack Flow –Simultaneous area and runtime savings Use congestion information to perform better reclustering –New approach to selecting congested regions –Use of interconnect-demand model to determine how much to spread logic Findings –Up to 5x runtime speedup versus Baseline –Up to 25% area savings versus Baseline
10
Contributions Recently accepted to FPT 2009 –D. Chiu, G. Lemieux, S. Wilton, “Congestion- Driven Re-Clustering for Low-cost FPGAs”
11
Previous Depopulation Schemes Single versus Multiregion: Region Selection: –Select all CLBs in area centered on most congested CLB (Single Region) –Select all CLBs in area centered on most congested CLB, not already chosen (Multiregion) Whitespace insertion: –Baseline: insert CLB in 1 row and 1 column in FPGA –Fine-grained : insert CLB in 1 row and 1 column in region Excellent area tradeoff, but slow –Multiregion: insert CLBs proportional to congestion Good runtime performance
12
Algorithm Region Selection: Try to select regions more intelligently –Capture congested regions instead of just CLBs Whitespace Insertion: Try to estimate appropriate cluster size –Use interconnect demand model to predict outcome for depopulation
13
Un/DoPack Cluster Place Route Re-cluster Identify Place Route Target CW Met? NO VPR Un/DoPack YES Region selectionWhitespace Insertion
14
Benchmark Circuits Metacircuits designed to emulate large SOC circuits –Cluster size 16 –Built using benchmark generator GNL –Large circuit composed of smaller subcircuits (SoC style) –Each subcircuit emulates the interconnect complexity (Rent parameter) of individual MCNC circuits –The standard deviation of the rent parameter is varied to create benchmark suite
15
Region Selection Find congested regions –Post Routing congestion information Center region on most congested CLB
16
Region Selection Use congestion values to generate direction to move region
17
Region Selection Binary Search –Find region with highest average congestion
18
Region Selection Mark Next Region Sort by average congestion and depopulate in sorted order
19
Budgeted Multiregion Un/DoPack (BMR) Multiple region approach Grow number of CLBs according to budget at each iteration –Number of CLBs in a row and column of the FPGA Each region grows equal to 1 row and 1 column of region
20
Adding Whitespace Congestion-Model Driven –Use interconnect demand information to estimate how much whitespace to add –Interconnect Model Estimate post clustering channel width for region
21
Regional Interconnect Adapt demand model for regions of CLBs instead of whole FPGA –Most wiring is from inside the region –Cannot affect wiring across region directly through depopulation
22
Regional Interconnect Assume external interconnect demand stays fixed Solve for internal interconnect demand region interconnect demand = Internal demand + external demand
23
Interconnect-Demand Model where W. Fang and J. Rose. “Modeling routing demand for early-stage FPGA architecture development”
24
Interconnect-Demand Model Use congestion map to determine equation constants –Calibrate equation separately for each region Solve for lambda that gives desired channel width –Re-cluster region with lower cluster size until lambda target is met
25
Congestion-Model Multiregion Un/DoPack (CMR) Same region selection method as BMR No constraint on new CLBs in each iteration Whitespace insertion using model
26
Results Typical results –Stdev004 CMR Speedup comparable to Multiregion Un/DoPack BMR Slightly faster than Baseline
27
Results Typical results –Stdev004 BMR area better than Multiregion CMR area better than Multiregion
28
Runtime / Area Tradeoff Previous Multiregion Approach (Fast) Previous Fine- Grained Approach (Good Area) Speed-Area Tradeoff
29
Runtime / Area Tradeoff BMR Improved runtime Good area performance
30
Runtime / Area Tradeoff CMR Better runtime Good area
31
Critical Path
32
Congestion Driven Placement We can further improve area performance using a congestion driven placer
33
Conclusions Use congestion information to perform better re-clustering –Up to 5x runtime speedup versus baseline –Up to 25% area savings versus baseline Improve Reduce/Area tradeoff of Un/DoPack Flow –Simultaneous area and runtime savings
34
Future Work Consider effect of neighboring Regions Other congestion-driven tools –Fast Placement
35
Questions?
36
Outline Motivation Background Multiregion approach Congestion-Driven Whitespace insertion
37
Related Work Un/DoPack [1] –Reduce interconnect usage to meet target channel width constraint Congestion Driven Clustering –iRAC, ISPL –Single-Pass Clustering
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.