Presentation is loading. Please wait.

Presentation is loading. Please wait.

Routing Architectures. 2 Global vs. Detailed Routing View Global (macroscopic) view:  Relative position of routing channels in relation to the positioning.

Similar presentations


Presentation on theme: "Routing Architectures. 2 Global vs. Detailed Routing View Global (macroscopic) view:  Relative position of routing channels in relation to the positioning."— Presentation transcript:

1 Routing Architectures

2 2 Global vs. Detailed Routing View Global (macroscopic) view:  Relative position of routing channels in relation to the positioning of logic blocks,  How each channel connects to other channels,  # of wires in each channel Detailed (microscopic) view:  Lengths of the wires,  Specific switching quantity and patterns between and among wires and logic block pins  (recently) single-driver vs. multiple-driver wires −This gives rise to wires that send signals in a specific direction

3 3 Connection Blocks and Switch Blocks LB Connection Block Switch Block LB Connection Block Switch Block LB Connection Block Switch Block LB Connection Block Switch Block

4 4 Global Routing Architecture Global Routing Architectures: 1.Hierarchical 2.Island-style

5 5 Global Routing Architecture Hierarchical Architecture:  Connections between logic blocks within a group can be made using wire segments at the lowest level of the routing hierarchy.  Connections between logic blocks in distant groups require the traversal of one or more levels (of the hierarchy) of routing segments. −Generally, the width of routing channels is widest at levels furthest from the logic blocks.

6 6

7 7 Altera Flex10K/Apex II

8 8

9 9 Hierarchical Routing Architecture Advantage:  More predictable inter-logic block delay following design placement −If interconnect delay is not significant, the delay is almost equal for all connection.  Superior performance for some logic designs Disadvantage:  Each level of the hierarchy presents a hard boundary that, once traversed, usually incurs a significant delay penalty. −Even if two logic blocks are physically close together but apart with respect to the hierarchy  In newer technologies, significant wire delay −  Most recent commercial FPGAs use only one level of hierarchy to create a flat, island-style global routing architecture.

10 10 Island-Style Architecture Island-Style:  Logic blocks arranged in a two-dimensional mesh with routing resources evenly distributed  Has routing channels on all four sides of the logic blocks  W: # of wires contained in a channel, −pre-set during fabrication −one of the key choices made by the architect.

11 11 Island-Style Global Routing

12 12 Island-Style Architectures Commercial island-style FPGAs (most new devices):  Lattice LatticeXP  Xilinx Virtex-4 and Virtex-5 Advantage:  Routing wires of different lengths are in close physical proximity to logic blocks −  Efficient connections for a variety of design net lengths

13 13 Connection Blocks and Switch Blocks LB Connection Block Switch Block LB Connection Block Switch Block LB Connection Block Switch Block LB Connection Block Switch Block

14 14 Connection Block F c,in  Input connection block flexibility F c,out  Output connection block flexibility,

15 15 Switch Block Switch Blocks:  form connections between wire segments at intersections of a horizontal and vertical channel. F s :  Switch block flexibility: −# of possible connections a wire segment can make to other wire segments −F s = 3

16 16 Disjoint vs. Wilton Switch Blocks Disjoint:  A wire entering a disjoint switch block can only connect to other wires with the same numerical designation. −  Potential source– destination routes in the FPGA are isolated into distinct routing domains. −  limiting routing flexibility.

17 17 Disjoint vs. Wilton Switch Blocks Wilton:  Uses the same number of routing switches  But overcomes the domain issue by allowing for a change in domain −A greater diversity of routing paths from a net source to a destination is possible.

18 18 Multiple Block Length Wire Segments A wire runs for L logic blocks: LE L=3 L=4 switch L=1 switch

19 19 Multiple Block Length Wire Segments Example:  40% of tracks: length 1  40%: length 2  20%: length 4

20 20 Xilinx Virtex

21 21 Xilinx Virtex Long lines: اتصالات دو طرفه که کل عرض يا طول تراشه را مي پيمايد. Hex lines: فقط از انتها قابل تغذيه است اما از وسط يا انتهاي ديگر قابل دسترسي است. Double lines: فقط از انتها قابل تغذيه است اما از وسط يا انتهاي ديگر قابل دسترسي است. Direct lines: بلوکهاي همسايه را ( به طور افقي، عمودي و قطري ) وصل مي کند. Fast connect lines: اتصالات محلي داخل CLB از خروجيهاي يک LUT به وروديهاي LUT ديگر.

22 22 Routing Switches (1999-2002) Bidirectional switches:  Pass transistors: −Less area −Faster for short wiring paths (passing through a small number of switches)  Buffers: −Faster for connections passing through many switches  Mixed PT and tri-state buffers:  better delay characteristics with the same area Betz and Rose (1999):  50%-50%: fastest routing architecture Many FPGAs based on these architectures.

23 23 Unidirectional Switches Bidirectional wire segments:  Can be driven by switch blocks on both ends.  [Lemieux04]: Once programmed, leaves 50% of switches inactive (unused)  Extra sinks  more capacitance  delay Directional wire segments:  [Lemieux04]: halves the required tri-state buffers per switch

24 24 Unidirectional Switch Options 1.Directional tri-state (dir-tri):  Each wire segment is driven by: 1.adjacent wire segments 2.one or more LB output pins (via a PT)

25 25 Unidirectional Switch Options 2.Single-driver:  A switch multiplexer selects inputs from both wire segment and logic block sources.  Each wire segment can be driven by a non-tri-state buffer −Improved drive strength

26 26 Single Driver Disadvantage:  Increase in the number of required wire segments per channel Experiments:  Roughly the same number of tracks per channel is needed to achieve the same routability. Experiments (2004):  100% single-driver always gave the best results for area and delay.

27 More Recent Routing Improvements

28 28 Double-Length Lines

29 29 Programmable Switch Matrix (PSM)

30 30 Field Programmable Gate Array (FPGA)

31 31 More Recent Routing Improvements HARP: Hardwired Routing Patterns [Sivaswamy05]:  Junction patterns: L, T, +

32 32 HARP Research Procedure 1.Routing requirement analysis:  A number of circuits were placed and routed on traditional FPGA architectures,  Routing patterns that were formed in the switch boxes are analyzed. 2.HARP architecture generation:  Based on the frequencies of the patterns, HARP patterns were instantiated and replaced some switches. 3.Placement and routing with HARPs:  Place and route circuits on the new HARP architecture.

33 33 HARP Patterns Some hard-wired patterns

34 34 Routing Graphs  In Virtex 5, diagonal wires are used

35 35 Power Consumption Power Consumption:  Dynamic Power: P d = k.C L. V dd 2.f  Leakage Power: −Much of the interconnect resources within the FPGA are not actively used. −Components: 1.Source-to-drain subthreshold leakage 2.Gate-to-source gate oxide leakage  60%–70% of FPGA dynamic and static power consumption is located in the programmable interconnect.

36 36 Leakage Power Power Consumption:  Leakage Power: −130 nm  65 nm: 18%  54% [ICCAD2003] Leakage Power Dynamic Power Power (Watts) 0 50 10 0 15 0 20 0 25 0 18013 0 9065 Technology (nm)

37 37 Sub-threshold Current  At V gs = V t (and a little before), I ds > 0 (sunthreshold region) −Must reduce V gs still more to cut-off the channel  I sub ≈ 10 -10 A @ V gs = 0  I sub ≈ 10 -5 A @ V gs = V t −increases exponentially

38 38 Gate Oxide Leakage Gate oxide leakage:  Result of electron tunneling as the transistor gate oxide is thinned.  Leakage current increases exponentially with oxide thinning.

39 39 Leakage Power Reason for increase in Leakage:  V dd is reduced with CMOS technology scaling  V th must be lowered to recover switching speed  Subthreshold leakage current increases exponentially with decreasing V th  Oxide thinning continued. [Pakbaznia DAC06]

40 40 Power Reduction Power reduction techniques in general:  Removing V DD from unused transistors: −  Both components of leakage eliminated  Reduce V DD for some transistors: −  Dynamic power reduced a lot

41 41 Power Reduction in FPGA Power reduction techniques in FPGAs:  Drive each routing buffer by two separate sources [Li04]: −A full-rail V DD (V DDH ) −A reduced V DD (V DDL ).  3 cases: 1.high-performance (M1 active), 2.reduced performance (M2 active), 3.sleep mode (both shut off)

42 42 Power Reduction in FPGA  Experiments: − 88% of interconnect buffers could be placed into sleep mode − 85% of active routing buffers could be driven with V DDL without increasing circuit delay (for 100nm). −  80% overall reduction in interconnect leakage −  38% reduction in interconnect dynamic power Multi-supply voltage (MSV) : high voltage on critical paths to maintain performance low voltage on non-critical paths to reduce power

43 43 Power Reduction in FPGA Problem:  requires the chip-wide distribution of multiple V DD values Solution:  V DD -selection approach for routing buffers [Anderson04]  3 cases: −both transistors on: −V VD = V DD −  buffer in high-performance mode −MNX on: −V VD = V DD – V t −  weak supply −  low power mode −both off: −  sleep mode

44 44 Power Reduction in FPGA Experiments:  75% of routing resources could tolerate a slowdown of 50% (70 nm process)  Leakage power reduction (including effects of the new transistors): about 35% −when operating in low-power mode (i.e., MPX off, MNX on)  Leakage power reduction: up to 61% −when operating in sleep mode (i.e., MPX and MNX off)  Dynamic power reduction: 28% −when operating in low-power mode.  Area increase: ~ 10%

45 45 MTCMOS in FPGAs Dual Threshold Technique  Use low threshold transistor along critical paths  Optimize performance  Use high threshold transistor along non-critical paths  Minimize leakage [Gayasen04],[Li04]  High V t transistors to implement configuration SRAM bits −These bits are not subsequently read, −  performance is not an issue.

46 46 Power Reduction in FPGA MTCMOS:  Redundant SRAM bits to control unused paths in routing buffers [Rahman04] Traditional: SRAM bits are minimized by having one bank of SRAM cells feed all multiplexers in a given level.  Multiple PTs are activated on unused paths [Rahman04]: Only one PT in the first stage passes the output value  R emaining transistors can be shut off

47 47 Power Reduction in FPGA Disadvantages:  Extra SRAM  Leakage current −Can use high-V t, low-power cells (not timing-critical)  More area −Experiments on 30-to-1 multiplexer: −Doubling # of SRAM bits:  leakage power decrease x2 −interconnect area increase: 30%–50%.

48 48 Power Reduction in FPGA Body biasing for unused interconnect transistors:  Adaptive V t  Reduce sub-threshold leakage Disadvantages:  Needs multi-well process  area  Needs a circuit to control bias voltages  Experiments [Rahman04]: −Area increase: 1.6X to 2X, −Leakage current reduction: 1.7X to 2.5X

49 49 Power Reduction in FPGA In commercial FPGAs:  High V t transistors for configuration SRAM bits  Thicker oxides to reduce the leakage of the devices which are not performance critical −Xilinx, “Power consumption in 65 nm FPGAs, Xilinx White Paper WP246 (v1.2),”http://www.xilinx.com/support/documentation/white papers/wp246.pdf, February 2007. −Altera Corporation, “Stratix III FPGAs vs. Xilinx Virtex-5 devices: Architecture and performance comparison, Altera White Paper WP-01007-2.1,”http://www.altera.com/literature/wp/wp-01007.pdf, October 2007.

50 50 References [Kuon07] I. Kuon, R. Tessier, “FPGA Architecture: Survey and Challenges,” Foundations and Trends in Electronic Design Automation, Vol. 2, No. 2 (2007) 135–253. [Xilinx] www.xilinx.com [Altera] www.altera.com [Lemieux04] G. Lemieux, E. Lee, M. Tom, and A. Yu, “Directional and single-driver wires in FPGA interconnect,” in Proceedings: International Conference on Field-Programmable Technology, pp. 41–48, December 2004. [Wang05] G. Wang, S. Sivaswamy, C. Ababei, K. Bazargan, R. Kastner, and E. Bozorgzadeh, “Statistical Analysis and Design of HARP Routing Pattern FPGAs,” Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD), pp. 2088-2102, Vol. 25, No. 10, 2006. [Li04] F. Li, Y. Lin, and L. He, “Vdd programmability to reduce FPGA interconnect power,” in IEEE/ACM International Conference on Computer Aided Design, 2004. [Anderson04] J. Anderson and F. Najm, “A novel low-power FPGA routing switch,” in Proceedings of the IEEE Custom Integrated Circuits Conference, pp. 719–722, October 2004.

51 51 References [Gayasen04] A. Gayasen, K. Lee, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, and T. Tuan, “A dual-VDD low power FPGA architecture,” in Proceedings of the International Conference on Field-Programmable Logic and Applications, pp. 145–157, August 2004. [Rahman04] A. Rahman and V. Polavarapuv, “Evaluation of low-leakage design techniques for field programmable gate arrays,” in Proceedings: ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 23– 30, February 2004.


Download ppt "Routing Architectures. 2 Global vs. Detailed Routing View Global (macroscopic) view:  Relative position of routing channels in relation to the positioning."

Similar presentations


Ads by Google