DeHon March 2001 Rent’s Rule Based Switching Requirements Prof. André DeHon California Institute of Technology.

Slides:

Advertisements

Similar presentations

Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 14: March 19, 2014 Compute 2: Cascades, ALUs, PLAs.

Advertisements

Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 11: March 4, 2008 Placement (Intro, Constructive)

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 20, 2008 Partitioning (Intro, KLFM)

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.

CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.

Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day8: October 18, 2000 Computing Elements 1: LUTs.

EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.

CS294-6 Reconfigurable Computing Day 10 September 24, 1998 Interconnect Richness.

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 21: April 15, 2009 Routing 1.

EDA (CS286.5b) Day 6 Partitioning: Spectral + MinCut.

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.

CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 26: April 18, 2007 Et Cetera…

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 11: February 14, 2007 Compute 1: LUTs.

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 19: April 9, 2008 Routing 1.

CS294-6 Reconfigurable Computing Day 12 October 1, 1998 Interconnect Population.

EDA (CS286.5b) Day 7 Placement (Simulated Annealing) Assignment #1 due Friday.

CS294-6 Reconfigurable Computing Day 14 October 7/8, 1998 Computing with Lookup Tables.

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 18: March 21, 2007 Interconnect 6: MoT.

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 16: March 14, 2007 Interconnect 4: Switching.

Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.

Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 14: February 28, 2007 Interconnect 2: Wiring Requirements and Implications.

ECE669 L16: Interconnection Topology March 30, 2004 ECE 669 Parallel Computer Architecture Lecture 16 Interconnection Topology.

Interconnect Network Topologies

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.

Interconnect Networks

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 15: February 12, 2003 Interconnect 5: Meshes.

ESE Spring DeHon 1 ESE534: Computer Organization Day 19: April 7, 2014 Interconnect 5: Meshes.

Power Reduction for FPGA using Multiple Vdd/Vth

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 21, 2015 Heterogeneous Multicontext Computing Array Work Preclass.

CBSSS 2002: DeHon Costs André DeHon Wednesday, June 19, 2002.

Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day11: October 30, 2000 Interconnect Requirements.

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 7: January 24, 2003 Instruction Space.

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 18: February 18, 2005 Interconnect 6: MoT.

CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 6: January 23, 2002 Partitioning (Intro, KLFM)

Caltech CS184b Winter DeHon 1 CS184b: Computer Architecture [Single Threaded Architecture: abstractions, quantification, and optimizations] Day9:

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 14: February 10, 2003 Interconnect 4: Switching.

CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 13: February 20, 2002 Routing 1.

CBSSS 2002: DeHon Interconnect André DeHon Thursday, June 20, 2002.

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 16: February 14, 2003 Interconnect 6: MoT.

Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 18: April 2, 2014 Interconnect 4: Switching.

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 16: February 14, 2005 Interconnect 4: Switching.

1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,

CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 9: February 9, 2004 Partitioning (Intro, KLFM)

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 6, 2003 Interconnect 3: Richness.

Topology How the components are connected. Properties Diameter Nodal degree Bisection bandwidth A good topology: small diameter, small nodal degree, large.

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: January 30, 2013 Partitioning (Intro, KLFM)

Lecture 17: Dynamic Reconfiguration I November 10, 2004 ECE 697F Reconfigurable Computing Lecture 17 Dynamic Reconfiguration I Acknowledgement: Andre DeHon.

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 11: January 31, 2005 Compute 1: LUTs.

ESE Spring DeHon 1 ESE534: Computer Organization Day 18: March 26, 2012 Interconnect 5: Meshes (and MoT)

Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day14: November 10, 2000 Switching.

ESE Spring DeHon 1 ESE534: Computer Organization Day 20: April 9, 2014 Interconnect 6: Direct Drive, MoT.

Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 12: February 5, 2003 Interconnect 2: Wiring Requirements.

Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.

Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 17: March 29, 2010 Interconnect 2: Wiring Requirements and Implications.

CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 21: November 28, 2005 Routing 1.

Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day12: November 1, 2000 Interconnect Requirements and Richness.

ESE534: Computer Organization

ESE534: Computer Organization

ESE534: Computer Organization

CS184a: Computer Architecture (Structure and Organization)

ESE534: Computer Organization

ESE534: Computer Organization

Indirect Networks or Dynamic Networks

Interconnection Network Design Lecture 14

ESE534: Computer Organization

CS184a: Computer Architecture (Structures and Organization)

ESE534: Computer Organization

CS184a: Computer Architecture (Structure and Organization)

Presentation transcript:

DeHon March 2001 Rent’s Rule Based Switching Requirements Prof. André DeHon California Institute of Technology

DeHon March 2001 Questions Conventional GA/ASIC/VLSI: How much wiring do I need to support my logic? –How does this scale with larger designs? For reconfigurable devices (FPGA, PSoC) (also) How much switching do I need to support my logic? –How does this scale with larger designs?

DeHon March 2001 Answers First question (wiring): –answer with Rent’s Rule characterization –subject of prior talks Second question (switches) –can also approach in terms of Rent’s Rule –that’s what this talk is about

DeHon March 2001 Why? With the silicon capacity available today, we find that we –can build large, high performance, spatial computing organizations –need flexibility in our large system chips –build large FPGAs spatially configurable devices Programmable SoC designs single-chip multiprocessors

DeHon March 2001 Why? Components with spatial flexibility (FPGAs, PSoCs, multiprocessors) need efficient, switchable interconnect

DeHon March 2001 Outline Need Problem Review –General case expensive –Rent’s Rule as a measure of locality –Impact on wiring Impact on Switching –practical issues –design space Summary

DeHon March 2001 Problem Given: Graph of operators –gates, PEs, memories, … –today: 100 PEs, 100,000 FPGA 4-LUTs Goal: Implement “any” graph on programmable substrate –provide flexibility –while maintaining efficiency, compact implementation

DeHon March 2001 Challenge “Obvious” direct solutions –are prohibitively expensive –scale poorly E.g. Crossbar –O(n 2 ) area and delay –density and performance decrease as we scale upward

DeHon March 2001 Multistage Networks Can reduce switch requirements –at cost of additional series switch latency E.g. Beneš Network –implement any permutation –O(N log(N)) switches, O(log(N)) delay

DeHon March 2001 Multistage Wiring Wiring area in 2D-VLSI still O(n 2 ) –bisection width of Beneš (all flat MINs) is O(n) –O(n) wires cross middle of chip with constant layers – will imply O(n) chip width –true when consider next dimension –chip is O(n)  O(n) or O(n 2 ) wiring area

DeHon March 2001 With “Flat” Networks Density diminishes as designs increase –O(N log(N)) switches for N nodes –O(N 2 ) wiring for N nodes

DeHon March 2001 Locality Structure Is this the problem we really need to solve? Or, is there additional structure in our (typical) designs? –allows us to get away with less?

DeHon March 2001 Rent’s Rule Characterization of Rent’s Rule IO = c N p Says: –typical graphs are not random –when we have freedom of placement can contain some connections in a local region

DeHon March 2001 Rent’s Rule and Locality Rent and IO capture locality –local consumption –local fanout

DeHon March 2001 Locality Measure View of Rent’s Rule: –quantifies the locality in a design smaller p –more locality –less interconnect

DeHon March 2001 Traditional Use Use Rent’s Rule characterization to understand wire growth IO = c N p Top bisections will be  (N p ) 2D wiring area  (N p )  (N p ) =  (N 2p )

DeHon March 2001 We Know How we avoid O(N 2 ) wire growth for “typical” designs How to characterize locality How we exploit that locality to reduce wire growth Wire growth implied by a characterized design

DeHon March 2001 Switching: How can we use the locality captured by Rent’s Rule to reduce switching requirements? (How much?)

DeHon March 2001 Observation Locality that saved us wiring, also saves us switching IO = c N p

DeHon March 2001 Consider Crossbar case to exploit wiring: –split into two halves –N/2 x N/2 crossbar each half –N/2 x (N/2) p connect to bisection wires

DeHon March 2001 Recurse Repeat at each level –form tree

DeHon March 2001 Result If use crossbar at each tree node –O(N 2p ) wiring area p>0.5, direct from bisection –O(N 2p ) switches top switch box is O(N 2p ) switches at one level down is –2 x (1/2 p ) 2 x previous level –coefficient 0.5 –get geometric series; sums to O(1)

DeHon March 2001 Good News Good news –asymptotically optimal –Even without switches area O(N 2p ) so adding O(N 2p ) switches not change

DeHon March 2001 Bad News Switches area >> wire crossing area –Consider 6 wire pitch  crossing 36  2 –Typical (passive) switch  –Passive only: 70x area difference worse once rebuffer or latch signals. Switches limited to substrate –whereas can use additional metal layers for wiring area

DeHon March 2001 Additional Structure This motivates us to look beyond crossbars –can depopulate crossbars on up-down connection without loss of functionality –can replace crossbars with multistage networks

DeHon March 2001 N-choose-M Up-down connections –only require concentration choose M things out of N –not full option for placement –i.e. order of subset irrelevant Consequent: –can save a constant factor ~ 2 p /(2 p -1) (N/2) p x N p vs (N p - (N/2) p +1)(N/2) p Similary, Left-Right –order not important  reduces switches

DeHon March 2001 Beneš Switching Flat networks reduced switches –N 2 to N(log(N)) –using multistage network Replace crossbars in tree with Beneš switching networks

DeHon March 2001 Beneš Switching Implication of Beneš Switching –still require O(W 2 ) wiring per tree node or a total of O(N 2p ) wiring –now O(W log(W)) switches per tree node converges to O(N) total switches! –O(log 2 (N)) switches in path across network strictly speaking, dominated by wire delay ~O(N p ) but constants make of little practical interest except for very large networks

DeHon March 2001 Linear Switch Population Can further reduce switches –connect each lower channel to O(1) channels in each tree node –end up with O(W) switches per tree node

DeHon March 2001 Linear Consequences: Good News Linear Switches –O(log(N)) switches in path –O(N 2p ) wire area –O(N) switches –More practical than Beneš case

DeHon March 2001 Linear Consequences: Bad News Lacks guarantee can use all wires –as shown, at least mapping ratio > 1 –likely cases where even constant not suffice expect no worse than logarithmic open to establish tight lower bound for any linear arrangement Finding Routes is harder –no longer linear time, deterministic –open as to exactly how hard

DeHon March 2001 Mapping Ratio Mapping ratio says –if I have W channels may only be able to use W/mr wires –for a particular design’s connection pattern to accommodate any design –forall channels physical wires  mr  logical

DeHon March 2001 Area Comparison Both: p=0.67 N=1024 M-choose-N perfect map Linear MR=2

DeHon March 2001 Area Comparison M-choose-N perfect map Linear MR=2 Since –switch >> wire may be able to tolerate MR>1 reduces switches –net area savings

DeHon March 2001 Multi-layer metal? Preceding assumed –fixed wire layers In practice, –increasing wire layers with shrinking tech. –Increasing wire layers with chip capacity wire layer growth ~ O(log(N))

DeHon March 2001 Multi-Layer Natural response to  (N 2p ) wire layers –Given N p wires in bisection rather than accept N p width –use N (p-0.5) layers –accommodate in N 0.5 width now wiring takes  (N) 2D area –with N (p-0.5) wire layers for p=0.5, – log(N) layers to accommodate wiring

DeHon March 2001 Linear + Multilayer Multilayer says can do in  (N) 2D-area Switches require 2D-area –more than O(N) switches would make switches dominate –Linear and Benes have O(N) switches There’s a possibility can achieve O(N) area –with multilayer metal and linear population

DeHon March 2001 Butterfly Fat-Tree Layout

DeHon March 2001 Fold Sequence

DeHon March 2001 Compact, Multilayer BFT Layout

DeHon March 2001 Fold and Squash Result Can layout BFT – in O(N) 2D area –with O(log(N)) wiring layers

DeHon March 2001 Summary Rent’s Rule characterizes locality in design Exploiting that locality reduces –both wiring and switching requirements Naïve switches match wires at O(N 2p ) –switch area >> wire area –prevent using multiple layers of metal Can achieve O(N) switches –plausibly O(N) area with sufficient metal layers

DeHon March 2001 Additional Information

DeHon March 2001 Consider Crossbar case to exploit wiring: –split into two halves –N/2 x N/2 crossbar each half –N/2 x (N/2) p connect to bisection wires –2 (1/4 N 2 +1/2 (p+1) N (p+1) ) –1/2 N 2 +1/2 p N (p+1) < N 2