CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.

Slides:



Advertisements
Similar presentations
4/22/ Clock Network Synthesis Prof. Shiyan Hu Office: EREC 731.
Advertisements

The Microprocessor is no more General Purpose. Design Gap.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day6: October 11, 2000 Instruction Taxonomy VLSI Scaling.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day17: November 20, 2000 Time Multiplexing.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 11: March 4, 2008 Placement (Intro, Constructive)
1 Lecture 23: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Appendix E)
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 9: February 20, 2008 Partitioning (Intro, KLFM)
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 15: March 12, 2007 Interconnect 3: Richness.
EDA (CS286.5b) Day 5 Partitioning: Intro + KLFM. Today Partitioning –why important –practical attack –variations and issues.
DeHon March 2001 Rent’s Rule Based Switching Requirements Prof. André DeHon California Institute of Technology.
CS294-6 Reconfigurable Computing Day 10 September 24, 1998 Interconnect Richness.
Penn ESE Fall DeHon 1 ESE (ESE534): Computer Organization Day 19: March 26, 2007 Retime 1: Transformations.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.
CS294-6 Reconfigurable Computing Day 13 October 6, 1998 Interconnect Partitioning.
EDA (CS286.5b) Day 3 Clustering (LUT Map and Delay) N.B. no lecture Thursday.
JR.S00 1 Lecture 13: (Re)configurable Computing Prof. Jan Rabaey Computer Science 252, Spring 2000 The major contributions of Andre Dehon to this slide.
CS294-6 Reconfigurable Computing Day 12 October 1, 1998 Interconnect Population.
1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches (Sections 8.1 – 8.5)
CS294-6 Reconfigurable Computing Day 15 October 13, 1998 LUT Mapping.
EDA (CS286.5b) Day 19 Covering and Retiming. “Final” Like Assignment #1 –longer –more breadth –focus since assignment #2 –…but ideas are cummulative –open.
CS294-6 Reconfigurable Computing Day 19 October 27, 1998 Multicontext.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 16: March 14, 2007 Interconnect 4: Switching.
Balancing Interconnect and Computation in a Reconfigurable Array Dr. André DeHon BRASS Project University of California at Berkeley Why you don’t really.
1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 14: February 28, 2007 Interconnect 2: Wiring Requirements and Implications.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #3 – FPGA.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 15: February 12, 2003 Interconnect 5: Meshes.
ESE Spring DeHon 1 ESE534: Computer Organization Day 19: April 7, 2014 Interconnect 5: Meshes.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 21, 2015 Heterogeneous Multicontext Computing Array Work Preclass.
CBSSS 2002: DeHon Costs André DeHon Wednesday, June 19, 2002.
Modern VLSI Design 2e: Chapter 7 Copyright  1998 Prentice Hall PTR Topics n Block placement. n Global routing. n Switchbox routing.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day11: October 30, 2000 Interconnect Requirements.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 24: April 18, 2011 Covering and Retiming.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 6: January 23, 2002 Partitioning (Intro, KLFM)
Lecture 13: Logic Emulation October 25, 2004 ECE 697F Reconfigurable Computing Lecture 13 Logic Emulation.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 14: February 10, 2003 Interconnect 4: Switching.
Topics Architecture of FPGA: Logic elements. Interconnect. Pins.
CBSSS 2002: DeHon Interconnect André DeHon Thursday, June 20, 2002.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 18: April 2, 2014 Interconnect 4: Switching.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 16: February 14, 2005 Interconnect 4: Switching.
CALTECH CS137 Winter DeHon CS137: Electronic Design Automation Day 9: February 9, 2004 Partitioning (Intro, KLFM)
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
CALTECH CS137 Spring DeHon 1 CS137: Electronic Design Automation Day 5: April 12, 2004 Covering and Retiming.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 6, 2003 Interconnect 3: Richness.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 6: January 30, 2013 Partitioning (Intro, KLFM)
Penn ESE534 Spring DeHon 1 ESE534 Computer Organization Day 9: February 13, 2012 Interconnect Introduction.
ESE Spring DeHon 1 ESE534: Computer Organization Day 18: March 26, 2012 Interconnect 5: Meshes (and MoT)
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day14: November 10, 2000 Switching.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 12: February 5, 2003 Interconnect 2: Wiring Requirements.
Penn ESE534 Spring DeHon 1 ESE534: Computer Organization Day 17: March 29, 2010 Interconnect 2: Wiring Requirements and Implications.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day12: November 1, 2000 Interconnect Requirements and Richness.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 25: April 17, 2013 Covering and Retiming.
ESE534: Computer Organization
ESE534: Computer Organization
Lecture 23: Interconnection Networks
ESE534 Computer Organization
CS137: Electronic Design Automation
CS137: Electronic Design Automation
Defect Tolerance for Nanocomputer Architecture
ESE534: Computer Organization
ESE534: Computer Organization
ESE534: Computer Organization
CS184a: Computer Architecture (Structures and Organization)
CprE / ComS 583 Reconfigurable Computing
ESE535: Electronic Design Automation
CS184a: Computer Architecture (Structure and Organization)
Presentation transcript:

CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements

Today (hold off on finishing up word serialization) Interconnect Requirements –Area –Delay –Growth –Structure

Dominant Area

Dominant Time

Dominant Power XC4003A data from Eric Kusse (UCB MS 1997)

For Spatial Architectures Interconnect dominant –area –power –time …so need to understand in order to optimize architectures

Interconnect Problem –Thousands of independent (bit) operators producing results true of FPGAs today …true for *LIW, multi-uP, etc. in future –Each taking as inputs the results of other (bit) processing elements –Interconnect is late bound don’t know until after fabrication

Design Issues Flexibility -- route “anything” –(w/in reason?) Area -- wires, switches Delay -- switches in path, stubs, wire length Power -- switch, wire capacitance Routability -- computational difficulty finding routes

Bisection Bandwidth Partition design into two equal size halves Minimize wires (nets) with ends in both halves Number of wires crossing is bisection bandwidth

First Attempt Any operator may consume output from any other operator Try a crossbar?

Crossbar Flexibility (++) –routes everything (guaranteed) Delay (Power) (-) –wire length O(kn) –parasitic stubs: kn+n –series switch: 1 –O(kn) Area (-) –Bisection bandwidth n –kn 2 switches –O(n 2 )

Crossbar Too expensive –Switch Area = k*n 2 *2.5K 2 –Switch Area/LUT = k*n* 2.5K 2 –n=1024, k=4 => 100M 2 What can we do?

Avoiding Crossbar Costs Typical architecture trick: –exploit expected problem structure

Avoiding Crossbar Costs Typical architecture trick: –exploit expected problem structure We have freedom in operator placement Designs have spatial locality =>place connected components “close” together –don’t need full interconnect?

Exploit Locality Wires expensive Local interconnect cheap Try a mesh?

Mesh Analysis Can we place everything close?

Mesh “Closeness” Try placing “everything” close

Mesh Analysis Flexibility - ? –Ok w/ large w Delay (Power) –Series switches 1--  n –Wire length w--  n –Stubs O(w)--O(w  n) Area –Bisection BW -- w  n –Switches -- O(nw) –O(w 2 n)

Mesh Plausible …but What’s w …and how does it grow?

Characterize Locality Want to exploit locality How much locality do we have? Impact on resources required?

Bisection Bandwidth Bisect design Bisection bandwidth of design –=> lower bound on network bisection bandwidth Design with more locality –=> lower bisection bandwidth Enough? N/2 cutsize

Characterizing Locality Single cut not capture locality within halves Cut again –=> recursive bisection

Regularizing Growth How do bisection bandwidths shrink (grow) at different levels of bisection hierarchy? Basic assumption: Geometric –1 –1/  –1/  2

Geometric Growth (F,  )-bifurcator –F bandwidth at root –geometric regression  at each level

Good Model? Log-log plot ==> straight lines represent geometric growth

Rent’s Rule Long standing empirical relationship –IO = C*N P –0  P  1.0 –compare (F,  )-bifurcator  = 2 P Captures notion of locality –some signals generated and consumed locally –reconvergent fanout

Rent’s Rule Typically consider –0.5  P  0.75 “High-Speed” Logic P=0.67 Memory (P~ ) Example (i10) –max C=7, P=0.68 –avg C=5, P=0.72

What tell us about design? Recursive bandwidth requirements in network

What tell us about design? Recursive bandwidth requirements in network N.B. necessary but not sufficient condition on network design – I.e. design must also be able to use the wires

What tell us about design? Interconnect lengths –Intuition if p>0.5, everything cannot be nearest neighbor as p grows, so wire distances

What tell us about design? Interconnect lengths –IO=(n 2 ) P cross distance n –dIO/dn end at exactly distance n –E(l)=Integral 0 to n=  N of n*(dIO/dn)/n 2 assume iid sources –E(l)=O(N (p-0.5) ) p>0.5

What Tell us about design? IO  N P Bisection BW  N P side length  N P –N if p<0.5 Area  N 2p –p>0.5 N.B. 2D VLSI world has “natural” Rent of P=0.5 (area vs. perimeter)

Rent’s Rule Caveats Modern “systems” on a chip -- likely to contain subcomponents of varying Rent complexity Less I/O at certain “natural” boundaries System close –(Rent’s Rule apply to workstation, PC, PDA?)

Area/Wire Length Bad news –Area ~ O(N 2p ) –Avg. Wire Length ~ O(N (p-0.5) ) Can designers/CAD control p (locality) once appreciate its effects? I.e. maybe this cost changes design style/criteria so we mitigate effects?

What Rent didn’t tell us Bisection bandwidth purely geometrical No constraint for delay –I.e. a partition may leave critical path weaving between halves

Critical Path and Bisection Minimum cut may cross critical path multiple times. Minimizing long wires in critical path => increase cut size.

Rent Weakness Not account for path topology ? Can we define a “Temporal” Rent which takes into consideration? –Promising research topic

Summary Interconnect Dominant –power, delay, area Can’t afford full crossbar Need to exploit locality Can’t have everything close Rent’s rule characterize locality => Area growth O(N 2p )