Enabling Technologies for System-on-Chip Development Reconfigurable Computing Architectures and Methodologies for System-on-Chip Monday, November 19, 10: :00 hrs. Reiner Hartenstein University of Kaiserslautern November 19-20, 2001, Tampere, Finland
© 2001, University of Kaiserslautern 2 downloadable “handout” Viewgraphs downloadable from: Paper dowloadable from: /staff/hartenstein/lot/Tampere01.pdf /staff/hartenstein/lot/Tampere01.ppt if you use part of it, please, quote me and me to:
© 2001, University of Kaiserslautern 3 Conferences on Reconfigurable Logic topic adoption by congresses: ASP-DAC, DAC, DATE, ISCAS, SPIE …. FCCM, FPGA (founded 1992), and FPL (founded 1991 at Oxford, UK): FPL 2002, La Grande Motte (Montpellier, France), Sept. 2 – 4 Paper Submission deadline : 15th March 2002 The International Conference on Field- programmable Logic and Applications Laboratoire d‘ Informatique, de Robotique et de Microélectronique de Montpellier Montpellier de
© 2001, University of Kaiserslautern 4 >> Introduction Conclusions & Future Developments fine grain coarse grain Introduction FPGA boom Coarse Grain Architectures Programming rDPAs
© 2001, University of Kaiserslautern 5 The Impact of Reconfigurable Logic Reconfigurable platforms bring a new dimension to digital system development and have a strong impact on SoC design. A rapidly growing large user base of HDL-savvy designers with FPGA experience. Flexibility supports spin-around times of minutes instead of months for real time in-system debugging, profiling, verification, tuning, field-maintenance, and field upgrades A New Business Model (in-field debugging and upgrading... ) A Fundamental Paradigm Shift in Silicon Application Revenue / month Time / months Update 1 Product Update ASIC Product reconfigurable Product with download 30 [Kean]
© 2001, University of Kaiserslautern 6 The History of Paradigm Shifts “Mainstream Silicon Application is switching every 10 Years” TTL µproc., memory “The Programmable System-on-a-Chip is the next wave“ custom standard Makimoto’s Wave ASICs, accel’s LSI, MSI 1 st Design Crisis 2 nd Design Crisis reconfigurable Published in 1989
© 2001, University of Kaiserslautern 7 How’s next Wave ? 2007 FPGAs custom standard Tredennick’s Paradigm Shifts procedural programming algorithm: variable resources: fixed hardwired algorithm: fixed resources: fixed 2007 ? structural programming algorithm: variable resources: variable rDPAs no further wave ! ? 4 th wave ? Hartenstein’s Curve
© 2001, University of Kaiserslautern 8 The Impact of Makimoto’s Paradigm Shifts TTL µproc., memory custom standard ASICs, accel’s LSI, MSI reconfigurable Procedural personalization via RAM-based Machine Paradigm Personalization (CAD) before fabrication structural personalization: RAM-based before run time Dr. Makimoto: FPL 2000 keynote Software Industry’s Secret of Success Repeat Success Story by new Machine Paradigm !
© 2001, University of Kaiserslautern 9 >> FPGA boom Introduction FPGA boom Coarse Grain Architectures (rDPAs) Programming rDPAs Conclusions & Future Developments
© 2001, University of Kaiserslautern 10 What is an FPGA ? single-length lines double-length lines S S S S L LL LL L LLL longlines S = Switch Box L = Logic Block Xilinx XC400E reconfigurable interconnect fabric L LL LL L LLL configurable logic blocks (CLBs)
© 2001, University of Kaiserslautern 11 Top 4 FPGA Manufacturers 2000 Xilinx 42% Altera 37% Lattice 15% Actel 6% Top 4 PLD Manufacturers 2000 total: $3.7 Bio [ Dataquest ] > $7 billion by "pre-fabricated" components and IP reuse for PLDs FPGAs going into every type of application – also SoC soon reach 50 million system gates / Chip PLD vendors provide libraries to support their products soft IPs Configware fastest growing semiconductor market segment killing the ASIC market improved design flow & libraries
© 2001, University of Kaiserslautern 12 Away from complex design flow User Code Compiler Executable Netlister Netlist Place and Route. Bitstream Schematics/ HDL [S. Guccione] use CPU for congfiguration management Compiler HLL [S. Guccione] HLL Compiler [S. Guccione] Compiler HLL [S. Guccione] Embedded CPU: Configware / Software Co-design is commonplace from HDL to HLL supporting....dynamically reconfigurable (RTR)
© 2001, University of Kaiserslautern 13 Configware as the Key Enabler Growing no. of independent configware houses (soft IP core vendors) and design services Xilinx AllianceCORE & Reference Design Alliance et al. Top FPGA vendors Currently the key innovators Design productivity and quality by configware libraries (soft IP cores) from various application areas. Cadence, Mentor, Synopsys just jumped in. Emerging separate EDA software market ( comparable to compiler / OS market in computers )
© 2001, University of Kaiserslautern 14 >> Coarse Grain Architectures Introduction FPGA boom Coarse Grain Architectures (rDPAs) Programming rDPAs Conclusions & Future Developments for detailed overview see proceedings
© 2001, University of Kaiserslautern 15 Why coarse-grained ? S S S S resources needed for reconfigurability partly for configuration code storage L LL LL L LLL area used by application “hidden RAM” not shown Reconfigurability Overhead
© 2001, University of Kaiserslautern 16 Commercial rDPAs XPU family (IP cores): PACT corp., Munich XPU128 flexible array: MorphICs CALISTO: Silicon Spice* CS2000 family: Chameleon Systems MECA family: Malleable* FIPSOC: SIDSA ACM: Quicksilver Tech CHESS array: Elixent *) bought
© 2001, University of Kaiserslautern 17 KressArray Family generic Fabrics: a few examples Examples of 2 nd Level Interconnect: layouted over rDPU cell - no separate routing areas ! + rout-through and function rout- through only more NNports: rich Rout Resources Select Function Repertory select Nearest Neighbour (NN) Interconnect: an example rDPU Select mode, number, width of NNports
© 2001, University of Kaiserslautern 18 array size: 10 x 16 = 160 rDPUs SNN filter KressArray Mapping Example rout thru only not used backbus connect
© 2001, University of Kaiserslautern 19 It’s a General Paradigm Shift ! Using FPGAs (fine grain reconfigurable): just Logic Synthesis on a strange platform replaceConcurrent Processes by much more efficient parallelism: Stream-based DPAs 1 1 ) systolic array* [1980] KressArray** [1995] ____ *) hardwired **) reconfigurable and rDPAs 2 converging design flows 2 ) chip-on-a-day* [2000] [ Broderson ] Coarse Grain rDPAs (Reconfigurable Computing): a fundamental Paradigm Shift terms: DPU: datpath unit DPA: data path array rDPU: reconfigurable DPU rDPA: reconfigurable DPA Kress: a generalization of systolic array synthesis: super systolic synthesis
© 2001, University of Kaiserslautern 20 Concurrent Computing DPU instruction sequencer DPU instruction sequencer DPU instruction sequencer DPU instruction sequencer.... Bus (es) or switch box CPU extremely inefficient control flow overhead instruction fetch / interpretation overhead address computation overhead - may be massive massive bottleneck phenomena at run time
© 2001, University of Kaiserslautern 21 Stream-based Computing: (r) DPA for both, reconfigurable, and hardwired [ Brodersen ] DPU transport-triggered execution driven by data stream from / to memory or, from / to peripheral interface no instruction sequencer inside ! avoids run time overhead and bottleneck phenomena rDPA: drastically reduced reconfigurability overhead
© 2001, University of Kaiserslautern 22 >> Programming rDPAs Introduction FPGA boom Coarse Grain Architectures (rDPAs) Programming rDPAs Conclusions & Future Developments
© 2001, University of Kaiserslautern 23 linear projection or algebraic mapping equations DPU architecture y + * x a computing in space placement y 1 0 y 2 0 y y 1 y 2 y x 1 x 2 x data streams Systolic Stream-based Computing System this dichotomy is completely ignored by our CS curricula computing in time systolic arrays etc. and other transformations migration by re-timing linear pipelines and uniform arrays only The Mathematician’s Synthesis Method no routing! Systolic Array [ H. T. Kung, 1980 ] : a DPA (Data Path Array)
© 2001, University of Kaiserslautern 24 2 General Stream-based Computing System heterogenous DPA or rDPA Scheduler Mapper expression tree DPU architectures y + * x a 1 simultaneous placement & routing * * * sh * xf - - data streams 4 The same mapper for both: Reconfigurable, or hardwired Kress DPSS [1995] simulated annealing free form pipe network
© 2001, University of Kaiserslautern 25 Architecture & Mapping Editor Statistics KressArray DPSS Datastream Generator HDL Generator Simulator Datapath Generator Delay & Power Estimator Improvement Proposal Generator KressArray (Design Space) Platform Space Explorer Application Set Xplorer User DPSS Source Input intermediate form
© 2001, University of Kaiserslautern 26 application not used Legend: an example by Nageldinger’s KressArray Xplorer Memory Communication Architecture … hot research topic in embedded systems storage context transformations [ Cathoor, Herz, Kougia, Soudris ] Synthesizable Memory Communication Architecture startups provide memory IP or generators sequencers memory ports Optimized Parallel Memory Controller GAG generic sequencer methodology vailable Herz
© 2001, University of Kaiserslautern for a Stream-based Soft Machine Scheduler Memory (data memory) memory bank... rDPA Compiler Sequencers (data stream generator)
© 2001, University of Kaiserslautern 28 data counter program counter : state register Compiler Memory Datapath hardwired Sequencer Computer tightly coupled by compact instruction code “von Neumann” does not support soft data paths does not support soft data paths Datapath reconfigurable Xputer Scheduler Compiler Memory multiple sequencer Datapath Array University of Kaiserslautern loosely coupled by decision data bits only Xputer: The Soft Machine Paradigm reconfigurable Computer: the wrong Machine Paradigm “von Neumann” Fundamentals available (course on Wednesday) also for hardwired [ Broderson ]
© 2001, University of Kaiserslautern 29 Processor Co-Compilation partitioning compiler Computer Machine Paradigm Software running on Xputer “Soft” Machine Paradigm Configware running on GNU C compiler Analyzer / Profiler Hardware / Software Co-Design turns to Configware / Software Co-Design supporting different platforms Resource Parameters interface X-C compiler Reconfigurable Accelerators KressArray DPSS high level programming language source X-C Partitioner Jürgen Becker’s Co-DE-X Co-Compiler [ASP-DAC’95]
© 2001, University of Kaiserslautern 30 Loop Transformation Examples loop 1-8 body endloop loop 1-8 body endloop loop 9-16 body endloop fork join strip mining loop 1-4 trigger endloop loop 1-2 trigger endloop loop 1-8 trigger endloop reconf.array: host: loop 1-16 body endloop sequential processes: resource parameter driven Co-Compilation loop unrolling
© 2001, University of Kaiserslautern 31 >> Conclusions Introduction FPGA boom Coarse Grain Architectures (rDPAs) Programming rDPAs Conclusions & Future developments
© 2001, University of Kaiserslautern 32 FPGA CPUs soft CPU FPGA Memory core FPGA Compiler HLL corearchitectureplatform MicroBlaze 125 MHz 70 D-MIPS 32 bit st‘d RISC 32 reg. by 32 LUT RAM-based reg. Xilinx up to 100 on one FPGA Nios16-bit instr. setAltera Mercury Nios 50 MHz 32-bit instr. setAltera 22 D-MIPS Nios8 bitAltera Mercury gr bit gr bit My80i8080AFLEX10K30 or EPF6016 DSPuva1616 bit DSPSpartan-II corearchitectureplatform Leon 25 MhzSPARC ARM7 cloneARM uP bitCISC, 32 reg.200 XC4000E CLBs REGIS8 bit instr.2 Xilinx 3020 LCA Reliance-112 bit DSPLattice 4 isp30256, 4 isp1016 1Popcorn-18 bit CISCAltera, Lattice, Xilinx Acorn-11 Flex 10K20 YARD-1A16-bit RISCold Xilinx FPGA Board xr16RISC integer CSpartanXL UCSC: 1990! Märaldalen University, Eskilstuna, Sweden Chalmers University, Göteborg, Sweden Cornell University Hiroshima City University, Japan Tokai University, Japan Universidad de Valladolid, Spain Washington University, St. Louis Gray Research Georgia Tech Michigan State Virginia Tech New Mexico Tech UC Riverside academic FPGA CPUs
© 2001, University of Kaiserslautern 33 Soft rDPA ? Memory soft CPU miscellanous softDPUarraysoftDPUarray HLL Compiler Rapid technology progress 50 mio system gates soon FPGAs f. relocateble configware code ? Compatibility at configuration code level ? Slower clock: compensated by more parellelism Even large rDPAs as a soft IP become feasible By >2005: don’t care about area efficiency ?
© 2001, University of Kaiserslautern 34 Main problems to be solved object code compatibility Dominant FPGA vendor needs: widely accepted OS & tools most software written for it most configware written for it conf‘w. object code compatibility widely accepted „OS“ & tools Most successful µprocessor: de facto standard configware libraries configw. code compatibility by de facto standard RC platform family scalable FPGA architectures supp‘n relocatable configuration code computing in space computing in time systolic arrays etc. widely spread dichotomy and FPGA awareness curricular innovations are urgently needed compilers to avoid needing HDL-savvy users FPGA-based de facto Standards: Education: relocatable code scalable memory important:
© 2001, University of Kaiserslautern 35 However, current CS Education …. Hardware invisible: under the surface … is based on the Submarine Model Brain usage: procedural-only Software Faculty Colleagues shy away from the Paradigm Shift: their Brain hurts? - can’t be: this Half has been amputated Algorithm Assembly Language procedural high level Programming Language Hardware Software This model disables...
© 2001, University of Kaiserslautern 36 Hardware, Configware Hardware and Software as Alternatives Algorithm Software partitioning Software only Software & Hardw/Configw procedural structural Brain Usage: both Hemispheres Hardw/Configw only
© 2001, University of Kaiserslautern 37 The Dominance of the Submarine Model... Hardware... indicates, that our CS education system produces zillions of mentally disabled Persons (procedural) structurally disabled … completely disabled to cope with solutions other than software only It‘s time to attack the software faculty dictatorship. Get involved!
© 2001, University of Kaiserslautern 38 >>> thank you thank you for listening
© 2001, University of Kaiserslautern 39 >>> END END