Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reconfigurable HPC Reconfigurable HPC part 4 miscellaneous Reiner Hartenstein TU Kaiserslautern May 14, 2004, TU Tallinn, Estonia.

Similar presentations


Presentation on theme: "Reconfigurable HPC Reconfigurable HPC part 4 miscellaneous Reiner Hartenstein TU Kaiserslautern May 14, 2004, TU Tallinn, Estonia."— Presentation transcript:

1 Reconfigurable HPC Reconfigurable HPC part 4 miscellaneous Reiner Hartenstein TU Kaiserslautern May 14, 2004, TU Tallinn, Estonia

2 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 2 Time to Market A Fundamental Paradigm Shift in Silicon Application Revenue / month Time / months 11020 ASIC Product 30 Update 1 Product Update 2 reconfigurable Product with download [Tom Kean]

3 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 3 Makimoto’s 3rd wave Reconfigurability The next Revolution: 1978 Transistor entry: Applicon, Calma, CV... 1992 Synthesis: Cadence, Synopsys... 1985 Schematics entry: Daisy, Mentor, Valid... [Keutzer / Newton] EDA industry paradigm switching every 7 years 1999 (Co-) Compilation & Data-stream-based ( r ) DPAs [Hartenstein] 2006 Paradigm Shift Mainstream Tornado McKinsey Curve [Richard Newton] [Keutzer / Newton] 82% of designers hate their tools

4 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 4 Software to Configware Migration this talk will illustrate the performance benfit which may be obtained from Reconfigurable Computing stressing coarse grain Reconfigurable Computing (RC), point of view, this talk hardly mentions FPGAs (But coarse grain may be always mapped onto FPGAs) Software to Configware Migration is the most important source of speed-up Hardware is just frozen Configware

5 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 5 directly delivered to the customer: completely configured number of design starts rGA-based [N. Tredennick, Gilder Technology Report, 2003] omit emulation avoiding specific silicon ….

6 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 6 Mega-rGAs 10 000 000 1 000 000 100 000 10 000 1 000 19841986198819901992199419961998200020022004 planned Virtex II XC 40250XV Virtex XC 4085XL 100 System gates per rGA chip Jahr [Xilinx Data] 200 500

7 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 7 Embedded hardw. CPU & memory cores on chip. HLL Compiler CPU core FPGA core Memory core HLL Compiler [à la S. Guccione]

8 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 8 FPGA Fabric-based on Virtex-II Architecture Source: Ivo Bolsens, Xilinx On Chip Memory Controller Power PC Core Embeded RAM Rocket IO entire system on a single chip all you need on board Xilinx Virtex-II Pro FPGA Architecture PowerPC 405 RISC CPU (PPC405) cores

9 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 9 What’s Wrong with This Picture? 1.Still Have to Make the Chip 2.Need Two Sets of Software to Build It –The ASIC Flow –The PLD Flow 3.Have No Idea What to Connect the PLD Pins to –Chances Are, You Are Going to Get It Wrong! Embedded FPGA Fabric [ Jonathan Rose ] What About PLD Cores on ASICs ?

10 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 10 What’s Right with This Picture! 1.Pre-Fabricated 2.One CAD Tool Flow! 3.Can Connect Anything to Anything PLDs are built for general connectivity Embedded CPU Serial Link, Analog, “etc.” [ Jonathan Rose ]

11 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 11 >> rGAs << rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions http://www.uni-kl.de

12 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 12 Different Morphware-Platforms: Reconfigurable Logic Blocks Reconfigurable Interconnect Blocks Reconfigurable Datapath Arrays fine grain reconfigurable coarse grain reconfigurable Reconfigurable interconnect fabrics

13 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 13 switch rGA w. island architecture (Ausschnitt) © 2003, reiner@hartenstein.de http://hartenstein.de 13 Interkonnect- Fabrics switch box connect box reconfigurable logic block

14 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 14 Switch box TU Kaiserslautern © 2003, reiner@hartenstein.de http://hartenstein.de 14 switch point switch box

15 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 15 connect box TU Kaiserslautern © 2003, reiner@hartenstein.de http://hartenstein.de 15 point

16 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 16 Verbindu ngspunkt (vergröße rt) conncect point activated TU Kaiserslautern © 2003, reiner@hartenstein.de http://hartenstein.de 16

17 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 17 der 4. Schaltpunkt der 5. Schaltpunkt 3 Schaltpunkte switch boxes activated TU Kaiserslautern © 2003, reiner@hartenstein.de http://hartenstein.de 17 switch point switch box

18 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 18 Result TU Kaiserslautern © 2003, reiner@hartenstein.de 18 http://hartenstein.de

19 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 19 TU Kaiserslautern A B Routing completed for 1 net © 2003, reiner@hartenstein.de 1979 Silva Lisco (Silicon Valley Research Corp.) offers CALM-P 20 Transistors + 20 Flipflops http://hartenstein.de 19

20 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 20 >> Placement & Routing << rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions http://www.uni-kl.de

21 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 21 A B passing through Routing: long distance net At a time a path may be used only for one signal...... Bridges of Königsberg

22 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 22 A B C C D D C and D are not reachable C and D need another placement Routing congestion C cannot beconnected with D. rLBs are not 100% usable

23 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 23 Leonhard Euler Euler‘s Problem of the bridges Königsberg is such a network (1736): Find a way, which crosses each bridge exactly once........ Also an optimization: none of the bridges is unused. 1736

24 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 24 L. Euler: Solutio Problematis Ad geometriam Situs Pertinentis; Commetarii Academiae Scientiarum Imperialis Petropolitanae 8 (1736), pp. 128-140 Graph edge node Left Bank Right Bank Kneiphof Island Other Island

25 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 25 1913 J. N. Reynold‘s crossbar switch 1915 patent granted 1926 first public telefon switching application in Shweden Betulander‘s crossbar switch 1919 NASA telemetrics crossbar array 1964 Crossbar Crossbr switch

26 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 26 Crossbar complete? One bar connects 2 pins Size of full complete switchs: n x n / 2 n x n/2n 48 1005000 cossbar chips in a row full n 4 100 partial no of crossbar chips needed Crossbar Chips available from Aptix, Texas Instruments and others

27 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 27 Routing congestion example with detour Direct connection impossible rGA Routing through Detour connection rLB Identity function configured Routing-Resources: Logic gates and/or pass transistors © 2003, reiner@hartenstein.de 27

28 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 28 Crossbar-based Architectures 1993: PADY-II (Jan Rabaey) 1990: UC Berkeley (Jan Rabaey) 16 bit 1997: Pleiades (mesh & crossbar) 32 bit

29 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 29 PADDI-II Architecture

30 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 30 >> Soft Processors << http://www.uni-kl.de rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions

31 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 31 FPGA CPUs in teaching and academic research UCSC: 1990! Märaldalen University, Eskilstuna, Sweden Chalmers University, Göteborg, Sweden Cornell University Gray Research Georgia Tech Hiroshima City University, Japan Michigan State Universidad de Valladolid, Spain Virginia Tech Washington University, St. Louis New Mexico Tech UC Riverside Tokai University, Japan

32 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 32 Some soft CPU core examples corearchitectureplatform MicroBlaze 125 MHz 70 D-MIPS 32 bit standard RISC 32 reg. by 32 LUT RAM- based reg. Xilinx up to 100 on one FPGA Nios16-bit instr. set Altera Mercury Nios 50 MHz 32-bit instr. set Altera 22 D-MIPS Nios8 bitAltera – Mercury gr104016-bit gr105032-bit My80i8080AFLEX10K30 or EPF6016 DSPuva1616 bit DSPSpartan-II corearchitectureplatform Leon 25 Mhz SPARC ARM7 cloneARM uP1232 8-bitCISC, 32 reg.200 XC4000E CLBs REGIS8 bits Instr. + ext. ROM 2 XILINX 3020 LCA Reliance-112 bit DSPLattice 4 isp30256, 4 isp1016 1Popcorn-18 bit CISCAltera, Lattice, Xilinx Acorn-11 Flex 10K20 YARD-1A16-bit RISC, 2 opd. Instr. old Xilinx FPGA Board xr16RISC integer CSpartanXL

33 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 33 einige „soft CPU core“ Beispiele Spartan-II16 bit DSPDSPuva16 FLEX10K30 or EPF6016 i8080AMy80 32-bitgr1050 16-bitgr1040 Altera – Mercury 8 bitNios Altera 22 D-MIPS 32-bit instr. set Nios 50 MHz Altera Mercury 16-bit instr. set Nios Xilinx up to 100 on one FPGA 32 bit standard RISC 32 reg. by 32 LUT RAM- based reg. MicroBlaze 125 MHz 70 D-MIPS platformarchitecturecore SpartanXLRISC integer Cxr16 old Xilinx FPGA Board 16-bit RISC, 2 opd. Instr. YARD-1A 1 Flex 10K20Acorn-1 Altera, Lattice, Xilinx 8 bit CISC1Popcorn-1 Lattice 4 isp30256, 4 isp1016 12 bit DSPReliance-1 2 XILINX 3020 LCA 8 bits Instr. + ext. ROM REGIS 200 XC4000E CLBs CISC, 32 reg.uP1232 8-bit ARMARM7 clone SPARCLeon 25 Mhz platformarchitecturecore Configware ! (keine Hardware) Configware ! (keine Hardware) Retro- Emulation Retro- Emulation

34 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 34 It’s a Paradigm Shift ! Using FPGAs (fine grain reconfigurable) just mainly has been classical Logic Synthesis on a “strange hardware” platform Coarse Grain Reconfigurable Arrays (rDPAs) (Reconfigurable Computing), however, mean a really fundamental Paradigm Shift This is still ignored by CS and EE Curricula and almost all R&D scenes

35 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 35 Why the speed-up...... although FPGA is clock slower by x 3 or even more (most know-how from „ high level synthesis “ discipline) moving operator to the data stream (before run time) support operations: no clock nor memory cycle decisions without memory cycles nor clock cycles most „ data fetch “ without memory cycle

36 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 36 >> History of Frameworks << http://www.uni-kl.de rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions

37 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 37 Goal: away from complex design flow Place and Route Netlist Schematics/ HDL Netlister Bitstream Compiler HLL [à la S. Guccione]

38 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 38 Overcome traditional separate design flow User Code Compiler Executable Netlister Netlist Place and Route. Bitstream Schematics/ HDL HLL Compiler [à la S. Guccione]

39 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 39 Overcome traditional co-processing design separate flow -> JBits Design Flow User Java Code Java Compiler JBits API Executable User Code Compiler Executable Netlister Netlist Place and Route. Bitstream Schematics/ HDL [à la S. Guccione]

40 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 40 new directions in application development new directions in application development. aut. partitioning compilers: designer productivity like CoDe-X (Jürgen Becker, Univ. of Karlsruhe), supports Run-Time Reconfiguration (RTR), a key enabler of error handling and fault correction by partial re-routing the FPGA at run time, as well as remote patching for upgrading, remote debugging, and remote repair by reconfiguration - even over the internet.

41 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 41 rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions >> RTR << http://www.uni-kl.de

42 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 42 CPU use for configuration management on-board microprocessor CPU is available anyhow - even along with a little RTOS use this CPU for configuration management Compiler HLL RTR System Design

43 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 43 hard CPU & memory core on same chip CPU core FPGA core Memory core Compiler HLL Compiler HLL RTR System Design

44 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 44 Converging factors for RTR User Java Code Java Compiler JBits API Executable Converging factors make RTR based system design viable 1) million gate FPGA devices and co-processing with standard microprocessors are commonplace direct implementation of complex algorithms in FPGAs. This alone has already revolutionized FPGA design. 2) new tools like Xilinx Jbits software tool suite directly support coprocessing and RTR.

45 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 45 RTR divides application into a series of sequentially executed stages, each mapped as a separate execution module. Excellent example :Xtrem platform by PACT AG, Munich Without RTR, all configurable platforms just ASIC emulators. directly support development and debugging of RTR applications will also heavily influence the future system organization

46 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 46 rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions >> Support by rGA vendors << http://www.uni-kl.de

47 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 47 >> Support … Support by FPGA Vendors –Xilinx Software by Xilinx Configware (soft IP Cores) Hardware –Altera Software Configware Hardware

48 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 48 Xilinx fabless FPGA semi vendor, San Jose, Ca, founded 1984 key patents on FPGAs (expiring in a few years) Fortune 2001: No. 14 Best Company to work for in (intel: no. 42, hp no. 64, TI no. 65). DARPA grant (Nov‘99) to develop Jbits API tools for internet reconfigurable / upgradable logic (w. VT) Less brilliant early/mid 90ies (president Curt Wozniak): 1995 market share from 84% down to 62% [Dataquest] As designs get larger, Xilinx losed its advantage (bugfixes did not require to burn new chips) meanwhile, weeks of expensive debug time needed

49 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 49 Software by Xilinx Full design flow from Cadence, Mentor, and Synopsys Xilinx Software AllianceEDA Program: –Alliance Series Development System. –Foundation Series Development Systems. –Xilinx Foundation Series ISE (Integrated Synthesis Environment) –free WebPOWERED SW w. WebFitter & WebPACK-ISE –StateCAD XE and HDL Bencher –Foundation Base Express –Foundation ISE Base Express ----- More: ModelSim Xilinx Edition (ModelSim XE) | Forge Compiler | Modular Design | Chipscope ILA | The Xilinx System Generator| XPower| JBits SDK | The Xilinx XtremeDSP Initiative| MathWorks / Xilinx Alliance| System Generator| The Wind River / Xilinx alliance|

50 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 50 Configware (soft IP Products) For libraries, creation and reuse of configware To search for IPs see: List of all available IP The AllianceCORE program is a cooperation between Xilinx and third-party core developers The Xilinx Reference Design Alliance Program The Xilinx University Program LogiCORE soft IP with LogiCORE PCI Interface. Consultants

51 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 51 Xilinx hardware Virtex, Virtex-II, first w. 1 mio system gates. –Virtex-E series > 3 mio system gates. Virtex-EM on a copper process & addit. on chip memory f. network switch appl. The Virtex XCV3200E > 3 million gates, 0.15-micron technology, Spartan, Spartan-XL, Spartan-II –for low-cost, high volume applications as ASIC replacements –Multiple I/O standards, on-chip block RAM, digital delay lock loops –eliminate phase lock loops, FIFOs, I/O xlators, system bus drivers XC4000XV, XC4000XL/XLA, CPLD: low-cost families –rapid development, longer system life, robust field upgradability –support In-System Programming (ISP), in-board debugging, –test during manufacturing, field upgrades, full JTAG compliant interface CoolRunner: low power, high speed/density, standby mode. Military & Aerospace: QPRO high-reliability QML certified Configuration Storage Devices

52 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 52 Altera Altera was founded in June 1983 EDA: synthesis, place & route, and, verification Quartus II: APEX, Excalibur, Mercury, FLEX 6000 families MAX+PLUS II: FLEX, ACEX & MAX families Flow with Quartus II: Mentor Graphics, Synopsys, Synplicity deliver a design design software to support Altera SOPC solutions. Mentor: only EDA vendor w. complete design environment f. APEX II incl. IP, design capture, simulation, synthesis, and h/s co- verification Configware: Altera offers over a hundred IP cores Third party IP core design services and consultants

53 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 53 Altera hardware Newer families: APEX 20KE, APEX 20KC, APEX II, MAX 7000B, ACEX 1K, Excalibur, Mercury families. –Apex EP20K1500E (0.18-µ), up to 2.4 mio system gates, –APEX II (all-copper 0.13-µ) f. data path applications, supports many I/O standards. 1-Gbps True-LVDS performance –wQ2001, an ARM-based Excalibur device Altera mainstream: MAX 7000A, 3000A; FLEX 6000, 10KA, 10KE; APEX 20K families. Mature and other : Classic, MAX 7000, 7000S, 9000; FLEX 8000, 10K families.

54 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 54 rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions >> EDA << http://www.uni-kl.de

55 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 55 >> EDA << EDA as the Key Enabler (major EDA vendors) Altera Cadence Mentor Graphics Synopsys Xilinx Changing EDA Tools Market

56 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 56 EDA as the Key Enabler (major EDA vendors) Select EDA quality / productivity, not FPGA architectures EDA often has massive software quality problems Customer: highest priority EDA center of excellence –collecting EDA expertise and EDA user experience –to assemble best possible tool environments –for optimum support design teams –to cope with interoperability problems –to keep track with the EDA scene as a rapidly moving target being fabless, FPGA vendors spend most qualified manpower in development of EDA, IP cores, applications, support Xilinx and Altera are morphing into EDA companies.

57 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 57 Cadence FPGA Designer: top-down FPGA design system, high-level mapping, architecture-specific optimization, Verilog,VHDL, schematic-level design entry. Verilog, VHDL to Synergy (logic synthesis) and FPGA Designer FPGAs simulated by themselves using Cadence's Verilog- XL or Leapfrog VHDL simulators and simulated w. rest of the system design w. Logic Workbench board/system verification env‘ment. Libraries for the leading FPGA manufacturers.

58 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 58 Mentor Graphics System Design and Verification. PCB design and analysis: IC Design and Verification shifts ASIC design flow to FPGAs (Altera, Xilinx) –by FPGA Advantage with IP support –by ModuleWare, –Xilinx CORE Generator –Altera MegaWizard integration,

59 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 59 Synopsys FPGA Compiler II Version of ASIC Design Compiler Ultra Block Level Incremental Synthesis (BLIS) ASIC FPGA migration Actel, Altera, Atmel, Cypress, Lattice, Lucent, Quicklogic, Triscend, Xilinx

60 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 60 new directions in application development new directions in application development. aut. partitioning compilers: designer productivity like CoDe-X (Jürgen Becker, Univ. of Karlsruhe), supports Run-Time Reconfiguration (RTR), a key enabler of error handling and fault correction by partial re-routing the FPGA at run time, as well as remote patching for upgrading, remote debugging, and remote repair by reconfiguration - even over the internet.

61 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 61 Converging factors for RTR User Java Code Java Compiler JBits API Executable Converging factors make RTR based system design viable 1) million gate FPGA devices and co-processing with standard microprocessors are commonplace direct implementation of complex algorithms in FPGAs. This alone has already revolutionized FPGA design. 2) new tools like Xilinx Jbits software tool suite directly support coprocessing and RTR.

62 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 62 RTR divides application into a series of sequentially executed stages, each implemented as a separate execution module. Partial RTR partitions these stages into finer-grain sub-modules to be swapped in as needed. Without RTR, all conf. platforms just ASIC emulators. needs a new kind of application development environments. directly support development and debugging of RTR appl. essential for the advancement of configurable computing will also heavily influence the future system organization Xilinx, VT, BYU work on run-time kernels, run-time support, RTR debugging tools and other associated tools. smaller, faster circuits, simplified hardware interfacing, fewer IOBs; smaller, cheaper packages, simplified software interfaces.

63 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 63 Run-time Mapping run-time reconfigurable are: Xilinx VIRTEX FPGA family RAs being part of Chameleon CS2000 series systems Using such devices changes many of the basic assumptions in the HW/SW co-design process: host/RL interaction is dynamic, needs a tiny OS like eBIOS, also to organize RL reconfiguration under host control typical goal is minimization of reconfiguration latency (especially important in communication processors), to hide configuration loading latency, and, Scheduling to find ’best’ schedule for eBIOS calls (C~side).

64 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 64 >> future directions << rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions http://www.uni-kl.de

65 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 65 Soft CPU: new job for compilers soft CPU FPGA Memory core FPGA Compiler HLL

66 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 66 Soft rDPA feasible ? rDPU Array rDPU Array [à la S. Guccione]

67 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 67 Array I/O examples rDPU Array rDPU Array data streams, or, from / to embedded memory banks data streams, or, from / to embedded memory banks [à la S. Guccione]

68 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 68 HLL 2 Soft Array Memory soft CPU miscellanous softDPUarraysoftDPUarray HLL Compiler [à la S. Guccione]

69 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 69 HLL 2 „flex“ rDPA Memory CPU miscellanous rDPUarrayrDPUarray HLL Compiler [à la S. Guccione]

70 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 70 >> HLLs <<

71 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 71 HLLs for Hardware Design vs. System Design vs. RTR System Design HLL Compiler System Design Compiler HLL RTR System Design [à la S. Guccione]

72 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 72 HLLs for Hardware Design vs. System Design vs. RTR System Design HLL Compiler System Design Compiler HLL RTR System Design Compiler HLL [à la S. Guccione]

73 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 73 CPU and memory on Chip CPU core FPGA core Memory core Compiler HLL Compiler HLL RTR System Design [à la S. Guccione]

74 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 74 Jbit Environment RTP Core Library JRoute API Device Simulator User Code BoardScope Debugger XHWIF JBits API TCP/IP [à la S. Guccione]

75 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 75 HLLs for Hardware Design vs. System Design vs. RTR System Design Compiler HLL Compiler System Design [à la S. Guccione]

76 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 76 Embedded System Design HLL Compiler CPU core FPGA core Memory core HLL Compiler soft CPU FPGA Memory core FPGA [à la S. Guccione]

77 © 2002, reiner@hartenstein.de http://kressarray.de University of Kaiserslautern TU Kaiserslautern 77 >> conclusions << rGAs Placement & Routing Soft Processors History of Frameworks RTR Support by rGA vendors EDA Future directions conclusions http://www.uni-kl.de

78 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 78 © 2001, reiner@hartenstein.de http://KressArray.de University of Kaiserslautern missing the next revolution Ignoring reconfigurable computing by teaching computing fundamentals within our CS curricula causing the waste billions of dollars. is one of the biggest mistakes in the history of information technology application

79 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 79 „EDA industry shifts into CS mentality“ [Wojciech Maly] Microprogramming to replace FSM design Hardware languages replace EE-type schematics EDA Software and its interfacing languages Newer system level languages like systemC etc. Small and large module re-use Hierarchical organization of designs, EDA, et al......................

80 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 80 „EDA industry shifts into CS mentality“ [Wojciech Maly] Which language to select ?

81 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 81 roadmap old CS lab course philosophy: given an application: implement it by a program -/- new CS freshman lab course environment: Given an application: a)implement it by writing a program b)implement it as a morphware prototype c)Partition it into P and Q c.1) implement P by software c.2) implement Q by morphware c.3) implement P / Q communication interface

82 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 82 All enabling technologies are available anti machine and all its architectural resources parallel memory IP cores and generators anything else needed languages & (co-)compilation techniques morphware vendors like PACT.... literature from last 30 years

83 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 83 END

84 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 84 The dichotomy of models Note for von Neumann: state register is with the CPU Note for the anti machine: state register is with memory bank / state register s are within memory bank s

85 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 85 Machine Paradigms ( “instruction fetch” ) also hardwired implementations* *) e g. Bee project Prof. Broderson

86 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 86 benefit from RAM-based & 2 nd paradigm RAM-based platform needed for: flexibility, programmability avoiding the need of specific silicon mask cost: currently 2 mio $ - rapidly growing 1) simple 2nd machine paradigm needed as a common model: to avoid the need of circuit expertize needed to to educate zillions of programmers 2)

87 © 2004, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 87 Design Space Exploration Systems


Download ppt "Reconfigurable HPC Reconfigurable HPC part 4 miscellaneous Reiner Hartenstein TU Kaiserslautern May 14, 2004, TU Tallinn, Estonia."

Similar presentations


Ads by Google