Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reconfigurable Supercomputing: What are the Problems? What are the Solutions? Reiner Hartenstein TU Kaiserslautern Dagstuhl, Germany, April 2 - 7, 2006.

Similar presentations


Presentation on theme: "Reconfigurable Supercomputing: What are the Problems? What are the Solutions? Reiner Hartenstein TU Kaiserslautern Dagstuhl, Germany, April 2 - 7, 2006."— Presentation transcript:

1 Reconfigurable Supercomputing: What are the Problems? What are the Solutions? Reiner Hartenstein TU Kaiserslautern Dagstuhl, Germany, April 2 - 7, 2006 Dynamically Reconfigurable Architectures

2 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 2 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The Supercomputing Paradox Rapidly growing listed Teraflops Often limited sustained Teraflops Almost stalled application implementation progress Increasing number of processors running in parallel COTS processor decreasing cost Very high total cost of the Tera(?)flops promising technology poor results Scientists waiting for affordable compute capacity

3 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 3 dangerously telling this to the supercomputing people: You … used the wrong roadmap the past 20 years !!!

4 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 4 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA progress stalled

5 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 5 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA 3 Reconfigurable Computing Paradoxes The high performance paradox The low power paradox Reconfigurable Computing Education Paradox

6 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 6 The Pervasiveness of RC 162,000 127,000 158,000 113,000 171,000 194,000 # of hits by Google 1,620,000 915,000 398,000 272,000 647,000 1,490,000 # of hits by Google search “FPGA and ….”

7 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 7 going into every application area Almost 10 million hits

8 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 8 We now also have the hardware / configware / software chasm The Reconfigurable Computing Education Paradox: Curricula still ignore these extremely hot new challenges in addition to the hardware / software chasm its run-away accelerated pervasiveness, despite of all these educational deficits …. educational deficits

9 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 9 Computing Curricula 2004 (1) Within about 500 pages the term reconfigurable is not found – nor its synonyms

10 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 10 obsolete von Neumann‘s monopoly inside curricula is obsolete

11 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 11 von Neumann is not the common model progra m counter DPU CPU RAM memory von Neumann bottleneck von Neumann instruction-stream- based machine co-processors accelerator CPU instruction- stream- based data- stream- based hardware morphware software mainframe age: microprocessor age: wagging the dog the tail is vN paradigm dominance ? dual paradigm

12 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 12 modern FPGA bestsellers: The new model is reality: FPGA fabrics, together with several µprocessors, several memory banks, and other IP cores, on the same COTS microchip

13 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 13 Bill Gates Speech by Bill Gates at a summit meeting of US state governors: "American high schools are obsolete." "The high schools of today teach kids about today's computers like on a 50-year-old mainframe. „Without re-design for the needs of the 21st century, we will keep limiting - even ruining - the lives of millions of Americans every year."

14 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 14 carved out of stone The most important cultural revolution since the invention of text characters: it‘s not the mainframe It is the Microchip !

15 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 15 RC education needed http://fpl.org/RCeducation/ 35 submissions from Australia, Brasil, India, USA, and throughout Europe Jürgen Becker Jörg Henkel R. Hartenstein

16 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 16 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA Reconfigurable Computing Paradoxes The high performance paradox The low power paradox Reconfigurable Computing Education Paradox

17 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 17 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The FPGA Low Power Paradox „very power-hungry“ [Rick Kornfeld*] *) personal communication The awful technology of FPGAs: FPGAs run at lower clock frequencies, draw much more power and are more expensive. Reducing the electricity bill by an order of magnitude and more by supercomputer 2 FPGA migration

18 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 18 telling this to the low power design people ? you … used the wrong roadmap the past 15 years: use FPGAs ! ISLPED, Oct 4 – 6, Tegernsee PATMOS, Sep 13 – 15, Montpellier 1991 : Kaiserslautern, Germany 1992 : Paris, France 1993 : Montpellier, France

19 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 19 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA Reconfigurable Computing Paradoxes The high performance paradox The low power paradox Reconfigurable Computing Education Paradox

20 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 20 The High Performance Paradox Effective integration density much worse than the Gordon Moore curve: by a factor of more than 10,000 85% of all designers hate their tools The awful technology of FPGAs: FPGAs run at lower clock frequencies, and are more expensive.

21 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 21 fine-grained RC: 1 st DeHon‘s Law # reconfigurability overhead> routing congestion wiring overhead overhead: >> 10 000 1980199020002010 10 0 10 3 10 6 10 9 FPGA logical FPGA routed density: FPGA physical (Gordon Moore curve) transistors / microchip (microprocessor) immense area inefficiency [1996: Ph. D, MIT]

22 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 22 coarse-grained RC: Hartenstein‘s Law # FPGA routed >> 10 000 1980199020002010 10 0 10 3 10 6 10 9 (Gordon Moore curve) transistors / microchip rDPA physical rDPA logical area efficiency very close to Moore‘s law [1996: ISIS, Austin, TX] e.g. KressArray family

23 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 23 210.50.25 0.001 0.01 0.1 1 10 100 1000 0.13 0.1 0.07 µ feature size MOPS / milliWatt standard microprocessor DSP instruction set processors (fine grained reconf.) FPGAs hardwired Claassen‘s Law

24 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 24 210.50.25 0.001 0.01 0.1 1 10 100 1000 0.13 0.1 0.07 µ feature size MOPS / milliWatt standard microprocessor DSP instruction set processors (fine grained reconf.) FPGAs hardwired Claassen‘s Law hardwired and coarse-grained reconf. (rDPA) : Hartenstein‘s Amendment

25 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 25 Selection of published speed-up factors 1980199020002010 10 0 10 3 10 6 10 9 8080 P4 7% / yr 50% / yr http://xputers.informatik.uni-kl.de/faq-pages/fqa.html 100 000 Los Alamos traffic simulation 47 real-time face detection 6000 video-rate stereo vision 900 pattern recognition 730 SPIHT wavelet-based image compression 457 Smith-Waterman pattern matching 288 BLAST 52 protein identification 40 molecular dynamics simulation 88 Reed-Solomon Decoding 2400 Viterbi Decoding 400 FFT 100 1000 MAC Grid-based DRC: no FPGA: DPLA on MoM by TU-KL Grid-based DRC: no FPGA: DPLA on MoM by TU-KL 2000 2-D FIR filter (no FPGA: DPLA by TU-KL) 39,4 Lee Routing ( DPLA by TU-KL) 160 Grid-based DRC („fair comparizon“) 15000 DSP and wireless Image processing, Pattern matching, Multimedia Bioinformatics GRAPE 20 Astrophysics MoM Xputer architecture crypto Microprocessor relative performance Memory X 2 / yr

26 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 26 2 nd DeHon‘s Law Computational Density 1 10 100 1000 210.50.250.13 0.10.07 µ feature size RISC FPGA [IEEE COMPUTER, 2000]

27 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 27 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The three RC Paradoxes poor technology brilliant results poor tools very poor education

28 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 28 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA Why supercomputing / HPC failed instruction-stream-based: memory-cycle-hungry the wrong way, how the data are moved around instruction fetch overhead because of the interconnect network architecture address computation overhead and other overhead sequencing overhead The law or More:

29 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 29 Earth Simulator 5120 Processors, 5000 pins each ES 20: TFLOPS Crossbar weight: 220 t, 3000 km of cable, moving data around inside the

30 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 30 data moved around by software i.e. by memory-cycle-hungry instruction streams which fully hit the memory wall P&R: move locality of operation, not data ! extremely unbalanced stolen from Bob Colwell CPU

31 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 31 An Archetype Common Model needed Guidance for organizing efficient solutions Make the project manageable Allow to share lessions between applications and between disciplines Useful simple archetype not widely accepted An archetype common model should provide.... Progress stalled by the software/configware chasm Configware Industry from the support undergraduate educastion

32 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 32 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA The new paradigm: how the data are traveling transport-triggered: an old hat pipeline, or chaining systolic array asynchronous (via handshake) wavefront array no, not by instruction execution

33 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 33 DPA x x x x x x x x x | || xx x x x x xx x -- - input data streams xx x x x x xx x -- - - - - - - - - - - x x x x x x x x x | | | | | | | | | | | | | | output data streams „ data streams “ time port # time port # time port # Flowware defines:... which data item at which time at which port Def.: data streams (flowware) (pipe network) source and sink ? H. T. Kung systolic arrays:

34 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 34 Data streams source and sink: not my job Not my Job!

35 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 35 x x x x x x x x x | || xx x x x x xx x -- - input data streams xx x x x x xx x -- - - - - - - - - - - x x x x x x x x x | | | | | | | | | | | | | | output data streams „ data streams “ distributed memory ASM On-chip Auto-Sequencing Memory RAM GAG ASM implemented by distributed on- chip memory

36 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 36 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA How the data are moved DMA, vN move processor [Jack Lipovski, EUROMiCRO, Nice, 1975] Henk Corporaal coins the term “transport-triggered” Application-specific distributed memory [Catthoor et al.] ASM use GAG generic address generator [TU-KL publ.: Tokyo 1989 + NH journal] by the way: GAG st…. by TI [TI patent 1995] MoM: GAG-based storage scheme methodology [Herz*] *) [see Michael Herz et al.: ICECS 2002 (Dubrovnik)]

37 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 37 The dual paradigm approach von Neumann paradigm Kress-Kung paradigm Software Engineering Configware Engineering ASM CPU

38 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 38 Mathematical Synthesis Methods algebraic methods i. e., linear projections yields only uniform arrays w. linear pipes only for applications with regular data dependencies

39 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 39 Coarse-grained reconfigurable arrays are a Generalization of the Systolic Array.... discard algebraic synthesis methods [Rainer Kress] the achievement: also non-linear and non-uniform pipes, and even more wild pipe structures possible now reconfigurability really makes sense use optimization algorithms instead, for example: simulated annealing R. Kress

40 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 40 array size: 10 x 16 = 160 rDPUs Coarse grain is about computing, not logic rout thru only not used backbus connect SNN filter on KressArray (mainly a pipe network) [Ulrich Nageldinger] Example: mapping onto rDPA by DPSS: based on simulated annealing rDPU, 32 bit no CPU tool: KressArray Xplorer: diss. Ulrich Nageldinger (downloadable)

41 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 41 Software / Configware Co-Compilation Resource Parameters supporting different platforms Analyzer / Profiler SW code SW compiler paradigm “vN" machine CW Code CW compiler anti machine paradigm Partitioner C language source FW Code simulated annealing [Juergen Becker’s CoDe-X, 1996]

42 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 42 Software / Configware Co-Compilation Resource Parameters supporting different platforms Analyzer / Profiler SW code SW compiler paradigm “vN" machine CW Code CW compiler anti machine paradigm Partitioner C language source FW Code simulated annealing For thesis see book exhibit rack at library entrance [Juergen Becker’s CoDe-X, 1996]

43 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 43 Distributed Memory Parallelism Capability ASM array size example: 10 x 16 NN ports interconnect layer ASM backbus connect layers …

44 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 44 Applications for coarse-grained arrays (on-chip distributed memory for intermediate results) Multi-standard world HDTV receiver with steady I/O data streams at constant speed: Wide variety of multimedia applications Wide variety of real-time applications Many other applications

45 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 45 The wrong mind set.... „but you can‘t implement decisions!“ (remark of a high-ranked industrial research head – discussion after a talk by Ulrich Nageldinger – RAW Orlando)

46 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 46 a tiny section of the pipe network S +

47 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 47 The wrong mind set.... + A B R C section of a very large pipe network: decision not knowing this solution: symptom of the hardware / software chasm and the configware / software chasm „but you can‘t implement decisions!“ =1=0

48 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 48 introducing hardware description languages (in the mid‘ seventies) “The decision box becomes a (de)multiplexer” This is so simple: why did it take decades to find out ? The wrong mind set – the wrong road map!

49 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 49 hypothetical branching example to illustrate software-to-configware migration *) if no intermediate storage in register file C = 1 simple conservative CPU example memory cycles nano seconds if C then read A read instruction1100 instruction decoding read operand*1100 operate & reg. transfers if not C then read B read instruction1100 instruction decoding add & store read instruction1100 instruction decoding operate & reg. transfers store result1100 total 5500 S = R + (if C then A else B endif); S + ABR C clock 200 MHz (5 nanosec) =1 section of a major pipe network on rDPU no memory cycles: speed-up factor = 100

50 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 50 why the RC paradigm shift is so important Move the stool or the grand piano? by Software by Configware

51 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 51 the data-stream-based approach has no von Neumann bottle- neck … understand only this parallelism solution: the instruction-stream-based approach von Neumann bottle- necks... cannot cope with this one

52 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 52 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA What means Reconfigurable Computing? microprogramming? switching the multiplexers? concurrency of 64 or 256 CPUs on a single chip? routing ALU result to a register? it means using the Kress/Kung machine paradigm !

53 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 53 vN paradigm loosing its dominance http://bwrc.eecs.berkeley.edu/Research/RAMP/people.htm RAMP project proposes: Run LINUX on FPGAs

54 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 54 Cray XD1 vN paradigm loosing its dominance Xilinx inside ! Xilinx FPGA

55 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 55 Recommended Pentium successor Discard most caches Have 64* cores with clever interconnect for: concurrent processes, for multithreading, and, Kung-Kress rDPA array The Desk-top Supercomputer!

56 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 56 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA What means Reconfigurable Computing ? The key issue: which is the underlying paradigm? Operation not based on instruction-streams at run time No instruction fetch at run time machine paradigm is data stream-based: Kress-Kung Undergraduate education needs a dual paradigm approach: symbiosis of von Neumann / Kress-Kung

57 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 57 thank you

58 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 58 END

59 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 59 ISC2006 BoF SessionTitle and Abstract Is Reconfigurable Computing the Next Generation Supercomputing? Advances in reconfigurable computing, particularly FPGA (field-programmable gate array) technology, have reached a performance level where they rival and exceed the performance of general purpose processors for the right applications. FPGAs have gotten cheaper thanks to smaller geometries, multimillion gate counts and volume market leverage from ASIC preproduction and other conventional uses. The potential benefit from the widespread incorporation of FPGA technology into high-performance applications is high, provided present day barriers to their incorporation can be overcome. This session will focus on defining the anticipated market changes, anticipated roles of FPGA technology in high-performance computing (from accelerators to hybrid architectures), characterizing present day barriers to the incorporation of FPGA technology (such as identifying the right applications), and partnering efforts required (tools, benchmarks, standards, etc.)to speed the adoption of reconfigurable technology in high-performance supercomputing. Keywords: Reconfigurable computing, FPGA Accelerators, Supercomputing Date and Time This BoF session is part of the conference program and will take place within a 45 minute-slot on Wednesday 28. June 2006 from 18:00 - 19:30. BoF Organizers John Abott Chief Analyst, The 451 Group, USA Dr. Joshua Harr CTO, Linux Networx, USA As CTO for Linux Networ x, Dr. Joshu a Harr has the respon sibility of laying the technic al roadma p for the compa ny and is leading the team develo ping cluster manag ement tools. Josh's experie nce with parallel process ing, distrib uted comput ing, large server farms, and Linux clusteri ng began when he built an eight- node cluster system out of used compo nents while in college. An industr y expert, Josh has been called upon to consult with busines ses and lecture in college classro oms. He earned a Ph.D. in comput ational chemis try and a bachel or's degree in molecu lar biolog y from BYU. Dr. Eric Stahlberg Organizing founder OpenFPGA, Ohio Supercomputer Center (OSC), USA

60 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 60 Backup for Discussion:

61 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 61 Term to be used for „soft hardware“ accelware adaptware adjustware altware alterware arrangeware changeware conformware doughware fabricsware fabrixware fitware flexware formware FPware gateware gateroutware hpcware LUTware matchware modiware morphware® morfware mouldware muxware parware paraware passware pathware patchware performware perfware perware pipeware platformware railware rangeware RCware ressourceware routware routeware routingware RTware shapeware shuntware shuntingware speedware speedupware suiteware switchware switchingware streamware structware transferware transware variware varyware warpware xferware xware send yourproposal to: unfortunately “Morphware” is trademarked

62 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 62 Compilation: Software vs. Configware source program software compiler software code Software Engineering configware code mapper configware compiler scheduler flowware code source „ program “ Configware Engineering placement & routing data C, FORTRAN MATHLAB

63 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 63 Co-Compilation software compiler software code Software / Configware Co-Compiler configware code mapper configware compiler scheduler flowware code data C, FORTRAN, MATHLAB automatic SW / CW partitioner

64 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 64 Why use Reconfigurable Computing Exploit spatial parallelism, and.. … high bandwidth and low latency memory access Ride the technology curve avoiding specific silicon Adapt to change: standards, trends, ….. Reduce risk Adapt to application / deployment requirements instead of spec. hardware? instead of software? … and fine-grained parallelism when useful

65 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 65 Computing Curricula 2004 (2) # CE Configware Engineering missing volume: CE missing

66 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 66 Computing Curricula 2004 (3) 2.2.1.

67 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 67 Computing Curricula 2004 (4) 2.2.1. … how it should be CONFIGWARE MORPHWARE morphware and configware added

68 © 2006, reiner@hartenstein.de http://hartenstein.de TU Kaiserslautern 68


Download ppt "Reconfigurable Supercomputing: What are the Problems? What are the Solutions? Reiner Hartenstein TU Kaiserslautern Dagstuhl, Germany, April 2 - 7, 2006."

Similar presentations


Ads by Google