Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICECS 2002 IEEE 9th International Conference on Electronics, Circuits and Systems Trends in Reconfigurable Logic and Reconfigurable Computing (invited.

Similar presentations


Presentation on theme: "ICECS 2002 IEEE 9th International Conference on Electronics, Circuits and Systems Trends in Reconfigurable Logic and Reconfigurable Computing (invited."— Presentation transcript:

1 ICECS 2002 IEEE 9th International Conference on Electronics, Circuits and Systems Trends in Reconfigurable Logic and Reconfigurable Computing (invited paper) Reiner Hartenstein University of Kaiserslautern viewgraph downloading, see: Dubrovnik, Croatia September 15-18, 2002

2 © 2002, University of Kaiserslautern 2 >> Outline The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions

3 © 2002, University of Kaiserslautern 3 Flag ship example: annual IEEE ISCA conference series Resignation? taken over by the opposition: Interconnect Fabrics: vN Parallelism: the Datenflow Machine is dead 98.5 % vN Statistics [David Padua, John Hennessy, et al.] Reconfigurable Computing

4 © 2002, University of Kaiserslautern 4 Dead Supercomputer Society ACRI Alliant American Supercomputer Ametek Applied Dynamics Astronautics BBN CDC Convex Cray Computer Cray Research Culler-Harris Culler Scientific Cydrome Dana/Ardent/ Stellar/Stardent DAPP Denelcor Elexsi ETA Systems Evans and Sutherland Computer Floating Point Systems Galaxy YH-1 Goodyear Aerospace MPP Gould NPL Guiltech ICL Intel Scientific Computers International Parallel Machines Kendall Square Research Key Computer Laboratories [ Gordon Bell, keynote at ISCA 2000]. MasPar Meiko Multiflow Myrias Numerix Prisma Tera Thinking Machines Saxpy Scientific Computer Systems (SCS) Soviet Supercomputers Supertek Supercomputer Systems Suprenum Vitesse Electronics

5 © 2002, University of Kaiserslautern 5 CS: young ? dynamic?.. but the von Neumann Paradigm is still the dominant doctrine... Microelectronics is ignored (except falling cost of computational effort)... still pushing he basic models from the times of mainframe dinosaurs after >10 technology generations... 1 th nd rd th th th th P5 (Pentium) 8 th P6 (Pentium Pro / Pentium II) 9 th Pentium III 10 th th the vN Microprocessor is a methusela, the steam engine of the silicon age. It is time to go silicon-oriented computing sciences are ultra conservative … … to avoid saying: senile A Re-orientation is over-due

6 © 2002, University of Kaiserslautern 6 >> Paradigm Shifts The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions

7 © 2002, University of Kaiserslautern 7 better to go for reconfigurable platforms [Dataquest] PLD market > $7 billion by fastest growing segment of semiconductor market IP reuse and silicon reuse FPGAs are going into every type of application

8 © 2002, University of Kaiserslautern 8 Why coarse grain ?

9 © 2002, University of Kaiserslautern 9 Throughput vs. Efficiency ,07 MOPS / mW µ feature size FPGAs (reconfigurable logic) hardwired instruction set processors standard microprocessor DSP S S S S resources needed for reconfigurability L LL LL L LLL area used by application 1 Bit CLB T. Claasen et al.: ISSCC 1999 Wiring by abutment: 32 Bit example *) R. Hartenstein: ISIS 1997 rDPAs (reconfigurable computing)*

10 © 2002, University of Kaiserslautern 10 Throughput vs. Flexibility flexibility throughput ,07 MOPS / mW µ feature size T. Claasen et al.: ISSCC 1999 hard- wired hardwired von Neumann instruction set processors standard microprocessor DSP FPGAs Reconfigurable logic the anti machine goes far beyond bridging the gap anti machine *) R. Hartenstein: ISIS 1997 rDPAs (reconfigurable computing)*

11 © 2002, University of Kaiserslautern 11 Terminology DPUdata path unit rDPUreconfigurable DPU DPAdata path array (DPU array) rDPAreconfigurable DPA RAreconfigurable array ISPinstruction set processor AManti machine AMPdata stream processor* rAMPreconfigurable AMP *) no “dataflow machine” platform category programming source machine paradigm hardware (not programmable) none ISPsoftware von Neumann morphwareconfigware FPGA: none data stream processor (AMP) streamware anti machine reconfigurable AMP (rAMP) streamware & configware digital system platforms: morphware usegranularity (path width) (re)configurable blocks reconfigurable logicfine grain (~1 bit)CLBs reconfigurable computing coarse grain (e.g. 32 bits)rDPUs (e.g. ALU-like) multi granular: by slice bundlingrDPU slices (e.g. 4 bits) categories of morphware: consensus is near FPGA field-programmable gate array FPL field-programmable logic PLD programmable logic device CPLD complex PLD instruction set processor

12 © 2002, University of Kaiserslautern 12 Paradigm Shifts: Nick Tredennick‘s view algorithms variable resources fixed instruction-stream- based computing: algorithms variable resources variable reconfigurable computing: programmable

13 © 2002, University of Kaiserslautern 13 Compilation for (r)DPA of anti machine Data Path Array DPA streamware

14 © 2002, University of Kaiserslautern 14 Why fine grain ? no specific silicon: low production volume (aerospace, automotive, military, industrial controllers, et al.) the spare part problem design flow

15 © 2002, University of Kaiserslautern 15 Evolution of FPGA and its design flow User Code Compiler Executable Netlister Netlist Place and Route. Bitstream Schematics/ HDL HLL Compiler HLL [à la S. Guccione] CPU core FPGA core Memory core Compiler HLL soft CPU © 2002, inter face s CPU core FPGA core Memory core rDPA core inter face s soft rDPA as soon as Giga FPGA is available

16 © 2002, University of Kaiserslautern 16 Some soft CPU core examples Spartan-II16 bit DSPDSPuva16 FLEX10K30 or EPF6016 i8080AMy80 32-bitgr bitgr1040 Altera – Mercury 8 bitNios Altera 22 D-MIPS 32-bit instr. set Nios 50 MHz Altera Mercury 16-bit instr. set Nios Xilinx up to 100 on one FPGA 32 bit standard RISC 32 reg. by 32 LUT RAM- based reg. MicroBlaze 125 MHz 70 D-MIPS platformarchitecturecore SpartanXLRISC integer Cxr16 old Xilinx FPGA Board 16-bit RISC, 2 opd. Instr. YARD-1A 1 Flex 10K20Acorn-1 Altera, Lattice, Xilinx 8 bit CISC1Popcorn-1 Lattice 4 isp30256, 4 isp bit DSPReliance-1 2 XILINX 3020 LCA 8 bits Instr. + ext. ROM REGIS 200 XC4000E CLBs CISC, 32 reg.uP bit ARMARM7 clone SPARCLeon 25 Mhz platformarchitecturecore

17 © 2002, University of Kaiserslautern 17 soft CPUs in academic teaching UCSC: 1990! Märaldalen University Chalmers University Cornell University Gray Research Georgia Tech Hiroshima City Univ. Michigan State Univ. de Valladolid Virginia Tech Washington U. St. Louis New Mexico Tech UC Riverside Tokai University

18 © 2002, University of Kaiserslautern 18 ASIC emulation ASIC emulation / Rapid Prototyping: to replace simulation Quickturn (Cadence), IKOS (Synopsys), Celaro (Mentor) hours of compilation run: inefficient since netlist-based: ASIC emulators will become obsolete soon by RTR: in-circuit execution debugging instead of emulation new business model: upgradable morphware is the product emulation for solving the spare part problem in many areas

19 © 2002, University of Kaiserslautern 19 The microelectronics spare part problem Original fab line is no more existing ICs do not survive storage time Demand: several decades of availability IC physical life expectance /years ,07 µ feature size [Hartenstein 2002] e. g. car price: ~25% electronics demand /years of availability IC market volume

20 © 2002, University of Kaiserslautern 20 The microelectronics spare part problem IC physical life expectance /years ,07 µ feature size [Hartenstein 2002] demand /years of availability IC market volume key problem in many application areas: medical, aerospace, automotive, other transportation, military, industrial equipment controllers, et al.

21 © 2002, University of Kaiserslautern 21 >> The Dichotomy of Models The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions

22 © 2002, University of Kaiserslautern 22 Matter & Antimatter The World of Matter machine paradigm: the Atom + + Electron spinning - The World of Anti Matter machine paradigm: Anti Atom - - Positron spinning +

23 © 2002, University of Kaiserslautern 23 Matter & Antimatter of Informatics : instruction stream spinning (von Neumann) data stream spinning - DPU + Anti Machine paradigm + CPU - nothing central !

24 © 2002, University of Kaiserslautern 24 computing paradigms and methodologies: instruction-stream-based vs. data-stream-based 1946: machine paradigm (von Neumann) 1980: data streams (Kung, Leiserson) 1989: anti machine paradigm introduced 1990: anti machine implementation methodology 1990: rDPU (Rabaey) 1994: anti machine high level programming language 1995: super systolic rDPA (Kress) 1996+: SCCC (LANL), SCORE, ASPRC, Bee (UCB), : configware / software partitioning compiler (Becker) 2000: generator for rDPA with high memory bandwidth (tutorials and courses available on all this)

25 © 2002, University of Kaiserslautern 25 Nasty Matter + CPU Data Path instruction sequencer RAM Address Computation Overhead Instruction Fetch Overhead central von Neumann bottleneck extremely power hungry and area inefficient performance problems reconfigurable? the wrong machine paradigm

26 © 2002, University of Kaiserslautern 26 - DPU Data Path Unit DPU Data Path instruction sequencer Matter vs. Antimatter: CPU vs. DPU + data stream data streams Data Path Unit DPU

27 © 2002, University of Kaiserslautern 27 heavy anti atoms: DPA = DPU array - DPA - DPU DPA streamware: data streams spinning around

28 © 2002, University of Kaiserslautern 28 + CPU Data Path instruction sequencer + simple machine paradigm + scalability + relocatability + compatibility = secret of success of software industry RAM RAM-based CPU:

29 © 2002, University of Kaiserslautern 29 Success Factors property instruction stream based data stream based reconfigurable hardwired fine grain (FPGA) coarse grain RAM-based yes (hardwired) machine paradigm yesnoavailable compatibilityyeslimitedfeasible scalabilityyesnogood* (hardwired) code relocatabilityyesnogood* (hardwired) *) if KressArray used **) mapping coarse grain onto FPGA good** feasible** available** success of software industry for configware industry is missing: –FPGA compatibility, –fully scalable FPGA, –relocatable configuration code rDPUs and rDPAs do much better than FPGAs

30 © 2002, University of Kaiserslautern 30 >>> Problems with Concurrency The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions

31 © 2002, University of Kaiserslautern 31 Parallelism by Concurrency independent instruction streams.... Bus(es) or switch box Data Path instruction sequencer Data Path instruction sequencer Data Path instruction sequencer Data Path instruction sequencer difficult coordination massive run time overhead

32 © 2002, University of Kaiserslautern 32 Data-stream-based Parallelism See my other talk ICECS 2002 IEEE 9th International Conference on Electronics, Circuits and Systems Memory Organisation for Datastream-based Reconfigurable Computing ( invited paper) Michael Herz, Agilent Technologies Reiner Hartenstein, University of Kaiserslautern Miguel Miranda, Erik Brockmeyer, Francky Catthoor, IMEC, Leuven Dubrovnik, Croatia September 15-18, 2002

33 © 2002, University of Kaiserslautern 33 >> The Dominance of Embedded Systems The Computer Architecture Crisis The Impact of Reconfigurable Platforms The Dichotomy of Models Parallelism Conclusions

34 © 2002, University of Kaiserslautern 34 Summary of the Anti Machine Paradigm anti language primitives are almost the same (slightly extended) anti machine execution potential is dramatically more powerful provides drastically more flexibility not always replacing von Neumann

35 © 2002, University of Kaiserslautern 35 Conclusions the anti machine is the way to go for massive parallelism, also data-intensive applications reconfigurable anti machine for high performance with short product life cycles, unstable standards reconfigurable for low cost low volume production Giga FPGAs highly promising - only by a new design flow: configware could repeat the success of software industry sparepart problem: needs new infrastructures

36 © 2002, University of Kaiserslautern 36 © 2001, University of Kaiserslautern >>> thank you thank you for your patience

37 © 2002, University of Kaiserslautern 37 © 2001, University of Kaiserslautern >>> END END

38 © 2002, University of Kaiserslautern 38 © 2001, University of Kaiserslautern >>> Appendix Appendix for discussion

39 © 2002, University of Kaiserslautern 39 >> Problems to be solved Configware Market FPGA Market Embedded Systems (Co-Design) Hardwired IP Cores on Board Run-Time Reconfiguration (RTR) Rapid Prototyping & ASIC Emulation Evolvable Hardware (EH) Academic Expertise ASICs dead Soft CPU HLLs Problems to be solved

40 © 2002, University of Kaiserslautern 40 EDA industry shift into CS mentality [Wojciech Maly] patches instead of engineering innovation stalled many years ago 85% users hate their tools netlist-based: do not care about efficiency, do not care about transistor density

41 © 2002, University of Kaiserslautern 41 [ Jonathan Rose ] FPGAs Give You Instant Fabrication –Get to Market Fast –Fix ‘em quick Zero NRE Charges –Low Risk –Low Cost at good volume

42 © 2002, University of Kaiserslautern 42 Machine Paradigms ( “instruction fetch” ) also hardwired implementations* *) e g. Bee project Prof. Broderson

43 © 2002, University of Kaiserslautern 43 The Crisis of Computing Sciences Computing Sciences are in a severe crisis Computing curricula are obsolete because of strictly enforced „procedural-only“ blinders Computer Architecture and related areas have lost leadership in digital system implementation CS ignores > 90% µprocessors in embedded systems: 10 times more programmers will write embedded applications than computer software by 2010 A disruptive promising therapy introduced by new approaches coming with Reconfigurable Computing

44 © 2002, University of Kaiserslautern 44 Programming Language Paradigms very easy to learn multiple GAGs

45 © 2002, University of Kaiserslautern 45 Conclusion: all knowledge needed is available languages machine paradigm compilation techniques anti architectural resources sequencing methodology: hw & sw hw / sw partitioning methodology parallel memory IP core and module generator vendors courses / embedded tutorials: DATE. Munich, 2001 ASP-DAC, Yokohama, 2001 SBCCI, Brasilia, 2001 full day: Univ. Montpellier 1998 Nokia / Univ. Tampere, Finland, 2002 CNRS Paris France, 2002 keynotes 2001 / 2002 invited talks 2001 / 2002 anything else needed

46 © 2002, University of Kaiserslautern 46 Ubiquitous embedded systems 20 billion µprocessors (2001) > 90% in embedded systems 10 times more programmers will write embedded applications than computer software by 2010 That’s where our graduates will go Embedded systems means: hardware / software co-design configware / software co-design hardware / configware / software co-design

47 © 2002, University of Kaiserslautern 47 The Situation in Computing Sciences Computing Sciences are in a severe crisis New fundamentals and R&D directions are inevitable my mission: getting you involved All knowledge needed is readily available even from Computing Sciences Silicon application and EDA provide useful concepts Reconfigurable Computing has the remedy

48 © 2002, University of Kaiserslautern 48 the edu gap has dramatic consequences Key R&D scenes are drying out or dying because of a lack of qualified researchers the embedded system design crisis gets worse because of a lack of qualified designers many innovative products cannot be sold because of a lack of qualified customers the edu gap is widening dramatically because of a lack of qualified educators

49 © 2002, University of Kaiserslautern 49 Super Pipe Networks The key is mapping, rather than architecture *) KressArray [ASP-DAC-1995]

50 © 2002, University of Kaiserslautern it‘s an alternative culture.... now the area is going mainstream: a rapidly widening audience of non-specialists gets interested... severe communication gaps due to educational deficits not only to users: still many hardware and EDA experts ask: isn’t it just logic design on a strange platform ? it is time to clarify and popularize fundamental aspects and to explain, that it is a fundamentally different culture

51 © 2002, University of Kaiserslautern 51 © 2001, University of Kaiserslautern Jürgen Becker’s Co-DE-X Co-Compiler Analyzer / Profiler Host Software GNU C compiler paradigm Computer machine DPSS KressArray Configware X-C compiler Xputer machine paradigm Partitioner Loop Transfor- mations X-C is C language extended by MoPL X-C Resource Parameters supporting different platforms supporting platform-based design

52 © 2002, University of Kaiserslautern 52 Impact of Makimoto’s wave TTL µproc., memory custom standard ASICs, accel’s LSI, MSI reconfigurable Procedural personalization via RAM-based Machine Paradigm Personalization (CAD) before fabrication structural personalization: RAM-based before run time Software Industry’s Secret of Success Repeat Success Story by new Machine Paradigm ! Configware Industry

53 © 2002, University of Kaiserslautern 53 © 2001, University of Kaiserslautern instructions program counter : state register Compiler RAM Datapath hardwired Sequencer Computer tightly coupled by compact instruction code “von Neumann” does not support soft data paths does not support soft data paths Datapath reconfigurable Xputer Scheduler Compiler RAM (multiple) sequencer Datapath Array “instructions” University of Kaiserslautern loosely coupled by decision data bits only Xputer: The Soft Machine Paradigm reconfigurable also for hardwired Computer: the wrong Machine Paradigm “von Neumann” data stream spec there are some differences s data counter (anti machine)

54 © 2002, University of Kaiserslautern 54 Reconfigurable semiconductor market Xilinx 42% Altera 37% Lattice 15% Actel 6% Top 4 PLD Manufacturers 2000 total: $3.7 Bio [ Dataquest ] > $7 billion by PLD vendors’ and their alliances provide libraries of “soft IPs” Configware Market fastest growing semiconductor market segment coarse-grained: rDPUs: configurable functional blocks fine-grained: cLBs, rLBs: configurable logic blocks PACT AG, Munich, Germany Quicksilver, San Jose

55 © 2002, University of Kaiserslautern 55 Semiconductor Revolutions “Mainstream Silicon Application is switching every 10 Years” TTL µproc., memory custom standard Makimoto’s Wave ASICs, accel’s LSI, MSI “The Programmable System-on-a-Chip is the next wave“ reconfigurable Published in 1989 Tredennick’s Paradigm Shifts hardwired algorithm: fixed resources: fixed procedural programming algorithm: variable resources: fixed structural programming algorithm: variable resources: variable vN machine paradigm anti machine paradigm anti machine paradigm

56 © 2002, University of Kaiserslautern 56 Impact of Makimoto’s wave TTL µproc., memory custom standard ASICs, accel’s LSI, MSI reconfigurable Procedural personalization via RAM-based Machine Paradigm Software Industry’s Secret of Success

57 © 2002, University of Kaiserslautern 57 Impact of Makimoto’s wave TTL µproc., memory custom standard ASICs, accel’s LSI, MSI reconfigurable structural personalization: RAM-based before run time Repeat Success Story by new Machine Paradigm ! Configware Industry qualified people are not available

58 © 2002, University of Kaiserslautern 58 Impact of Data-stream-based... TTL µproc., memory custom standard ASICs, accel’s LSI, MSI reconfigurable structural personalization: hardwired before fabrication Repeat Success Story by new Machine Paradigm ! Embedded Hardware/ Configware Industry qualified people are not available

59 © 2002, University of Kaiserslautern 59 Rapidly growing CS education gap Our computing curricula are obsolete introduction is strictly „procedural-only“ vN-only use of terms like „computer organisation“, „ computer structures“, „ computer architecture graduates are not prepared to the real world –most applications for embedded systems (>90% by 2010) our graduates are unable to compete with EE graduates only a few % curricula need to be changed my mission: getting you involved

60 © 2002, University of Kaiserslautern 60 Binding Time vs. Computing Domain time domain (procedural) Binding time: (Set-up of Communication Channels) at run time microprocessor parallel computer time & space (hybrid) later fabrication step ASICs space domain (structural) before fabrication full custom ICs at loading time at compile time Reconfigurable Computing array processor programming domain: supersystolic arrays systolic arrays

61 © 2002, University of Kaiserslautern 61 Sources: Proc ISSCC, ICSPAT, DAC, DSPWorld Why Coarse Grain instead of FPGA ? physical logical supersystolic FPGA logical FPGA physical Transistors / chip ~ 10 ~ drastically smaller configuration memory a lot of more benefits much faster loading FPGA routed memory microprocessor reduced reconfigurability overhead by up to ~ 1000

62 © 2002, University of Kaiserslautern 62 What are the differences ? vN* computing: computing in time instruction fetch at run time procedural programming instruction scheduling Reconfigurable Computing: computing in space and time “instruction” fetch at compile time structural programming data scheduling i. e. Data-stream-based also hardwired implementations** “instruction” fetch before fabrication **) e g. Bee project Prof. Broderson *) vN stands for “von Neumann”

63 © 2002, University of Kaiserslautern 63 Basics of Binding Time run time loading time compile time time of “Instruction Fetch” microprocessor parallel computer Reconfigurable Computing “Instruction” generalized: including complex expressions and other datapaths strong impact on the machine paradigm !


Download ppt "ICECS 2002 IEEE 9th International Conference on Electronics, Circuits and Systems Trends in Reconfigurable Logic and Reconfigurable Computing (invited."

Similar presentations


Ads by Google