Presentation is loading. Please wait.

Presentation is loading. Please wait.

Aiming at the Natural Equilibrium of Planet Earth Requires to Reinvent Computing Reiner Hartenstein IEEE fellow 1 (ISCAS-2011)

Similar presentations


Presentation on theme: "Aiming at the Natural Equilibrium of Planet Earth Requires to Reinvent Computing Reiner Hartenstein IEEE fellow 1 (ISCAS-2011)"— Presentation transcript:

1 Aiming at the Natural Equilibrium of Planet Earth Requires to Reinvent Computing Reiner Hartenstein IEEE fellow 1 (ISCAS-2011)

2 © 2010, TU Kaiserslautern 2011, (Preface) Without Computers? 2 Lufthansa anno 1960 (Business Information System)

3 © 2010, TU Kaiserslautern (Preface) Rebooting the World 3 Smart graphic interfaces replacing UN, UNESCO and other bureaucracies by Rebooting the World for the New Realities direct interaction with the population (mass collaboration) The World Economic Forum:

4 © 2010, TU Kaiserslautern 2011, Preface Enormous Trouble in Computing: –Longterm Programming Crisis –Keynotes and Panel Discussions booming –Excessive Power Consumption 4

5 © 2010, TU Kaiserslautern 2011, Outline (1) Energy consumption of Computers Toward Exascale Computing The von Neumann Syndrome We need to Reinvent Computing Conclusions 5

6 © 2010, TU Kaiserslautern Beyond peak oil 6 „6 more Saudi Arabias needed for demand predicted for 2030“ 80% crude oil coming from [Fatih Birol, Chief Economist IEA]. https://www.theoildrum.com/

7 © 2010, TU Kaiserslautern 2011, 7 7 Saudi Arabia © 2011

8 © 2010, TU Kaiserslautern 2011, How many more Saudi Arabias needed? Rio de Janeiro 8

9 © 2010, TU Kaiserslautern 2011, Power Consumption of the Internet Power consumption by internet: x30 til 2030 if trends continue G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep [Randy Katz: IEEE Spectrum, Febr. 2009] more than6 saudi arabia ! Google Data Ccenter at Columbia River soon 8 billion smart wireless devices

10 © 2010, TU Kaiserslautern 2011, More Google Data Centers 10 Google causing 2% electricity consumption worldwide ? [datacenterknowledge.com] ©

11 © 2010, TU Kaiserslautern 2011, Electricity Bill: a Key Issue „The possibility of computer equipment power consumption spiraling out of control could have serious consequences for the overall affordability of computing.” Patent for water-based data centers Cost of a Google data center dominated only by monthly power bill [L. A. Barroso, Google] 11 FERC Google going to sell electricity Already in 2005, Google’s electricity bill higher than value of its equipment.

12 © 2010, TU Kaiserslautern 2011, 12 The World's largest Data Center [datacenterknowledge.com]

13 © 2010, TU Kaiserslautern 2011, Microsoft Data Center at Quincey 13 [datacenterknowledge.com]

14 © 2010, TU Kaiserslautern 2011, About 2000 datacenters world-wide 14 [datacenterknowledge.com]

15 © 2010, TU Kaiserslautern 2011, Outline (2) Energy consumption of Computers Toward Exascale Computing The von Neumann syndrome We need to Reinvent Computing Conclusions 15

16 © 2010, TU Kaiserslautern 2011, 16 year relative performance Multicore: Break-through or Breakdown? x86 parallelism von-Neumann-only m u c h s l o w e r t h a n M o o r e ‘ s l a w begin of the multicore era „forcing a historic transition to a parallel programming model yet to be invented“ David Callahan, Microsoft distinghuished endineer performance growth needed

17 © 2010, TU Kaiserslautern 2011, „ intel has thrown a Hail Mary Pass“ Dave Patterson 17

18 © 2010, TU Kaiserslautern 2011, 18 John Hennessy „ … I would be panicking …“

19 © 2010, TU Kaiserslautern 19 Exa-scale: (10 18 computations/second) expected by 2018; [several sources] Power estimated (single supercomputer): 250 MW – 10 GW (2x NY City: 16 million people) Exascale affordable ?

20 © 2010, TU Kaiserslautern 20 In my opinion, the largest supercomputers at any time, including the first exaflops, should not be thought of as computers. … [ Andrew Jones, vice president Numerical Algorithms Group] Supercomputers: no Computers?

21 © 2010, TU Kaiserslautern 21 …Their usage patterns and scientific impact are closer to major research facilities such as CERN, ITER, or Hubble. [ Andrew Jones, vice president Numerical Algorithms Group] no reason to solve the power problem ? Supercomputers as Scientific Instruments

22 © 2010, TU Kaiserslautern 2011, CERN (1) 22

23 © 2010, TU Kaiserslautern 2011, CERN (2) 23

24 © 2010, TU Kaiserslautern 2011, Hubble 24

25 © 2010, TU Kaiserslautern 2011, Learning how to go Exascale CACHES st International Workshop on Characterizing Applications for Heterogeneous Exascale Systems June 4th, 2011, held in conjunction with ICS' th International Conference on Supercomputing May 31 - June 4, 2011, Loews Ventana Canyon Resort, Tucson, Arizona 25

26 © 2010, TU Kaiserslautern 2011, Outline (3) Energy consumption of Computers Toward Exascale Computing The von Neumann syndrome We need to Reinvent Computing Conclusions 26

27 © 2010, TU Kaiserslautern 2011, 27 Reconfigurable Computing offers an overwhelming reduction of electricity consumption Potential of RC as well as massive speed-up factors … explained by the von Neumann Syndrome

28 © 2010, TU Kaiserslautern 2011, >15000 PISA project 28 FFT 100 Reed-Solomon Decoding 2400 Viterbi Decoding MAC DSP and wireless molecular dynamics simulation 88 BLAST 52 protein identification 40 Smith-Waterman pattern matching 288 Bioinformatics GRAPE 20 Astrophysics SPIHT wavelet-based image compression 457 real-time face detection 6000 video-rate stereo vision 900 pattern recognition 730 Image processing, Pattern matching, Multimedia 3000 CT imaging ,000,000 Speedup-Factor Speed-up factors are not new by avoiding the von Neumann paradigm doubled ev. 4 months 8723 DNA seq , , © 2011 ? crypto DES breaking ?

29 © 2010, TU Kaiserslautern 2011, Energy saving factors: ~10% of speedup 29 FFT 100 Reed-Solomon Decoding 2400 Viterbi Decoding MAC DSP and wireless molecular dynamics simulation 88 BLAST 52 protein identification 40 Smith-Waterman pattern matching 288 Bioinformatics GRAPE 20 Astrophysics crypto DES breaking Speedup-Factor © 2011 doubles every 4 months Power save factors obtained SPIHT wavelet-based image compression 457 real-time face detection 6000 video-rate stereo vision 900 pattern recognition 730 Image processing, Pattern matching, Multimedia 3000 CT imaging 8723 DNA seq.

30 © 2010, TU Kaiserslautern 2011, 30 [Tarek El-Ghazawi et al.: IEEE COMPUTER, Febr. 2008] Application. Speed-up factor Savings PowerCostSize DNA and Protein sequencing DES breaking much less equipment needed massively saving energy RC*: Demonstrating the intensive Impact SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster Tarek El-Ghazawi *) RC = Reconfigurable Computing taxonomy of HPRC design flows 12 % 9 %

31 © 2010, TU Kaiserslautern Drastically less Equipment needed For instance: a hangar full of racks replaced by : a single rack without air conditioning 31 or ½ rack

32 © 2010, TU Kaiserslautern 2011, The Reconfigurability Paradox Routing congestion 32 Lower clock speed Reconfigurability overhead Wiring overhead O. o. magnitude better performance by a massively worse technology ?

33 © 2010, TU Kaiserslautern 33 More power for creating foam than to accelerate the vessel ? 33 because of The von Neumann Syndrome © 2011

34 © 2010, TU Kaiserslautern 2011, von Neumann Syndrome 34 Lambert M. Surhone, Mariam T. Tennoe, Susan F. Hennessow (ed.): Von Neumann Syndrome ; ßetascript publishing 2011

35 © 2010, TU Kaiserslautern 2011, © 2011 von Neumann Model Critics 35 “The von Neumann Syndrome”: [C.V. “RAM” Ramamoorthy 2007; UC Berkeley] Nathan’s Law: Software is a gas. It expands to fill all its containers... Nathan Myhrvold, Microsoft Ex-CTO „even fills the internet“ and the clouds yearsystem MLOC (millions) 2001Windows XP MAC OS X SAP Net Weaver238 incompetent programmers E. Dijkstra 1968; J. Backus 1978; Arvind, 1983; Peter G. Neumann ; L. Savain Critique of von Neumann is not new: N. N. 1995: THE STANDISH GROUP REPORT Robert N. Charette 2005: Why Software Fails; IEEE Spectrum Anthony Berglas 2008: Why it is Important that Software Projects Fail Software Desaster Reports:

36 © 2010, TU Kaiserslautern 2011, All hardware but ALU is overhead: x20 inefficiency 36 x20 inefficiency: just one of several overhead layers [R. Hameed et al.: Understanding Sources of Inefficiency in General- Purpose Chips; 37th ISCA, June 19-23, 2010, St. Malo, France] “GP Processors are inefficient” (data cashe)

37 © 2010, TU Kaiserslautern 2011, „The Memory Wall“ 60%/yr.. 7%/year Patterson’s Law: Processor-Memory Performance Gap: (grows 50% / year) 2008 >1000 coined by Sally McKee The overwhealming problem is data moving complexity, not processor performance. Dr. Djordje Maric* (ETH Zurich), 37 and complex multi- M LOC instruction movement

38 © 2010, TU Kaiserslautern 2011, Through-Silicon-Via (TSV) 38 reduce power consumption by 75% [Wally Rh., Micro News 2/28/2011 ] SIP multiple dice PoP Package on Package PiP Package in Package TSV Through silicon via reducing the memory wall?

39 © 2010, TU Kaiserslautern Massive Overhead Phenomena proportionate to the number of processors overproportionate to the number of processors 39 overhead von Neumann machine instruction fetchinstruction stream state address computation instruction stream data address computation instruction stream data meet PU + other overh.instruction stream i / o to / from off-chip RAM instruction stream Inter PU communication instruction stream message passing overheadinstruction stream transactional memory overh. instruction stream multithreading overhead etc. instruction stream

40 © 2010, TU Kaiserslautern 40 von Neumann overhead vs. Reconfigurable Computing overhead von Neumann machine datastream machine instruction fetchinstruction streamnone* state address computation instruction streamnone* data address computation instruction streamnone* data meet PU + other overh.instruction streamnone* i / o to / from off-chip RAM instruction streamnone* Inter PU communication instruction streamnone* message passing overheadinstruction streamnone* transactional memory overh. instruction streamnone* multithreading overhead etc. instruction streamnone* *) configured before run time n o i n s t r u c t i o n f e t c h a t r u n t i m e 40

41 © 2010, TU Kaiserslautern 2011, Outline (4) Energy consumption of Computers Toward Exascale Computing The von Neumann Syndrome We need to Reinvent Computing Conclusions 41

42 © 2010, TU Kaiserslautern 2011, Putting Old Ideas Into Practice 42 Software Engineering SEN vol. 24 no. 3, May 1999 The biggest payoff will come from putting old ideas into practice (POIIP) and teaching people how to apply them properly. [David Parnas]

43 © 2010, TU Kaiserslautern 2011, Mike Flynn‘s Taxonomy 43 M. J. Flynn: “ Very high-speed computing systems”; Proc. IEEE, Vol. 54, No. 12, pp. 1901–1909, Dec., 1966.

44 © 2010, TU Kaiserslautern 2011, Diana‘s extended Taxonomy 44 © 2011 D. Göhringer, M. Hübner, T. Perschke, J. Becker: “ A Taxonomy of Reconfigurable Single/Multi-Processor Systems-on-Chip”; International Journal of Reconfigurable Computing, Hindawi, Special Issue: Selected Papers from ReCoSoC 2008, rSI: I can be reconfigured at run time: e. g. RISP rSD: can exchange data memory or datapath rSIrSD: both possible 4 x SISD: 4 x SIMD: rMIrMD: supports both I: instruction stream D: data stream rMD: SIMD processors can exchange their data memories or reconfigure their datapaths rSI: I can be reconfigured at run time: e. g. RISP rSIrMD: can reconfigure both, D and Iat run time rMI: MPSoCs w. reconfigurable I 4 x MIMD: rMD: MPSoCs w. reconfigurable D

45 © 2010, TU Kaiserslautern „ B u t y o u c a n ‘ t i m p l e m e n t d e c i s i o n s ! “ 45 S = R + (if C then A else B endif); =1 + A B R C section of a very large pipe network: decision C. G. Bell et al: IEEE Trans-C21/5, May 1972 W. A. Clark: 1967 SJCC, AFIPS Conf. Proc. decision box turns into (de)multiplexer ** Software to Configware Migration 0 1 (de)multiplexer: B A C decision box: C 0 1 ?? POIIP:

46 © 2010, TU Kaiserslautern POIIP: Loop to Pipe Mapping 46 (reconfigurable) DataPath Unit: rDPU loop body rDPU Pipeline: rDPU loop body loop: complex loop body nested loops complex rDPU or pipe network inside rDPU complex pipe network CPU Memory Adder Speaker FMDemod LPF 1 Split Gather LPF 2 LPF 3 HPF 1 HPF 2 HPF 3 Source: MIT StreamIT transport- triggered

47 © 2010, TU Kaiserslautern POIIP: Loop to Pipe Mapping 47 (reconfigurable) DataPath Unit: rDPU loop body rDPU Pipeline: rDPU loop body loop: complex loop body nested loops complex rDPU or pipe network inside rDPU complex pipe network CPU Memory Adder Speaker FMDemod LPF 1 Split Gather LPF 2 LPF 3 HPF 1 HPF 2 HPF 3 Source: MIT StreamIT transport- triggered on „platform FPGAs“

48 © 2010, TU Kaiserslautern 48 Imperative Language Twins very easy to learn multiple GAGs much more powerful Flowware Languages Software Languages much more simple von Neumann Languages [COMP- EURO ’89] Anti- machine: MoPL : [FPL‘94, Prague] more simple parallelism solution

49 © 2010, TU Kaiserslautern 2011, A Heliocentric CS Model needed 49 PE P rogram E ngineering The Generalization of Software Engineering — data streams * *) do not confuse with „dataflow“! F lowware E ngineering FE a uto- s equencing M emory asM CE C onfigware E ngineering structures pipe network model rDPU r econfigurable- D ata- P ath- U nit r econfigurable- D ata- P ath- A rray rDPA instruction streams SE S oftware E ngineering CPU

50 © 2010, TU Kaiserslautern 50 program sourcecompilation result S oftware instruction streams F lowware data streams C onfigware datapath structures configured A Clean Terminology, please

51 © 2010, TU Kaiserslautern 2011, Outline (5) Energy consumption of Computers Toward Exascale Computing The von Neumann Syndrome We need to Reinvent Computing Conclusions 51

52 © 2010, TU Kaiserslautern 2011, absurdely i ncomprehensible abstractions 52 [For architecture design & debug] Concurrency models can operate at component architecture level rather than programming languages. [E. A. Lee] [E. A. Lee: Are new languages necessary for multicore? 2007] [E. A. Lee. The problem with threads. Computer, 2006.] & Locality Awareness needed !! We need model-based abstractions at algorithmic level are the problem in „standard“ languages

53 © 2010, TU Kaiserslautern 53 Higher Abstraction Levels Efforts to extend standards-based, serial programming languages with features to describe parallel constructs are likely to fail. Nick Tredennick: Term Rewriting Systems (TRS) may raise the abstraction level up to math formulae Mauricio Ayala-Rincón: What’s more likely to succeed are languages that raise the level of abstraction in algorithm description TRS: powerful for better language design and design space exploration

54 © 2010, TU Kaiserslautern 54 Conclusions Twin Paradigm skills & basic hardware knowledge are essential qualifications for programmers. We urgently need a fundamental CS Education and Research Revolution for dual-rail-thinking Since we‘ve to re-write software anyway we should do it twin-pardigm. We need a tool flow & education efforts supporting a twin-paradigm approach and locality awareness

55 © 2010, TU Kaiserslautern We need „une' Levée en Masses“ 55 We need „une' Levée en Masses “ 55

56 © 2010, TU Kaiserslautern 2011, 56 Don‘t worry ! Thank You very much ! too many panels and keynotes?

57 © 2010, TU Kaiserslautern 2011, END 57

58 © 2010, TU Kaiserslautern time to space mapping time domain:space domain: procedure domainstructure domain 58 program loop n time steps, 1 CPU pipeline 1 time step, n DPUs Bubble Sort n x k time steps, 1 „conditional swap“ unit Shuffle Sort k time steps, n conditional swap“ time algorithmspace algorithm conditional swap x y conditional swap conditional swap conditional swap conditional swap time algorithm space/time algorithm s units

59 © 2010, TU Kaiserslautern Architecture instead of synchro: 59 „Shuffle Sort“ conditional swap conditional swap conditional swap conditional swap modification: with shuffle- function conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional direct time to space mapping accessing conflicts Better Architecture instead of complex synchronisation: half he number of Blocks + up und down of data (shuffle function) – no von Neumann-syndrome ! Example

60 © 2010, TU Kaiserslautern Understanding Complex Hetero Systems 60 Layers of Abstraction and Automatic Parallelization hide critical sources of, and limits to efficient parallel execution Efficient Distribution of Tasks being memory limited Internode Communications reduces Computational Efficiency We must change how programmers think essential: awareness of locality, Focusing on memory mapping issues and transfer modes to detect overhead and bottlenecks Understanding streams through complex fabrics needed [Ed Lee]

61 © 2010, TU Kaiserslautern 2011, Vertical Disintegration X courtesy Manfred Glesner

62 © 2010, TU Kaiserslautern 2011, Market Complexity 62 Source: Gartner

63 © 2010, TU Kaiserslautern 2011, Taxonomy of Twin Paradigm Programming Flows (HPRC) 63 E. El-Araby et al.: Comparative Analysis of High Level Programming for Reconfigurable Computers: Methodology And Empirical Study; Proc. SPL2007, Mar del Plata, Argentina, Febr [courtesy Richard Newton] „The nroff of EDA“ [ R. N.]

64 © 2010, TU Kaiserslautern 2011, HLL programming models 64

65 © 2010, TU Kaiserslautern Some hardware description languaqges 65 DeFacto Galadriel & Nenya MATCH

66 © 2010, TU Kaiserslautern Some programming languages 66

67 © 2010, TU Kaiserslautern Some languages for parallelism 67

68 © 2010, TU Kaiserslautern More Languages Some functional languages Some datastream languages 68

69 © 2010, TU Kaiserslautern Why Computers are important 69 R. Rajkumar, I. Lee, L. Sha, J. Stankovic: Cyber-Physical Systems: The Next Computing Revolution; DAC 2010

70 © 2010, TU Kaiserslautern 2011, Science alone ? see the claims by Andrew Jones, … 70

71 © 2010, TU Kaiserslautern 2011, Mobile Communication Worldwide radio base station sites* (millions) Average power consumption per site (kW) Total power consumption of all sites (TW) Total global RAN energy consumption (TWh) total # of subscriptions expected (billions)69 Broadband subscriptions expected (billions)2 Video streams (%)6690 Share of mobile data in total mobile traffic (%) A. Fehske, J. Malmodin, G. Biczók, G. Fettweis: The Global Footprint of Mobile Communications – The Ecological and Economic Perspective ; IEEE Communications Magazine, Aug 2011 *) all standards The data transmission speed growth by a factor of ten every five years (cellular, local + personal area networks), Technologies to reduce energy consumption are a key enabler

72 © 2010, TU Kaiserslautern 2011, Undersea Cable 72 Google: 9,620km submarine cable Japan-US; 1 st use Febr 21, 2011 Five fiber pairs deliver up to 4.8 Terabits per second (Tbps) >100 kilometers between repeaters wavelength-division multiplexing dramatically increases fiber capacity. repeater laser power consumption <25 W power consumption of fabrication and cable layer ships much higher multiple (e.g. 5) pairs of fibers: each pair has one fiber in each direction <1000 repeaters: <25 kW


Download ppt "Aiming at the Natural Equilibrium of Planet Earth Requires to Reinvent Computing Reiner Hartenstein IEEE fellow 1 (ISCAS-2011)"

Similar presentations


Ads by Google