We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byMateo Clerkin
Modified about 1 year ago
Aiming at the Natural Equilibrium of Planet Earth Requires to Reinvent Computing Reiner Hartenstein IEEE fellow 1 (ISCAS-2011)
© 2010, TU Kaiserslautern 2011, (Preface) Without Computers? 2 Lufthansa anno 1960 (Business Information System)
© 2010, TU Kaiserslautern (Preface) Rebooting the World 3 Smart graphic interfaces replacing UN, UNESCO and other bureaucracies by Rebooting the World for the New Realities direct interaction with the population (mass collaboration) The World Economic Forum:
© 2010, TU Kaiserslautern 2011, Preface Enormous Trouble in Computing: –Longterm Programming Crisis –Keynotes and Panel Discussions booming –Excessive Power Consumption 4
© 2010, TU Kaiserslautern 2011, Outline (1) Energy consumption of Computers Toward Exascale Computing The von Neumann Syndrome We need to Reinvent Computing Conclusions 5
© 2010, TU Kaiserslautern Beyond peak oil 6 „6 more Saudi Arabias needed for demand predicted for 2030“ 80% crude oil coming from [Fatih Birol, Chief Economist IEA]. https://www.theoildrum.com/
© 2010, TU Kaiserslautern 2011, 7 7 Saudi Arabia © 2011
© 2010, TU Kaiserslautern 2011, How many more Saudi Arabias needed? Rio de Janeiro 8
© 2010, TU Kaiserslautern 2011, Power Consumption of the Internet Power consumption by internet: x30 til 2030 if trends continue G. Fettweis, E. Zimmermann: ICT Energy Consumption - Trends and Challenges; WPMC'08, Lapland, Finland, 8 –11 Sep [Randy Katz: IEEE Spectrum, Febr. 2009] more than6 saudi arabia ! Google Data Ccenter at Columbia River soon 8 billion smart wireless devices
© 2010, TU Kaiserslautern 2011, More Google Data Centers 10 Google causing 2% electricity consumption worldwide ? [datacenterknowledge.com] ©
© 2010, TU Kaiserslautern 2011, Electricity Bill: a Key Issue „The possibility of computer equipment power consumption spiraling out of control could have serious consequences for the overall affordability of computing.” Patent for water-based data centers Cost of a Google data center dominated only by monthly power bill [L. A. Barroso, Google] 11 FERC Google going to sell electricity Already in 2005, Google’s electricity bill higher than value of its equipment.
© 2010, TU Kaiserslautern 2011, 12 The World's largest Data Center [datacenterknowledge.com]
© 2010, TU Kaiserslautern 2011, Microsoft Data Center at Quincey 13 [datacenterknowledge.com]
© 2010, TU Kaiserslautern 2011, About 2000 datacenters world-wide 14 [datacenterknowledge.com]
© 2010, TU Kaiserslautern 2011, Outline (2) Energy consumption of Computers Toward Exascale Computing The von Neumann syndrome We need to Reinvent Computing Conclusions 15
© 2010, TU Kaiserslautern 2011, 16 year relative performance Multicore: Break-through or Breakdown? x86 parallelism von-Neumann-only m u c h s l o w e r t h a n M o o r e ‘ s l a w begin of the multicore era „forcing a historic transition to a parallel programming model yet to be invented“ David Callahan, Microsoft distinghuished endineer performance growth needed
© 2010, TU Kaiserslautern 2011, „ intel has thrown a Hail Mary Pass“ Dave Patterson 17
© 2010, TU Kaiserslautern 2011, 18 John Hennessy „ … I would be panicking …“
© 2010, TU Kaiserslautern 19 Exa-scale: (10 18 computations/second) expected by 2018; [several sources] Power estimated (single supercomputer): 250 MW – 10 GW (2x NY City: 16 million people) Exascale affordable ?
© 2010, TU Kaiserslautern 20 In my opinion, the largest supercomputers at any time, including the first exaflops, should not be thought of as computers. … [ Andrew Jones, vice president Numerical Algorithms Group] Supercomputers: no Computers?
© 2010, TU Kaiserslautern 21 …Their usage patterns and scientific impact are closer to major research facilities such as CERN, ITER, or Hubble. [ Andrew Jones, vice president Numerical Algorithms Group] no reason to solve the power problem ? Supercomputers as Scientific Instruments
© 2010, TU Kaiserslautern 2011, CERN (1) 22
© 2010, TU Kaiserslautern 2011, CERN (2) 23
© 2010, TU Kaiserslautern 2011, Hubble 24
© 2010, TU Kaiserslautern 2011, Learning how to go Exascale CACHES st International Workshop on Characterizing Applications for Heterogeneous Exascale Systems June 4th, 2011, held in conjunction with ICS' th International Conference on Supercomputing May 31 - June 4, 2011, Loews Ventana Canyon Resort, Tucson, Arizona 25
© 2010, TU Kaiserslautern 2011, Outline (3) Energy consumption of Computers Toward Exascale Computing The von Neumann syndrome We need to Reinvent Computing Conclusions 26
© 2010, TU Kaiserslautern 2011, 27 Reconfigurable Computing offers an overwhelming reduction of electricity consumption Potential of RC as well as massive speed-up factors … explained by the von Neumann Syndrome
© 2010, TU Kaiserslautern 2011, >15000 PISA project 28 FFT 100 Reed-Solomon Decoding 2400 Viterbi Decoding MAC DSP and wireless molecular dynamics simulation 88 BLAST 52 protein identification 40 Smith-Waterman pattern matching 288 Bioinformatics GRAPE 20 Astrophysics SPIHT wavelet-based image compression 457 real-time face detection 6000 video-rate stereo vision 900 pattern recognition 730 Image processing, Pattern matching, Multimedia 3000 CT imaging ,000,000 Speedup-Factor Speed-up factors are not new by avoiding the von Neumann paradigm doubled ev. 4 months 8723 DNA seq , , © 2011 ? crypto DES breaking ?
© 2010, TU Kaiserslautern 2011, Energy saving factors: ~10% of speedup 29 FFT 100 Reed-Solomon Decoding 2400 Viterbi Decoding MAC DSP and wireless molecular dynamics simulation 88 BLAST 52 protein identification 40 Smith-Waterman pattern matching 288 Bioinformatics GRAPE 20 Astrophysics crypto DES breaking Speedup-Factor © 2011 doubles every 4 months Power save factors obtained SPIHT wavelet-based image compression 457 real-time face detection 6000 video-rate stereo vision 900 pattern recognition 730 Image processing, Pattern matching, Multimedia 3000 CT imaging 8723 DNA seq.
© 2010, TU Kaiserslautern 2011, 30 [Tarek El-Ghazawi et al.: IEEE COMPUTER, Febr. 2008] Application. Speed-up factor Savings PowerCostSize DNA and Protein sequencing DES breaking much less equipment needed massively saving energy RC*: Demonstrating the intensive Impact SGI Altix 4700 with RC 100 RASC compared to Beowulf cluster Tarek El-Ghazawi *) RC = Reconfigurable Computing taxonomy of HPRC design flows 12 % 9 %
© 2010, TU Kaiserslautern Drastically less Equipment needed For instance: a hangar full of racks replaced by : a single rack without air conditioning 31 or ½ rack
© 2010, TU Kaiserslautern 2011, The Reconfigurability Paradox Routing congestion 32 Lower clock speed Reconfigurability overhead Wiring overhead O. o. magnitude better performance by a massively worse technology ?
© 2010, TU Kaiserslautern 33 More power for creating foam than to accelerate the vessel ? 33 because of The von Neumann Syndrome © 2011
© 2010, TU Kaiserslautern 2011, von Neumann Syndrome 34 Lambert M. Surhone, Mariam T. Tennoe, Susan F. Hennessow (ed.): Von Neumann Syndrome ; ßetascript publishing 2011
© 2010, TU Kaiserslautern 2011, © 2011 von Neumann Model Critics 35 “The von Neumann Syndrome”: [C.V. “RAM” Ramamoorthy 2007; UC Berkeley] Nathan’s Law: Software is a gas. It expands to fill all its containers... Nathan Myhrvold, Microsoft Ex-CTO „even fills the internet“ and the clouds yearsystem MLOC (millions) 2001Windows XP MAC OS X SAP Net Weaver238 incompetent programmers E. Dijkstra 1968; J. Backus 1978; Arvind, 1983; Peter G. Neumann ; L. Savain Critique of von Neumann is not new: N. N. 1995: THE STANDISH GROUP REPORT Robert N. Charette 2005: Why Software Fails; IEEE Spectrum Anthony Berglas 2008: Why it is Important that Software Projects Fail Software Desaster Reports:
© 2010, TU Kaiserslautern 2011, All hardware but ALU is overhead: x20 inefficiency 36 x20 inefficiency: just one of several overhead layers [R. Hameed et al.: Understanding Sources of Inefficiency in General- Purpose Chips; 37th ISCA, June 19-23, 2010, St. Malo, France] “GP Processors are inefficient” (data cashe)
© 2010, TU Kaiserslautern 2011, „The Memory Wall“ 60%/yr.. 7%/year Patterson’s Law: Processor-Memory Performance Gap: (grows 50% / year) 2008 >1000 coined by Sally McKee The overwhealming problem is data moving complexity, not processor performance. Dr. Djordje Maric* (ETH Zurich), 37 and complex multi- M LOC instruction movement
© 2010, TU Kaiserslautern 2011, Through-Silicon-Via (TSV) 38 reduce power consumption by 75% [Wally Rh., Micro News 2/28/2011 ] SIP multiple dice PoP Package on Package PiP Package in Package TSV Through silicon via reducing the memory wall?
© 2010, TU Kaiserslautern Massive Overhead Phenomena proportionate to the number of processors overproportionate to the number of processors 39 overhead von Neumann machine instruction fetchinstruction stream state address computation instruction stream data address computation instruction stream data meet PU + other overh.instruction stream i / o to / from off-chip RAM instruction stream Inter PU communication instruction stream message passing overheadinstruction stream transactional memory overh. instruction stream multithreading overhead etc. instruction stream
© 2010, TU Kaiserslautern 40 von Neumann overhead vs. Reconfigurable Computing overhead von Neumann machine datastream machine instruction fetchinstruction streamnone* state address computation instruction streamnone* data address computation instruction streamnone* data meet PU + other overh.instruction streamnone* i / o to / from off-chip RAM instruction streamnone* Inter PU communication instruction streamnone* message passing overheadinstruction streamnone* transactional memory overh. instruction streamnone* multithreading overhead etc. instruction streamnone* *) configured before run time n o i n s t r u c t i o n f e t c h a t r u n t i m e 40
© 2010, TU Kaiserslautern 2011, Outline (4) Energy consumption of Computers Toward Exascale Computing The von Neumann Syndrome We need to Reinvent Computing Conclusions 41
© 2010, TU Kaiserslautern 2011, Putting Old Ideas Into Practice 42 Software Engineering SEN vol. 24 no. 3, May 1999 The biggest payoff will come from putting old ideas into practice (POIIP) and teaching people how to apply them properly. [David Parnas]
© 2010, TU Kaiserslautern 2011, Mike Flynn‘s Taxonomy 43 M. J. Flynn: “ Very high-speed computing systems”; Proc. IEEE, Vol. 54, No. 12, pp. 1901–1909, Dec., 1966.
© 2010, TU Kaiserslautern 2011, Diana‘s extended Taxonomy 44 © 2011 D. Göhringer, M. Hübner, T. Perschke, J. Becker: “ A Taxonomy of Reconfigurable Single/Multi-Processor Systems-on-Chip”; International Journal of Reconfigurable Computing, Hindawi, Special Issue: Selected Papers from ReCoSoC 2008, rSI: I can be reconfigured at run time: e. g. RISP rSD: can exchange data memory or datapath rSIrSD: both possible 4 x SISD: 4 x SIMD: rMIrMD: supports both I: instruction stream D: data stream rMD: SIMD processors can exchange their data memories or reconfigure their datapaths rSI: I can be reconfigured at run time: e. g. RISP rSIrMD: can reconfigure both, D and Iat run time rMI: MPSoCs w. reconfigurable I 4 x MIMD: rMD: MPSoCs w. reconfigurable D
© 2010, TU Kaiserslautern „ B u t y o u c a n ‘ t i m p l e m e n t d e c i s i o n s ! “ 45 S = R + (if C then A else B endif); =1 + A B R C section of a very large pipe network: decision C. G. Bell et al: IEEE Trans-C21/5, May 1972 W. A. Clark: 1967 SJCC, AFIPS Conf. Proc. decision box turns into (de)multiplexer ** Software to Configware Migration 0 1 (de)multiplexer: B A C decision box: C 0 1 ?? POIIP:
© 2010, TU Kaiserslautern POIIP: Loop to Pipe Mapping 46 (reconfigurable) DataPath Unit: rDPU loop body rDPU Pipeline: rDPU loop body loop: complex loop body nested loops complex rDPU or pipe network inside rDPU complex pipe network CPU Memory Adder Speaker FMDemod LPF 1 Split Gather LPF 2 LPF 3 HPF 1 HPF 2 HPF 3 Source: MIT StreamIT transport- triggered
© 2010, TU Kaiserslautern POIIP: Loop to Pipe Mapping 47 (reconfigurable) DataPath Unit: rDPU loop body rDPU Pipeline: rDPU loop body loop: complex loop body nested loops complex rDPU or pipe network inside rDPU complex pipe network CPU Memory Adder Speaker FMDemod LPF 1 Split Gather LPF 2 LPF 3 HPF 1 HPF 2 HPF 3 Source: MIT StreamIT transport- triggered on „platform FPGAs“
© 2010, TU Kaiserslautern 48 Imperative Language Twins very easy to learn multiple GAGs much more powerful Flowware Languages Software Languages much more simple von Neumann Languages [COMP- EURO ’89] Anti- machine: MoPL : [FPL‘94, Prague] more simple parallelism solution
© 2010, TU Kaiserslautern 2011, A Heliocentric CS Model needed 49 PE P rogram E ngineering The Generalization of Software Engineering — data streams * *) do not confuse with „dataflow“! F lowware E ngineering FE a uto- s equencing M emory asM CE C onfigware E ngineering structures pipe network model rDPU r econfigurable- D ata- P ath- U nit r econfigurable- D ata- P ath- A rray rDPA instruction streams SE S oftware E ngineering CPU
© 2010, TU Kaiserslautern 50 program sourcecompilation result S oftware instruction streams F lowware data streams C onfigware datapath structures configured A Clean Terminology, please
© 2010, TU Kaiserslautern 2011, Outline (5) Energy consumption of Computers Toward Exascale Computing The von Neumann Syndrome We need to Reinvent Computing Conclusions 51
© 2010, TU Kaiserslautern 2011, absurdely i ncomprehensible abstractions 52 [For architecture design & debug] Concurrency models can operate at component architecture level rather than programming languages. [E. A. Lee] [E. A. Lee: Are new languages necessary for multicore? 2007] [E. A. Lee. The problem with threads. Computer, 2006.] & Locality Awareness needed !! We need model-based abstractions at algorithmic level are the problem in „standard“ languages
© 2010, TU Kaiserslautern 53 Higher Abstraction Levels Efforts to extend standards-based, serial programming languages with features to describe parallel constructs are likely to fail. Nick Tredennick: Term Rewriting Systems (TRS) may raise the abstraction level up to math formulae Mauricio Ayala-Rincón: What’s more likely to succeed are languages that raise the level of abstraction in algorithm description TRS: powerful for better language design and design space exploration
© 2010, TU Kaiserslautern 54 Conclusions Twin Paradigm skills & basic hardware knowledge are essential qualifications for programmers. We urgently need a fundamental CS Education and Research Revolution for dual-rail-thinking Since we‘ve to re-write software anyway we should do it twin-pardigm. We need a tool flow & education efforts supporting a twin-paradigm approach and locality awareness
© 2010, TU Kaiserslautern We need „une' Levée en Masses“ 55 We need „une' Levée en Masses “ 55
© 2010, TU Kaiserslautern 2011, 56 Don‘t worry ! Thank You very much ! too many panels and keynotes?
© 2010, TU Kaiserslautern 2011, END 57
© 2010, TU Kaiserslautern time to space mapping time domain:space domain: procedure domainstructure domain 58 program loop n time steps, 1 CPU pipeline 1 time step, n DPUs Bubble Sort n x k time steps, 1 „conditional swap“ unit Shuffle Sort k time steps, n conditional swap“ time algorithmspace algorithm conditional swap x y conditional swap conditional swap conditional swap conditional swap time algorithm space/time algorithm s units
© 2010, TU Kaiserslautern Architecture instead of synchro: 59 „Shuffle Sort“ conditional swap conditional swap conditional swap conditional swap modification: with shuffle- function conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional swap conditional direct time to space mapping accessing conflicts Better Architecture instead of complex synchronisation: half he number of Blocks + up und down of data (shuffle function) – no von Neumann-syndrome ! Example
© 2010, TU Kaiserslautern Understanding Complex Hetero Systems 60 Layers of Abstraction and Automatic Parallelization hide critical sources of, and limits to efficient parallel execution Efficient Distribution of Tasks being memory limited Internode Communications reduces Computational Efficiency We must change how programmers think essential: awareness of locality, Focusing on memory mapping issues and transfer modes to detect overhead and bottlenecks Understanding streams through complex fabrics needed [Ed Lee]
© 2010, TU Kaiserslautern 2011, Vertical Disintegration X courtesy Manfred Glesner
© 2010, TU Kaiserslautern 2011, Market Complexity 62 Source: Gartner
© 2010, TU Kaiserslautern 2011, Taxonomy of Twin Paradigm Programming Flows (HPRC) 63 E. El-Araby et al.: Comparative Analysis of High Level Programming for Reconfigurable Computers: Methodology And Empirical Study; Proc. SPL2007, Mar del Plata, Argentina, Febr [courtesy Richard Newton] „The nroff of EDA“ [ R. N.]
© 2010, TU Kaiserslautern 2011, HLL programming models 64
© 2010, TU Kaiserslautern Some hardware description languaqges 65 DeFacto Galadriel & Nenya MATCH
© 2010, TU Kaiserslautern Some programming languages 66
© 2010, TU Kaiserslautern Some languages for parallelism 67
© 2010, TU Kaiserslautern More Languages Some functional languages Some datastream languages 68
© 2010, TU Kaiserslautern Why Computers are important 69 R. Rajkumar, I. Lee, L. Sha, J. Stankovic: Cyber-Physical Systems: The Next Computing Revolution; DAC 2010
© 2010, TU Kaiserslautern 2011, Science alone ? see the claims by Andrew Jones, … 70
© 2010, TU Kaiserslautern 2011, Mobile Communication Worldwide radio base station sites* (millions) Average power consumption per site (kW) Total power consumption of all sites (TW) Total global RAN energy consumption (TWh) total # of subscriptions expected (billions)69 Broadband subscriptions expected (billions)2 Video streams (%)6690 Share of mobile data in total mobile traffic (%) A. Fehske, J. Malmodin, G. Biczók, G. Fettweis: The Global Footprint of Mobile Communications – The Ecological and Economic Perspective ; IEEE Communications Magazine, Aug 2011 *) all standards The data transmission speed growth by a factor of ten every five years (cellular, local + personal area networks), Technologies to reduce energy consumption are a key enabler
© 2010, TU Kaiserslautern 2011, Undersea Cable 72 Google: 9,620km submarine cable Japan-US; 1 st use Febr 21, 2011 Five fiber pairs deliver up to 4.8 Terabits per second (Tbps) >100 kilometers between repeaters wavelength-division multiplexing dramatically increases fiber capacity. repeater laser power consumption <25 W power consumption of fabrication and cable layer ships much higher multiple (e.g. 5) pairs of fibers: each pair has one fiber in each direction <1000 repeaters: <25 kW
Aiming at the Natural Equilibrium of Planet Earth Requires to Reinvent Computing Reiner Hartenstein IEEE fellow 1 (ISCAS-2011)
Enabling Technologies for Reconfigurable Computing Reiner Hartenstein University of Kaiserslautern November 21, 2001, Tampere, Finland Enabling Technologies.
Cloud Computing Cloud Computing Overview of Distributed Computing.
The Future of Computing Performance: Samuel H. Fuller, Chair March 22, 2011 Computer Science and Telecommunications Board (CSTB) National Research Council.
Manchester Computing Cross Council ICT Conference For e-Science & GRID May 2004 End to End Services to support an e-Science Community Professor M.
Distributed Computing Dr. Eng. Ahmed Moustafa Elmahalawy Computer Science and Engineering Department.
© University of Cancun, Mexico1 Chapter 8: Green computing and Communication Architecture 1 Tarik Guelzim, 1 Mohammad S. Obaidat, Fellow of IEEE 1 Monmouth.
Why everything I learned at Leeds in 1972 is no longer true! Andrew Herbert 30 th March 2007.
The first-generation Cell Broadband Engine (BE) processor is a multi-core chip comprised of a 64-bit Power Architecture processor core and eight synergistic.
UNIT I FUNDAMENTAL OF E-COMMERCE 1.1INTRODUCTION TO E-COMMERCE 1.2 DRIVING FORCES OF E-COMMERCE 1.3 BENEFITS AND LIMITATIONS OF E-COMMERCE 1.4 DATA MINING.
Ahmad Aljebaly Department of Computer Science Western Michigan University.
Sensors and Wireless Sensor Networks Roadmap Motivation for a Network of Wireless Sensor Nodes Definitions and background Challenges and constraints.
What is an Operating System? A program that acts as an intermediary between a user of a computer and the computer hardware. Operating system goals: Execute.
Copyright 2011 John Wiley & Sons, Inc Business Data Communications and Networking 11th Edition Jerry Fitzgerald and Alan Dennis John Wiley & Sons, Inc.
CS211/Fall /29 CS211: Protocol and Systems Design for Wireless and Mobile Networks Instructor: Songwu Lu Office: 4531D BH Lectures:
Software Development QA Best Practices May 20, 2010 Suzette Hackl, CSM Senior Project Manager Skyline Technologies, Inc.
Abstract Cloud Computing is being projected by several major IT companies such as IBM, Google, Yahoo, Amazon and others as fifth utility where clients.
Quality Tools and Techniques in the School and Classroom.
Information Systems Using Information (Higher and Intermediate 2)
Addressing Complexity in Emerging Cyber-Ecosystems – Experiments with Autonomic Computational Science Manish Parashar* Center for Autonomic Computing The.
What happened to IPv5? and other oft asked IPv6 questions The Internet Society, IPv6 and You Susan Estrada.
PREPARED BY: JANINE RABE PHOEBIE ANDALLO SHIELA MAE COLLADO JAYSON SANTIAGO.
For a copy of this presentation,
Libraries supporting e-Science --- … combining cultures … Pauline Simpson National Oceanography Centre University of Southampton, UK Digital Libraries.
Introduction to Simulations on GPUs Kalyan S. Perumalla, Ph.D. Senior Researcher Oak Ridge National Laboratory Adjunct Professor Georgia Institute of.
Graphics on a Stream Processor Peter Djeu March 20, 2003.
Visions of the Future of Computing Professor Peter Excell Professor of Communications / Athro Cyfathrebu Glyndwr University / Prifysgol Glyndŵr.
Exchange 2010 Hosting Service How Hosted Exchange Works and Benefits Businesses.
© 2016 SlidePlayer.com Inc. All rights reserved.