Presentation is loading. Please wait.

Presentation is loading. Please wait.

Computing for the Near and Long Term Haldun Hadimioglu Spring 2010 CS/EE 1012.

Similar presentations


Presentation on theme: "Computing for the Near and Long Term Haldun Hadimioglu Spring 2010 CS/EE 1012."— Presentation transcript:

1 Computing for the Near and Long Term Haldun Hadimioglu Spring 2010 CS/EE 1012

2 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 2 Outline What has happened ? Designing chips Near future directions Long term directions Conclusions Intel Eight-Core Xeon die with 2.3 billion transistors Cray Jaguar Supercomputer the fastest computer in the world

3 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 3 What has Happened ? Moore’s Law has been holding since 1960s It will continue to hold Perhaps at a slower rate of doubling every three years Smaller transistors are susceptible to alpha particles ! We will have very small transistors ! More transistors will be defective !

4 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 4 Intel ‘s Past Microprocessor Roadmap Intel eight-core Xeon processor (>26MB cache) ,300,000,000 Intel 1.01 TFLOP, 100 million transistor, 62-Watt, 80-core die, each core at 3.16GHz Intel Eight-Core Xeon 7500 die with 2.3 billion transistors

5 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 5 Power Density was Increasing Exponentially! Watts/cm  i386 i486 Pentium® Pentium® Pro Pentium® II Pentium® III Hot plate RocketNozzleRocketNozzle Nuclear Reactor Courtesy : “New Microarchitecture Challenges in the Coming Generations of CMOS Process Technologies” – Fred Pollack, Intel Corp. Micro32 conference key note Courtesy Avi Mendelson, Intel. Pentium® 4 Power was doubling every 4 years

6 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 6 Microprocessor speed Every two years the speed of microprocessors doubles The processor speed increases 50% a year ! But, memory speed increases 10 % a year ! Microprocessor speed for an application depends on Number of operations in the application (lower better) The quality of the code Number of parallel operations performed (higher better) Do more operations in parallel How fast each operation is performed (higher better) Because of Moore’s Law : transistors are smaller and wires are shorter Clock frequency is increased Until 2005 increasing the clock frequency was the main way to increase the speed Power consumption (heat generation) increases with the frequency The chip has to be cooled by usingcooled A heat sink or a fan or a liquid Since 2005 power consumption changed way to increase speed

7 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 7 Multi-Core Microprocessors Since 2005 microprocessor speed increase depends on Number of operations in the code (the quality of the code) Number of parallel operations performed Dual-core microprocessors with reduced frequency consume less power (generate less heat) Two/Four/Eight cores perform more operations in parallel  The speed increase continues into the future with more cores on chip Clock frequency Number of cores per chip doubles every two years The memory can become a bottleneck The memory speed increases 10% a year More cores increase the demand on the memory The memory wall problem Parallel Programming has to be improved dramatically Parallel programming wall

8 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 8 Designing Chips We have been using hardware description languages (HDLs) to design chips We write an HDL program to design a chip ! Just like we draw a schematic to design a chip Why an HDL program, why not schematics ? Real life circuits are too complex to be designed by schematics There are two popular HDLs today VHDL Verilog HDL Knowing one HDL language helps one learn another HDL language faster

9 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 9 Why HDLs ? Software : Statements are executed sequentially The sequence of statements is significant, since they are executed in that order Java, C++, C, Ada, Pascal, Fortran,… Hardware : Events happen concurrently A software language cannot be used for describing and simulating hardware Concurrent software languages cannot be used either Because we do not have powerful tools Programs in C/C++ etc. will be used to design chips in the future It is already done for C and C++ programs in limited cases First they are converted to HDL programs and then to hardware

10 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 10 Full Adder VHDL Program Data-flow description of the Full Adder circuit : Full Adder ki mi si ci co si = ki mi ci + ki mi ci + ki mi ci + ki mi ci co = ki mi + ki ci + mi ci IBM dual-core BlueGene/L microprocessor die & its chip © IBM

11 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 11 VHDL Details : 3-to-8 Decoder

12 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 12 3-to-8 Decoder VHDL Program Entity Part : 3-to-8 DCD A0 G1 Y_L0 A1 A2 Y_L1 Y_L2 Y_L3 Y_L4 Y_L5 Y_L6 Y_L7 G2A_L G2B_L V74x138

13 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 13 3-to-8 Decoder VHDL Program All statements happen concurrently Architecture Part : 3-to-8 DCD A0 G1 Y_L0 A1 A2 Y_L1 Y_L2 Y_L3 Y_L4 Y_L5 Y_L6 Y_L7 G2A_L G2B_L V74x138

14 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 14 Near Future Directions  Double number of cores every two years Make sure to handle errors due to Alpha particles Defective transistors Parallel Programming Make sure to improve Make sure to handle Memory Wall Power Wall

15 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 15 Near Future Directions HPC Wire, December 4, 2009 September 1, The IBM Power7 chips are implemented in a 45 nanometer copper/SOI process and have 1.2 billion transistors with eight cores on a single die. The Power7 core has 32KB of L1 instruction cache and 32KB of L1 data cache. Each core sports simultaneous multithreading that delivers four virtual threads per core, and has a 256KB of L2 cache tightly coupled to it. The chip also has 32MB of embedded DRAM that acts as a shared L3 cache, with 4 MB segments affiliated with each of the eight cores. The Power7 chip has two dual-channel DDR3 memory controllers implemented on the chip, which deliver 100 GB/sec of sustained bandwidth per chip. November, 27, 2009 Intel Unveils 48-Core Research Chip On Wednesday Intel shifted its Tera-scale Computing Research Program into second gear by demonstrating a 48-core x86 processor. The company is intending to use the new chip as a research platform for the purpose of lighting a fire under many-core computing. According to Intel, the new chip boasts 1.3 billion transistors and is built on 45nm CMOS technology. It's distinction is that it contains the largest number of Intel Architecture (IA) cores ever assembled on a single microprocessor. As such, it represents the sequel to Intel's 2007 "Polaris" 80-core prototype that was based on simple floating point units. While the latter chip was said to reach 2 teraflops, the company is not talking about performance for the 48-core version.

16 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 16 Intel & IBM Vision for Next 5-8 Years From Intel Intel Technology Journal, November 2005 Scalable High Performance Main Memory System Using PCM Technology, Moinuddin K. Qureshi, et.al., ISCA 2009, IBM Intel

17 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 17 Near Future Directions : Next 5-8 Years Applications Intel : Recognition, Mining, Synthesis as platform 2015 Workload Model (on massively parallel core chips) IBM : Presence information, knowing where and things are and how to best match them, people are sensorized Microsoft : Intention machine, computer predicts user intentions and delivers useful information CMU : Computational thinking, computer science based approach to solving problems, designing systems, understanding human behavior Traditional computing will continue A C/C++/Java program for an application becomes Software A compiler generates the machine language program file A new type of computing A C/C++/Java program for an application becomes Hardware A hardware compiler generates the transistor circuit The result is a custom chip

18 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 18 Near Future Directions : New Computing Types ? Any other new possibility ? A C/C++/Java program for an application becomes Hardware A CAD tool generates the bit file to reconfigure the FPGA An FPGA chip is a hardware programmable chip The chip emulates the circuit designed The bit file configures the chip The CS 2204 Digital Logic Lab uses FPGAs ! There can be more opportunities with FPGA chips ! FPGAs are increasingly used in commercial products ! FPGAs are becoming cost competitive with microprocessors FPGAs are becoming speed competitive with custom chips FPGAs are used for applications where Speed and programmability matter Latest FPGAs also have microprocessor cores They can run software as well The application can be divided into software and hardware

19 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 19 Near Future Directions : New Computing Types A C/C++/Java program becomes Part software and part hardware FPGA with cores and reconfigurable areas runs applications Software is run by processor cores and Hardware is in the reconfigurable area When such an FPGA runs an application, some operations are in hardware and simultaneously some operations in software Software tools (compilers) and CAD tools must merge Reconfigurable areas & cores allow recovering from errors due to Alpha particles Defective transistors Processor core to run software Reconfigurable area to do operations in hardware These FPGAs are available now but we need much better tools

20 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 20 Near Future Directions : Hybrid Switching Elements CMOL : A circuitry composed of CMOS and nanodevices A closer look at FPGA-like reconfigurable logic circuits Interface between CMOS and nanodevices Two CMOS cells and a nanodevice A larger view of FPGA-like reconfigurable logic circuits Figures from : Konstantin K. Likharev

21 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 21 Near Future Directions : Possible New Structures Microelectromechanical systems, MEMS, with computing elements Microembedded systems Smart Dust at UC Berkeley Microbiolab on a chip Sometimes referred to as a biochip ! Other structures that can be used for a number of different applications with or without computing elements Microcameras Microsensors Micromirrors Micromotors Microlenses An all-optical computing chip with Micromirrors Microlenses Bio MEMS The Biochip Group at Mesa+, University of Twente, Holland

22 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 22 Near Future Directions : Year 2020 SEMATECH : consortium of semiconductor manufacturers from America, Asia and Europe. SEMATECH predictions for year 2020 (from its 2009 Update of International Technology Roadmap for Semiconductors, ITRS, study) : Clock speed : 12 GHz Number of transistors on a microprocessor chip : 35 Billion 32Gbit DRAM chips Process length : 14 nm Make sure to handle errors due to  Alpha particles  Defective transistors

23 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 23 Long Term Directions : Possible New Structures Nanotechnology Programmable materials NEMS Bio NEMS Nano medicine Drug delivery Smart diagnosis Nanocomputing 1 Watt supercomputer Quantum computing Molecular computing Molecular self assembly Testing of molecular structures Adaptive molecular structures Merger of bio and non-bio structures Synthetic biology IBM Blue Gene/L molecular dynamics demo

24 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 24 Long Term Directions : 2020 and Beyond Many interconnected varying-size computing elements using each other’s results autonomously Ubiquitous computing with little human intervention Cloud computing to nano computing Personal agents Intelligent spaces Nano medicine Targeted drug delivery We need Self-healing, adaptive, self managing, trustworthy, dependable hardware and software Efficient parallel processing New computational models New programming languages Hardware and software reliability

25 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 25 Long Term Directions : 2020 and Beyond Will hardware and software be developed separately like today ? How will software be developed for nano systems ? Quantum software ? Molecular software ? Biosoftware ? How will hardware be developed for nano systems ? VHDL or Verilog HDL or C or C++ or ? Iron atoms on copper with electron movement Developing tools is critical Simulation of protein molecules folding on a supercomputer

26 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 26 Long Term Directions : 2020 and Beyond By 2019 a $1000 computer will match the processing power of the human brain Raymond Kurzweil, KurzweilAI.net, 9/1/1999 His keynote speech at the Supercomputing Conference (SC06) in November 2006 The title of his talk is “The Coming Merger of Biological and Non-Biological Intelligence”  Singularity point ? Brain downloads possible by 2050 Ian Pearson, Head of British Telecom’s futurology unit, CNN.com, 5/23/2005 Computers will be used as virtual brain extensions ? Direct brain - Internet link ?

27 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 27 Long Term Directions Hans Moravec, 1998 Many ethical issues will be facing you ! Being prepared will help !

28 Spring 2010 CS/EE1012 Introduction to Computer Engineering Page 28 Conclusions Digital Logic evolution will continue : Faster, cheaper, smaller, lighter, less power consuming, higher reliability digital products Due to converging research in various areas : Mathematics Computer Science Computer Engineering Electrical Engineering Mechanical Engineering Physics Chemistry Material Science Biology ? There will be many ethical issues  Try to prepare !  Try to be informed !


Download ppt "Computing for the Near and Long Term Haldun Hadimioglu Spring 2010 CS/EE 1012."

Similar presentations


Ads by Google