Heterogeneous Computing Dr. Jason D. Bakos. Heterogeneous Computing 2 “Traditional” Parallel/Multi-Processing Large-scale parallel platforms: –Individual.

Slides:



Advertisements
Similar presentations
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Advertisements

Lecture 6: Multicore Systems
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Appendix A — 1 FIGURE A.2.2 Contemporary PCs with Intel and AMD CPUs. See Chapter 6 for an explanation of the components and interconnects in this figure.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
GRAPHICS AND COMPUTING GPUS Jehan-François Pâris
Chapter 8 Hardware Conventional Computer Hardware Architecture.
Heterogeneous Computing at USC Dept. of Computer Science and Engineering University of South Carolina Dr. Jason D. Bakos Assistant Professor Heterogeneous.
Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing Dr. Jason D. Bakos.
Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing CSCE 791 Dr. Jason D. Bakos.
Heterogeneous Computing at USC Dept. of Computer Science and Engineering University of South Carolina Dr. Jason D. Bakos Assistant Professor Heterogeneous.
Seven Minute Madness: Special-Purpose Parallel Architectures Dr. Jason D. Bakos.
Some Thoughts on Technology and Strategies for Petaflops.
Define Embedded Systems Small (?) Application Specific Computer Systems.
CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.
Weekly Report Start learning GPU Ph.D. Student: Leo Lee date: Sep. 18, 2009.
FPGA Acceleration of Phylogeny Reconstruction for Whole Genome Data Jason D. Bakos Panormitis E. Elenis Jijun Tang Dept. of Computer Science and Engineering.
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos.
ECE 526 – Network Processing Systems Design
Seven Minute Madness: Reconfigurable Computing Dr. Jason D. Bakos.
University of Michigan Electrical Engineering and Computer Science Amir Hormati, Mehrzad Samadi, Mark Woh, Trevor Mudge, and Scott Mahlke Sponge: Portable.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
EKT303/4 PRINCIPLES OF PRINCIPLES OF COMPUTER ARCHITECTURE (PoCA)
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Motivation “Every three minutes a woman is diagnosed with Breast cancer” (American Cancer Society, “Detailed Guide: Breast Cancer,” 2006) Explore the use.
© Fujitsu Laboratories of Europe 2009 HPC and Chaste: Towards Real-Time Simulation 24 March
GPU-accelerated Evaluation Platform for High Fidelity Networking Modeling 11 December 2007 Alex Donkers Joost Schutte.
David Luebke NVIDIA Research GPU Computing: The Democratization of Parallel Computing.
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
Heterogeneous Computing: New Directions for Efficient and Scalable High-Performance Computing CSCE 791 Dr. Jason D. Bakos.
Extracted directly from:
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
COMPUTER SCIENCE &ENGINEERING Compiled code acceleration on FPGAs W. Najjar, B.Buyukkurt, Z.Guo, J. Villareal, J. Cortes, A. Mitra Computer Science & Engineering.
Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos.
Publication: Ra Inta, David J. Bowman, and Susan M. Scott. Int. J. Reconfig. Comput. 2012, Article 2 (January 2012), 1 pages. DOI= /2012/ Naveen.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
1 © 2012 The MathWorks, Inc. Parallel computing with MATLAB.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
NVIDIA Tesla GPU Zhuting Xue EE126. GPU Graphics Processing Unit The "brain" of graphics, which determines the quality of performance of the graphics.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
VTU – IISc Workshop Compiler, Architecture and HPC Research in Heterogeneous Multi-Core Era R. Govindarajan CSA & SERC, IISc
Introduction to Reconfigurable Computing Greg Stitt ECE Department University of Florida.
Hardware Acceleration Using GPUs M Anirudh Guide: Prof. Sachin Patkar VLSI Consortium April 4, 2008.
Summary Background –Why do we need parallel processing? Moore’s law. Applications. Introduction in algorithms and applications –Methodology to develop.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Seven Minute Madness: Heterogeneous Computing Dr. Jason D. Bakos.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.
Compiler and Runtime Support for Enabling Generalized Reduction Computations on Heterogeneous Parallel Configurations Vignesh Ravi, Wenjing Ma, David Chiu.
Trends in the Infrastructure of Computing
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
Current Research Overview Jeremy Espenshade 09/04/08.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
Trends in the Infrastructure of Computing: Processing, Storage, Bandwidth CSCE 190: Computing in the Modern World Dr. Jason D. Bakos.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Seven Minute Madness: Heterogeneous Computing Dr. Jason D. Bakos.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Real-Time Ray Tracing Stefan Popov.
CSCE 190: Computing in the Modern World Dr. Jason D. Bakos
Characteristics of Reconfigurable Hardware
Graphics Processing Unit
Multicore and GPU Programming
Multicore and GPU Programming
Presentation transcript:

Heterogeneous Computing Dr. Jason D. Bakos

Heterogeneous Computing 2 “Traditional” Parallel/Multi-Processing Large-scale parallel platforms: –Individual computers connected with a high-speed interconnect –Programs are dispatched from a head node Upper bound for speedup is n, where n = # processors –How much parallelism in program? –System, network overheads?

Heterogeneous Computing 3 Heterogeneous Computing Computer system made up of interconnected parallel processors of mixed types Generally includes: –One or more general-purpose processor(s) Acts as the “host” –One or more special-purpose processor(s) Acts as the “co-processors” Advantage: –Accelerates each general-purpose processor by a factor of 10 to 1000 –One co-processor add-in board can replace 10s to 1000s of processors in a traditional parallel computer

Heterogeneous Computing 4 Heterogeneous Execution Model initialization 0.5% of run time “hot” loop 99% of run time clean up 0.5% of run time instructions executed over time 49% of code 1% of code co-processor

Heterogeneous Computing 5 Heterogeneous Computing: Performance Move “bottleneck” computations from software to FPGA –Use FPGA as co-processor Example: –Application requires a week of CPU time –One computation consumes 99% of execution time Kernel speedup Application speedup Execution time hours hours hours hours hours

Heterogeneous Computing 6 High-Performance Reconfigurable Computing Heterogeneous computing with reconfigurable logic, i.e. FPGAs

Heterogeneous Computing 7 High-Performance Reconfigurable Computing Advantage of HPRC: –Cost FPGA card => ~ $15K 128-processor cluster => ~ $150K + maintenance + cooling + electricity + recycling Challenges: –Programming the FPGA –Identifying kernels –Optimizing accelerator design

Heterogeneous Computing 8 Programming FPGAs

Heterogeneous Computing 9 Heterogeneous Computing with GPUs Graphics Processor Unit (GPU) –Contains hundreds of small processor cores grouped hierarchically –Has high bandwidth to on-board memory and to host memory –Became “programmable” about two years ago –Gained hardware double precision about one year ago Examples: IBM Cell, nVidia GeForce, AMD FireStream Advantage over FPGAs: –Easier to program –Less expensive (gamers drove high volumes, decreasing cost) Drawbacks: –Can only do floating point fast (computations that map well to shaders)?

Heterogeneous Computing 10 Heterogeneous Computing with GPUs

Heterogeneous Computing 11 Heterogeneous computing is mainstream: IBM Roadrunner Los Alamos, fastest computer in the world 6,480 AMD Opteron (dual core) CPUs 12,960 PowerXCell 8i GPUs Each blade contains 2 Operons and 4 Cells 296 racks 1.71 petaflops peak (1.7 billion million fp operations per second) 2.35 MW (not including cooling) –Lake Murray hydroelectric plant produces ~150 MW (peak) –Lake Murray coal plant (McMeekin Station) produces ~300 MW (peak) –Catawba Nuclear Station near Rock Hill produces 2258 MW

Heterogeneous Computing 12 Open Questions What characteristics make a particular program amenable to acceleration? Are FPGA-based co-processors better suited for some types of computations than GPU-based co-processors? How can we develop efficient and effective methodologies for adapting general-purpose codes to heterogeneous systems?

Heterogeneous Computing 13 Our Group Past projects: –Custom FPGA accelerators: computational biology linear algebra –Multi-FPGA interconnection networks: interface abstractions adaptive routing algorithms on-chip router designs Current projects: –Design tools Dynamic code analysis Semi-automatic accelerator generation –Assessing GPU accelerators versus FPGA accelerators for various computations –Accelerators for solving large systems of ODEs