GPGPU Introduction Alan Gray EPCC The University of Edinburgh.

Slides:

Advertisements

Similar presentations

Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters

Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.

Lecture 6: Multicore Systems

GPU System Architecture Alan Gray EPCC The University of Edinburgh.

Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.

HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.

GPU Programming and CUDA Sathish Vadhiyar Parallel Programming.

GPUs on Clouds Andrew J. Younge Indiana University (USC / Information Sciences Institute) UNCLASSIFIED: 08/03/2012.

Alternative Processors 5/22/20151 John Gustafson CEO, Massively Parallel Technologies (Former CTO, ClearSpeed)

1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.

CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.

University of Michigan Electrical Engineering and Computer Science Amir Hormati, Mehrzad Samadi, Mark Woh, Trevor Mudge, and Scott Mahlke Sponge: Portable.

Panda: MapReduce Framework on GPU’s and CPU’s

GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.

GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.

GPGPU platforms GP - General Purpose computation using GPU

Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.

GPU Programming EPCC The University of Edinburgh.

An Introduction to Programming with CUDA Paul Richmond

GPU Programming with CUDA – Accelerated Architectures Mike Griffiths

Computer System Architectures Computer System Software

CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA

GPU Programming David Monismith Based on notes taken from the Udacity Parallel Programming Course.

1 The Performance Potential for Single Application Heterogeneous Systems Henry Wong* and Tor M. Aamodt § *University of Toronto § University of British.

Computer Graphics Graphics Hardware

BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.

Extracted directly from:

1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,

By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.

© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Principles of I/0 hardware.

GPU Programming with CUDA – Optimisation Mike Griffiths

© David Kirk/NVIDIA and Wen-mei W. Hwu, 1 Programming Massively Parallel Processors Lecture Slides for Chapter 1: Introduction.

General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.

GPU in HPC Scott A. Friedman ATS Research Computing Technologies.

CUDA All material not from online sources/textbook copyright © Travis Desell, 2012.

Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.

Use of GPUs in ALICE (and elsewhere) Thorsten Kollegger TDOC-PG | CERN |

GPU Architecture and Programming

GPU Programming and CUDA Sathish Vadhiyar Parallel Programming.

CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.

Some key aspects of NVIDIA GPUs and CUDA. Silicon Usage.

GPUs: Overview of Architecture and Programming Options Lee Barford firstname dot lastname at gmail dot com.

OpenCL Programming James Perry EPCC The University of Edinburgh.

By Dirk Hekhuis Advisors Dr. Greg Wolffe Dr. Christian Trefftz.

May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.

CENTRAL PROCESSING UNIT. CPU Does the actual processing in the computer. A single chip called a microprocessor. Composed of an arithmetic and logic unit.

GPUs – Graphics Processing Units Applications in Graphics Processing and Beyond COSC 3P93 – Parallel ComputingMatt Peskett.

Introduction to CUDA CAP 4730 Spring 2012 Tushar Athawale.

Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.

GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.

GPU Performance Optimisation Alan Gray EPCC The University of Edinburgh.

Processor Level Parallelism 2. How We Got Here Developments in PC CPUs.

Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.

Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.

Matthew Royle Supervisor: Prof Shaun Bangay.  How do we implement OpenCL for CPUs  Differences in parallel architectures  Is our CPU implementation.

GPGPU Programming with CUDA Leandro Avila - University of Northern Iowa Mentor: Dr. Paul Gray Computer Science Department University of Northern Iowa.

Computer Graphics Graphics Hardware

Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.

A Level Computing – a2 Component 2 1A, 1B, 1C, 1D, 1E.

Graphics Processing Unit

Presented by: Isaac Martin

Computer Graphics Graphics Hardware

Introduction to CUDA.

6- General Purpose GPU Programming

CSE 502: Computer Architecture

Multicore and GPU Programming

Presentation transcript:

GPGPU Introduction Alan Gray EPCC The University of Edinburgh

Introduction Central Processing Unit (CPU) of a computer system must be able to perform a wide variety of tasks efficiently. Until (relatively) recently, most CPUs comprised of 1 sophisticated compute core (for arithmetic), plus complex arrangement of controllers, memory caches, etc Increases in CPU performance were achieved through increases in the clock frequency of the core. –This has now reached it’s limit mainly due to power requirements Today, processor cores are not getting any faster, but instead we are getting increasing numbers of cores per chip. –Plus other forms of parallelism such as SSE,AVX vector instruction support Harder for applications to exploit such technology advances. 2Alan Gray

Introduction Meanwhile…. In recent years computer gaming industry has driven development of a different type of chip: the Graphics Processing Unit (GPU) Silicon largely dedicated to high numbers (hundreds) of simplistic cores, –at the expense of controllers, caches, sophistication etc GPUs work in tandem with the CPU (communicating over PCIe), and are responsible for generating the graphical output display –Computing pixel values Inherently parallel - each core computes a certain set of pixels Architecture has evolved for this purpose 3Alan Gray

Introduction GPU performance has been increasing much more rapidly than CPU Can we use GPUs for general purpose computation? –Yes (with some effort). 4Alan Gray

GPGPU GPGPU: General Purpose computation on Graphics Processing Units. GPU acts as an “accelerator” to the CPU (heterogeneous system) –Most lines of code are executed on the CPU (serial computing) –Key computational kernels are executed on the GPU (stream computing) –Taking advantage of the large number of cores and high graphics memory bandwidth –AIM: code performs better than use of CPU alone. GPUs now firmly established in HPC industry –Can augment each node of parallel system with GPUs 5Alan Gray

GPGPU: Stream Computing Data set decomposed into a stream of elements A single computational function (kernel) operates on each element –“thread” defined as execution of kernel on one data element Multiple cores can process multiple elements in parallel –i.e. many threads running in parallel Suitable for data-parallel problems 6Alan Gray

Programming Considerations Standard (CPU) code will not run on a GPU unless it is adapted Programmer must –decompose problem onto the hardware in a specific way (e.g. using a hierarchical thread/grid model in CUDA) –Manage data transfers between the separate CPU and GPU memory spaces. –Traditional language (C, C++, Fortran etc) not enough, need extensions, directives, or new language. Once code is ported to GPU, optimization work is usually required to tailor it to the hardware and achieve good performance Many researchers are now successfully exploiting GPUs –Across a wide range of application areas 7Alan Gray