Graphic Processing Units Presentation by John Manning.

Slides:



Advertisements
Similar presentations
GPU Programming using BU Shared Computing Cluster
Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
Multi-core and tera- scale computing A short overview of benefits and challenges CSC 2007 Andrzej Nowak, CERN
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Monte-Carlo method and Parallel computing  An introduction to GPU programming Mr. Fang-An Kuo, Dr. Matthew R. Smith NCHC Applied Scientific Computing.
GPU Virtualization Support in Cloud System Ching-Chi Lin Institute of Information Science, Academia Sinica Department of Computer Science and Information.
Intro to GPU’s for Parallel Computing. Goals for Rest of Course Learn how to program massively parallel processors and achieve – high performance – functionality.
GPU System Architecture Alan Gray EPCC The University of Edinburgh.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
HPCC Mid-Morning Break High Performance Computing on a GPU cluster Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery.
FSOSS Dr. Chris Szalwinski Professor School of Information and Communication Technology Seneca College, Toronto, Canada GPU Research Capabilities.
GRAPHICS AND COMPUTING GPUS Jehan-François Pâris
IMGD 4000: Computer Graphics in Games Emmanuel Agu.
A many-core GPU architecture.. Price, performance, and evolution.
GPU Computing with CUDA as a focus Christie Donovan.
GPUs. An enlarging peak performance advantage: –Calculation: 1 TFLOPS vs. 100 GFLOPS –Memory Bandwidth: GB/s vs GB/s –GPU in every PC and.
Team Members: Tyler Drake Robert Wrisley Kyle Von Koepping Justin Walsh Faculty Advisors: Computer Science – Prof. Sanjay Rajopadhye Electrical & Computer.
Jared Law CUDA: Super-Computing Made Easy. Jared Law NVidia CUDA: Why CUDA? What is CUDA? Where/how is CUDA being used? What does CUDA mean to programmers?
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
CUDA Programming Lei Zhou, Yafeng Yin, Yanzhi Ren, Hong Man, Yingying Chen.
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
HPCC Mid-Morning Break Dirk Colbry, Ph.D. Research Specialist Institute for Cyber Enabled Discovery Introduction to the new GPU (GFX) cluster.
Emotion Engine A look at the microprocessor at the center of the PlayStation2 gaming console Charles Aldrich.
Digital Graphics and Computers. Hardware and Software Working with graphic images requires suitable hardware and software to produce the best results.
Accelerating SQL Database Operations on a GPU with CUDA Peter Bakkum & Kevin Skadron The University of Virginia GPGPU-3 Presentation March 14, 2010.
CSU0021 Computer Graphics © Chun-Fa Chang CSU0021 Computer Graphics September 10, 2014.
COMPUTER ARCHITECTURE (for Erasmus students)
Background image by chromosphere.deviantart.com Fella in following slides by devart.deviantart.com DM2336 Programming hardware shaders Dioselin Gonzalez.
1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Dec 31, 2012 Emergence of GPU systems and clusters for general purpose High Performance Computing.
Training Program on GPU Programming with CUDA 31 st July, 7 th Aug, 14 th Aug 2011 CUDA Teaching UoM.
GPU – Graphic Processing Unit
CuMAPz: A Tool to Analyze Memory Access Patterns in CUDA
Shared memory systems. What is a shared memory system Single memory space accessible to the programmer Processor communicate through the network to the.
Computationally Efficient Histopathological Image Analysis: Use of GPUs for Classification of Stromal Development Olcay Sertel 1,2, Antonio Ruiz 3, Umit.
Computer Graphics Graphics Hardware
BY: ALI AJORIAN ISFAHAN UNIVERSITY OF TECHNOLOGY 2012 GPU Architecture 1.
GPUs and Accelerators Jonathan Coens Lawrence Tan Yanlin Li.
By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.
© David Kirk/NVIDIA and Wen-mei W. Hwu, 1 Programming Massively Parallel Processors Lecture Slides for Chapter 1: Introduction.
General Purpose Computing on Graphics Processing Units: Optimization Strategy Henry Au Space and Naval Warfare Center Pacific 09/12/12.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
Emergence of GPU systems and clusters for general purpose high performance computing ITCS 4145/5145 April 3, 2012 © Barry Wilkinson.
GPU Architecture and Programming
1 Latest Generations of Multi Core Processors
Introduction What is GPU? It is a processor optimized for 2D/3D graphics, video, visual computing, and display. It is highly parallel, highly multithreaded.
Carlo del Mundo Department of Electrical and Computer Engineering Ubiquitous Parallelism Are You Equipped To Code For Multi- and Many- Core Platforms?
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
GPU Programming Shirley Moore CPS 5401 Fall 2013
This deck has 1-, 2-, and 3- slide variants for C++ AMP If your own deck uses 4:3, get with the 21 st century and switch to 16:9 ( Design tab, Page Setup.
Copyright © Curt Hill Video Hardware Evolution.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483, University of Illinois, Urbana-Champaign 1 Graphic Processing Processors (GPUs) Parallel.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
Fast and parallel implementation of Image Processing Algorithm using CUDA Technology On GPU Hardware Neha Patil Badrinath Roysam Department of Electrical.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Oct 30, 2014.
Heterogeneous Processing KYLE ADAMSKI. Overview What is heterogeneous processing? Why it is necessary Issues with heterogeneity CPU’s vs. GPU’s Heterogeneous.
Sobolev(+Node 6, 7) Showcase +K20m GPU Accelerator.
Computer Engg, IIT(BHU)
Computer Graphics Graphics Hardware
ATI Semiconductor technology corporation based in Markham, Ontario, Canada, that specialized in the development of graphics processing units and chipsets.
GPU Architecture and Its Application
Graphics Processor Graphics Processing Unit
Computer Graphics Graphics Hardware
Graphics Processing Unit
CSE 502: Computer Architecture
Multicore and GPU Programming
Presentation transcript:

Graphic Processing Units Presentation by John Manning

Contents What is a GPU? What is a GPU? History History Hardware Hardware Software Software Current Trends Current Trends

What is a GPU? What does a Graphics Processing Unit (GPU) mean? What does a Graphics Processing Unit (GPU) mean? A GPU is a single-chip processor primarily used to manage and boost the performance of video and graphics. GPU features include:A GPU is a single-chip processor primarily used to manage and boost the performance of video and graphics. GPU features include: 2D or 3D graphics 2D or 3D graphics Digital output to display monitors Digital output to display monitors Application support for high intensity graphics software Application support for high intensity graphics software Rendering Polygons Rendering Polygons

What is a GPU? (cont.) These features are designed to lessen the work of the CPU and produce faster video and graphics. These features are designed to lessen the work of the CPU and produce faster video and graphics. A GPU is not only used in a PC on a video card or motherboard; it is also used in mobile phones, display adapters, workstations and game consoles. A GPU is not only used in a PC on a video card or motherboard; it is also used in mobile phones, display adapters, workstations and game consoles. A.K.A VPU (Visual Processing Unit) A.K.A VPU (Visual Processing Unit) IiRzmfs5aw IiRzmfs5aw

History 1999 – Nvidia releases GeForce – Nvidia releases GeForce 256 “This GPU model could process 10 million polygons per second and had more than 22 million transistors. The GeForce 256 was a single-chip processor with integrated transform, drawing and BitBLT support, lighting effects, triangle setup/clipping and redering engines.”“This GPU model could process 10 million polygons per second and had more than 22 million transistors. The GeForce 256 was a single-chip processor with integrated transform, drawing and BitBLT support, lighting effects, triangle setup/clipping and redering engines.”

History (cont.) In the timeframe, computer scientists and domain scientists from various fields started using GPU’s to accelerate a range of scientific applications. This was the advent of the movement called GPGPU, or General-Purpose computation on GPU. In the timeframe, computer scientists and domain scientists from various fields started using GPU’s to accelerate a range of scientific applications. This was the advent of the movement called GPGPU, or General-Purpose computation on GPU. While users achieved unprecedented performance (over 100x compared to CPUs in some cases), the challenge was that GPGPU required the use of graphics programming API’s like OpenGL and Cg to program the GPU. This limited accessibility to the tremendous capabilities of GPU’s for science While users achieved unprecedented performance (over 100x compared to CPUs in some cases), the challenge was that GPGPU required the use of graphics programming API’s like OpenGL and Cg to program the GPU. This limited accessibility to the tremendous capabilities of GPU’s for science

CPU vs. GPU and Moore’s Law Transistor Count Transistor Count CPU – Intel Pentium 4 – 42 million CPU – Intel Pentium 4 – 42 million GPU – Nvidia NV15 – 25 million GPU – Nvidia NV15 – 25 million CPU – Cell Processor – 241 million CPU – Cell Processor – 241 million GPU – Nvidia G80 – 681 million GPU – Nvidia G80 – 681 million CPU – 62 core Xeon Phi – 5 billion ($ ) CPU – 62 core Xeon Phi – 5 billion ($ ) GPU – Nvidia Titan – 7.1 billion ($1000) GPU – Nvidia Titan – 7.1 billion ($1000)

Hardware

Hardware (cont.)

Rendering Pipeline

Software C++ AMP C++ AMP “C++ Accelerated Massive Parallelism (AMP) accelerates execution of C++ code by taking advantage of data- parallel hardware such as a graphics processing unit on a discrete graphics card. By using C++ AMP, you can code multi-dimensional data algorithms so that execution can be accelerated by using parallelism on heterogeneous hardware. The C++ AMP programming model includes multidimensional arrays, indexing, memory transfer, tiling and a mathematical function library. You can use C++ AMP language extensions to control how data is moved from the CPU to the GPU and back, so that you can improve performance”“C++ Accelerated Massive Parallelism (AMP) accelerates execution of C++ code by taking advantage of data- parallel hardware such as a graphics processing unit on a discrete graphics card. By using C++ AMP, you can code multi-dimensional data algorithms so that execution can be accelerated by using parallelism on heterogeneous hardware. The C++ AMP programming model includes multidimensional arrays, indexing, memory transfer, tiling and a mathematical function library. You can use C++ AMP language extensions to control how data is moved from the CPU to the GPU and back, so that you can improve performance” us/library/windows/hardware/ff569246(v=vs.85).aspx us/library/windows/hardware/ff569246(v=vs.85).aspx

Software (cont.) #include #include Void StandardMethod() { int aCPP[] = {1,2,3,4,5}; int bCPP[] = {6, 7, 8, 9, 10}; int sumCPP[5]; for( int idx = 0; idx < 5; idx++) { sumCPP[idx] = aCPP[idx] + bCPP[idx]; } for( int idx = 0; idx < 5; idx++) { std::count <<sumCPP[idx]<<“\n”; }}

Software (cont.) #include #include Using namespace consurrency; Const int size = 5; void CppAmpMethod() { int aCPP[] = {1,2,3,4,5}; int bCPP[] = {6,7,8,9,10}; int sumCPP[size]; //create c++ amp objects array_view a(size, aCPP); array_view b(size, bCPP); array_view sum(size, sumCPP); sum.discard_data();parallel_for_each( // Define the compute domain, which is the set of threads that are created. sum.extent, // Define the code to run on each thread on the accelerator. // Define the code to run on each thread on the accelerator. [=](index idx) restrict(amp) [=](index idx) restrict(amp) { sum[idx] = a[idx] + b[idx]; sum[idx] = a[idx] + b[idx];} ); ); // Print the results. The expected output is "7, 9, 11, 13, 15". // Print the results. The expected output is "7, 9, 11, 13, 15". for (int i = 0; i < size; i++) { std::cout << sum[i] << "\n"; std::cout << sum[i] << "\n";}}

Software (cont.) CUDA – Compute Unified Device Architecture CUDA – Compute Unified Device Architecture Created by NvidiaCreated by Nvidia Parallel Programming PlatformParallel Programming Platform With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA. Here are a few examples:With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA. Here are a few examples: Identify hidden plaque in arteries: Heart attacks are the leading cause of death worldwide. Harvard Engineering, Harvard Medical School and Brigham & Women's Hospital have teamed up to use GPUs to simulate blood flow and identify hidden arterial plaque without invasive imaging techniques or exploratory surgery.Identify hidden plaque in arteries: Heart attacks are the leading cause of death worldwide. Harvard Engineering, Harvard Medical School and Brigham & Women's Hospital have teamed up to use GPUs to simulate blood flow and identify hidden arterial plaque without invasive imaging techniques or exploratory surgery. Analyze air traffic flow: The National Airspace System manages the nationwide coordination of air traffic flow. Computer models help identify new ways to alleviate congestion and keep airplane traffic moving efficiently. Using the computational power of GPUs, a team at NASA obtained a large performance gain, reducing analysis time from ten minutes to three seconds.Analyze air traffic flow: The National Airspace System manages the nationwide coordination of air traffic flow. Computer models help identify new ways to alleviate congestion and keep airplane traffic moving efficiently. Using the computational power of GPUs, a team at NASA obtained a large performance gain, reducing analysis time from ten minutes to three seconds. Visualize molecules: A molecular simulation called NAMD (nanoscale molecular dynamics) gets a large performance boost with GPUs. The speed-up is a result of the parallel architecture of GPUs, which enables NAMD developers to port compute-intensive portions of the application to the GPU using the CUDA Toolkit.Visualize molecules: A molecular simulation called NAMD (nanoscale molecular dynamics) gets a large performance boost with GPUs. The speed-up is a result of the parallel architecture of GPUs, which enables NAMD developers to port compute-intensive portions of the application to the GPU using the CUDA Toolkit.

Software (cont.) Physics engines commonly employ GPU’s to handle the massive computations required for video Physics engines commonly employ GPU’s to handle the massive computations required for video k1fqPukkhttp:// k1fqPukkhttp:// k1fqPukkhttp:// k1fqPukk DirectX – a collection of API’s (Application Programmable Interface) for handling tasks related with mediaDirectX – a collection of API’s (Application Programmable Interface) for handling tasks related with media oYGko oYGko

Current Trends Nvidia Titan Nvidia Titan 2,688 CUDA cores2,688 CUDA cores 7.1 billion transistors7.1 billion transistors 837 MgHz clockrate837 MgHz clockrate GDDR5 Memory InterfaceGDDR5 Memory Interface 384 bit Memory Interface Width384 bit Memory Interface Width Memory Bandwidth GB/secMemory Bandwidth GB/sec 4.5 Teraflops single precision4.5 Teraflops single precision 1.3 Teraflops doulbe precision1.3 Teraflops doulbe precision

Current Trends AMD Radeon HD 7970 AMD Radeon HD 7970 Up to 925 MHz Engine ClockUp to 925 MHz Engine Clock 3GB GDDR5 Memory3GB GDDR5 Memory 1375 MHz Memory Clock (5.5 Gbps GDDR5)1375 MHz Memory Clock (5.5 Gbps GDDR5) 264GB/s memory bandwidth264GB/s memory bandwidth 3.79 TFLOPS Single Precision3.79 TFLOPS Single Precision 947 GFLOPS Double Precision947 GFLOPS Double Precision 2048 cores2048 cores

Conclusion GPU’s have been around for less than 15 years but are already breaking new grounds in computation technology. GPU’s have been around for less than 15 years but are already breaking new grounds in computation technology. GPU’s contain more transistors than CPU’s and follow Moore’s Law more closely GPU’s contain more transistors than CPU’s and follow Moore’s Law more closely GPU cores are more closer to an ALU in a CPU than an actual processor core GPU cores are more closer to an ALU in a CPU than an actual processor core C++ Amp and CUDA allow parallel processing for quickly doing massive number crunching C++ Amp and CUDA allow parallel processing for quickly doing massive number crunching

Sources -processing-unit-gpu -processing-unit-gpu -processing-unit-gpu -processing-unit-gpu Msdn.microsoft.com/en- us/library/vstudio/hh aspx Msdn.microsoft.com/en- us/library/vstudio/hh aspx En.wikipedia.org/wiki/transistor_count En.wikipedia.org/wiki/transistor_count fermi-gpu-architecture-revealed fermi-gpu-architecture-revealed fermi-gpu-architecture-revealed fermi-gpu-architecture-revealed us/library/windows/hardware/ff569246%28v=vs. 85%29.aspx us/library/windows/hardware/ff569246%28v=vs. 85%29.aspx