Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graphic Processing Units Presentation by John Manning.

Similar presentations


Presentation on theme: "Graphic Processing Units Presentation by John Manning."— Presentation transcript:

1 Graphic Processing Units Presentation by John Manning

2 Contents What is a GPU? What is a GPU? History History Hardware Hardware Software Software Current Trends Current Trends

3 What is a GPU? What does a Graphics Processing Unit (GPU) mean? What does a Graphics Processing Unit (GPU) mean? A GPU is a single-chip processor primarily used to manage and boost the performance of video and graphics. GPU features include:A GPU is a single-chip processor primarily used to manage and boost the performance of video and graphics. GPU features include: 2D or 3D graphics 2D or 3D graphics Digital output to display monitors Digital output to display monitors Application support for high intensity graphics software Application support for high intensity graphics software Rendering Polygons Rendering Polygons

4 What is a GPU? (cont.) These features are designed to lessen the work of the CPU and produce faster video and graphics. These features are designed to lessen the work of the CPU and produce faster video and graphics. A GPU is not only used in a PC on a video card or motherboard; it is also used in mobile phones, display adapters, workstations and game consoles. A GPU is not only used in a PC on a video card or motherboard; it is also used in mobile phones, display adapters, workstations and game consoles. A.K.A VPU (Visual Processing Unit) A.K.A VPU (Visual Processing Unit) http://www.youtube.com/watch?v=- IiRzmfs5aw http://www.youtube.com/watch?v=- IiRzmfs5aw

5 History 1999 – Nvidia releases GeForce 256 1999 – Nvidia releases GeForce 256 “This GPU model could process 10 million polygons per second and had more than 22 million transistors. The GeForce 256 was a single-chip processor with integrated transform, drawing and BitBLT support, lighting effects, triangle setup/clipping and redering engines.”“This GPU model could process 10 million polygons per second and had more than 22 million transistors. The GeForce 256 was a single-chip processor with integrated transform, drawing and BitBLT support, lighting effects, triangle setup/clipping and redering engines.”

6 History (cont.) In the 1999-2000 timeframe, computer scientists and domain scientists from various fields started using GPU’s to accelerate a range of scientific applications. This was the advent of the movement called GPGPU, or General-Purpose computation on GPU. In the 1999-2000 timeframe, computer scientists and domain scientists from various fields started using GPU’s to accelerate a range of scientific applications. This was the advent of the movement called GPGPU, or General-Purpose computation on GPU. While users achieved unprecedented performance (over 100x compared to CPUs in some cases), the challenge was that GPGPU required the use of graphics programming API’s like OpenGL and Cg to program the GPU. This limited accessibility to the tremendous capabilities of GPU’s for science While users achieved unprecedented performance (over 100x compared to CPUs in some cases), the challenge was that GPGPU required the use of graphics programming API’s like OpenGL and Cg to program the GPU. This limited accessibility to the tremendous capabilities of GPU’s for science

7 CPU vs. GPU and Moore’s Law Transistor Count Transistor Count 2000 2000 CPU – Intel Pentium 4 – 42 million CPU – Intel Pentium 4 – 42 million GPU – Nvidia NV15 – 25 million GPU – Nvidia NV15 – 25 million 2006 2006 CPU – Cell Processor – 241 million CPU – Cell Processor – 241 million GPU – Nvidia G80 – 681 million GPU – Nvidia G80 – 681 million 2012 2012 CPU – 62 core Xeon Phi – 5 billion ($2500- 3000) CPU – 62 core Xeon Phi – 5 billion ($2500- 3000) GPU – Nvidia Titan – 7.1 billion ($1000) GPU – Nvidia Titan – 7.1 billion ($1000)

8 Hardware

9 Hardware (cont.)

10 Rendering Pipeline

11 Software C++ AMP C++ AMP “C++ Accelerated Massive Parallelism (AMP) accelerates execution of C++ code by taking advantage of data- parallel hardware such as a graphics processing unit on a discrete graphics card. By using C++ AMP, you can code multi-dimensional data algorithms so that execution can be accelerated by using parallelism on heterogeneous hardware. The C++ AMP programming model includes multidimensional arrays, indexing, memory transfer, tiling and a mathematical function library. You can use C++ AMP language extensions to control how data is moved from the CPU to the GPU and back, so that you can improve performance”“C++ Accelerated Massive Parallelism (AMP) accelerates execution of C++ code by taking advantage of data- parallel hardware such as a graphics processing unit on a discrete graphics card. By using C++ AMP, you can code multi-dimensional data algorithms so that execution can be accelerated by using parallelism on heterogeneous hardware. The C++ AMP programming model includes multidimensional arrays, indexing, memory transfer, tiling and a mathematical function library. You can use C++ AMP language extensions to control how data is moved from the CPU to the GPU and back, so that you can improve performance” http://msdn.microsoft.com/en- us/library/windows/hardware/ff569246(v=vs.85).aspx http://msdn.microsoft.com/en- us/library/windows/hardware/ff569246(v=vs.85).aspx

12 Software (cont.) #include #include Void StandardMethod() { int aCPP[] = {1,2,3,4,5}; int bCPP[] = {6, 7, 8, 9, 10}; int sumCPP[5]; for( int idx = 0; idx < 5; idx++) { sumCPP[idx] = aCPP[idx] + bCPP[idx]; } for( int idx = 0; idx < 5; idx++) { std::count <<sumCPP[idx]<<“\n”; }}

13 Software (cont.) #include #include Using namespace consurrency; Const int size = 5; void CppAmpMethod() { int aCPP[] = {1,2,3,4,5}; int bCPP[] = {6,7,8,9,10}; int sumCPP[size]; //create c++ amp objects array_view a(size, aCPP); array_view b(size, bCPP); array_view sum(size, sumCPP); sum.discard_data();parallel_for_each( // Define the compute domain, which is the set of threads that are created. sum.extent, // Define the code to run on each thread on the accelerator. // Define the code to run on each thread on the accelerator. [=](index idx) restrict(amp) [=](index idx) restrict(amp) { sum[idx] = a[idx] + b[idx]; sum[idx] = a[idx] + b[idx];} ); ); // Print the results. The expected output is "7, 9, 11, 13, 15". // Print the results. The expected output is "7, 9, 11, 13, 15". for (int i = 0; i < size; i++) { std::cout << sum[i] << "\n"; std::cout << sum[i] << "\n";}}

14 Software (cont.) CUDA – Compute Unified Device Architecture CUDA – Compute Unified Device Architecture Created by NvidiaCreated by Nvidia Parallel Programming PlatformParallel Programming Platform With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA. Here are a few examples:With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad-ranging uses for GPU computing with CUDA. Here are a few examples: Identify hidden plaque in arteries: Heart attacks are the leading cause of death worldwide. Harvard Engineering, Harvard Medical School and Brigham & Women's Hospital have teamed up to use GPUs to simulate blood flow and identify hidden arterial plaque without invasive imaging techniques or exploratory surgery.Identify hidden plaque in arteries: Heart attacks are the leading cause of death worldwide. Harvard Engineering, Harvard Medical School and Brigham & Women's Hospital have teamed up to use GPUs to simulate blood flow and identify hidden arterial plaque without invasive imaging techniques or exploratory surgery. Analyze air traffic flow: The National Airspace System manages the nationwide coordination of air traffic flow. Computer models help identify new ways to alleviate congestion and keep airplane traffic moving efficiently. Using the computational power of GPUs, a team at NASA obtained a large performance gain, reducing analysis time from ten minutes to three seconds.Analyze air traffic flow: The National Airspace System manages the nationwide coordination of air traffic flow. Computer models help identify new ways to alleviate congestion and keep airplane traffic moving efficiently. Using the computational power of GPUs, a team at NASA obtained a large performance gain, reducing analysis time from ten minutes to three seconds. Visualize molecules: A molecular simulation called NAMD (nanoscale molecular dynamics) gets a large performance boost with GPUs. The speed-up is a result of the parallel architecture of GPUs, which enables NAMD developers to port compute-intensive portions of the application to the GPU using the CUDA Toolkit.Visualize molecules: A molecular simulation called NAMD (nanoscale molecular dynamics) gets a large performance boost with GPUs. The speed-up is a result of the parallel architecture of GPUs, which enables NAMD developers to port compute-intensive portions of the application to the GPU using the CUDA Toolkit.

15 Software (cont.) Physics engines commonly employ GPU’s to handle the massive computations required for video Physics engines commonly employ GPU’s to handle the massive computations required for video http://www.youtube.com/watch?v=143 k1fqPukkhttp://www.youtube.com/watch?v=143 k1fqPukkhttp://www.youtube.com/watch?v=143 k1fqPukkhttp://www.youtube.com/watch?v=143 k1fqPukk DirectX – a collection of API’s (Application Programmable Interface) for handling tasks related with mediaDirectX – a collection of API’s (Application Programmable Interface) for handling tasks related with media http://www.youtube.com/watch?v=4G9anR oYGko http://www.youtube.com/watch?v=4G9anR oYGko

16 Current Trends Nvidia Titan Nvidia Titan 2,688 CUDA cores2,688 CUDA cores 7.1 billion transistors7.1 billion transistors 837 MgHz clockrate837 MgHz clockrate GDDR5 Memory InterfaceGDDR5 Memory Interface 384 bit Memory Interface Width384 bit Memory Interface Width Memory Bandwidth 288.4 GB/secMemory Bandwidth 288.4 GB/sec 4.5 Teraflops single precision4.5 Teraflops single precision 1.3 Teraflops doulbe precision1.3 Teraflops doulbe precision

17 Current Trends AMD Radeon HD 7970 AMD Radeon HD 7970 Up to 925 MHz Engine ClockUp to 925 MHz Engine Clock 3GB GDDR5 Memory3GB GDDR5 Memory 1375 MHz Memory Clock (5.5 Gbps GDDR5)1375 MHz Memory Clock (5.5 Gbps GDDR5) 264GB/s memory bandwidth264GB/s memory bandwidth 3.79 TFLOPS Single Precision3.79 TFLOPS Single Precision 947 GFLOPS Double Precision947 GFLOPS Double Precision 2048 cores2048 cores

18 Conclusion GPU’s have been around for less than 15 years but are already breaking new grounds in computation technology. GPU’s have been around for less than 15 years but are already breaking new grounds in computation technology. GPU’s contain more transistors than CPU’s and follow Moore’s Law more closely GPU’s contain more transistors than CPU’s and follow Moore’s Law more closely GPU cores are more closer to an ALU in a CPU than an actual processor core GPU cores are more closer to an ALU in a CPU than an actual processor core C++ Amp and CUDA allow parallel processing for quickly doing massive number crunching C++ Amp and CUDA allow parallel processing for quickly doing massive number crunching

19 Sources www.technopedia.com/definition/24682/graphics -processing-unit-gpu www.technopedia.com/definition/24682/graphics -processing-unit-gpu www.technopedia.com/definition/24682/graphics -processing-unit-gpu www.technopedia.com/definition/24682/graphics -processing-unit-gpu Msdn.microsoft.com/en- us/library/vstudio/hh265136.aspx Msdn.microsoft.com/en- us/library/vstudio/hh265136.aspx www.nvidia.com/object/cuda_home_new.html www.nvidia.com/object/cuda_home_new.html www.nvidia.com/object/cuda_home_new.html En.wikipedia.org/wiki/transistor_count En.wikipedia.org/wiki/transistor_count http://techreport.com/review/17670/nvidia- fermi-gpu-architecture-revealed http://techreport.com/review/17670/nvidia- fermi-gpu-architecture-revealed http://techreport.com/review/17670/nvidia- fermi-gpu-architecture-revealed http://techreport.com/review/17670/nvidia- fermi-gpu-architecture-revealed http://msdn.microsoft.com/en- us/library/windows/hardware/ff569246%28v=vs. 85%29.aspx http://msdn.microsoft.com/en- us/library/windows/hardware/ff569246%28v=vs. 85%29.aspx


Download ppt "Graphic Processing Units Presentation by John Manning."

Similar presentations


Ads by Google