GP2: General Purpose Computation using Graphics Processors

Slides:



Advertisements
Similar presentations
Is There a Real Difference between DSPs and GPUs?
Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
IMGD 4000: Computer Graphics in Games Emmanuel Agu.
Some Thoughts on Technology and Strategies for Petaflops.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
ATI GPUs and Graphics APIs Mark Segal. ATI Hardware X1K series 8 SIMD vertex engines, 16 SIMD fragment (pixel) engines 3-component vector + scalar ALUs.
GPU Tutorial 이윤진 Computer Game 2007 가을 2007 년 11 월 다섯째 주, 12 월 첫째 주.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 April 4, 2013 © Barry Wilkinson CUDAIntro.ppt.
CSE 690 General-Purpose Computation on Graphics Hardware (GPGPU) Courtesy David Luebke, University of Virginia.
General-Purpose Computation on Graphics Hardware.
David Luebke 1 9/4/2015 Real-Time Rendering & Game Technology CS 446/651 David Luebke.
1 ITCS 4/5010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Dec 31, 2012 Emergence of GPU systems and clusters for general purpose High Performance Computing.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Over View of the GPU Architecture CS7080 Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad &
Database and Stream Mining using GPUs Naga K. Govindaraju UNC Chapel Hill.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
Computationally Efficient Histopathological Image Analysis: Use of GPUs for Classification of Stromal Development Olcay Sertel 1,2, Antonio Ruiz 3, Umit.
Computer Graphics Graphics Hardware
By Arun Bhandari Course: HPC Date: 01/28/12. GPU (Graphics Processing Unit) High performance many core processors Only used to accelerate certain parts.
Cg Programming Mapping Computational Concepts to GPUs.
General-Purpose Computation on Graphics Hardware Adapted from: David Luebke (University of Virginia) and NVIDIA.
General-Purpose Computation on Graphics Hardware.
Emergence of GPU systems and clusters for general purpose high performance computing ITCS 4145/5145 April 3, 2012 © Barry Wilkinson.
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
Debunking the 100X GPU vs. CPU Myth An Evaluation of Throughput Computing on CPU and GPU Present by Chunyi Victor W Lee, Changkyu Kim, Jatin Chhugani,
David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 3.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Oct 30, 2014.
General Purpose computing on Graphics Processing Units
Computer Graphics Graphics Hardware
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 July 12, 2012 © Barry Wilkinson CUDAIntro.ppt.
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
GPU Architecture and Its Application
Single Instruction Multiple Data
Graphics Processor Graphics Processing Unit
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
What is GPU? how does it work?
Scalability of Intervisibility Testing using Clusters of GPUs
CS427 Multicore Architecture and Parallel Computing
Graphics Processing Unit
Real-Time Ray Tracing Stefan Popov.
CS-301 Introduction to Computing Lecture 17
From Turing Machine to Global Illumination
BitWarp Energy Efficient Analytic Data Processing on Next Generation General Purpose GPUs Jason Power || Yinan Li || Mark D. Hill || Jignesh M. Patel.
GP2: General Purpose Computation using Graphics Processors
Emergence of GPU systems for general purpose high performance computing ITCS 4145/5145 © Barry Wilkinson GPUIntro.ppt Nov 4, 2013.
Computer-Generated Force Acceleration using GPUs: Next Steps
GPGPU: Distance Fields
Introduction to Computing
Graphics Processing Unit
GPU Introduction: Uses, Architecture, and Programming Model
Chapter 1 Introduction.
1.1 The Characteristics of Contemporary Processors, Input, Output and Storage Devices Types of Processors.
Computer Graphics Graphics Hardware
Introduction to Heterogeneous Parallel Computing
Ray Tracing on Programmable Graphics Hardware
Graphics Processing Unit
CSE 502: Computer Architecture
Presentation transcript:

GP2: General Purpose Computation using Graphics Processors Dinesh Manocha & Avneesh Sud http://gamma.cs.unc.edu/GPGP Spring 2007 Department of Computer Science UNC Chapel Hill

Instructors Dinesh Manocha: dm@cs.unc.edu: 962-1749 Avneesh Sud: sud@cs.unc.edu: 962-1849

Class Schedule Current Time Slot: 2:00 – 3:15pm, Mon/Wed, SN011 Office hours: TBD Class mailing list: gpgp@cs.unc.edu (??)

GPGP: What kind of course is it? Is it a graphics course?

GPGP: What kind of course is it? Is it a graphics course? Is it a system course?

GPGP: What kind of course is it? Is it a graphics course? Is it a system course? Is it an application course?

GPGP: What kind of course is it? Is it a graphics course? Is it a system course? Is it an application course? It is all of them!!

Is this the right course for me? No strict pre-requisites Course would borrow concepts from Computer graphics Linear algebra Numerical computations Architectures: CPU & GPUs Parallel programming (data parallel programming) Applications Geometric computations Database computations Scientific computing and physical simulation Computer vision …

Modern Commodity Processors GPU (1.3 GHz) CPU (2 x 3GHz) Video Memory (768 MB) 2 x 4 MB Cache CPU (2 x 3GHz) PCI-E Bus (4 GB/s) GPU (1.3 GHz) 2 x 4 MB Cache Modern computer architectures consists of two processors - CPUs or GPUs to handle these datasets. We quickly glance over the issues with CPUs and later explain the advantages of GPUs Video Memory (768 MB) System Memory (4 GB) HyperTransport (20 GB/s)

GPUs of Today! The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability Precision Power

GPGP The GPU on commodity video cards has evolved into an extremely flexible and powerful processor Programmability Precision Power This course will address how to harness that power for general-purpose computation (non-rasterization) Algorithmic issues Programming and systems Applications

GeForce 7900 – 302M Transistors (2005)

GeForce 7900 – 302M Transistors (OUT OF DATE)

GeForce 8800 – 600M Transistors (2006)

Graphics Processing Units (GPUs) Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Low memory latency pipeline Programmable High growth rate Power-efficient

GPU: Commodity Processor Laptops Consoles Cell phones PSP Desktops

GPU: Commodity Processor Laptops Consoles Cell phones ???? SuperComputers PSP Desktops

GPU: Commodity Processor Laptops Consoles Cell phones ???? iPhone PSP Desktops

Graphics Processing Units (GPUs) Commodity processor for graphics applications Massively parallel vector processors 10-20x more operations per sec than CPUs High memory bandwidth Better hides memory latency pipeline Programmable High growth rate Power-efficient

Parallelism on GPUs Graphics FLOPS GPU – 1.3 TFLOPS CPU – 25.6 GFLOPS

Quad SLI: 1.3 Billion transistors Jan’2006

Graphics Processing Units (GPUs) Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Better hides latency pipeline Programmable 10x more memory bandwidth than CPUs High growth rate Power-efficient

CPU vs. GPU Memory Hierarchy Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache L2 cache L2 cache DDR2 RAM GDDR4 RAM

CPU vs. GPU Memory Hierarchy: Broad Level Comparison Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache Write back Write through L2 cache L2 cache DDR2 RAM GDDR4 RAM

CPU vs. GPU Memory Hierarchy Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache Small, 4MB Very small L2 cache L2 cache DDR2 RAM GDDR4 RAM

CPU vs. GPU Memory Hierarchy Core 1 Core 2 FP FP FP FP FP Registers Registers Registers L1 Dcache L1 Dcache L1 cache L2 cache L2 cache High B/W, 86 GB/s Low B/W, 8GB/s DDR2 RAM GDDR4 RAM

Graphics Processing Units (GPUs) Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Better hides latency pipeline Programmable High growth rate Power-efficient

GFLOPS for GPUs & CPUs Graphics-Flops Giga-Flops

Graphics Processing Units (GPUs) Commodity processor for graphics applications Massively parallel vector processors High memory bandwidth Better hides latency pipeline Programmable High growth rate Power-efficient (high throughput per watt)

Computational Power of GPUs Why are GPUs getting faster so fast? Arithmetic intensity: the specialized nature of GPUs makes it easier to use additional transistors for computation not cache Economics: multi-billion dollar video game market is the killer application that pays for innovation

GPUs and Computer Architecture Current research in computer architecture is looking at: Streaming computation Flexible polymorphous computing systems Multi-core architecture Heterogeneous architecture More on these topics in the future

GPUs and Computer Architecture Current research in computer architecture is looking at: Streaming computation Flexible polymorphous computing systems Multi-core architecture Heterogeneous architecture GPU-like architectures have a lot in common with all these research trends!

GPUs and Computer Architecture Current research in computer architecture is looking at: Streaming computation Flexible polymorphous computing systems Multi-core architecture Heterogeneous architecture GPU-like architectures have a lot in common with all these research trends! We plan to touch on many of these topics as part of the course!

Is There a Future of GPGPU? http://www.informationweek.com/news/showArticle.jhtml?articleID=196800208: One of the Five Disruptive Technologies for 2007 http://www.wired.com/news/technology/computers/0,72090-0.html?tw=wn_index_9: SuperComputing’s Next Revolution

Capabilities of Current GPUs Modern GPUs are deeply programmable Programmable pixel, vertex, video engines Solidifying high-level language support Modern GPUs support 32-bit floating point precision Great development in the last few years 64-bit arithmetic may be coming soon Almost IEEE FP compliant

The Potential of GPGP The power and flexibility of GPUs makes them an attractive platform for general-purpose computation Example applications range from in-game physics simulation, geometric applications to conventional computational science Goal: make the inexpensive power of the GPU available to developers as a sort of computational coprocessor Check out http://www.gpgpu.org

GPGP: Challenges GPUs designed for and driven by video games Programming model is unusual & tied to computer graphics Programming environment is tightly constrained Underlying architectures are: Inherently parallel Rapidly evolving (even in basic feature set!) Largely secret No clear standards (besides DirectX imposed by MSFT) Can’t simply “port” code written for the CPU! Is there a formal class of problems that can be solved using current GPUs

Importance of Data Parallelism GPUs are designed for graphics or gaming industry Highly parallel tasks GPUs process independent vertices & fragments Temporary registers are zeroed No shared or static data No read-modify-write buffers Data-parallel processing GPUs architecture is ALU-heavy Multiple vertex & pixel pipelines, multiple ALUs per pipe Hide memory latency (with more computation)

GPGPU Applications Geometric computations Database computations Scientific computing and physical simulation Signal processing Computer vision Efficient when computation domain is a uniform grid

Geometric Computations Distance computations: Data-parallel computation Demo (2D)

Geometric Computations Distance computations

Geometric Computations Collision Detection and Proximity Computations GPU: A culling co-processor N-Objects Stage 1 Culling GPU-Based Culling Exact Tests Potential Colliding Set Overlap Tests Collision Potential Neighbor Set Distance Distance-Based Culling CPU GPU

Geometric Computations Collision Detection

Geometric Computations Proximity Computations

Database Computations

Physical Simulation Solving PDEs Reaction-Diffusion Demo Fluid Demo Numerical methods Linear Algebra Reaction-Diffusion Demo Fluid Demo

Signal Processing FFT, DCT, Video Processing DCT demo Video filtering demo

Computer Vision Realtime feature tracker (KLT)

Computer Vision Realtime feature tracker (KLT)

Goals of this Course A detailed introduction to general-purpose computing on graphics hardware Emphasis includes: Core computational building blocks Strategies and tools for programming GPUs Cover many applications and explore new applications Highlight major research issues

Course Organization Survey lectures Instructors, other faculty, senior graduate students Breadth and depth coverage Student presentations

Course Contents Overview of GPUs: architecture and features Models of computation for GPU-based algorithms System issues: Cache and data management; Languages and compilers Numerical and Scientific Computations: Linear algebra computations. Optimization, FFTrigid body simulation, fluid dynamics Geometric computations: Proximity computations; distance fields; motion planning and navigation Database computations: database queries: predicates, booleans, aggregates; streaming databases and data mining; sorting & searching GPU Clusters: Parallel computing environments for GPUs Rendering: Ray-tracing, photon mapping; Shadows

Student Load Stay awake in classes! One class lecture Read a lot of papers 2-3 small assignments

Student Load Stay awake in classes! One class lecture Read a lot of papers 2-3 small assignments A MAJOR COURSE PROJECT WITH RESEARCH COMPONENT

Course Projects Work by yourself or part of a small team Develop new algorithms for simulation, geometric problems, database computations Formal model for GPU algorithms or GPU hacking Issues in developing GPU clusters for scientific computation Look into new architecture and parallel programming trends

Possible Course Projects Check the WWW site http://gamma.cs.unc.edu/GPGP/#projects