David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become.

Similar presentations

Presentation on theme: "David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become."— Presentation transcript:

1 David Angulo Rubio FAMU CIS GradStudent

2 Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become extremely powerful and flexible  Programmability  Precision  Power  GPGPU computing is an emerging field which objective is to harness GPUs for general- purpose computation

3 GPU Performance Trends

4 Motivation: Flexible and Precise  Modern GPUs are deeply programmable  Programmable pixel, vertex, video engines.  Solidifying high level language support  Modern GPUs support high precision  32 bit floating point throughout the pipeline  High enough for many (not all) applications  Newest GPUs have 64bit support


6 Stream Programming Abstraction  Streams  Collection of data records  All data is expressed in streams  Kernels  Inputs/outputs are streams  Perform computation on streams  Can be chained together KERNEL stream

7 Stream Programming Abstraction Dolphin Triangle Mesh

8 Stream Programming Abstraction  Benchmark Funnel: In this simulation, a cloth falls into a funnel and pass through it under the pressure of a ball. This model has 47K vertices, 92K triangles, and a lot of self- collisions. Our novel GPU-based CCD algorithm takes 4.4ms and 10ms per frame to compute all the collisions on a NVIDIA GeForce GTX 480 and a NVIDIA GeForce GTX 285, respectively.

9 Stream Programming Abstraction

10 Why Streams  Ample computation by exposing parallelism  Streams expose data parallelism  Multiple streams elements can be processed in parallel  Pipeline (task) parallelism  Multiple tasks can be processed in parallel  Kernels yield high arithmetic intensity  Efficient communication  Producer consumer locality  Predictable memory access pattern  Optimize for throughput of all elements, not latency of one  Processing elements at once allows latency hiding

11 CPU GPU ANALOGIES Stream/Data array = Texture Memory read= Texture Sample

12 Structuring a GPU Program  Cpu assembles input data  Cpu transfers data to GPU(GPU “main memory” or “device memory”)  Cpu calls GPU program (or set of kernels). GPU runs out of GPU main memory.  When GPU finishes, CPU copies back results into CPU memory.  Recent interfaces allow overlap  What lessons can we draw from this sequence of operations




16 Kernels CPU GPU ADVECT KERNEL / LOOP BODY / ALGORITHM STEP = Fragment Program You write one program. It runs on every vertex/fragment.

17 Conclusion  Can we apply these techniques to more general problems?  GPUs should excel at tasks that :  Require ample computation  Regular computation  Efficient communication

Download ppt "David Angulo Rubio FAMU CIS GradStudent. Introduction  GPU(Graphics Processing Unit) on video cards has evolved during the last years. They have become."

Similar presentations

Ads by Google