The programmable pipeline Lecture 10 Slide Courtesy to Dr. Suresh Venkatasubramanian.

Slides:



Advertisements
Similar presentations
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Advertisements

Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
The Programmable Graphics Hardware Pipeline Doug James Asst. Professor CS & Robotics.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Morphing and Animation GPU Graphics Gary J. Katz University of Pennsylvania CIS 665 Adapted from articles taken from ShaderX 3, 4 and 5 And GPU Gems 1.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Chapter.
Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Joseph.
Aaron Lefohn University of California, Davis With updates from slides by Suresh Venkatasubramanian, University of Pennsylvania Updates performed by Gary.
GPU Simulator Victor Moya. Summary Rendering pipeline for 3D graphics. Rendering pipeline for 3D graphics. Graphic Processors. Graphic Processors. GPU.
Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS Spring 2011.
Mapping Computational Concepts to GPU’s Jesper Mosegaard Based primarily on SIGGRAPH 2004 GPGPU COURSE and Visualization 2004 Course.
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
Evolution of the Programmable Graphics Pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
Aaron Lefohn University of California, Davis GPU Memory Model Overview.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
Enhancing GPU for Scientific Computing Some thoughts.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Over View of the GPU Architecture CS7080 Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad &
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
Geometric Objects and Transformations. Coordinate systems rial.html.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
Computer Graphics Ken-Yi Lee National Taiwan University.
Cg Programming Mapping Computational Concepts to GPUs.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
The programmable pipeline Lecture 3.
CSE 690: GPGPU Lecture 6: Cg Tutorial Klaus Mueller Computer Science, Stony Brook University.
CS 480/680 Intro Dr. Frederick C Harris, Jr. Fall 2014.
Computer Graphics The Rendering Pipeline - Review CO2409 Computer Graphics Week 15.
GPU Computation Strategies & Tricks Ian Buck NVIDIA.
Xbox MB system memory IBM 3-way symmetric core processor ATI GPU with embedded EDRAM 12x DVD Optional Hard disk.
OpenGL-ES 3.0 And Beyond Boston Photo credit :Johnson Cameraface OpenGL Basics.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Computing & Information Sciences Kansas State University Lecture 12 of 42CIS 636/736: (Introduction to) Computer Graphics CIS 636/736 Computer Graphics.
David Luebke 1 1/25/2016 Programmable Graphics Hardware.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Lecture.
Ray Tracing using Programmable Graphics Hardware
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
Mesh Skinning Sébastien Dominé. Agenda Introduction to Mesh Skinning 2 matrix skinning 4 matrix skinning with lighting Complex skinning for character.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA.
UW EXTENSION CERTIFICATE PROGRAM IN GAME DEVELOPMENT 2 ND QUARTER: ADVANCED GRAPHICS The GPU.
The Graphics Pipeline Revisited Real Time Rendering Instructor: David Luebke.
Programming with OpenGL Part 3: Shaders Ed Angel Professor of Emeritus of Computer Science University of New Mexico 1 E. Angel and D. Shreiner: Interactive.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 GPU.
GLSL I.  Fixed vs. Programmable  HW fixed function pipeline ▪ Faster ▪ Limited  New programmable hardware ▪ Many effects become possible. ▪ Global.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
GLSL Review Monday, Nov OpenGL pipeline Command Stream Vertex Processing Geometry processing Rasterization Fragment processing Fragment Ops/Blending.
COMP 175 | COMPUTER GRAPHICS Remco Chang1/XX13 – GLSL Lecture 13: OpenGL Shading Language (GLSL) COMP 175: Computer Graphics April 12, 2016.
Build your own 2D Game Engine and Create Great Web Games using HTML5, JavaScript, and WebGL. Sung, Pavleas, Arnez, and Pace, Chapter 5 Examples 1.
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
- Introduction - Graphics Pipeline
A Crash Course on Programmable Graphics Hardware
Graphics on GPU © David Kirk/NVIDIA and Wen-mei W. Hwu,
Graphics Processing Unit
Introduction to OpenGL
Chapter 6 GPU, Shaders, and Shading Languages
CS451Real-time Rendering Pipeline
Chapter VI OpenGL ES and Shader
Graphics Processing Unit
CS5500 Computer Graphics April 17, 2006 CS5500 Computer Graphics
Computer Graphics Practical Lesson 10
Programming with OpenGL Part 3: Shaders
Introduction to OpenGL
CS 480/680 Fall 2011 Dr. Frederick C Harris, Jr. Computer Graphics
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

The programmable pipeline Lecture 10 Slide Courtesy to Dr. Suresh Venkatasubramanian

Vertex Index Stream 3D API Commands Assembled Primitives Pixel Updates Pixel Location Stream Programmable Fragment Processor Programmable Fragment Processor Transformed Vertices Programmable Vertex Processor Programmable Vertex Processor GPU Front End GPU Front End Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation 3D API: OpenGL or Direct3D 3D API: OpenGL or Direct3D 3D Application Or Game 3D Application Or Game Pre-transformed Vertices Pre-transformed Fragments Transformed Fragments GPU Command & Data Stream CPU-GPU Boundary (AGP/PCIe) Programmable pipeline

How to program Shader?

How does Cg run? Your application must call the Cg Runtime to invoke the Cg programs and pass the appropriate parameters.

The vertex pipeline Input: vertices position, color, texture coords. Input: uniform and constant parameters. Matrices can be passed to a vertex program. Lighting/material parameters can also be passed.

The vertex pipeline Operations: Math/swizzle ops Matrix operators Flow control (as before) [nv3x] No access to textures. Output: Modified vertices (position, color) Vertex data transmitted to primitive assembly.

Vertex programs are useful We can replace the entire geometry transformation portion of the fixed-function pipeline. Vertex programs used to change vertex coordinates (move objects around) There are many fewer vertices than fragments: shifting operations to vertex programs improves overall pipeline performance. Much of shader processing happens at vertex level. We have access to original scene geometry.

The fragment pipeline ColorRGBA PositionXYZW Texture coordinates XY[Z]- Texture coordinates XY[Z]- … Input: Fragment Attributes Interpolated from vertex information Input: Texture Image XYZW Each element of texture is 4D vector Textures can be “square” or rectangular (power-of-two or not) 32 bits = float 16 bits = half

The fragment pipeline Math ops: USE THEM ! cos(x)/log2(x)/pow(x, y) dot(a,b) mul(v, M) sqrt(x)/rsqrt(x) cross(u, v) Using built-in ops is more efficient than writing your own Swizzling/masking: an easy way to move data around. v1 = (4,-2,5,3); // Initialize v2 = v1.yx; // v2 = (-2,4) s = v1.w; // s = 3 v3 = s.rrr; // v3 = (3,3,3)

The fragment pipeline x y float4 v = tex2D(IMG, float2(x,y)) Texture access is like an array lookup. The value in v can be used to perform another lookup! This is called a dependent read Texture reads (and dependent reads) are expensive resources, and are limited in different GPUs. Use them wisely !

The fragment pipeline Control flow: ( )?a:b operator. if-then-else conditional –[nv3x] Both branches are executed, and the condition code is used to decide which value is used to write the output register. –[nv40] True conditionals for-loops and do-while –[nv3x] limited to what can be unrolled (i.e no variable loop limits) –[nv40] True looping. WARNING: Even though nv40 has true flow control, performance will suffer if there is no coherence

The fragment pipeline What comes after fragment programs ? Depth/stencil happen after frag. program Blending and aggregation happen as usual Early z-culling: fragments that would have failed depth test are killed before executing fragment program. Optimization point: avoid work in the fragment program if possible. Frame Buffer Frame Buffer Raster Operations

Getting data back I: Readbacks Fragment Processor Fragment Processor Vertex Processor Vertex Processor GPU Front End GPU Front End Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation 3D API: OpenGL or Direct3D 3D API: OpenGL or Direct3D Readbacks transfer data from the frame buffer to the CPU. They are very general (any buffer can be transferred) Partial buffers can be transferred  They are slow: reverse data transfer across PCI/AGP bus is very slow (PCIe is expected to be a lot better)  Data mismatch: readbacks return image data, but the CPU expects vertex data (or has to load image into texture)

Getting data back II: Copy-to- texture Fragment Processor Fragment Processor Vertex Processor Vertex Processor GPU Front End GPU Front End Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation Copy-to-texture transfers data from frame buffer to texture. Transfer does not cross GPU- CPU boundary. Partial buffers can be transferred  Not very flexible: depth and stencil buffers cannot be transferred in this way, and copy to texture is still somewhat slow.  Loss of precision in the copy.

Getting data back III: Render-to- texture Fragment Processor Fragment Processor Vertex Processor Vertex Processor GPU Front End GPU Front End Primitive Assembly Primitive Assembly Raster Operations Rasterization and Interpolation Render-to-texture renders directly into a texture. Transfer does not cross GPU- CPU boundary. Fastest way to transfer data to fragment processor  Only works with depth and color buffers (not stencil). Render-to-texture is the best method for reading data back after a computation.

Using Render-to-texture Using the render-texture extension is tricky. You have to set up a pbuffer context, bind an appropriate texture to it, and then render to this context. Then you have to change context and read the bound texture. You cannot write to a texture and read it simultaneously Mark Harris (NVIDIA) has written a RenderTexture class that wraps all of this. The tutorial will have more details on this. RenderTextures are your friend !

Sending data back to vertex program Solution: [Pass 1] Render all vertices to be stored in a texture. [Pass 2] Compute force field in fragment program [Pass 3] Update texture containing vertex coordinates in a fragment program using the force field. [Pass 4] Retrieve vertex data from texture. How?

Vertex/Pixel Buffer Objects V/P buffer objects are ways to transfer data between framebuffer/vertex arrays and GPU memory. Conceptually, V/PBO are like CPU memory, but on the GPU. Can use glReadPixels to read to PBO Can create vertex array from VBO

Solution! Programmable Fragment Processor Programmable Fragment Processor Programmable Vertex Processor Programmable Vertex Processor GPU Front End GPU Front End Primitive Assembly Primitive Assembly Raster Operations Rasterization and Interpolation texture VBO/PBO

NV40: Vertex programs can read textures Programmable Fragment Processor Programmable Fragment Processor Programmable Vertex Processor Programmable Vertex Processor GPU Front End GPU Front End Primitive Assembly Primitive Assembly Raster Operations Rasterization and Interpolation texture

Summary of memory flow Readback CPU Vertex program Vertex program Fragment program Fragment program Frame buffer Frame buffer CPU Vertex program Vertex program Fragment program Fragment program Frame buffer Frame buffer Copy-to-Texture CPU Vertex program Vertex program Fragment program Fragment program Render-to-Texture

Summary of memory flow Vertex program Vertex program Fragment program Fragment program VBO/PBO transfer Vertex program Vertex program Fragment program Fragment program nv40 texture ref in vertex program