Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS Spring 2011
Administrivia Tip: google “cis 565” Slides posted before each class Tentative assignment dates on website 1 st assignment handed out today Write concisely Due start of class, one week from today Google group in progress FYI. GDC Early Registration - 01/24
Survey Results 15/23 – graphics experience Most students have usable video cards Lerk – don’t be scared I want to be a Toys R Us kid too
Survey Results Class interests Pure architecture Game rendering Physical simulations Animation Vision algorithms Image/video processing …
Course Roadmap Graphics Pipeline (GLSL) GPGPU (GLSL) Briefly GPU Computing (CUDA, OpenCL) Choose your own adventure Student Presentation Final Project Goal: Prepare you for your presentation and project
Agenda Why program the GPU? Graphics Review Evolution of the Programmable Graphics Pipeline Understand the past
Why Program the GPU? Graph from:
Why Program the GPU? Graph from:
Why Program the GPU? Compute Intel Core i7 – 4 cores – 100 GFLOP NVIDIA GTX280 – 240 cores – 1 TFLOP Memory Bandwidth System Memory – 60 GB/s NVIDIA GT200 – 150 GB/s Install Base Over 200 million NVIDIA G80s shipped Numbers from Programming Massively Parallel Processors.
NVIDIA GPU Evolution Slide from David Luebke:
Graphics Review Modeling Rendering Animation
Graphics Review: Modeling Modeling Polygons vs Triangles How do you store a triangle mesh? Implicit Surfaces Height maps …
Triangles Image courtesy of A K Peters, Ltd.
Triangles Image courtesy of A K Peters, Ltd. Imagery from NASA Visible Earth: visibleearth.nasa.gov.
Triangles
Implicit Surfaces Images from GPU Gems 3:
Height Maps Image courtesy of A K Peters, Ltd.
Graphics Review: Rendering Rendering Goal: Assign color to pixels Two Parts Visible surfaces What is in front of what for a given view Shading Simulate the interaction of material and light to produce a pixel color
Rasterization What about ray tracing?
Visible Surfaces Image courtesy of A K Peters, Ltd.
Visible Surfaces Z-Buffer / Depth Buffer Fragment vs Pixel Image courtesy of A K Peters, Ltd.
Shading Images courtesy of A K Peters, Ltd.
Shading Image from GPU Gems 3:
Graphics Pipeline Primitive Assembly Primitive Assembly Vertex Transforms Vertex Transforms Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation Scissor Test Stencil Test Depth Test Blending
Graphics Pipeline Images courtesy of A K Peters, Ltd.
Graphics Pipeline Images courtesy of A K Peters, Ltd.
Graphics Pipeline Images courtesy of A K Peters, Ltd.
Graphics Pipeline Images courtesy of A K Peters, Ltd.
Graphics Review: Animation Move the camera and/or agents, and re- render the scene In less than 16.6 ms (60 fps)
Evolution of the Programmable Graphics Pipeline Pre GPU Fixed function GPU Programmable GPU Unified Shader Processors
Early 90s – Pre GPU Slide from Mike Houston:
Why GPUs? Exploit Parallelism Pipeline parallel Data-parallel CPU and GPU executing in parallel Hardware: texture filtering, MAD, etc.
Generation I: 3dfx Voodoo (1996) Image from “7 years of Graphics” Did not do vertex transformations: these were done in the CPU Did do texture mapping, z-buffering. Primitive Assembly Primitive Assembly Vertex Transforms Vertex Transforms Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation CPUGPU PCI Slide adapted from Suresh Venkatasubramanian and Joe Kider
Aside: Mario Kart 64 Image from: High fragment load / low vertex load
Aside: Mario Kart Wii High fragment load / low vertex load? Image from:
Generation II: GeForce/Radeon 7500 (1998) Slide from Suresh Venkatasubramanian and Joe Kider Vertex Transforms Vertex Transforms Main innovation: shifting the transformation and lighting calculations to the GPU Allowed multi-texturing: giving bump maps, light maps, and others.. Faster AGP bus instead of PCI Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation GPU AGP Image from “7 years of Graphics”
Generation III: GeForce3/Radeon 8500(2001) Slide from Suresh Venkatasubramanian and Joe Kider Vertex Transforms Vertex Transforms For the first time, allowed limited amount of programmability in the vertex pipeline Also allowed volume texturing and multi-sampling (for antialiasing) Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation GPU AGP Small vertex shaders Small vertex shaders Image from “7 years of Graphics”
Generation IV: Radeon 9700/GeForce FX (2002) Vertex Transforms Vertex Transforms This generation is the first generation of fully-programmable graphics cards Different versions have different resource limits on fragment/vertex programs Primitive Assembly Primitive Assembly Raster Operations Rasterization and Interpolation AGP Programmable Vertex shader Programmable Vertex shader Programmable Fragment Processor Programmable Fragment Processor Texture Memory Slide from Suresh Venkatasubramanian and Joe Kider Image from “7 years of Graphics”
Generation IV.V: GeForce6/X800 (2004) Slide adapted from Suresh Venkatasubramanian and Joe Kider Simultaneous rendering to multiple buffers True conditionals and loops PCIe bus Vertex texture fetch Vertex Transforms Vertex Transforms Primitive Assembly Primitive Assembly Frame Buffer Frame Buffer Raster Operations Rasterization and Interpolation PCIe Programmable Vertex shader Programmable Vertex shader Programmable Fragment Processor Programmable Fragment Processor Texture Memory
NVIDIA NV40 Architecture Image from GPU Gems 2: 6 vertex shader units 16 fragment shader units Vertex Texture Fetch
Generation V: GeForce8800/HD2900 (2006) Slide adapted from Suresh Venkatasubramanian and Joe Kider Ground-up GPU redesign Support for Direct3D 10 / OpenGL 3 Geometry Shaders Stream out / transform-feedback Unified shader processors Support for General GPU programming Input Assembler Input Assembler Programmable Pixel (Fragment) Shader Programmable Pixel (Fragment) Shader Raster Operations Programmable Geometry Shader PCIe Programmable Vertex shader Programmable Vertex shader Output Merger
D3D 10 Pipeline Image from David Blythe :
Geometry Shaders: Point Sprites
Geometry Shaders Image from David Blythe :
NVIDIA G80 Architecture Slide from David Luebke:
NVIDIA G80 Architecture Slide from David Luebke:
Why Unify Shader Processors? Slide from David Luebke:
Why Unify Shader Processors? Slide from David Luebke:
Unified Shader Processors Slide from David Luebke:
Terminology Shader Model Direct3DOpenGLVideo card Example 292.x NVIDIA GeForce 6800 ATI Radeon X x3.x NVIDIA GeForce 8800 ATI Radeon HD x4.x NVIDIA GeForce GTX 480 ATI Radeon HD 5870
Shader Capabilities Table courtesy of A K Peters, Ltd.
Shader Capabilities Table courtesy of A K Peters, Ltd.
Evolution of the Programmable Graphics Pipeline Slide from Mike Houston:
Evolution of the Programmable Graphics Pipeline Slide from Mike Houston:
Not covered today: SM 5 / D3D 11 / GL 4 Tessellation shaders *cough* student presentation *cough* Later this semester: NVIDIA Fermi Dual warp scheduler Configurable L1 / shared memory Double precision … Evolution of the Programmable Graphics Pipeline
New Tool: AMD System Monitor Released 01/04/2011