David Luebke 1 1/25/2016 Programmable Graphics Hardware.

Slides:



Advertisements
Similar presentations
Fragment level programmability in OpenGL Evan Hart
Advertisements

COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
Computer Organization and Architecture
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
CS-378: Game Technology Lecture #9: More Mapping Prof. Okan Arikan University of Texas, Austin Thanks to James O’Brien, Steve Chenney, Zoran Popovic, Jessica.
9/25/2001CS 638, Fall 2001 Today Shadow Volume Algorithms Vertex and Pixel Shaders.
The Programmable Graphics Hardware Pipeline Doug James Asst. Professor CS & Robotics.
CS5500 Computer Graphics © Chun-Fa Chang, Spring 2007 CS5500 Computer Graphics April 19, 2007.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
A Crash Course on Programmable Graphics Hardware Li-Yi Wei 2005 at Tsinghua University, Beijing.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Chapter.
ATI GPUs and Graphics APIs Mark Segal. ATI Hardware X1K series 8 SIMD vertex engines, 16 SIMD fragment (pixel) engines 3-component vector + scalar ALUs.
Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS Spring 2011.
The programmable pipeline Lecture 10 Slide Courtesy to Dr. Suresh Venkatasubramanian.
Status – Week 281 Victor Moya. Objectives Research in future GPUs for 3D graphics. Research in future GPUs for 3D graphics. Simulate current and future.
Vertex & Pixel Shaders CPS124 – Computer Graphics Ferdinand Schober.
GPU Tutorial 이윤진 Computer Game 2007 가을 2007 년 11 월 다섯째 주, 12 월 첫째 주.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
Cg Kevin Bjorke GDC NVIDIA CONFIDENTIAL A Whole New World with Cg Graphics Program Written in Cg “C” for Graphics Compiled & Optimized Low Level,
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
Enhancing GPU for Scientific Computing Some thoughts.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
Graphics Graphics Korea University cgvr.korea.ac.kr 1 Using Vertex Shader in DirectX 8.1 강 신 진
Programmable Pipelines. 2 Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
Cg Programming Mapping Computational Concepts to GPUs.
1 Dr. Scott Schaefer Programmable Shaders. 2/30 Graphics Cards Performance Nvidia Geforce 6800 GTX 1  6.4 billion pixels/sec Nvidia Geforce 7900 GTX.
The programmable pipeline Lecture 3.
CSE 690: GPGPU Lecture 6: Cg Tutorial Klaus Mueller Computer Science, Stony Brook University.
The GPU Revolution: Programmable Graphics Hardware David Luebke University of Virginia.
Finding Body Parts with Vector Processing Cynthia Bruyns Bryan Feldman CS 252.
CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.
Programmable Pipelines Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University.
David Luebke 1 11/24/2015 Programmable Graphics Hardware.
A User-Programmable Vertex Engine Erik Lindholm Mark Kilgard Henry Moreton NVIDIA Corporation Presented by Han-Wei Shen.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
COMPUTER ORGANIZATION AND ASSEMBLY LANGUAGE Lecture 21 & 22 Processor Organization Register Organization Course Instructor: Engr. Aisha Danish.
Processor Structure and Function Chapter8:. CPU Structure  CPU must:  Fetch instructions –Read instruction from memory  Interpret instructions –Instruction.
Fateme Hajikarami Spring  What is GPGPU ? ◦ General-Purpose computing on a Graphics Processing Unit ◦ Using graphic hardware for non-graphic computations.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Lecture.
09/25/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Shadows Stage 2 outline.
Ray Tracing using Programmable Graphics Hardware
Programmable Graphics Hardware CS 446: Real-Time Rendering & Game Technology David Luebke University of Virginia.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
第七课 GPU & GPGPU.
A Crash Course on Programmable Graphics Hardware
Graphics on GPU © David Kirk/NVIDIA and Wen-mei W. Hwu,
Graphics Processing Unit
Chapter 6 GPU, Shaders, and Shading Languages
The Graphics Rendering Pipeline
Introduction to Programmable Hardware
Graphics Processing Unit
CS5500 Computer Graphics April 17, 2006 CS5500 Computer Graphics
Ray Tracing on Programmable Graphics Hardware
RADEON™ 9700 Architecture and 3D Performance
CIS 441/541: Introduction to Computer Graphics Lecture 15: shaders
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

David Luebke 1 1/25/2016 Programmable Graphics Hardware

David Luebke 2 1/25/2016 Admin ● Handout: Cg in two pages

David Luebke 3 1/25/2016 Acknowledgement & Aside ● The bulk of this lecture comes from slides from Bill Mark’s SIGGRAPH 2002 course talk on NVIDIA’s programmable graphics technology ● For this reason, and because the lab is outfitted with GeForce 3 cards, we will focus on NVIDIA tech

David Luebke 4 1/25/2016 Outline ● Programmable graphics ■ NVIDIA’s next-generation technology: GeForceFX (code name NV30) ● Programming programmable graphics ■ NVIDIA’s Cg language

David Luebke 5 1/25/2016 GPU Programming Model Application Vertex Processor Fragment Processor Assembly & Rasterization Framebuffer Operations Framebuffer GPU CPU Textures

David Luebke 6 1/25/2016 ● Framebuffer ● Textures ● Fragment processor ● Vertex processor ● Interpolants 32-bit IEEE floating-point throughout pipeline

David Luebke 7 1/25/2016 Hardware supports several other data types ● Fragment processor also supports: ■ 16-bit “half” floating point ■ 12-bit fixed point ■ These may be faster than 32-bit on some HW ● Framebuffer/textures also support: ■ Large variety of fixed-point formats ■ E.g., classical 8-bit per component ■ These formats use less memory bandwidth than FP32

David Luebke 8 1/25/2016 Vertex processor capabilities ● 4-vector FP32 operations, as in GeForce3/4 ● True data-dependent control flow ■ Conditional branch instruction ■ Subroutine calls, up to 4 deep ■ Jump table (for switch statements) ● Condition codes ● New arithmetic instructions (e.g. COS) ● User clip-plane support

David Luebke 9 1/25/2016 Vertex processor has high resource limits ● 256 instructions per program (effectively much higher w/branching) ● 16 temporary 4-vector registers ● 256 “uniform” parameter registers ● 2 address registers (4-vector) ● 6 clip-distance outputs

David Luebke 10 1/25/2016 Fragment processor has clean instruction set ● General and orthogonal instructions ● Much better than previous generation ● Same syntax as vertex processor: MUL R0, R1.xyz, R2.yxw; ● Full set of arithmetic instructions: RCP, RSQ, COS, EXP, …

David Luebke 11 1/25/2016 Fragment processor has flexible texture mapping ● Texture reads are just another instruction (TEX, TXP, or TXD) ● Allows computed texture coordinates, nested to arbitrary depth ● Allows multiple uses of a single texture unit ● Optional LOD control – specify filter extent ● Think of it as… A memory-read instruction, with optional user-controlled filtering

David Luebke 12 1/25/2016 Additional fragment processor capabilities ● Read access to window-space position ● Read/write access to fragment Z ● Built-in derivative instructions ■ Partial derivatives w.r.t. screen-space x or y ■ Useful for anti-aliasing ● Conditional fragment-kill instruction ● FP32, FP16, and fixed-point data

David Luebke 13 1/25/2016 Fragment processor limitations ● No branching ■ But, can do a lot with condition codes ● No indexed reads from registers ■ Use texture reads instead ● No memory writes

David Luebke 14 1/25/2016 Fragment processor has high resource limits ● 1024 instructions ● 512 constants or uniform parameters ■ Each constant counts as one instruction ● 16 texture units ■ Reuse as many times as desired ● 8 FP32 x 4 perspective-correct inputs ● 128-bit framebuffer “color” output (use as 4 x FP32, 8 x FP16, etc…)

David Luebke 15 1/25/2016 NV30 CineFX Technology Summary Application Vertex Processor Fragment Processor Assembly & Rasterization Framebuffer Operations Framebuffer Textures FP32 throughout pipeline Clean instruction sets True branching in vertex processor Dependent texture in fragment processor High resource limits

David Luebke 16 1/25/2016 Programming in assembly is painful … FRC R2.y, C11.w; ADD R3.x, C11.w, -R2.y; MOV H4.y, R2.y; ADD H4.x, -H4.y, C4.w; MUL R3.xy, R3.xyww, C11.xyww; ADD R3.xy, R3.xyww, C11.z; TEX H5, R3, TEX2, 2D; ADD R3.x, R3.x, C11.x; TEX H6, R3, TEX2, 2D; … … L2weight = timeval – floor(timeval); L1weight = 1.0 – L2weight; ocoord1 = floor(timeval)/ /128.0; ocoord2 = ocoord /64.0; L1offset = f2tex2D(tex2, float2(ocoord1, 1.0/128.0)); L2offset = f2tex2D(tex2, float2(ocoord2, 1.0/128.0)); … L2weight = timeval – floor(timeval); L1weight = 1.0 – L2weight; ocoord1 = floor(timeval)/ /128.0; ocoord2 = ocoord /64.0; L1offset = f2tex2D(tex2, float2(ocoord1, 1.0/128.0)); L2offset = f2tex2D(tex2, float2(ocoord2, 1.0/128.0)); … Easier to read and modify Cross-platform Combine pieces etc. Assembly

David Luebke 17 1/25/2016 Quick Demo

David Luebke 18 1/25/2016 Cg – C for Graphics ● Cg is a GPU programming language ● Designed by NVIDIA and Microsoft ● Compilers available in beta versions from both companies

David Luebke 19 1/25/2016 Design goals for Cg ● Enable algorithms to be expressed… ■ Clearly, and ■ Efficiently ● Provide interface continuity ■ Focus on DX9-generation HW and beyond ■ But provide support for DX8-class HW too ■ Support both OpenGL and Direct3D ● Allow easy, incremental adoption

David Luebke 20 1/25/2016 Easy adoption for applications ● Avoid owning the application’s data ■ No scene graph ■ No buffering of vertex data ● Compiler sits on top of existing APIs ■ User can examine assembly-code output ■ Can compile either at run time, or at application- development time ● Allow partial adoption e.g. Use Cg vertex program with assembly fragment program ● Support current hardware

David Luebke 21 1/25/2016 Some points in the design space ● CPU languages ■ C – close to the hardware; general purpose ■ C++, Java, lisp – require memory management ■ RenderMan – specialized for shading ● Real-time shading languages ■ Stanford shading language ■ Creative Labs shading language

David Luebke 22 1/25/2016 Design strategy ● Start with C (and a bit of C++) ■ Minimizes number of decisions ■ Gives you known mistakes instead of unknown ones ● Allow subsetting of the language ● Add features desired for GPU’s ■ To support GPU programming model ■ To enable high performance ● Tweak to make it fit together well

David Luebke 23 1/25/2016 How are current GPU’s different from CPU? 1. GPU is a stream processor ■ Multiple programmable processing units ■ Connected by data flows Application Vertex Processor Fragment Processor Assembly & Rasterization Framebuffer Operations Framebuffer Textures