Status – Week 281 Victor Moya. Objectives Research in future GPUs for 3D graphics. Research in future GPUs for 3D graphics. Simulate current and future.

Slides:



Advertisements
Similar presentations
Fragment level programmability in OpenGL Evan Hart
Advertisements

COMPUTER GRAPHICS SOFTWARE.
Computer Organization CS224 Fall 2012 Lesson 19. Floating-Point Example  What number is represented by the single-precision float …00 
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
Graphics Pipeline.
Status – Week 257 Victor Moya. Summary GPU interface. GPU interface. GPU state. GPU state. API/Driver State. API/Driver State. Driver/CPU Proxy. Driver/CPU.
RealityEngine Graphics Kurt Akeley Silicon Graphics Computer Systems.
CS 4363/6353 BASIC RENDERING. THE GRAPHICS PIPELINE OVERVIEW Vertex Processing Coordinate transformations Compute color for each vertex Clipping and Primitive.
Computer Graphic Creator: Mohsen Asghari Session 2 Fall 2014.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
CS-378: Game Technology Lecture #9: More Mapping Prof. Okan Arikan University of Texas, Austin Thanks to James O’Brien, Steve Chenney, Zoran Popovic, Jessica.
9/25/2001CS 638, Fall 2001 Today Shadow Volume Algorithms Vertex and Pixel Shaders.
The CPU Revision Typical machine code instructions Using op-codes and operands Symbolic addressing. Conditional and unconditional branches.
The Programmable Graphics Hardware Pipeline Doug James Asst. Professor CS & Robotics.
Introduction to Geometry Shaders Patrick Cozzi Analytical Graphics, Inc.
Shading Languages GeForce3, DirectX 8 Michael Oswald.
A Crash Course on Programmable Graphics Hardware Li-Yi Wei 2005 at Tsinghua University, Beijing.
Status – Week 243 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry.
Modern Graphics Hardware 2002 Vertex Programs Joe Michael Kniss.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
Status – Week 242 Victor Moya. Summary Current status. Current status. Tests. Tests. XBox documentation. XBox documentation. Post Vertex Shader geometry.
Status – Week 277 Victor Moya.
GPU Simulator Victor Moya. Summary Rendering pipeline for 3D graphics. Rendering pipeline for 3D graphics. Graphic Processors. Graphic Processors. GPU.
IAT 3551 Computer Graphics Overview Color Displays Drawing Pipeline.
Status – Week 276 Victor Moya. Hardware Pipeline Command Processor. Command Processor. Vertex Shader. Vertex Shader. Rasterization. Rasterization. Pixel.
ARB Fragment Program in GPULib. Summary Fragment program arquitecture New instructions.  Emulating instructions not supported directly New Required GL.
3D Graphic Hardware Pipeline Victor Moya. Index 3D Graphic Pipeline Overview. 3D Graphic Pipeline Overview. Geometry. Geometry. Rasterization. Rasterization.
Status – Week 279 Victor Moya. Rasterization Setup triangles (calculate slope values). Setup triangles (calculate slope values). Fill triangle: Interpolate.
Status – Week 240 Victor Moya. Summary Post Geometry Pipeline. Post Geometry Pipeline. Rasterization. Rasterization. Triangle Setup. Triangle Setup. Triangle.
Status – Week 283 Victor Moya. 3D Graphics Pipeline Akeley & Hanrahan course. Akeley & Hanrahan course. Fixed vs Programmable. Fixed vs Programmable.
Status – Week 239 Victor Moya. Summary Primitive Assembly Primitive Assembly Clipping triangle rejection. Clipping triangle rejection. Rasterization.
Status – Week 275 Victor Moya. Simulator model Boxes. Boxes. Perform the actual work. Perform the actual work. Parameters: wires in, wires out, child.
Status – Week 260 Victor Moya. Summary shSim. shSim. GPU design. GPU design. Future Work. Future Work. Rumors and News. Rumors and News. Imagine. Imagine.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
CHAPTER 4 Window Creation and Control © 2008 Cengage Learning EMEA.
Programmable Pipelines. Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Graphics Pipeline Rasterization CMSC 435/634. Drawing Terms Primitive – Basic shape, drawn directly – Compare to building from simpler shapes Rasterization.
Graphics Graphics Korea University cgvr.korea.ac.kr 1 Using Vertex Shader in DirectX 8.1 강 신 진
Programmable Pipelines. 2 Objectives Introduce programmable pipelines ­Vertex shaders ­Fragment shaders Introduce shading languages ­Needed to describe.
Cg Programming Mapping Computational Concepts to GPUs.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
Week 2 - Friday.  What did we talk about last time?  Graphics rendering pipeline  Geometry Stage.
1 Dr. Scott Schaefer Programmable Shaders. 2/30 Graphics Cards Performance Nvidia Geforce 6800 GTX 1  6.4 billion pixels/sec Nvidia Geforce 7900 GTX.
OpenGL Conclusions OpenGL Programming and Reference Guides, other sources CSCI 6360/4360.
Real-time Shadow Mapping. Shadow Mapping Shadow mapping uses two-pass rendering - render depth texture from the light ’ s point-of-view - render from.
CSE 690: GPGPU Lecture 6: Cg Tutorial Klaus Mueller Computer Science, Stony Brook University.
Stream Processing Main References: “Comparing Reyes and OpenGL on a Stream Architecture”, 2002 “Polygon Rendering on a Stream Architecture”, 2000 Department.
Computer Graphics The Rendering Pipeline - Review CO2409 Computer Graphics Week 15.
CS662 Computer Graphics Game Technologies Jim X. Chen, Ph.D. Computer Science Department George Mason University.
Programmable Pipelines Ed Angel Professor of Computer Science, Electrical and Computer Engineering, and Media Arts Director, Arts Technology Center University.
A User-Programmable Vertex Engine Erik Lindholm Mark Kilgard Henry Moreton NVIDIA Corporation Presented by Han-Wei Shen.
Review on Graphics Basics. Outline Polygon rendering pipeline Affine transformations Projective transformations Lighting and shading From vertices to.
David Luebke 1 1/25/2016 Programmable Graphics Hardware.
09/25/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Shadows Stage 2 outline.
Mesh Skinning Sébastien Dominé. Agenda Introduction to Mesh Skinning 2 matrix skinning 4 matrix skinning with lighting Complex skinning for character.
UW EXTENSION CERTIFICATE PROGRAM IN GAME DEVELOPMENT 2 ND QUARTER: ADVANCED GRAPHICS The GPU.
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
GLSL Review Monday, Nov OpenGL pipeline Command Stream Vertex Processing Geometry processing Rasterization Fragment processing Fragment Ops/Blending.
Lecture 6: Decision and Control CS 2011 Spring 2016, Dr. Rozier.
GCSE COMPUTER SCIENCE Computers 1.5 Assembly Language.
A Crash Course on Programmable Graphics Hardware
Graphics Processing Unit
Chapter 6 GPU, Shaders, and Shading Languages
The Graphics Rendering Pipeline
Introduction to Programmable Hardware
Graphics Processing Unit
Lecture 13 Clipping & Scan Conversion
Where does the Vertex Engine fit?
Presentation transcript:

Status – Week 281 Victor Moya

Objectives Research in future GPUs for 3D graphics. Research in future GPUs for 3D graphics. Simulate current and future 3D graphic hardware. Simulate current and future 3D graphic hardware. Finish (someday) the PhD ;). Finish (someday) the PhD ;).

Problems Information. Information. Choice of the simulation target: Choice of the simulation target: Current GPUs. Current GPUs. Near future GPUs. Near future GPUs. Absolutely new GPU designs. Absolutely new GPU designs. Future is hard to predict. Future is hard to predict. But GPUs change very fast. But GPUs change very fast. Fierce competence between ATI and NVidia. Matrox and 3DLabs follow (3DLabs can rule workstation market). SIS and VIA as OEM. Fierce competence between ATI and NVidia. Matrox and 3DLabs follow (3DLabs can rule workstation market). SIS and VIA as OEM.

Status Designing a hardware 3D graphics pipeline: Designing a hardware 3D graphics pipeline: Command processors. Command processors. Vertex Shader.  Vertex Shader.  Divide by w, Clip, Culling and Triangle Setup. Divide by w, Clip, Culling and Triangle Setup. Rasterization. Rasterization. Pixel shaders. Pixel shaders. Antialiasing. Antialiasing. Designing the simulator. Designing the simulator.

3D Graphics Pipeline

Geometry Vertex operations: Vertex operations: (1) Transform coordinates and normal (1) Transform coordinates and normal Model => World. Model => World. World => Eye. World => Eye. (2) Normalize the length of the normal. (2) Normalize the length of the normal. (3) Compute vertex lightning. (3) Compute vertex lightning. (4) Transform texture coordinates. (4) Transform texture coordinates. (5) Transform coordinates to clip coordinates (projection). (5) Transform coordinates to clip coordinates (projection). (8) Divide coordinate by w. (8) Divide coordinate by w. (9) Apply affine viewport transform (x, y, z). (9) Apply affine viewport transform (x, y, z).

Geometry Primitive operations: Primitive operations: (6) Primitive assembly (6) Primitive assembly (7) Clipping: (7) Clipping: (10) Backface cull: eliminate back-facing triangles. (10) Backface cull: eliminate back-facing triangles. Primitive generation: new pipeline stage (ATI TruForm). Primitive generation: new pipeline stage (ATI TruForm).

Vertex Shader VS 1.0, 1.1 and 1.2 (current technology) for Direct3D 8 and 8.1. OpenGL extensions: ARB_vertex_program (finally in OpenGL v1.4), NV_vertex_program1_1 (NVidia), EXT_vertex_shader (ATI). VS 1.0, 1.1 and 1.2 (current technology) for Direct3D 8 and 8.1. OpenGL extensions: ARB_vertex_program (finally in OpenGL v1.4), NV_vertex_program1_1 (NVidia), EXT_vertex_shader (ATI). No branching. No branching. Single cycle execution latency (?). Single cycle execution latency (?). Single issue instruction each cycle. Single issue instruction each cycle. Simple in order pipeline (?). Simple in order pipeline (?).

Vertex Shader 16 input registers (read only). 16 input registers (read only). 15 output registers (write only). 15 output registers (write only). 12 temporary registers (read/write). 12 temporary registers (read/write). 96 constant registers (read only or read/write?). 96 constant registers (read only or read/write?). 256 instructions max 256 instructions max

Vertex Shader Output Output Inputs (vector or Inputs (vector or Opcode (scalar or vector) replicated scalar) Operation Opcode (scalar or vector) replicated scalar) Operation ARL s address register address register load ARL s address register address register load MOV v v move MOV v v move MUL v,v v multiply MUL v,v v multiply ADD v,v v add ADD v,v v add MAD v,v,v v multiply and add MAD v,v,v v multiply and add RCP s ssss reciprocal RCP s ssss reciprocal RSQ s ssss reciprocal square root RSQ s ssss reciprocal square root DP3 v,v ssss 3-component dot product DP3 v,v ssss 3-component dot product DP4 v,v ssss 4-component dot product DP4 v,v ssss 4-component dot product DST v,v v distance vector DST v,v v distance vector MIN v,v v minimum MIN v,v v minimum MAX v,v v maximum MAX v,v v maximum SLT v,v v set on less than SLT v,v v set on less than SGE v,v v set on greater equal than SGE v,v v set on greater equal than EXP s v exponential base 2 EXP s v exponential base 2 LOG s v logarithm base 2 LOG s v logarithm base 2 LIT v v light coefficients LIT v v light coefficients DPH v,v ssss homogeneous dot product DPH v,v ssss homogeneous dot product RCC s ssss reciprocal clamped RCC s ssss reciprocal clamped SUB v,v v subtract SUB v,v v subtract ABS v v absolute value ABS v v absolute value

Clipping Clip geometry primitives with the view frustrum (6 planes). Clip geometry primitives with the view frustrum (6 planes). Clip geometry primitives with the user clip planes. Clip geometry primitives with the user clip planes. Techniques used: Techniques used: Guard-Band Clipping. Guard-Band Clipping. Homogenous rasterization avoids clipping in the geometry stage. Homogenous rasterization avoids clipping in the geometry stage.

Guard-Band Clipping

Homogeneus coordinates “Triangle Scan Conversion using 2D Homogeneus Coordinates”, Olano and Greer. “Triangle Scan Conversion using 2D Homogeneus Coordinates”, Olano and Greer.

Rasterization Setup (per-triangle). Setup (per-triangle). Sampling (triangle = {fragments}. Sampling (triangle = {fragments}. Interpolation (interpolate colors and coordinates). Interpolation (interpolate colors and coordinates).

Rasterization Converts primitives to fragments. Converts primitives to fragments. Primitive: point, line, polygon, … Primitive: point, line, polygon, … Fragment: transient data structure Fragment: transient data structure short x, y; long depth; short r, g, b, a; Fragment selection. Fragment selection. Parameter Assignment (color, depth...). Parameter Assignment (color, depth...).

Programmable Pipeline

Vertex Program

NV_vertex_program2 ARL (new support for four-component A0 and A1 instead of just A0.x) ARL (new support for four-component A0 and A1 instead of just A0.x) ARR (similar to ARL, but rounds instead of truncating before storing the integer result in an address register) ARR (similar to ARL, but rounds instead of truncating before storing the integer result in an address register) BRA, CAL, RET (branching instructions) BRA, CAL, RET (branching instructions) COS, SIN (high-precision trigonometric functions) COS, SIN (high-precision trigonometric functions) FLR, FRC (floor and fraction of floating-point values) FLR, FRC (floor and fraction of floating-point values) EX2, LG2 (high-precision exponentiation and logarithm functions) EX2, LG2 (high-precision exponentiation and logarithm functions) ARA (adds pairs of components of an address register; useful for looping and other operations) ARA (adds pairs of components of an address register; useful for looping and other operations) SEQ, SFL, SGT, SLE, SNE, STR (“set on” instructions similar to SLT, SGE) SEQ, SFL, SGT, SLE, SNE, STR (“set on” instructions similar to SLT, SGE) SSG (“set sign” operation; generates a vector holding –1.0 for negative operand components, 0 for zero-value components, and +1.0 for positive components) SSG (“set sign” operation; generates a vector holding –1.0 for negative operand components, 0 for zero-value components, and +1.0 for positive components)

NV_vertex_program2 Overview 1. Condition codes 2. Branching & subroutines 3. Even faster performance 4. Nineteen new instructions 5. New source modifiers 6. Clip plane support 7. More registers & instructions

NV_vertex_program2 Resource Limits 256 vertex program parameters 256 vertex program parameters Up from 96 Up from temporary registers 16 temporary registers Up from 12 Up from 12 Two 4-component address registers Two 4-component address registers Up from one single-component address register Up from one single-component address register 256 static instructions per program 256 static instructions per program Up from 128 Up from 128 Given branching, dynamic instructions can execute before termination to avoid infinite loops Given branching, dynamic instructions can execute before termination to avoid infinite loops

NV_vertex_program2 Source Modifiers Source operand absolute value Source operand absolute value Example: MOV R0, |R1|; Example: MOV R0, |R1|; In addition to source negation & swizzling In addition to source negation & swizzling Example: MAD R0, -|R1|.yzwy, |R2|, - R3,w; Example: MAD R0, -|R1|.yzwy, |R2|, - R3,w; Swizzle, negate, & absolute value operations are “free” source modifiers Swizzle, negate, & absolute value operations are “free” source modifiers

NV_vertex_program2 Condition Codes (1) Condition code state Condition code state 4-component register stores condition code values 4-component register stores condition code values Four possible values Four possible values LT –less than zero LT –less than zero EQ – equal to zero EQ – equal to zero GT –greater than zero GT –greater than zero UN– unordered, for comparisons involving NaN UN– unordered, for comparisons involving NaN Most instructions optionally update condition code state Most instructions optionally update condition code state Indicated with “C” suffix: DP4C, MOVC, etc Indicated with “C” suffix: DP4C, MOVC, etc “CC” pseudo-register used to just update condition codes “CC” pseudo-register used to just update condition codes

NV_vertex_program2 Condition Codes (2) Optional condition code based destination masking Optional condition code based destination masking Example: MOV R1.xy(NE.z), R0; Example: MOV R1.xy(NE.z), R0; Copy R0components to R1’s X & Y components except when condition code’s Z component is EQ Copy R0components to R1’s X & Y components except when condition code’s Z component is EQ Condition code rules: EQ, equal; GE, greater or equal; GT, greater than; LE, less or equal; LT, less than; NE, not equal; FL, false; and TR, true Condition code rules: EQ, equal; GE, greater or equal; GT, greater than; LE, less or equal; LT, less than; NE, not equal; FL, false; and TR, true Note that condition code masking rule can swizzle condition code components Note that condition code masking rule can swizzle condition code components

ATI R300. Vertex Shader.

3DLabs P10. Pipeline.

Matrox Parhelia. Pipeline.