Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.

Slides:



Advertisements
Similar presentations
CS123 | INTRODUCTION TO COMPUTER GRAPHICS Andries van Dam © 1/16 Deferred Lighting Deferred Lighting – 11/18/2014.
Advertisements

CS 352: Computer Graphics Chapter 7: The Rendering Pipeline.
Understanding the graphics pipeline Lecture 2 Original Slides by: Suresh Venkatasubramanian Updates by Joseph Kider.
CS 4363/6353 BASIC RENDERING. THE GRAPHICS PIPELINE OVERVIEW Vertex Processing Coordinate transformations Compute color for each vertex Clipping and Primitive.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
Graphics Hardware CMSC 435/634. Transform Shade Clip Project Rasterize Texture Z-buffer Interpolate Vertex Fragment Triangle A Graphics Pipeline.
Rasterization and Ray Tracing in Real-Time Applications (Games) Andrew Graff.
Chapter 6: Vertices to Fragments Part 2 E. Angel and D. Shreiner: Interactive Computer Graphics 6E © Addison-Wesley Mohan Sridharan Based on Slides.
Vertices and Fragments I CS4395: Computer Graphics 1 Mohan Sridharan Based on slides created by Edward Angel.
Introduction to Geometry Shaders Patrick Cozzi Analytical Graphics, Inc.
CS 4731: Computer Graphics Lecture 18: Hidden Surface Removal Emmanuel Agu.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Adapted from: CULLIDE: Interactive Collision Detection Between Complex Models in Large Environments using Graphics Hardware Naga K. Govindaraju, Stephane.
Introduction to Geometry Shaders Patrick Cozzi Analytical Graphics, Inc.
Hidden Surface Elimination Wen-Chieh (Steve) Lin Institute of Multimedia Engineering I-Chen Lin’ CG Slides, Rich Riesenfeld’s CG Slides, Shirley, Fundamentals.
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.
Evolution of the Programmable Graphics Pipeline Patrick Cozzi University of Pennsylvania CIS Spring 2011.
Status – Week 283 Victor Moya. 3D Graphics Pipeline Akeley & Hanrahan course. Akeley & Hanrahan course. Fixed vs Programmable. Fixed vs Programmable.
Vertices and Fragments III Mohan Sridharan Based on slides created by Edward Angel 1 CS4395: Computer Graphics.
Mapping Computational Concepts to GPU’s Jesper Mosegaard Based primarily on SIGGRAPH 2004 GPGPU COURSE and Visualization 2004 Course.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
University of Texas at Austin CS 378 – Game Technology Don Fussell CS 378: Computer Game Technology Beyond Meshes Spring 2012.
1 A Hierarchical Shadow Volume Algorithm Timo Aila 1,2 Tomas Akenine-Möller 3 1 Helsinki University of Technology 2 Hybrid Graphics 3 Lund University.
Computer Graphics: Programming, Problem Solving, and Visual Communication Steve Cunningham California State University Stanislaus and Grinnell College.
Hidden Surface Removal
Under the Hood: 3D Pipeline. Motherboard & Chipset PCI Express x16.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
© Copyright Khronos Group, Page 1 Harnessing the Horsepower of OpenGL ES Hardware Acceleration Rob Simpson, Bitboys Oy.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
GPU Programming Robert Hero Quick Overview (The Old Way) Graphics cards process Triangles Graphics cards process Triangles Quads.
CSE 381 – Advanced Game Programming Basic 3D Graphics
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
NVIDIA PROPRIETARY AND CONFIDENTIAL Occlusion (HP and NV Extensions) Ashu Rege.
Chris Kerkhoff Matthew Sullivan 10/16/2009.  Shaders are simple programs that describe the traits of either a vertex or a pixel.  Shaders replace a.
Cg Programming Mapping Computational Concepts to GPUs.
Matrices from HELL Paul Taylor Basic Required Matrices PROJECTION WORLD VIEW.
Week 2 - Friday.  What did we talk about last time?  Graphics rendering pipeline  Geometry Stage.
Computer Graphics 2 Lecture 8: Visibility Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
3D Graphics for Game Programming Chapter IV Fragment Processing and Output Merging.
Improving k-buffer methods via Occupancy Maps Andreas A. Vasilakis and Georgios Papaioannou Dept. of Informatics, Athens University of Economics & Business,
GPU Computation Strategies & Tricks Ian Buck NVIDIA.
Advanced Computer Graphics Spring 2014 K. H. Ko School of Mechatronics Gwangju Institute of Science and Technology.
Mobile Graphics Patrick Cozzi University of Pennsylvania CIS Spring 2012.
Computer Graphics 3 Lecture 6: Other Hardware-Based Extensions Benjamin Mora 1 University of Wales Swansea Dr. Benjamin Mora.
Computer Graphics I, Fall 2010 Implementation II.
COMPUTER GRAPHICS CS 482 – FALL 2015 SEPTEMBER 29, 2015 RENDERING RASTERIZATION RAY CASTING PROGRAMMABLE SHADERS.
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
The Graphics Pipeline Revisited Real Time Rendering Instructor: David Luebke.
09/23/03CS679 - Fall Copyright Univ. of Wisconsin Last Time Reflections Shadows Part 1 Stage 1 is in.
Postmortem: Deferred Shading in Tabula Rasa Rusty Koonce NCsoft September 15, 2008.
Image Fusion In Real-time, on a PC. Goals Interactive display of volume data in 3D –Allow more than one data set –Allow fusion of different modalities.
How to use a Pixel Shader CMT3317. Pixel shaders There is NO requirement to use a pixel shader for the coursework though you can if you want to You should.
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
- Introduction - Graphics Pipeline
Week 2 - Friday CS361.
Patrick Cozzi University of Pennsylvania CIS Fall 2013
Graphics Processing Unit
Deferred Lighting.
The Graphics Rendering Pipeline
GRAPHICS PROCESSING UNIT
CSC 2231: Parallel Computer Architecture and Programming GPUs
Graphics Hardware CMSC 491/691.
Graphics Processing Unit
UMBC Graphics for Games
A Hierarchical Shadow Volume Algorithm
RADEON™ 9700 Architecture and 3D Performance
Frame Buffer Applications
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.

Overview Z-Buffer Review Hardware: Early-Z Software: Front-to-Back Sorting Hardware: Double-Speed Z-Only Software: Early-Z Pass Software: Deferred Shading Hardware: Buffer Compression Hardware: Fast Clear Hardware: Z-Cull Future: Programmable Culling Unit

Z-Buffer Review Also called Depth Buffer Fragment vs Pixel Alternatives: Painter’s, Ray Casting, etc

Z-Buffer History “Brute-force approach” “Ridiculously expensive” Sutherland, Sproull, and, Schumacker, “A Characterization of Ten Hidden-Surface Algorithms”, 1974

Z-Buffer Quiz 10 triangles cover a pixel. Rendering these in random order with a Z-buffer, what is the average number of times the pixel’s z-value is written? See Subtle Tools Slides: erich.realtimerendering.com

Z-Buffer Quiz 1 st triangle writes depth 2 nd triangle has 1/2 chance of writing depth 3 rd triangle has 1/3 chance of writing depth 1 + 1/2 + 1/3 + …+ 1/10 = … See Subtle Tools Slides: erich.realtimerendering.com

Z-Buffer Quiz Harmonic Series # Triangles# Depth Writes ,36710 See Subtle Tools Slides: erich.realtimerendering.com

Z-Test in the Pipeline When is the Z-Test? Fragment Shader Fragment Shader Z-Test or

Early-Z Avoid expensive fragment shaders Reduce bandwidth to frame buffer Writes not reads Fragment Shader Z-Test

Early-Z Automatically enabled on GeForce (8?) unless Fragment shader discards or write depth Depth writes and alpha-test are enabled Fine-grained as opposed to Z-Cull. ATI: “Top of the Pipe Z Reject” Fragment Shader Z-Test See NVIDIA GPU Programming Guide for exact details

Front-to-Back Sorting Utilize Early-Z for opaque objects Old hardware still has less z-buffer writes CPU overhead. Need efficient sorting Bucket Sort Octtree Conflicts with state sorting – –

Double Speed Z-Only GeForce FX and later render at double speed when writing only depth or stencil Enabled when Color writes are disabled Fragment shader discards or write depth Alpha-test is disabled See NVIDIA GPU Programming Guide for exact details

Early-Z Pass Software technique to utilize Early-Z and Double Speed Z-Only Two passes Render depth only. “Lay down depth” – Double Speed Z-Only Render with full shaders – Early-Z (and Z-Cull)

Deferred Shading Similar to Early-Z Pass 1 st Pass: Visibility tests 2 nd Pass: Shading Different than Early-Z Pass Geometry is only transformed once

Deferred Shading 1 st Pass Render geometry into G-Buffers: Images from Tabula Rasa. See Resources. Fragment ColorsNormals DepthEdge Weight

Deferred Shading 2 nd Pass Shading == post processing effects Render full screen quads that read from G-Buffers Objects are no longer needed

Deferred Shading Light Accumulation Result Image from Tabula Rasa. See Resources.

Deferred Shading Eliminates shading fragments that fail Z-Test Increases video memory requirement How does it affect bandwidth?

Buffer Compression Reduce depth buffer bandwidth Generally does not reduce memory usage of actual depth buffer Same architecture applies to other buffers, e.g. color and stencil

Buffer Compression Tile Table: Status for nxn tile of depths, e.g. n=8 [state, z min, z max ] state is either compressed, uncompressed, or cleared [uncompressed, 0.1, 0.8]

Buffer Compression Tile Table DecompressCompress Compressed Z-Buffer Rasterizer updated z-values updated z-max nxn uncompressed z values [z min, z max ]

Buffer Compression Depth Buffer Write Rasterizer modifies copy of uncompressed tile Tile is lossless compressed (if possible) and sent to actual depth buffer Update Tile Table z min and z max status: compressed or decompressed

Buffer Compression Depth Buffer Read Tile Status Uncompressed: Send tile Decompress: Decompress and send tile Cleared: See Fast Clear

Fast Clear Don’t touch depth buffer glClear sets state of each tile to cleared When the rasterizer reads a cleared buffer A tile filled with GL_DEPTH_CLEAR_VALUE is sent Depth buffer is not accessed

Fast Clear Use glClear Not full screen quads No "one frame positive, one frame negative“ trick Clear stencil together with depth

Z-Cull Cull blocks of fragments before shading Coarse-grained as opposed to Early-Z Fragment Shader Z-Cull Z triangle min > tile’s z max z triangle min

Z-Cull Z max -Culling Rasterizer fetches z max for each tile it processes Compute z triangle min for a triangle Culled if z triangle min > z max Fragment Shader Z-Cull Z triangle min > tile’s z max z triangle min

Z-Cull Z min -Culling Support different depth tests Avoid depth buffer reads If triangle is in front of tile, depth tests for each pixel is unnecessary Fragment Shader Z-Cull Z triangle max < tile’s z min z triangle max

Z-Cull Automatically enabled on GeForce (6?) cards unless glClear isn’t used Fragment shader writes depth (or discards?) Direction of depth test is changed ATI recommends avoiding = and != depth compares and stencil fail and stencil depth fail operations Less efficient when depth varies a lot within a few pixels See NVIDIA GPU Programming Guide for exact details

Programmable Culling Unit Cull before fragment shader even if the shader writes depth or discards Run part of shader over an entire tile to determine lower bound z value Hasselgren and Akenine-Möller, “PCU: The Programmable Culling Unit,” 2007

Summary What was once “ridiculously expensive” is now the primary visible surface algorithm for rasterization

Resources Sections and 18.3

Resources developer.nvidia.com/object/gpu_programming_guide.html GeForce 8 Guide: sections 3.4.9, 3.6, and 4.8 GeForce 7 Guide: section 3.6

Resources ATI Radeon HyperZ Technology Steve Morein

Resources Performance Optimization Techniques for ATI Graphics Hardware with DirectX® 9.0 Guennadi Riguer Sections 6.5 and 8

Resources developer.nvidia.com/object/gpu_gems_home.html Chapter 28: Graphics Pipeline Performance

Resources developer.nvidia.com/object/gpu-gems-3.html Chapter 19: Deferred Shading in Tabula Rasa