Improving k-buffer methods via Occupancy Maps Andreas A. Vasilakis and Georgios Papaioannou Dept. of Informatics, Athens University of Economics & Business,

Slides:



Advertisements
Similar presentations
Visible-Surface Detection(identification)
Advertisements

Exploration of advanced lighting and shading techniques
Sven Woop Computer Graphics Lab Saarland University
Photon Mapping on Programmable Graphics Hardware Timothy J. Purcell Mike Cammarano Pat Hanrahan Stanford University Craig Donner Henrik Wann Jensen University.
Real-time lighting via Light Linked List 8/07/2014 Abdul Bezrati.
Depth - fighting aware Methods for Multifragment Rendering Andreas A. Vasilakis and Ioannis Fudos Department of Computer Science, University of Ioannina,
03/12/02 (c) 2002 University of Wisconsin, CS559 Last Time Some Visibility (Hidden Surface Removal) algorithms –Painter’s Draw in some order Things drawn.
Occlusion Culling Fall 2003 Ref: GamasutraGamasutra.
Part I: Basics of Computer Graphics
GCAFE 28 Feb Real-time REYES Jeremy Sugerman.
Eurographics 2012, Cagliari, Italy S-buffer: Sparsity-aware Multi-fragment Rendering Andreas A. Vasilakis and Ioannis Fudos Department of Computer Science,
Fragment-Parallel Composite and Filter Anjul Patney, Stanley Tzeng, and John D. Owens University of California, Davis.
Chapter 6: Vertices to Fragments Part 2 E. Angel and D. Shreiner: Interactive Computer Graphics 6E © Addison-Wesley Mohan Sridharan Based on Slides.
Control Flow Virtualization for General-Purpose Computation on Graphics Hardware Ghulam Lashari Ondrej Lhotak University of Waterloo.
Skin Rendering GPU Graphics Gary J. Katz University of Pennsylvania CIS 665 Adapted from David Gosselin’s Power Point and article, Real-time skin rendering,
Interactive Shadow Generation in Complex Environments Naga K. Govindaraju, Brandon Lloyd, Sung-Eui Yoon, Avneesh Sud, Dinesh Manocha Speaker: Alvin Date:
3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.
Final Gathering on GPU Toshiya Hachisuka University of Tokyo Introduction Producing global illumination image without any noise.
Enhancing and Optimizing the Render Cache Bruce Walter Cornell Program of Computer Graphics George Drettakis REVES/INRIA Sophia-Antipolis Donald P. Greenberg.
Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.
Z-Buffer Optimizations Patrick Cozzi Analytical Graphics, Inc.
Beyond Programmable Shading Course, ACM SIGGRAPH 20111/66.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
1 A Hierarchical Shadow Volume Algorithm Timo Aila 1,2 Tomas Akenine-Möller 3 1 Helsinki University of Technology 2 Hybrid Graphics 3 Lund University.
Afrigraph 2004 Massive model visualization Tutorial A: Part I Rasterization Based Approaches Andreas Dietrich Computer Graphics Group, Saarland University.
Shadows Computer Graphics. Shadows Shadows Extended light sources produce penumbras In real-time, we only use point light sources –Extended light sources.
Erdem Alpay Ala Nawaiseh. Why Shadows? Real world has shadows More control of the game’s feel  dramatic effects  spooky effects Without shadows the.
Voxelized Shadow Volumes Chris Wyman Department of Computer Science University of Iowa High Performance Graphics 2011.
Ray Tracing and Photon Mapping on GPUs Tim PurcellStanford / NVIDIA.
NVIDIA PROPRIETARY AND CONFIDENTIAL Occlusion (HP and NV Extensions) Ashu Rege.
Visibility Queries Using Graphics Hardware Presented by Jinzhu Gao.
Lecture 3 : Direct Volume Rendering Bong-Soo Sohn School of Computer Science and Engineering Chung-Ang University Acknowledgement : Han-Wei Shen Lecture.
Adaptive Real-Time Rendering of Planetary Terrains WSCG 2010 Raphaël Lerbour Jean-Eudes Marvie Pascal Gautron THOMSON R&D, Rennes, France.
Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware Nolan GoodnightGreg HumphreysCliff WoolleyRui Wang University of Virginia.
CS 450: COMPUTER GRAPHICS REVIEW: INTRODUCTION TO COMPUTER GRAPHICS – PART 2 SPRING 2015 DR. MICHAEL J. REALE.
Matrices from HELL Paul Taylor Basic Required Matrices PROJECTION WORLD VIEW.
Week 2 - Friday.  What did we talk about last time?  Graphics rendering pipeline  Geometry Stage.
VIS Group, University of Stuttgart Tutorial T4: Programmable Graphics Hardware for Interactive Visualization Adaptive Terrain Slicing (Stefan Roettger)
Computer Graphics 2 Lecture 8: Visibility Benjamin Mora 1 University of Wales Swansea Pr. Min Chen Dr. Benjamin Mora.
Photon Mapping on Programmable Graphics Hardware
Visible-Surface Detection Jehee Lee Seoul National University.
Efficient Rendering of Local Subsurface Scattering Tom Mertens 1, Jan Kautz 2, Philippe Bekaert 1, Frank Van Reeth 1, Hans-Peter Seidel
Global Illumination with a Virtual Light Field Mel Slater Jesper Mortensen Pankaj Khanna Insu Yu Dept of Computer Science University College London
Occlusion Query. Content Occlusion culling Collision detection (convex) Etc. Fall
Tone Mapping on GPUs Cliff Woolley University of Virginia Slides courtesy Nolan Goodnight.
Silicon Graphics, Inc. The Holodeck Interactive Ray Cache Greg Ward, Exponent Maryann Simmons, UCB.
IIIT Hyderabad Scalable Clustering using Multiple GPUs K Wasif Mohiuddin P J Narayanan Center for Visual Information Technology International Institute.
Stencil Routed A-Buffer
- Laboratoire d'InfoRmatique en Image et Systèmes d'information
Based on paper by: Rahul Khardekar, Sara McMains Mechanical Engineering University of California, Berkeley ASME 2006 International Design Engineering Technical.
Emerging Technologies for Games Deferred Rendering CO3303 Week 22.
Programmability Hiroshi Nakashima Thomas Sterling.
Partitioning Screen Space 2 Rui Wang. Architectural Implications of Hardware- Accelerated Bucket Rendering on the PC (97’) Dynamic Load Balancing for.
Graphics Interface 2009 The-Kiet Lu Kok-Lim Low Jianmin Zheng 1.
Single Pass Point Rendering and Transparent Shading Paper by Yanci Zhang and Renato Pajarola Presentation by Harmen de Weerd and Hedde Bosman.
What are shaders? In the field of computer graphics, a shader is a computer program that runs on the graphics processing unit(GPU) and is used to do shading.
Shadows David Luebke University of Virginia. Shadows An important visual cue, traditionally hard to do in real-time rendering Outline: –Notation –Planar.
01/28/09Dinesh Manocha, COMP770 Visibility Computations Visible Surface Determination Visibility Culling.
Hierarchical Occlusion Map Zhang et al SIGGRAPH 98.
SHADOW CASTER CULLING FOR EFFICIENT SHADOW MAPPING JIŘÍ BITTNER 1 OLIVER MATTAUSCH 2 ARI SILVENNOINEN 3 MICHAEL WIMMER 2 1 CZECH TECHNICAL UNIVERSITY IN.
Siggraph 2009 RenderAnts: Interactive REYES Rendering on GPUs Kun Zhou Qiming Hou Zhong Ren Minmin Gong Xin Sun Baining Guo JAEHYUN CHO.
Real-Time Soft Shadows with Adaptive Light Source Sampling
Week 2 - Friday CS361.
Graphics Processing Unit
COMP60621 Fundamentals of Parallel and Distributed Systems
Accelerating k+-buffer using efficient fragment culling
Improving k-buffer methods via Occupancy Maps
RADEON™ 9700 Architecture and 3D Performance
Frame Buffer Applications
COMP60611 Fundamentals of Parallel and Distributed Systems
Presentation transcript:

Improving k-buffer methods via Occupancy Maps Andreas A. Vasilakis and Georgios Papaioannou Dept. of Informatics, Athens University of Economics & Business, Greece

Multi-fragment Visibility Determination [Problem]: generation of more than one out-of-order fragments per pixel [Problem]: generation of more than one out-of-order fragments per pixel [Goal]: 1.peel() 2.sort() 3.resolve() [Goal]: 1.peel() 2.sort() 3.resolve() [ray casting]

Screen-space Applications (1) Photorealistic Rendering [global illumination][transparency]

Screen-space Applications (2) Visualization & Processing [flow] [molecular] [solid] [hair]

Multi-fragment Rendering Solutions (1) A-buffer:Store [all] fragments then Sort them

Multi-fragment Rendering Solutions (2) A-buffer:Store [all] fragments then Sort them Limitations: 1.Memory a.wasteful allocation b.potential overflow 2.Performance a.local memory cache overflow & latency issues

Multi-fragment Rendering Solutions (3) k-buffer: Store & Sort [k-closest] fragments k=4

Multi-fragment Rendering Solutions (4) k-buffer: Store & Sort [k-closest] fragments Limitations: 1.[Bavoil07,Bavoil08] a.RMW hazards b.geometry pre-sorting c.upper-bounded k 2.[Liu10,Maule13] a.extra geometry pass b.depth precision conversion 3.[Salvi13] a.extreme fragment congestion b.modern hardware k=4

Multi-fragment Rendering Solutions (5) k-buffer: Store [all] fragments then Select & Sort [k-closest] ones k=4

Multi-fragment Rendering Solutions (6) k-buffer: Store [all] fragments then Select & Sort [k-closest] ones Limitations: 1.[Salvi11,Yu12] a.A-buffer construction k=4

Multi-fragment Rendering Solutions (8) k + -buffer [Vasilakis14,Vasilakis15]: Store [k-closest] fragments then Sort them Fragment Culling Mechanism: Concurrently discards an incoming fragment that is farther from all currently maintained fragments (using max element). k=4

Multi-fragment Rendering Solutions (8) k + -buffer [Vasilakis14,Vasilakis15]: Store [k-closest] fragments then Sort them Fragment Culling Mechanism: Concurrently discards an incoming fragment that is farther from all currently maintained fragments (using max element). Limitations: 1.Depends on the fragment arrival order. 2.Requires k-buffer to be initially filled. 3.Fragment elimination is performed inside pixel shader (not hardware-accelerated). k=4

Novel Fragment Culling (1) Ideal k-buffer solution: Find [k-th fragment] then Cull [farthest] them k=4 Fragment Culling Mechanism: 1.perform extra geometry pass to compute k-th fragment. 2.k-buffer construction with depth testing enabled. free from all previous limitations !!! 12

Novel Fragment Culling (2) Approximate Solution: Exploit [fragment occupancy maps] Fragment Culling Mechanism: Perform early-z culling with the k a -th fragment, nearest largest to the actual k-th (k a ≥ k). Convex Hull Occupancy bitmap Bounding Box

Novel Fragment Culling (3) Approximate Solution: Exploit [fragment occupancy maps] Fragment Culling Mechanism: Perform early-z culling with the k a -th fragment, nearest largest to the actual k-th (k a ≥ k). Algorithm: 1.Depth range is divided into B uniform consecutive subintervals. Convex Hull Bounding Box

Novel Fragment Culling (4) Approximate Solution: Exploit [fragment occupancy maps] Fragment Culling Mechanism: Perform early-z culling with the k a -th fragment, nearest largest to the actual k-th (k a ≥ k). Algorithm: 1.Depth range is divided into B uniform consecutive subintervals. 2.Occupancy bitmap indicates fragment presence in each bucket. Occupancy bitmap

Novel Fragment Culling (5) Approximate Solution: Exploit [fragment occupancy maps] Fragment Culling Mechanism: Perform early-z culling with the k a -th fragment, nearest largest to the actual k-th (k a ≥ k). Algorithm: 1.Depth range is divided into B uniform consecutive subintervals. 2.Occupancy bitmap indicates fragment presence in each bucket. 3.Accumulation of 1s until you reach k value → () time. Occupancy bitmap

Novel Fragment Culling (6) Approximate Solution: Exploit [fragment occupancy maps] Fragment Culling Mechanism: Perform early-z culling with the k a -th fragment, nearest largest to the actual k-th (k a ≥ k). Algorithm: 1.Depth range is divided into B uniform consecutive subintervals. 2.Occupancy bitmap indicates fragment presence in each bucket. 3.Accumulation of 1s until you reach k value → () time. 4.Discard fragments with depth value > k a -th fragment. Occupancy bitmap

Results using our culling mechanism at k-buffer: Store & Sort [k-closest] fragments 1.[Bavoil08] (alleviate RMW hazards) Quality (↑) 2.[Liu10,Maule13,Salvi13] (reduced fragment racing) Performance (↑) k=4

Results Quality Comparison: [Bavoil08] error 29.5% 0.6% [Vasilakis14]

Results using our culling mechanism at k-buffer: Store [all] fragments then Select & Sort [k-closest] ones 1.[Salvi11,Yu12] (reduction of stored fragments) Memory (↓) Performance (↑) k=4

Results using our culling mechanism at k + -buffer: Store [k-closest] fragments then Sort them 1.[Vasilakis14, Vasilakis15] (reduced fragment racing) Performance (↑) k=4

Results Fragment Culling Comparison: 98.28%63.66% layers [Vasilakis14]our culling k = 8

Results Performance Comparison: Impact of k

Results Performance Comparison: Impact of buckets (= 32∙d) k = 8

Conclusions Efficient fragment culling: Exploits fragment occupancy maps to approximate the k-th fragment. Efficient fragment culling: Exploits fragment occupancy maps to approximate the k-th fragment.

Conclusions Efficient fragment culling: Exploits fragment occupancy maps to approximate the k-th fragment. Efficient fragment culling: Exploits fragment occupancy maps to approximate the k-th fragment.

Conclusions Efficient fragment culling: Exploits fragment occupancy maps to approximate the k-th fragment. Efficient fragment culling: Exploits fragment occupancy maps to approximate the k-th fragment. Limitations: Works well only when the generated per-pixel fragments ≫. Fragment rejection process (speed up) is highly dependent on the occupancy map resolution. Limitations: Works well only when the generated per-pixel fragments ≫. Fragment rejection process (speed up) is highly dependent on the occupancy map resolution.

The end Shader Code Available: Acknowledgements: This research has been co-financed by the European Union (European Social Fund - ESF) and Greek national funds through the Operational Program "Education and Lifelong Learning" of the National Strategic Reference Framework (NSRF) - Research Funding Program: ARISTEIA II-GLIDE (grant no.3712).

References [Bavoil2007] Multi-fragment effects on the GPU using the k-buffer (I3D) [Bavoil2008] Deferred rendering using a stencil routed k-buffer (ShaderX6) [Liu2010] FreePipe: a programmable parallel rendering architecture for efficient multi-fragment effects (I3D) [Salvi2011] Adaptive transparency (I3D) [Yu2012] A framework for rendering complex scattering effects on hair (I3D) [Maule2013] Hybrid transparency (I3D) [Salvi2013] Pixel synchronization: Solving old graphics problems with new data structures (SIGGRAPH) [Vasilakis2014] k+-buffer: Fragment synchronized k-buffer (I3D) [Vasilakis2015] k+-buffer: An efficient, memory-friendly and dynamic k-buffer framework (TVCG)