Presentation is loading. Please wait.

Presentation is loading. Please wait.

3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research.

Similar presentations


Presentation on theme: "3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research."— Presentation transcript:

1 3D Graphics Processor Architecture Victor Moya

2 PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research on architecture improvements for future Graphic Processor Units (GPUs). Design and implement a GPU simulator for 3D graphics. Design and implement a GPU simulator for 3D graphics. Goal: Goal: Real-Time radiosity on GPU. Real-Time radiosity on GPU.

3 Outline Rendering. Rendering. Global Illumination. Global Illumination. Ray Tracing. Ray Tracing. Radiosity. Radiosity. Status. Status.

4 Outline Rendering. Rendering. Global Illumination. Global Illumination. Ray Tracing. Ray Tracing. Radiosity. Radiosity. Status. Status.

5 Rendering Display a database of 3D objects over a screen, a picture (file) or a movie (file). Display a database of 3D objects over a screen, a picture (file) or a movie (file). Rendering methods for 3D graphics: Rendering methods for 3D graphics: Rasterization. Rasterization. Reyes. Reyes. Raytracing. Raytracing. Radiosity. Radiosity.

6 Rasterization Project 3D polygons into a view plane. Project 3D polygons into a view plane. Rasterize those polygons into fragments. Rasterize those polygons into fragments. Shade the generated fragments. Shade the generated fragments. Apply and combine textures to calculate the fragment color. Apply and combine textures to calculate the fragment color. Objective: Objective: Real-time. Real-time. Make it look as realistic as possible. Make it look as realistic as possible. Avoid simulation of physical light behavior. Avoid simulation of physical light behavior. With the help of vertex and fragment shaders can render realistic images. With the help of vertex and fragment shaders can render realistic images.

7 rasterize polygons project polygons projection (near) plane far plane

8 Reyes Reyes or Renderman is a rendering architecture designed for realistic offline rendering. Reyes or Renderman is a rendering architecture designed for realistic offline rendering. The 3D objects are reduced to a number of micropolygons. The 3D objects are reduced to a number of micropolygons. Micropolygon: polygon smaller than a pixel. Micropolygon: polygon smaller than a pixel. The micropolygons are then shaded and later sampled and written to the framebuffer. The micropolygons are then shaded and later sampled and written to the framebuffer. Can be combined with raytracing, radiosity or other global illumination techniques. Can be combined with raytracing, radiosity or other global illumination techniques.

9 Dice Shade Sample Visibility/Filter Model Image

10 RayTracing Project a ray from the camera (framebuffer) to the objects in the scene. Project a ray from the camera (framebuffer) to the objects in the scene. Secondary rays may be created as reflections and refractions of the primary rays or other secondary rays. Secondary rays may be created as reflections and refractions of the primary rays or other secondary rays. Rays may be sent from the light sources to create caustic light effects. Rays may be sent from the light sources to create caustic light effects. Good simulation of reflection and transparency (refractions). Good simulation of reflection and transparency (refractions).

11

12 Radiosity Simulates the physical behavior of the light. Simulates the physical behavior of the light. Define the emission, reflection, refraction, absorption and scattering properties of the scene surfaces. Define the emission, reflection, refraction, absorption and scattering properties of the scene surfaces. Mathematical formulation: system of linear equations. Mathematical formulation: system of linear equations. Build iteratively an approximation to the illumination solution. Build iteratively an approximation to the illumination solution. Used to implement global illumination Used to implement global illumination Diffuse lightning. Diffuse lightning. Indirect lightning. Indirect lightning.

13 1st pass2nd pass3rd pass4th pass 16th pass

14 Rasterization VS Reyes Real-time. Real-time. Implemented on current hardware. Implemented on current hardware. Optimized for Optimized for large polygons and large polygons and small number of polygons small number of polygons Global illumination: Global illumination: Emulated using: Emulated using: Shaders Shaders Shadow maps. Shadow maps. Stencil shadows. Stencil shadows. Off-line rendering. Implemented by software renderers. Optimized for large number of polygons. Global illumination: Shaders. Raytracing. Radiosity.

15 Rasterization vs Raytracing Real-time. Real-time. Implemented on current hardware. Implemented on current hardware. Optimized for Optimized for large polygons and large polygons and small number of polygons small number of polygons Global illumination: Global illumination: Emulated using: Emulated using: Shaders Shaders Shadow maps. Shadow maps. Stencil shadows. Stencil shadows. Off-line rendering. Implemented by software renderers. Some hardware implementations. Optimized for large polygon numbers. small polygons. Global illumination: Whitted Ray Tracer. Photon Mapping. Montecarlo. Path Tracing.

16 Rasterization vs Radiosity Real-time. Real-time. Implemented on current hardware. Implemented on current hardware. Optimized for Optimized for large polygons and large polygons and small number of polygons small number of polygons Global illumination: Global illumination: Emulated using: Emulated using: Shaders Shaders Shadow maps. Shadow maps. Stencil shadows. Stencil shadows. Off-line rendering. Implemented by software renderers. Optimized for large polygon numbers. small polygons. Global illumination: Inherent to the algorithm.

17 Outline Rendering. Rendering. Global Illumination. Global Illumination. Ray Tracing. Ray Tracing. Radiosity. Radiosity. Status. Status.

18 Global Illumination Illumination and lightning depends on all the objects and lights in the scene. Illumination and lightning depends on all the objects and lights in the scene. BRDF: BRDF: Function that defines how light is reflected or refracted over a surface. Function that defines how light is reflected or refracted over a surface. Soft shadows: Soft shadows: Umbra and penumbra effects. Umbra and penumbra effects. Physically real reflections and refractions. Physically real reflections and refractions. Indirect illumination: Indirect illumination: Color blending. Color blending. Caustics. Caustics.

19 Why Global Illumination Realism. Realism. Single algorithm for the full the illumination problem: Single algorithm for the full the illumination problem: Direct illumination. Direct illumination. Indirect illumination. Indirect illumination. Shadows. Shadows.

20 Global Illumination vs Real-Time Full scene global illumination algorithms are expensive. Full scene global illumination algorithms are expensive. Introduce illumination algorithm lod (level of detail). Introduce illumination algorithm lod (level of detail). Not all scenes may require a full global illumination implementation. Not all scenes may require a full global illumination implementation. Not all parts of the scene may require a full global illumination implemention. Not all parts of the scene may require a full global illumination implemention. Combine normal rasterization algorithms and techniques and global illumination techniques. Combine normal rasterization algorithms and techniques and global illumination techniques. Reyes architecture. Reyes architecture.

21 Outline Rendering. Rendering. Global Illumination. Global Illumination. Raytracing. Raytracing. Radiosity. Radiosity. Status. Status.

22 Raytracing Highly parallel task. Highly parallel task. Raytracing algorithms: Raytracing algorithms: Raycasting. Raycasting. Shadow Casting. Shadow Casting. Whitted raytracing. Whitted raytracing. Photon mapping. Photon mapping. Montecarlo. Montecarlo. Path tracing. Path tracing.

23 Rays Types: Types: Eye rays. Eye rays. Shadow rays. Shadow rays. Reflected rays. Reflected rays. Refracted rays. Refracted rays. Raytracing recursion depth. Raytracing recursion depth. Static. Static. Adaptative. Adaptative.

24 Raytracing on Current GPUs Limitations: Limitations: Integer arithmetic and addressing not supported in current shader models. Integer arithmetic and addressing not supported in current shader models. No generalized output buffers for fragment shader programs. No generalized output buffers for fragment shader programs. No branching, looping or funtion calls. No branching, looping or funtion calls. No stream buffer or conditional stream support. No stream buffer or conditional stream support. Under utilization of the vertex shader (1 quad per pass). Under utilization of the vertex shader (1 quad per pass). Vertex shader represent ~30% of the computing resources in current GPUs. Vertex shader represent ~30% of the computing resources in current GPUs.

25 Outline Rendering. Rendering. Global Illumination. Global Illumination. Raytracing. Raytracing. Radiosity. Radiosity. Status. Status.

26 Radiosity Light energy per unit surface leaving any surface in the scene. Light energy per unit surface leaving any surface in the scene. Highly parallel. Highly parallel. Scene is divided in patches. Scene is divided in patches. Form factor. Form factor. Fraction of light that reaches a surface i from a surface j. Fraction of light that reaches a surface i from a surface j.

27 Radiosity Radiosity implementations: Radiosity implementations: Light maps and volumes. Light maps and volumes. Statically (off line) radiosity. Statically (off line) radiosity. Used with rasterization as textures. Used with rasterization as textures. Cube maps and Spherical Harmonics. Cube maps and Spherical Harmonics. Fast implementation on current hardware. Fast implementation on current hardware. Photon Mapping. Photon Mapping. Implemented using raytracing. Implemented using raytracing. System of linear equations. System of linear equations. Matrix resolution or approximation. Matrix resolution or approximation. Iterative resolution. Iterative resolution.

28 Photon Mapping on Current GPUs Limitations: Limitations: No integer ALU and addressing modes. No integer ALU and addressing modes. No support for large 1D texture addressing (CPU loads). No support for large 1D texture addressing (CPU loads). No scatter capability at the shaders (CPU stores). No scatter capability at the shaders (CPU stores).

29 Outline Rendering. Rendering. Global Illumination. Global Illumination. Ray Tracing. Ray Tracing. Radiosity. Radiosity. Status. Status.

30 Research Topics Evaluate radiosity on Atila. Evaluate radiosity on Atila. Propose software and hardware changes to make radiosity real-time. Propose software and hardware changes to make radiosity real-time.

31 Immediate changes Unifiy shader model. Unifiy shader model. Single shader model for vertex and fragment shaders. Single shader model for vertex and fragment shaders. Generalize shader model. Generalize shader model. Integer operations. Integer operations. Branches and function calls. Branches and function calls. Looping. Looping. Memory load (different from texture load). Memory load (different from texture load). Texture write and memory store (scatter). Texture write and memory store (scatter).

32 New architecture proposals Reconfigurable shader architecture. Reconfigurable shader architecture. Streaming. Streaming. Deferred rendering. Deferred rendering. Embedded DRAM. Embedded DRAM. Virtualization. Virtualization.

33 Reconfigurable Architecture Static: Static: Variable rendering configuration for each algorithm: Variable rendering configuration for each algorithm: 2:6:16. 2:6:16. 0:8:16. 0:8:16. 0:0:24. 0:0:24. Dynamic. Dynamic. Work balancing. Work balancing. Streaming between shader units. Streaming between shader units.

34 Surface Shaders Vertex Shaders Fragment Shaders Vertex Shaders Fragment Shaders Ray shaders

35 Interconnection Network Dynamically reconfigurable shader network.

36 Streaming Streaming on-chip buffers between shader units. Streaming on-chip buffers between shader units. Conditional streams. Conditional streams. Any shader can: Any shader can: Stream in from memory. Stream in from memory. Stream out to memory. Stream out to memory. Stream in from another shader. Stream in from another shader. Stream out to another shader. Stream out to another shader.

37 Interconnection Network Interconexion Network MC

38 Deferred rendering Store all the scene in local video memory before rendering. Store all the scene in local video memory before rendering. Rasterization: Rasterization: After geometric stage. After geometric stage. Reduces the overdraw overhead. Reduces the overdraw overhead. Raytracing: Raytracing: Before any processing. Before any processing. Build acceleration structure for dynamic scenes. Build acceleration structure for dynamic scenes.

39 Interconexion Network MC

40 Embedded DRAM On chip large embedded DRAM memory buffer. On chip large embedded DRAM memory buffer. Store stream buffers between shaders. Store stream buffers between shaders. Store framebuffer and Z buffer for rasterization: Store framebuffer and Z buffer for rasterization: Reduced overhead from overdraw. Reduced overhead from overdraw. With deferred tiled rendering: fast low external bandwidth supersampling antialiasing. With deferred tiled rendering: fast low external bandwidth supersampling antialiasing. Store acceleration structures for raytracing and radiosity. Store acceleration structures for raytracing and radiosity.

41 Interconexion Network L2 GPU Memory System Memory eDRAM

42 Virtualization Virtualize GPU resources: Virtualize GPU resources: Number of shader processors. Number of shader processors. Size of on-chip stream buffers. Size of on-chip stream buffers. Virtualize shader resources: Virtualize shader resources: Scratch RAM for register spills. Scratch RAM for register spills. Instruction cache rather than intruction memory for unlimited program length. Instruction cache rather than intruction memory for unlimited program length.

43 Virtualization Virtualize GPU memory: Virtualize GPU memory: Memory hierarchy: Memory hierarchy: On chip caches (L1 or L2). On chip caches (L1 or L2). On chip embedded DRAM buffers. On chip embedded DRAM buffers. GPU video memory (L2 or L3). GPU video memory (L2 or L3). System memory. System memory. Disk. Disk.

44 Specific vs Programable Rasterization specific purpose hardware: Rasterization specific purpose hardware: Hierarchical Z Buffer. Hierarchical Z Buffer. Z and Stencil Buffer. Z and Stencil Buffer. Rasterizer: Rasterizer: Triangle Setup. Triangle Setup. Fragment Generation. Fragment Generation. Interpolation. Interpolation.

45 Specific vs Programable Future GPUs may replace specific purpose units with programable units (shaders). Future GPUs may replace specific purpose units with programable units (shaders). Example: Example: Triangle Setup using homogenous coordinates setup algorithm [Olano & Greer] can be efficiently implemented by current shaders. Triangle Setup using homogenous coordinates setup algorithm [Olano & Greer] can be efficiently implemented by current shaders. Only use specific purpose units for those tasks that are more efficient using specific hardware. Only use specific purpose units for those tasks that are more efficient using specific hardware.

46 Specific purpose hardware Acceleration hardware for ray-triangle intersection. Acceleration hardware for ray-triangle intersection. Acceleration hardware for scene traversal. Acceleration hardware for scene traversal. Acceleration hardware for photon maps. Acceleration hardware for photon maps.

47 General Purpose GPUs A generalized GPU can be used as a highly parallel coprocessor for highly parallizable computation tasks. A generalized GPU can be used as a highly parallel coprocessor for highly parallizable computation tasks. Simulations: Simulations: Collision. Collision. Fluid simulation. Fluid simulation. FFT. FFT. Matrix resolution. Matrix resolution. Grid computing. Grid computing.


Download ppt "3D Graphics Processor Architecture Victor Moya. PhD Project Research on architecture improvements for future Graphic Processor Units (GPUs). Research."

Similar presentations


Ads by Google