Presentation is loading. Please wait.

Presentation is loading. Please wait.

RADEON™ 9700 Architecture and 3D Performance

Similar presentations


Presentation on theme: "RADEON™ 9700 Architecture and 3D Performance"— Presentation transcript:

1 RADEON™ 9700 Architecture and 3D Performance
Gordon Elder ATI Technologies - Confidential 19 April 2019

2 RADEON™ 9700 What is the RADEON™ 9700 ?
Programmability(SMARTSHADER™ 2.0) First Full Floating Point Graphics Pipeline Enables Compilation of High Level Shading Languages Performance High Bandwidth Parallelism Efficiency Image Quality (SMOOTHVISION™ 2.0) Multisample Antialiasing Anisotropic Texture Filtering

3 Image Generation with Image Mapping 1st Generation Programmability
Idea: Texture Mapping, Blinn and Newell 1976 Implementation: SGI VGXT 1990 Hardwired Vertex Processing Hardwired Fragment Processing with a Single Texture Result: Environment Mapping and other effects Blinn, J. F. and Newell, M. E. Texture and reflection in computer generated images. Communications of the ACM Vol. 19, No. 10 (October 1976),

4 Image Generation with Texture Composition 2nd Generation Programmability
Idea: Shade trees, R. Cook 1984 Implementation: RADEON™ Limited Vertex Programmability Limited Fragment Processing Multiple Textures Fixed Point Data Short Programs Result: Current generation of effects. Robert L. Cook Shade Trees. Computer Graphics Vol. 18, No. 3, (July 1984),

5 Image Generation with General Purpose Floating Point Math & Texturing 3rd Generation Programmability
Idea: RenderMan®, Pixar 1987 Implementation: ATI RADEON™ Advanced Vertex Programmability Advanced Fragment Programmability Floating Point Data Rich Instruction Set Large Instruction Store Result: Enabling Cinematic Rendering Compiling RenderMan®, Maya, etc. Willina T. Reeves, David H. Salesin, Robert L. Cook Rendering Antialiased Shadows with Depth Maps. Computer Graphics Vol. 21, No. 4, (July 1987),

6 SMARTSHADER™ 2.0 Next-generation programmable shader technology
Enabling cinema-quality effects in real time First complete DirectX® 9.0 feature support 2.0 Vertex and Pixel Shaders Floating Point Pixel Pipelines 128-bit Floating Point Texture and Frame Buffer Formats Two-Sided Stencil Shadow Acceleration High Precision 32-bpp (10:10:10:2) Display Mode Higher Order Surface Enhancements Full feature set also available for OpenGL® OpenGL® Shading Language Support ATI Technologies - Confidential 19 April 2019

7 Vertex Shaders (SMARTSHADER™ 2.0)
Flow Control Loops, jumps and subroutines Allow re-use of certain parts of the shader code Avoids repetition and saves instructions More Instructions, More Complex Effects Up to 65,280 instructions per pass Vertex shaders can be much more complex than they were in DX8 ATI Technologies - Confidential 19 April 2019

8 Pixel Shaders (SMARTSHADER™ 2.0)
More Complex Shaders by an Order of Magnitude Up to 160 instructions per pass 32 address ops, 64 color ops, 64 alpha ops Compared with 12 instructions total in DX8.0 Multi-pass rendering support High precision 128-bit floating point data formats for storing intermediate results between passes Shaders can now effectively be thousands of instructions long – performance is the only limitation 24-bit per component floating point precision for all pixel shader operations - necessary for cinema-quality effects Allows shaders written in any present or future language to run on hardware with SMARTSHADER™ 2.0 Even high level languages like RenderMan® can now be compiled to run on RADEON™ 9700 in real time Pixel shader can also implement complex Image Processing algorithms ATI Technologies - Confidential 19 April 2019

9 RADEON 9700 Performance Key design elements for best performance: High Bandwidth, Parallelism, & Efficiency High Bandwidth AGP 8x provides 2 GB/sec transfers to or from the CPU or system memory. 310 MHz 256-bit DDR Memory Interface provides GB/sec access to the Frame Buffer Internal 256-bit data busses for Color, Texture and Z Parallelism 4 Vertex Engines running at 325MHz provides Mtriangles/sec (4 clocks per vertex per engine) 8 Pixels/Clock Rasterization Architecture running at 325MHz provides a peak fill rate of 2.6 Gpix/sec ATI Technologies - Confidential 19 April 2019

10 RADEON 9700 Performance (cont.)
Efficiency Graphics systems tend to be Memory Bandwidth limited. The RADEON™ 9700 is no exception. So it is important to use the bandwidth efficiently. Hierarchical and Early Z checking allows pixels to be rejected before the pixel shader. This is very important when shader programs are long. Color, Texture and Z caches reduce memory bandwidth utilization. Benefit from spacial and temporal locality. Lossless Color and Z data compression reduce memory bandwidth utilization. Compressed Textures can be utilized to reduce memory bandwidth utilization. Fast Color and Z clears eliminate need to access memory for clears HyperZ III

11 RADEON™ 9700 Performance (cont.)
One more interesting thing…….. Scalability The RADEON™ 9700 Architecture is capable of scaling up to 256 simultaneous units

12 Image Quality (SMOOTHVISION™ 2.0 )
Performance matters too Pixel antialiasing and anisotropic texture filtering improve image quality only if they are enabled. Just going to higher resolutions isn’t the answer for improved image quality. Artifacts due to poor texture sampling remain. Dynamic antialiasing artifacts are still very visible. Sufficient performance for high resolution display, high quality texture filtering, and antialiasing is needed. The RADEON™ 9700 was architected to do all three simultaneously.

13 Anti-Aliasing (SMOOTHVISION™ 2.0)
Non-Grid Programmable Multi-Sampling 2, 4, or 6 samples per pixel Sample positions provide the maximum quality per sample Lossless Z and Color compression minimizes bandwidth cost of higher sample counts. Per Sample Gamma Correction Takes gamma into account when blending samples Creates smoother edge transitions Standard Edge Gradient Input Output Gamma Corrected ATI Technologies - Confidential 19 April 2019

14 Anisotropic Filtering (SMOOTHVISION™ 2.0)
Improved Adaptive Algorithm Up to 16 Trilinear Samples (128-tap) Calculates optimal number of samples for each polygon Delivers full image quality benefit while conserving memory bandwidth ATI Technologies - Confidential 19 April 2019

15 RADEON™ 9700 Demos

16 Conclusion


Download ppt "RADEON™ 9700 Architecture and 3D Performance"

Similar presentations


Ads by Google