Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATI Stream Computing ATI Radeon™ HD 2900 Series GPU Hardware Overview Micah Villmow May 30, 2008.

Similar presentations


Presentation on theme: "ATI Stream Computing ATI Radeon™ HD 2900 Series GPU Hardware Overview Micah Villmow May 30, 2008."— Presentation transcript:

1 ATI Stream Computing ATI Radeon™ HD 2900 Series GPU Hardware Overview Micah Villmow May 30, 2008

2 | ATI Stream Computing Update | Confidential 22 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ATI Radeon™ HD 2900 Series GPU Hardware Overview Graphics View Compute View ATI Radeon™ HD 2900 Series GPU Hardware ATI Radeon™ HD 2400/2600 Series GPU Hardware

3 | ATI Stream Computing Update | Confidential 33 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ATI Radeon™ HD 2900 Series GPU - Graphics Overview Created for graphics Not optimal for compute Various functions have specific use cases Overhead caused by graphics pipeline Graphics APIs do not allow very direct control

4 | ATI Stream Computing Update | Confidential 44 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ATI Radeon™ HD 2900 Series GPU – Compute Overview Hides non- compute items:  Geometry Shader  Tesselation Unit  Vertex Shader  Vertex Cache  Z/Stencil Cache  Etc… Exposes only what is required

5 | ATI Stream Computing Update | Confidential 55 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ATI Radeon™ HD 2900 Series GPU Hardware ALU Hardware –Streaming Core –Thread processor –Flow Control –Thread Creation –ALU Scheduling Memory Hardware –Memory Controller –Texture Unit –Texture Unit Scheduling –Tiling –Render Backends

6 | ATI Stream Computing Update | Confidential 66 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ALU Hardware – Thread Processors 5 Streaming Cores Four thin SC’s[X,Y,Z,W] One fat SC[T] Branch execution unit Single cycle dispatch Four cycle latency 16 Threads/Cycle 00 ALU: ADDR(32) CNT(5) 0 x: MOV R1.x, 0.0f y: MOV R1.y, 0.0f z: MOV R1.z, 0.0f w: MOV R1.w, 0.0f t: MOV R0.x, 0.0f

7 | ATI Stream Computing Update | Confidential 77 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ALU Hardware – Flow Control Predication to mask state updates Writes only occur when mask not set 01 LOOP_DX10 i0 FAIL_JUMP_ADDR(5) VALID_PIX 02 ALU_BREAK: ADDR(37) CNT(2) KCACHE0(CB0:0-15) 1 y: SETE_INT R0.y, R0.x, KC0[1].x 2 x: PREDE_INT ____, R0.y, 0.0f UPDATE_EXEC_MASK UPDATE_PRED 03 ALU: ADDR(39) CNT(5) KCACHE0(CB0:0-15) 3 x: ADD R1.x, R1.x, KC0[0].x y: ADD R1.y, R1.y, KC0[0].y z: ADD R1.z, R1.z, KC0[0].z w: ADD R1.w, R1.w, KC0[0].w t: ADD_INT R0.x, R0.x, 1 04 ENDLOOP i0 PASS_JUMP_ADDR(2) 05 EXP_DONE: PIX0, R1 END_OF_PROGRAM

8 | ATI Stream Computing Update | Confidential 88 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ALU Hardware – DPP Array 4 SIMD Engines 4 Quads/SE 4 TP/Quads 5 Streaming Cores/TP 320 Streaming Cores 2 Wavefronts/SE 512 Threads Concurrently processed 256 Registers Per SIMD

9 | ATI Stream Computing Update | Confidential 99 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview Cycle 0: ALU Hardware – Wavefront Execution Even WavefrontOdd Wavefront Cycle 1: Cycle 2: Cycle 3: Cycle 4: Cycle 5: Cycle 6: IL Instr: imul r22, r22, r10 IL Instr: and r22, r22, r11 Repeat Ad Nauseam for ALU 1 square represents a quad(4 sequential threads) 4 quads execute per cycle on a SIMD Two Wavefronts(WF’s) execute in parallel Even/Odd WF’s interleave quads every other cycle Cycle 7:

10 | ATI Stream Computing Update | Confidential 10 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ALU Hardware – Thread Creation Stamps out 16 quads per wavefront in preset order Dispatched to SE’s in round robin fashion by Ultra- Threaded Dispatch Processor Affects memory access performance

11 | ATI Stream Computing Update | Confidential 11 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ALU Hardware - Wavefront Scheduling SIMD Engine is 100% busy SIMD Engine has stalls

12 | ATI Stream Computing Update | Confidential 12 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview Memory Hardware – Memory Controller Fully distributed memory interface Stacked I/O pad design Runs Independently of compute and texture units. Highlights: Over 100 GB/s memory bandwidth Achieved via last generation technology Eight 64-bit memory channels Kilobit ring bus Lower frequencies required

13 | ATI Stream Computing Update | Confidential 13 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview 13 12/6/2015 Memory Hardware – Texture Unit Four 32KB Four-way associative L1 caches L1 cache size is 4x8KB per SIMD Engine Data is split across all four 8K L1 cache’s L1 cache line is 256 bytes or 4 quads of data 256KB Unified Cache over all SIMDs

14 | ATI Stream Computing Update | Confidential 14 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview Memory Hardware – TEX Scheduling Run independently of ALU units Run on core/engine clocks Process multiple wavefronts sequentially to hide latency Transfers data from cache to registers Latency is predictable for L1 cache hits

15 | ATI Stream Computing Update | Confidential 15 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview Memory Hardware - Tiling Multiple tiling formats Micro-tiling and macro-tiling CAL tiled format is micro-tiled, macro-tiled Quad based hierarchical Z pattern CAL linear format is micro- tiled, macro-linear Tiled quad based linear format

16 | ATI Stream Computing Update | Confidential 16 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview Memory Hardware – Backends Also called ROPs (Raster Operator) Outputs data to memory via color registers Maximum 8 Outputs 4 Backend units 256B output width 32KB Write cache/unit 32 Pixels/Clk Memory Controller DPP Array

17 | ATI Stream Computing Update | Confidential 17 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview ATI Radeon™ HD 2400 / 2600 Series GPUs ATI Radeon™ HD 2400 Series GPU 40 Stream Processors 2 SIMD Engines 4 Thread Processors/SIMD 1 Texture Unit 1 Render Backend ATI Radeon™ HD 2600 Series GPU 120 Stream Processors 3 SIMD Engines 8 Thread Processors/SIMD 2 Texture Units 1 Render Backend

18 | ATI Stream Computing Update | Confidential 18 | ATI Stream Computing – ATI Radeon™ HD 2900 Series GPU Hardware Overview Disclaimer & Attribution DISCLAIMER The information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors. The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limited to product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. AMD assumes no obligation to update or otherwise correct or revise this information. However, AMD reserves the right to revise this information and to make changes from time to time to the content hereof without obligation of AMD to notify any person of such revisions or changes. AMD MAKES NO REPRESENTATIONS OR WARRANTIES WITH RESPECT TO THE CONTENTS HEREOF AND ASSUMES NO RESPONSIBILITY FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION. AMD SPECIFICALLY DISCLAIMS ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE. IN NO EVENT WILL AMD BE LIABLE TO ANY PERSON FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF AMD IS EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. ATTRIBUTION © 2010 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, ATI, the ATI Logo, Radeon, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names are for informational purposes only and may be trademarks of their respective owners.


Download ppt "ATI Stream Computing ATI Radeon™ HD 2900 Series GPU Hardware Overview Micah Villmow May 30, 2008."

Similar presentations


Ads by Google