Presentation on theme: "Is There a Real Difference between DSPs and GPUs?"— Presentation transcript:
1Is There a Real Difference between DSPs and GPUs? byStephanie Mitchell and Tim Knudtson
2Main Topics Examples Used in this Presentation D.S.P. Processor Features of the D.S.P. ProcessorD.S.P. ArchitectureD.S.P. ProgrammingG.P.U. ProcessorFeatures of the G.P.U. ProcessorG.P.U. ArchitectureG.P.U. ProgrammingConclusions
3Examples Used in this Presentation Information is given for the following processors:Digital Signal Processor (DSP)TigerSHARCGraphics Processor (GPU)Nvidia GeForce Series 6
4D.S.P. ProcessorA digital signal processor (DSP) is a specialized microprocessor designed specifically for digital signal processing, generally in real-time.Programmable Digital Signal Processor (DSPs) are tuned to efficiently execute the computationally-intensive loops that typically characterize digital signal processing algorithms (i.e. FIR and IIR filters).
5Features of the D.S.P. Processor Designed for real-time processingOptimum performance with streaming dataSeparate program and data memories (Harvard architecture)Special Instructions for SIMD operationsNo hardware support for multitaskingThe ability to act as a direct memory access device if in a host environment
6D.S.P. Architecture Memory architecture DSPs often use special memory architectures that are able to fetch multiple data and/or instructions at the same time:Harvard architecturemodified von Neumann architectureUse of direct memory accessMemory-address calculation unit
7D.S.P. Architecture … continued Data operationsSaturation arithmeticoperations that produce overflows will accumulate at the maximum (or minimum) values that the register can hold rather than wrapping around (maximum+1 doesn't overflow to minimum as in many general-purpose CPUs, instead it stays at maximum).Fixed-point arithmetic is often used to speed up arithmetic processing.Single-cycle operations to increase the benefits of pipelining.
8D.S.P. ProgrammingFloating-point unit integrated directly into the data-pathSpecial looping hardware. Low-overhead or Zero-overhead looping capabilityMultiply-accumulate (MAC) operations, which are good for all kinds of matrix operations, such as convolution for filtering, dot product, or even polynomial evaluation.
9D.S.P. Programming … continued Instructions to increase parallelism: SIMD, VLIW, superscalar architecture.Specialized instructions for modulo addressing in ring buffers and bit-reversed addressing mode for FFT cross-referencing.Digital signal processors sometimes use time-stationary encoding to simplify hardware and increase coding efficiency
10G.P.U. ProcessorA Graphics Processing Unit or GPU (also occasionally called Visual Processing Unit or VPU) is a dedicated graphics rendering device for a personal computer, workstation, or game console.A GPU is the main processing unit in the architecture of every graphic cards used on computers or game consoles.
11Features of the G.P.U. Processor GPU architecture offers a large degree of parallelism.It supports Single Instruction, Multiple Data (SIMD)Most of them have two different types of processing units:Vertex processor (or vertex shader): it is responsible of mathematical operationsPixel (or fragment) processor: it is responsible of texturing operationsThe third stage is for detailed processing, and may change from architecture to another.
12G.P.U. Architecture Processing Unit Focus on Floating point math fp32 and fp16 precision support for intermediate calculations6 four-wide fp32 vector in shaders and 1scalar multifunction op16 four-wide fp32 vector in frag-proc plus 16four-wide fp32 MULsDedicated fp16 normalization hardware
13G.P.U. Architecture… continued MemoryUse dedicated but standard memory architectures (eg DRAM)Multiple small independent memory partitions for improved latencyMemory used to store buffers and optionally texturesIn low-end system (Intel 855GM) system memory is shared as the Graphics memory
14G.P.U. Architecture… continued CacheTexture caches (2 level)Shared between vertex processors and fragment processorsCache processed/filtered texturesVertex cachescache processed and unprocessed vertexesimprove computation and fetch performanceZ and buffer cache and write queues
15G.P.U. Programming Optimization Texture caches (2 level) Super-scalability resulting in high parallelismSIMD (single instruction multiple data) structureRISC (reduced instruction set computer) architectureneither a board design nor an extra high speed data link is necessarya programmable pipeline (shading and lighting calculations programmed by the user)Non graphical applications to be executed on GPUs has been named GPGPU, or General Purpose Computations on GPUs.
16Is There a Real Difference between DSPs and GPUs? ConclusionsThe answer to the title of this presentation:Is There a Real Difference between DSPs and GPUs?The is no ‘real’ difference simply because these two technologies are always in competition with one of another and both architectures offer a large degree of parallelism at a relatively low cost.But …
17Conclusions … continued There pipelines have different units.The GPU is a specialist of gaming graphics so,Vertex Unit: transforms primitives from global 3D into 2D coordinates system.Rasterizer Unit = primitives are converted into square fragmentsFragment Unit = the final color for each fragment is computed, (i.e. texture)Composing Unit = fragments are combined with the current renderingThe DSP is a specialist digital processing so,Data ALU unit = performs multiply/accumulate and other ALU operationsAGU unit = performs memory operand address calculationProgram Control Pipeline (PCP) Unit = performs all other instructions (branches, loops, bit tests, etc.)
18References P. Trancoso and M. Charalambous. Exploring Graphics Processor Performance for General Purpose Applications. Nicosia, Byprus. M. Takefman and P. Chow. A Streamlined DSP Microprocessor Architecture. Toronto, Canada M. Saghir, P. Chow, and C. Lee. Application-Driven Design of DSP Architectures and Compilers. Toronto, Canada D. Geer. Taking the Graphics Processor Beyond Graphics. Published by the IEE Computer Society. September, 2005.