Is There a Real Difference between DSPs and GPUs?

Slides:



Advertisements
Similar presentations
Clare Smtih SHARC Presentation1 The SHARC Super Harvard Architecture Computer.
Advertisements

Vectors, SIMD Extensions and GPUs COMP 4611 Tutorial 11 Nov. 26,
DSPs Vs General Purpose Microprocessors
Lecture 4 Introduction to Digital Signal Processors (DSPs) Dr. Konstantinos Tatas.
Intel Pentium 4 ENCM Jonathan Bienert Tyson Marchuk.
COMPUTER GRAPHICS CS 482 – FALL 2014 NOVEMBER 10, 2014 GRAPHICS HARDWARE GRAPHICS PROCESSING UNITS PARALLELISM.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
AMD OPTERON ARCHITECTURE Omar Aragon Abdel Salam Sayyad This presentation is missing the references used.
The University of Adelaide, School of Computer Science
Prepared 5/24/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
1 Shader Performance Analysis on a Modern GPU Architecture Victor Moya, Carlos González, Jordi Roca, Agustín Fernández Jordi Roca, Agustín Fernández Department.
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408, University of Illinois, Urbana-Champaign 1 Programming Massively Parallel Processors Chapter.
1 ITCS 6/8010 CUDA Programming, UNC-Charlotte, B. Wilkinson, Jan 19, 2011 Emergence of GPU systems and clusters for general purpose High Performance Computing.
1 SHARC ‘S’uper ‘H’arvard ‘ARC’hitecture Nagendra Doddapaneni ER hit HAR ect VARD ure SUP Arc.
ATI GPUs and Graphics APIs Mark Segal. ATI Hardware X1K series 8 SIMD vertex engines, 16 SIMD fragment (pixel) engines 3-component vector + scalar ALUs.
ECEN4002 Spring 2002DSP Lab Intro R. C. Maher1 A Short Introduction to DSP Microprocessor Architecture R.C. Maher ECEN4002/5002 DSP Laboratory Spring 2002.
Kathy Grimes. Signals Electrical Mechanical Acoustic Most real-world signals are Analog – they vary continuously over time Many Limitations with Analog.
GPU Graphics Processing Unit. Graphics Pipeline Scene Transformations Lighting & Shading ViewingTransformations Rasterization GPUs evolved as hardware.
GPGPU overview. Graphics Processing Unit (GPU) GPU is the chip in computer video cards, PS3, Xbox, etc – Designed to realize the 3D graphics pipeline.
Digital Signal Processors for Real-Time Embedded Systems By Jeremy Kohel.
Real time DSP Professors: Eng. Julian Bruno Eng. Mariano Llamedo Soria.
CSE 690: GPGPU Lecture 4: Stream Processing Klaus Mueller Computer Science, Stony Brook University.
REAL-TIME VOLUME GRAPHICS Christof Rezk Salama Computer Graphics and Multimedia Group, University of Siegen, Germany Eurographics 2006 Real-Time Volume.
Enhancing GPU for Scientific Computing Some thoughts.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Over View of the GPU Architecture CS7080 Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad &
Basics and Architectures
RICE UNIVERSITY Implementing the Viterbi algorithm on programmable processors Sridhar Rajagopal Elec 696
Mapping Computational Concepts to GPUs Mark Harris NVIDIA Developer Technology.
Processor Architecture Needed to handle FFT algoarithm M. Smith.
Computer Graphics Graphics Hardware
Cg Programming Mapping Computational Concepts to GPUs.
A Simple Computer consists of a Processor (CPU-Central Processing Unit), Memory, and I/O Memory Input Output Arithmetic Logic Unit Control Unit I/O Processor.
Software Defined Radio 長庚電機通訊組 碩一 張晉銓 指導教授 : 黃文傑博士.
Programming Concepts in GPU Computing Dušan Gajić, University of Niš Programming Concepts in GPU Computing Dušan B. Gajić CIITLab, Dept. of Computer Science.
Classifying GPR Machines TypeNumber of Operands Memory Operands Examples Register- Register 30 SPARC, MIPS, etc. Register- Memory 21 Intel 80x86, Motorola.
DSP Processors We have seen that the Multiply and Accumulate (MAC) operation is very prevalent in DSP computation computation of energy MA filters AR filters.
Shadow Mapping Chun-Fa Chang National Taiwan Normal University.
GPU Computation Strategies & Tricks Ian Buck NVIDIA.
Introduction to Microprocessors
1)Leverage raw computational power of GPU  Magnitude performance gains possible.
May 8, 2007Farid Harhad and Alaa Shams CS7080 Overview of the GPU Architecture CS7080 Final Class Project Supervised by: Dr. Elias Khalaf By: Farid Harhad.
DIGITAL SIGNAL PROCESSORS. Von Neumann Architecture Computers to be programmed by codes residing in memory. Single Memory to store data and program.
From Turing Machine to Global Illumination Chun-Fa Chang National Taiwan Normal University.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
Computer Architecture Lecture 24 Parallel Processing Ralph Grishman November 2015 NYU.
GPGPU introduction. Why is GPU in the picture Seeking exa-scale computing platform Minimize power per operation. – Power is directly correlated to the.
3/12/2013Computer Engg, IIT(BHU)1 CUDA-3. GPGPU ● General Purpose computation using GPU in applications other than 3D graphics – GPU accelerates critical.
My Coordinates Office EM G.27 contact time:
An Introduction to the Cg Shading Language Marco Leon Brandeis University Computer Science Department.
Computer Graphics Graphics Hardware
Topics to be covered Instruction Execution Characteristics
GPU Architecture and Its Application
COMPUTER GRAPHICS CHAPTER 38 CS 482 – Fall 2017 GRAPHICS HARDWARE
Visit for more Learning Resources
William Stallings Computer Organization and Architecture 8th Edition
Embedded Systems Design
Graphics Processing Unit
Chapter 6 GPU, Shaders, and Shading Languages
From Turing Machine to Global Illumination
Digital Signal Processors
Subject Name: Digital Signal Processing Algorithms & Architecture
Pipelining and Vector Processing
Introduction to Digital Signal Processors (DSPs)
EE 445S Real-Time Digital Signal Processing Lab Spring 2014
Computer Graphics Graphics Hardware
Digital Signal Processors-1
Graphics Processing Unit
CSE 502: Computer Architecture
CIS 6930: Chip Multiprocessor: GPU Architecture and Programming
Presentation transcript:

Is There a Real Difference between DSPs and GPUs? by Stephanie Mitchell and Tim Knudtson

Main Topics Examples Used in this Presentation D.S.P. Processor Features of the D.S.P. Processor D.S.P. Architecture D.S.P. Programming G.P.U. Processor Features of the G.P.U. Processor G.P.U. Architecture G.P.U. Programming Conclusions

Examples Used in this Presentation Information is given for the following processors: Digital Signal Processor (DSP) TigerSHARC Graphics Processor (GPU) Nvidia GeForce Series 6

D.S.P. Processor A digital signal processor (DSP) is a specialized microprocessor designed specifically for digital signal processing, generally in real-time. Programmable Digital Signal Processor (DSPs) are tuned to efficiently execute the computationally-intensive loops that typically characterize digital signal processing algorithms (i.e. FIR and IIR filters).

Features of the D.S.P. Processor Designed for real-time processing Optimum performance with streaming data Separate program and data memories (Harvard architecture) Special Instructions for SIMD operations No hardware support for multitasking The ability to act as a direct memory access device if in a host environment

D.S.P. Architecture Memory architecture DSPs often use special memory architectures that are able to fetch multiple data and/or instructions at the same time: Harvard architecture modified von Neumann architecture Use of direct memory access Memory-address calculation unit

D.S.P. Architecture … continued Data operations Saturation arithmetic operations that produce overflows will accumulate at the maximum (or minimum) values that the register can hold rather than wrapping around (maximum+1 doesn't overflow to minimum as in many general-purpose CPUs, instead it stays at maximum). Fixed-point arithmetic is often used to speed up arithmetic processing. Single-cycle operations to increase the benefits of pipelining.

D.S.P. Programming Floating-point unit integrated directly into the data-path Special looping hardware. Low-overhead or Zero-overhead looping capability Multiply-accumulate (MAC) operations, which are good for all kinds of matrix operations, such as convolution for filtering, dot product, or even polynomial evaluation.

D.S.P. Programming … continued Instructions to increase parallelism: SIMD, VLIW, superscalar architecture. Specialized instructions for modulo addressing in ring buffers and bit-reversed addressing mode for FFT cross-referencing. Digital signal processors sometimes use time-stationary encoding to simplify hardware and increase coding efficiency

G.P.U. Processor A Graphics Processing Unit or GPU (also occasionally called Visual Processing Unit or VPU) is a dedicated graphics rendering device for a personal computer, workstation, or game console. A GPU is the main processing unit in the architecture of every graphic cards used on computers or game consoles.

Features of the G.P.U. Processor GPU architecture offers a large degree of parallelism. It supports Single Instruction, Multiple Data (SIMD) Most of them have two different types of processing units: Vertex processor (or vertex shader): it is responsible of mathematical operations Pixel (or fragment) processor: it is responsible of texturing operations The third stage is for detailed processing, and may change from architecture to another.

G.P.U. Architecture Processing Unit Focus on Floating point math fp32 and fp16 precision support for intermediate calculations 6 four-wide fp32 vector in shaders and 1scalar multifunction op 16 four-wide fp32 vector in frag-proc plus 16 four-wide fp32 MULs Dedicated fp16 normalization hardware

G.P.U. Architecture… continued Memory Use dedicated but standard memory architectures (eg DRAM) Multiple small independent memory partitions for improved latency Memory used to store buffers and optionally textures In low-end system (Intel 855GM) system memory is shared as the Graphics memory

G.P.U. Architecture… continued Cache Texture caches (2 level) Shared between vertex processors and fragment processors Cache processed/filtered textures Vertex caches cache processed and unprocessed vertexes improve computation and fetch performance Z and buffer cache and write queues

G.P.U. Programming Optimization Texture caches (2 level) Super-scalability resulting in high parallelism SIMD (single instruction multiple data) structure RISC (reduced instruction set computer) architecture neither a board design nor an extra high speed data link is necessary a programmable pipeline (shading and lighting calculations programmed by the user) Non graphical applications to be executed on GPUs has been named GPGPU, or General Purpose Computations on GPUs.

Is There a Real Difference between DSPs and GPUs? Conclusions The answer to the title of this presentation: Is There a Real Difference between DSPs and GPUs? The is no ‘real’ difference simply because these two technologies are always in competition with one of another and both architectures offer a large degree of parallelism at a relatively low cost. But …

Conclusions … continued There pipelines have different units. The GPU is a specialist of gaming graphics so, Vertex Unit: transforms primitives from global 3D into 2D coordinates system. Rasterizer Unit = primitives are converted into square fragments Fragment Unit = the final color for each fragment is computed, (i.e. texture) Composing Unit = fragments are combined with the current rendering The DSP is a specialist digital processing so, Data ALU unit = performs multiply/accumulate and other ALU operations AGU unit = performs memory operand address calculation Program Control Pipeline (PCP) Unit = performs all other instructions (branches, loops, bit tests, etc.)

References [1] P. Trancoso and M. Charalambous. Exploring Graphics Processor Performance for General Purpose Applications. Nicosia, Byprus. [2] M. Takefman and P. Chow. A Streamlined DSP Microprocessor Architecture. Toronto, Canada. 1991. [3] M. Saghir, P. Chow, and C. Lee. Application-Driven Design of DSP Architectures and Compilers. Toronto, Canada. 1994. [4] D. Geer. Taking the Graphics Processor Beyond Graphics. Published by the IEE Computer Society. September, 2005.