Download presentation

Presentation is loading. Please wait.

1
**Yafeng Yin, Lei Zhou, Hong Man 07/21/2010**

CUDA Library and Demo Yafeng Yin, Lei Zhou, Hong Man 07/21/2010

2
**Outline Basic CUDA computation library**

GPULib, CUBLAS, CUFFT Advanced CUDA computation library CULA /MAGMA , VSIPL CUDA FIR Demo(UMD) Discuss and future work

3
**Basic lib - GPULib GPULib provides a library of mathematical functions**

addition, subtraction, multiplication, and division, as well as unary functions, including sin(), cos(), gamma(), and exp(), interpolation, array reshaping, array slicing, and reduction operations

4
**Basic lib - CUBLAS BLAS-- Basic Linear Algebra Subprograms CUBLAS**

Provide a set of functions for basic vector and matrix operations, such as matrix‐vector copy, sort, dot product, Euclidean norm etc Real data Level 1 (vector-vector O(N) ) Level 2 (matrix-vector O(N2) ) Level 3 (matrix-matrix O(N3) ) Complex data Level 1

5
**CUBLAS-Level 2 function**

cublasSgbmv() y = alpha * op(A) * x + beta * y cublasSgemv() cublasSger() A = alpha * x * yT + A cublasSsbmv() y = alpha * A * x + beta * y , cublasSspmv() y = alpha * A * x + beta * y cublasSspr() A = alpha * x * xT + A cublasSspr2() A = alpha * x * yT + alpha * y * xT + A , cublasSsymv() cublasSsyr() cublasSsyr2() cublasStbmv() x = op(A) * x cublasStbsv() op(A) * x = b , output x

6
**Basic lib - CUFFT CUFFT is the CUDA FFT library**

Provides a simple interface for computing parallel FFT on an NVIDIA GPU Allows users to leverage the floating-point power and parallelism of the GPU without having to develop a GPU-based FFT implementation cufftPlan1d() ,cufftPlan2d() ,cufftPlan3d() Creates a 1D,2D or 3D FFT plan configuration for a specified signal size

7
**Advanced lib – CULA and MAGMA**

CULA: GPU Accelerated Linear Algebra provide LAPACK (Linear Algebra PACKage) function on CUDA GPUs MAGMA: Matrix Algebra on GPU and Multicore Architectures develop a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures and "Multicore+GPU" systems

8
**Advanced lib -CULA function**

Linear Equation Routines Solves a general system of linear equations AX=B. Orthogonal Factorizations LQ ,RQ factorization Least Squares Routines Symmetric and non- Symmetric Eigenvalue Routines Singular Value Decomposition (SVD) Routines

9
**Advanced lib - MAGMA LAPACK on CUDA GPUs**

LU, QR, and Cholesky factorizations in both real and complex arithmetic (single and double) Linear solvers based on LU, QR, and Cholesky in real arithmetic (single and double) Mixed-precision iterative refinement solvers based on LU, QR, and Cholesky in real arithmetic Reduction to upper Hessenberg form in real arithmetic (single and double) MAGMA BLAS in real arithmetic (single and double),

10
**Advanced lib -VSIPL VSIPL: Vector Image Signal Processing Library**

Generalized matrix product Fast FIR filtering Correlation Fast Fourier Transform QR decomposition Random number generation Elementwise arithmetic, logical, and comparison operators, linear algebra procedures

11
**CUDA library Summary Basic vector or matrix computation**

GPULib, CUBLAS, CUFFT vector or matrix: addition, subtraction, multiplication, and division sin(), cos(), sort, dot product, Libraries can be used for Signal Processing CULA /MAGMA , VSIPL LU, QR, and Cholesky factorizations SVD decompostion

12
**CUDA Demo (FIR) GPU: NVIDIA GeForce 8600 GT CPU: Intel Duo CPU 2.33G**

Software: Visual Studio 2005

13
**CUDA Demo (FIR) Output NO GPU Run Memory Time(msec) Total Time**

CPU +GPU CPU Only 1000 10000 100000

14
CUDA Demo (FIR)

15
**Discuss and future work**

how to connect CUDA to the SSP re-hosting demo how to change the sequential executed codes in signal processing system to CUDA codes how to transfer the XML codes to CUDA codes to generate the CUDA input.

16
**Reference CUDA Zone http://www.nvidia.com/object/cuda_home_new.html**

Similar presentations

OK

AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.

AUTO-GC: Automatic Translation of Data Mining Applications to GPU Clusters Wenjing Ma Gagan Agrawal The Ohio State University.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on ready mix concrete plant Ppt on field study 3 Ppt on volatility of stock market Free ppt on brain machine interface insect Ppt on measurement of price elasticity of demand Ppt on ascending and descending numbers Ppt on media research methods Ppt on panel discussion meaning Ppt on festival of india Ppt on prime numbers