Presentation is loading. Please wait.

Presentation is loading. Please wait.

Acceleration of software package "R" using GPU's Sachinthaka Abeywardana.

Similar presentations

Presentation on theme: "Acceleration of software package "R" using GPU's Sachinthaka Abeywardana."— Presentation transcript:

1 Acceleration of software package "R" using GPU's Sachinthaka Abeywardana

2 CSIRO. Introduction to Graphic Processing Units (GPU)

3 CSIRO. Introduction to GPU contd.

4 CSIRO. Introduction to R and BLAS R Statistical Package Graphics BLAS (Basic Linear Algebra Subprograms) Vector-Vector Addition/Multiplication etc. Vector-Matrix Addition/Multiplication etc. Matrix-Matrix Addition/Multiplication etc. LAPack (Linear Algebra Package)

5 What has been done in this project Aim: Replace Rblas.dll with a faster BLAS library CSIRO. R LAPackBLAS New BLAS Replace

6 Rblas.dll How New Rblas.dll was created CSIRO. CUBLAS library C program wrapper FORTRAN call Initialise call

7 CSIRO. Results for 1000 x 1000 Matrices CPU Average (s) 3.2 * A %*% B + 4.1 * A (3.2 A x B + 4.1 B) 1.9335 A%*%B (Matrix A x matrix B) 1.8855 t(A)%*%B (Transpose matrix A x Matrix B) 1.9135 solve(A) (Invert Matrix A) 2.2274.695.288 GPU Average (s) Single Precision GPU Average (s) Double Precision 0.23750.123 0.1760.092 0.2070.089

8 CSIRO. Improvements Single Precision (%) Double Precision (%) 3.2 * A %*% B + 4.1 * A814.10526321571.95122 A%*%B1071.3068182049.456522 t(A)%*%B924.39613532150 solve(A)-210.597216-237.4494836

9 CSIRO. Who to Blame A.Simply random? B.Me??? C.Stupid Computer? D.Memory allocation.

10 CSIRO. Nvidia GPU Architecture

11 CSIRO. Nvidia GPU Architecture contd.

12 CSIRO. Nvidia GPU Architecture contd.



15 Comparison with Atlas RBlas Improvement on multiplication : A%*%B319% Improvement on inverting matrix: solve(A)281% (source: Limitations on Atlas: Latest version is for pentium 4 only

16 CSIRO. Limitations of this Project Specific Card Cost GeForce GTX 280 $582 (Source: Precision? RMS of 6.350072e-06 for inverting a 1024 x 1024 matrix for the single precision cards. IEEE 754 deviations

17 CSIRO. Where can I get this from

18 CSIRO. Where to from now? Implementation of more Blas functions Getting rid of overhead Adjusting LAPack Double precision to Single Precision and Single to Double Conversion Parallel Extensions (CPU)

19 CSIRO. Thank You Luke Domanski Dadong Wang Pascal Valotton Glenn Stone Robert Dunne CMIS/ CSIRO staff


Download ppt "Acceleration of software package "R" using GPU's Sachinthaka Abeywardana."

Similar presentations

Ads by Google