Accelerated Linear Algebra Libraries James Wynne III NCCS User Assistance.

Slides:



Advertisements
Similar presentations
TWO STEP EQUATIONS 1. SOLVE FOR X 2. DO THE ADDITION STEP FIRST
Advertisements

Advanced Piloting Cruise Plot.
1
Feichter_DPG-SYKL03_Bild-01. Feichter_DPG-SYKL03_Bild-02.
1 Vorlesung Informatik 2 Algorithmen und Datenstrukturen (Parallel Algorithms) Robin Pomplun.
© 2008 Pearson Addison Wesley. All rights reserved Chapter Seven Costs.
Copyright © 2003 Pearson Education, Inc. Slide 1 Computer Systems Organization & Architecture Chapters 8-12 John D. Carpinelli.
Chapter 1 The Study of Body Function Image PowerPoint
Copyright © 2011, Elsevier Inc. All rights reserved. Chapter 6 Author: Julia Richards and R. Scott Hawley.
Author: Julia Richards and R. Scott Hawley
1 Copyright © 2013 Elsevier Inc. All rights reserved. Appendix 01.
1 Copyright © 2013 Elsevier Inc. All rights reserved. Chapter 3 CPUs.
Properties Use, share, or modify this drill on mathematic properties. There is too much material for a single class, so you’ll have to select for your.
UNITED NATIONS Shipment Details Report – January 2006.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Chapter 11: Structure and Union Types Problem Solving & Program Design.
1 RA I Sub-Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Casablanca, Morocco, 20 – 22 December 2005 Status of observing programmes in RA I.
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Properties of Real Numbers CommutativeAssociativeDistributive Identity + × Inverse + ×
Exit a Customer Chapter 8. Exit a Customer 8-2 Objectives Perform exit summary process consisting of the following steps: Review service records Close.
FACTORING ax2 + bx + c Think “unfoil” Work down, Show all steps.
Year 6 mental test 5 second questions
Year 6 mental test 10 second questions
1 Click here to End Presentation Software: Installation and Updates Internet Download CD release NACIS Updates.
Excel Functions. Part 1. Introduction 2 An Excel function is a formula or a procedure that is performed in the Visual Basic environment, outside the.
Solve Multi-step Equations
REVIEW: Arthropod ID. 1. Name the subphylum. 2. Name the subphylum. 3. Name the order.
PP Test Review Sections 6-1 to 6-6
Chapter 17 Linked Lists.
EU market situation for eggs and poultry Management Committee 20 October 2011.
EIS Bridge Tool and Staging Tables September 1, 2009 Instructor: Way Poteat Slide: 1.
Bellwork Do the following problem on a ½ sheet of paper and turn in.
2 |SharePoint Saturday New York City
Exarte Bezoek aan de Mediacampus Bachelor in de grafische en digitale media April 2014.
VOORBLAD.
Name Convolutional codes Tomashevich Victor. Name- 2 - Introduction Convolutional codes map information to code bits sequentially by convolving a sequence.
1 public class Newton { public static double sqrt(double c) { double epsilon = 1E-15; if (c < 0) return Double.NaN; double t = c; while (Math.abs(t - c/t)
Copyright © 2012, Elsevier Inc. All rights Reserved. 1 Chapter 7 Modeling Structure with Blocks.
1 RA III - Regional Training Seminar on CLIMAT&CLIMAT TEMP Reporting Buenos Aires, Argentina, 25 – 27 October 2006 Status of observing programmes in RA.
Factor P 16 8(8-5ab) 4(d² + 4) 3rs(2r – s) 15cd(1 + 2cd) 8(4a² + 3b²)
Basel-ICU-Journal Challenge18/20/ Basel-ICU-Journal Challenge8/20/2014.
1..
CONTROL VISION Set-up. Step 1 Step 2 Step 3 Step 5 Step 4.
© 2012 National Heart Foundation of Australia. Slide 2.
While Loop Lesson CS1313 Spring while Loop Outline 1.while Loop Outline 2.while Loop Example #1 3.while Loop Example #2 4.while Loop Example #3.
Understanding Generalist Practice, 5e, Kirst-Ashman/Hull
DB analyzer utility An overview 1. DB Analyzer An application used to track discrepancies and other reports in Sanchay Post Constantly updated by SDC.
Model and Relationships 6 M 1 M M M M M M M M M M M M M M M M
25 seconds left…...
Januar MDMDFSSMDMDFSSS
Analyzing Genes and Genomes
We will resume in: 25 Minutes.
©Brooks/Cole, 2001 Chapter 12 Derived Types-- Enumerated, Structure and Union.
Essential Cell Biology
Intracellular Compartments and Transport
PSSA Preparation.
Essential Cell Biology
Immunobiology: The Immune System in Health & Disease Sixth Edition
1 Chapter 13 Nuclear Magnetic Resonance Spectroscopy.
Energy Generation in Mitochondria and Chlorplasts
Murach’s OS/390 and z/OS JCLChapter 16, Slide 1 © 2002, Mike Murach & Associates, Inc.
User Defined Functions Lesson 1 CS1313 Fall User Defined Functions 1 Outline 1.User Defined Functions 1 Outline 2.Standard Library Not Enough #1.
Data Structures Using C++ 2E
Chapter 9: Using Classes and Objects. Understanding Class Concepts Types of classes – Classes that are only application programs with a Main() method.
Presentation transcript:

Accelerated Linear Algebra Libraries James Wynne III NCCS User Assistance

2 Accelerated Linear Algebra Libraries Collection of functions to preform mathematical operations on matrices Designer has re-written the standard LAPACK functions to make use of GPU accelerators to speed up execution on large matrices

3 MAGMA HOST INTERFACE Accelerated Linear Algebra Libraries

4 MAGMA - Host MAGMA stands for Matrix Algebra on GPU and Multicore Architecture Developed by the Innovative Computing Laboratory at the University of Tennessee Host interface allows easy porting from CPU libraries (like LAPACK) to MAGMAs accelerated library – Automatically manages data allocation and transfer between CPU (Host) and GPU (Device)

5 MAGMA - Host Fortran: – To run MAGMA functions from Fortran, an Interface block needs to be written for each MAGMA function thats being called. These interfaces will be defined file magma.f90 – Example: module magma Interface Integer Function magma_sgesv(n,nrhs,…)& BIND(C,name=magma_sgesv) use iso_c_binding implicit none integer(c_int), value :: n integer(c_int), value :: nrhs … end Function end Interface end module

6 MAGMA - Host Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Integer(C_INT) :: piv(3), ok, status !Fill your `A` and `b` arrays then call !MAGMA_SGESV status = magma_sgesv(3,1,A,3,piv,b,3,ok) !Loop through and write(*,*) the contents of !array `b` end Program

7 MAGMA - Host Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Integer(C_INT) :: piv(3), ok, status !Fill your `A` and `b` arrays then call !MAGMA_SGESV status = magma_sgesv(3,1,A,3,piv,b,3,ok) !Loop through and write(*,*) the contents of !array `b` end Program

8 MAGMA - Host Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Integer(C_INT) :: piv(3), ok, status !Fill your `A` and `b` arrays then call !MAGMA_SGESV status = magma_sgesv(3,1,A,3,piv,b,3,ok) !Loop through and write(*,*) the contents of !array `b` end Program

9 MAGMA - Host Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded – magma.f90: Contains the Interface block module – sgesv.f90: Contains the Fortran source code To compile: $ module swap PrgEnv-pgi PrgEnv-gnu $ module load cudatoolkit magma $ ftn magma.f90 –lcuda –lmagma –lmagmablas sgesv.f90

10 MAGMA - Host C: In C source code, no kind of interface block is needed like in Fortran Simply #include in your code When declaring variables to use with MAGMA functions, use magma_int_t instead of Cs int type. Matrices for MAGMAs SGESV are of type float

11 MAGMA - Host Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }

12 MAGMA - Host Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }

13 MAGMA - Host Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }

14 MAGMA - Host Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and call magma_sgesv() magma_sgesv(m,n,A,m,piv,b,m,&info); //Loop through and print out returned array b }

15 MAGMA - Host Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded To compile: $ module swap PrgEnv-pgi PrgEnv-gnu $ module load cudatoolkit magma $ cc –lcuda –lmagma –lmagmablas sgesv.c

16 MAGMA DEVICE INTERFACE Accelerated Linear Algebra Libraries

17 MAGMA - Device MAGMA Device interface allows direct control over how the GPU (device) is managed – Memory allocation and transfer – Keeping matrices on the device

18 MAGMA - Device Fortran: – To run MAGMA device functions from Fortran, an Interface block needs to be written for each MAGMA function thats being called. This is not required in C source code – Device functions suffixed with _gpu – CUDA Fortran used to manage memory on the device – All interface blocks need to be defined in a module ( module magma ) If module exists in a separate file, file extension must be.cuf, just like the source file

19 MAGMA - Device Example: module magma Interface Integer Function magma_sgesv_gpu(n,nrhs,dA…)& BIND(C,name=magma_sgesv_gpu) use iso_c_binding use cudafor implicit none integer(c_int), value :: n integer(c_int), value :: nrhs real (c_float), device, dimension(:)::dA(*) … end Function end Interface …

20 MAGMA - Device Also need the MAGMA initialize function (defined in the same interface block module): … Interface Integer Function magma_init() & BIND(C,name=magma_init) use iso_c_binding implicit none end Function end Interface end module

21 MAGMA - Device Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use cudafor use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Real(C_FLOAT),device,dimension(:,:) :: dA Real(C_FLOAT),device,dimension(:) :: dB Integer(C_INT),value :: piv(3), ok, status

22 MAGMA - Device Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use cudafor use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Real(C_FLOAT),device,dimension(:,:) :: dA Real(C_FLOAT),device,dimension(:) :: dB Integer(C_INT),value :: piv(3), ok, status

23 MAGMA - Device Pseudo-code for a simple SGESV operation in magma Program SGESV !Include the module that hosts your interface use magma use cudafor use iso_c_binding !Define your arrays and variables Real(C_FLOAT) :: A(3,3), b(3) Real(C_FLOAT),device,dimension(:,:) :: dA Real(C_FLOAT),device,dimension(:) :: dB Integer(C_INT),value :: piv(3), ok, status

24 MAGMA - Device Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program

25 MAGMA - Device Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program

26 MAGMA - Device Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program

27 MAGMA - Device Pseudo-code for a simple SGESV operation in magma !Fill your `A` and `b` arrays then initialize !MAGMA status = magma_init() !Copy filled host arrays `A` and `b` to `dA` !and `dB` using CUDA Fortran dA = A dB = b !Call the device function status = magma_sgesv_gpu(3,1,dA,3,piv,dB,3,ok) !Copy results back to CPU b = dB !Loop through and write(*,*) the contents of !array `b` end Program

28 MAGMA - Device Before compiling, Make sure the MAGMA module, CUDA toolkit and the PGI programming environment is loaded – magma.cuf: Contains module of Interface blocks – sgesv.cuf: Contains the Fortran source To compile: $ module load cudatoolkit magma $ ftn magma.cuf –lcuda –lmagma –lmagmablas sgesv.cuf

29 MAGMA - Device C: In C Device source code, no kind of interface block is needed like in Fortran Simply #include in your code When declaring variables to use with MAGMA functions, use magma_int_t instead of Cs int type. Matrices for MAGMAs SGESV are of type float Before running any MAGMA Device code, magma_init() must be called.

30 MAGMA - Device C: To interact with the device (Allocate matrices, transfer data, etc) use the built in MAGMA functions – Allocate on the device: magma_dmalloc() – Copy matrix to device: magma_dsetmatrix() – Copy matrix to host: magma_dgetmatrix()

31 MAGMA - Device Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);

32 MAGMA - Device Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);

33 MAGMA - Device Example code for C: #include int main() { //Define Arrays and variables for MAGMA float b[3], A[3][3]; float *A_d, *b_d //Device pointers magma_int_t m = 3, n = 1, piv[3] ok; //Fill matrices A and b and allocate device //matrices magma_dmalloc(&A_d, m*m); magma_dmalloc(&b_d, m);

34 MAGMA - Device Example code for C: //Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b }

35 MAGMA - Device Example code for C: //Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b }

36 MAGMA - Device Example code for C: //Transfer matrices to device magma_dsetmatrix(m,m,A,m,A_d,m); magma_dsetmatrix(m,n,b,m,b_d,m); //Call the device sgesv function magma_sgesv_gpu(m,n,A_d,m,piv,b_d,m,&info); //Copy back computed matrix magma_dgetmatrix(m,n,b_d,m,b,m); //Loop through and print out returned array b }

37 MAGMA - Host Before compiling, Make sure the MAGMA module, CUDA toolkit and the GNU programming environment is loaded To compile: $ module swap PrgEnv-pgi PrgEnv-gnu $ module load cudatoolkit magma $ cc –lcuda –lmagma –lmagmablas sgesv_gpu.c