Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to MPI programming Morris Law, SCID May 18/25, 2013.

Similar presentations


Presentation on theme: "Introduction to MPI programming Morris Law, SCID May 18/25, 2013."— Presentation transcript:

1 Introduction to MPI programming Morris Law, SCID May 18/25, 2013

2 What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each process is a separable program  All data is private

3 Multi-core programming  Currently, most CPUs has multiple cores that can be utilized easily by compiling with openmp support  Programmers no longer need to rewrite a sequential code but to add directives to instruct the compiler for parallelizing the code with openmp.

4 Openmp example /* * Sample program to test runtime of simple matrix multiply * with and without OpenMP on gcc-4.3.3-tdm1 (mingw) * compile with gcc –fopenmp * (c) 2009, Rajorshi Biswas */ #include int main(int argc, char **argv) { int i,j,k; int n; double temp; double start, end, run; printf("Enter dimension ('N' for 'NxN' matrix) (100-2000): "); scanf("%d", &n); assert( n >= 100 && n <= 2000 ); int **arr1 = malloc( sizeof(int*) * n); int **arr2 = malloc( sizeof(int*) * n); int **arr3 = malloc( sizeof(int*) * n); for(i=0; i<n; ++i) { arr1[i] = malloc( sizeof(int) * n ); arr2[i] = malloc( sizeof(int) * n ); arr3[i] = malloc( sizeof(int) * n ); } printf("Populating array with random values...\n"); srand( time(NULL) ); for(i=0; i<n; ++i) { for(j=0; j<n; ++j) { arr1[i][j] = (rand() % n); arr2[i][j] = (rand() % n); } printf("Completed array init.\n"); printf("Crunching without OMP..."); fflush(stdout); start = omp_get_wtime(); for(i=0; i<n; ++i) { for(j=0; j<n; ++j) { temp = 0; for(k=0; k<n; ++k) { temp += arr1[i][k] * arr2[k][j]; } arr3[i][j] = temp; } end = omp_get_wtime(); printf(" took %f seconds.\n", end-start); printf("Crunching with OMP..."); fflush(stdout); start = omp_get_wtime(); #pragma omp parallel for private(i, j, k, temp) for(i=0; i<n; ++i) { for(j=0; j<n; ++j) { temp = 0; for(k=0; k<n; ++k) { temp += arr1[i][k] * arr2[k][j]; } arr3[i][j] = temp; } end = omp_get_wtime(); printf(" took %f seconds.\n", end-start); return 0; }

5 Compiling for openmp support  GCC gcc –fopenmp –o foo foo.c gfortran –fopenmp –o foo foo.f  Intel Compiler icc -openmp –o foo foo.c ifort –openmp –o foo foo.f  PGI Compiler pgcc -mp –o foo foo.c pgf90 –mp –o foo foo.f

6 What is Message Passing Interface (MPI)?  This is a library, not a language!!  Different compilers, but all must use the same libraries, i.e. MPICH, LAM, OPENMPI etc.  There are two versions now, MPI-1 and MPI-2  Use standard sequential language. Fortran, C, C++, etc.

7 Basic Idea of Message Passing Interface (MPI)  MPI Environment Initialize, manage, and terminate communication among processes  Communication between processes Point to point communication, i.e. send, receive, etc. Collective communication, i.e. broadcast, gather, etc.  Complicated data structures Communicate the data effectively e.g. matrices and memory

8 Is MPI Large or Small?  MPI is large More than one hundred functions But not necessarily a measure of complexity  MPI is small Many parallel programs can be written with just 6 basic functions  MPI is just right One can access flexibility when it is required One need not master all MPI functions

9 When Use MPI?  You need a portable parallel program  You are writing a parallel library  You care about performance  You have a problem that can be solved in parallel ways

10 F77/F90, C/C++ MPI library calls  Fortran 77/90 uses subroutines CALL is used to invoke the library call Nothing is returned, the error code variable is the last argument All variables are passed by reference  C/C++ uses functions Just the name is used to invoke the library call The function returns an integer value (an error code) Variables are passed by value, unless otherwise specified

11 Types of Communication  Point to Point Communication communication involving only two processes.  Collective Communication communication that involves a group of processes.

12 Implementation of MPI

13 Getting started with MPI  Create a file called “ machines ”  The content of “ machines ” (8 nodes): compute-0-0 compute-0-1 compute-0-2 … compute-0-7

14 MPI Commands  mpicc - compiles an mpi program mpicc -o foo foo.c mpif77 -o foo foo.f mpif90 -o foo foo.f90  mpirun - start the execution of mpi programs mpirun -v -np 2 -machinefile machines foo

15 Basic MPI Functions

16 MPI Environment  Initialize initialize environment  Finalize terminate environment  Communicator create default communication group for all processes  Version establish version of MPI

17 MPI Environment  Total processes spawn total processes  Rank/Process ID assign identifier to each process  Timing Functions MPI_Wtime, MPI_Wtick

18 MPI_INIT  Initializes the MPI environment  Assigns all spawned processes to MPI_COMM_WORLD, default comm.  C int MPI_Init(argc,argv)  int *argc;  char **argv; Input Parameters  argc - Pointer to the number of arguments  argv - Pointer to the argument vector  Fortran CALL MPI_INIT(error_code) int error_code – variable that gets set to an error code

19 MPI_FINALIZE  Terminates the MPI environment  C int MPI_Finalize()  Fortran CALL MPI_FINALIZE(error_code) int error_code – variable that gets set to an error code

20 MPI_ABORT  This routine makes a “ best attempt ” to abort all tasks in the group of comm.  Usually used in error handling.  C int MPI_Abort(comm, errorcode)  MPI_Comm comm  int errorcode Input Parameters  comm - communicator of tasks to abort  errorcode - error code to return to invoking environment  Fortran CALL MPI_ABORT(COMM, ERRORCODE, IERROR) INTEGER COMM, ERRORCODE, IERROR

21 MPI_GET_VERSION  Get the version of currently used MPI  C int MPI_Get_version(int *version, int *subversion) Input Parameters  version – version of MPI  subversion – subversion of MPI  Fortran CALL MPI_GET_VERSION(version, subversion, error_code) int error_code – variable that gets set to an error code

22 MPI_COMM_SIZE  This finds the number of processes in a communication group  C int MPI_Comm_size (comm, size)  MPI_Comm comm – MPI communication group;  int *size; Input Parameter  comm - communicator (handle) Output Parameter  size - number of processes in the group of comm (integer)  Fortran CALL MPI_COMM_SIZE(comm, size, error_code) int error_code – variable that gets set to an error code  Using MPI_COMM_WORLD as comm will return the total number of processes started

23 MPI_COMM_RANK  This gives the rank/identification number of a process in a communication group  C int MPI_Comm_rank ( comm, rank )  MPI_Comm comm;  int *rank; Input Parameter  comm - communicator (handle) Output Parameter  rank – rank/id number of the process who made the call (integer)  Fortran CALL MPI_COMM_RANK(comm, rank, error_code) int error_code – variable that gets set to an error code  Using MPI_COMM_WORLD as comm will return the rank of the process in relation to all processes that were started

24 Timing Functions – MPI_WTIME  MPI_Wtime() - returns a floating point number of seconds, representing elapsed wall-clock time.  C double MPI_Wtime(void)  Fortran DOUBLE PRECISION MPI_WTIME()  The times returned are local to the node/process that made the call.

25 Timing Functions – MPI_WTICK  MPI_Wtick() - returns a double precision number of seconds between successive clock ticks.  C double MPI_Wtick(void)  Fortran DOUBLE PRECISION MPI_WTICK()  The times returned are local to the node/process that made the call.

26 Hello World 1  Echo the MPI version  MPI Functions Used MPI_Init MPI_Get_version MPI_Finalize

27 Hello World 1 (C) #include int main(int argc, char *argv[]) { int version, subversion; MPI_Init(&argc, &argv); MPI_Get_version(&version, &subversion); printf("Hello world!\n"); printf("Your MPI Version is: %d.%d\n", version, subversion); MPI_Finalize(); return(0); }

28 Hello World 1 (Fortran) program main include 'mpif.h' integer ierr, version, subversion call MPI_INIT(ierr) call MPI_GET_VERSION(version, subversion, ierr) print *, 'Hello world!' print *, 'Your MPI Version is: ', version, '.', subversion call MPI_FINALIZE(ierr) end

29 Hello World 2  Echo the process rank and the total number of process in the group  MPI Functions Used MPI_Init MPI_Comm_rank MPI_Comm_size MPI_Finalize

30 Hello World 2 (C) #include int main(int argc, char *argv[]) { int rank, size; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf( ” Hello world! I am %d of %d\n ”, rank, size); MPI_Finalize(); return(0); }

31 Hello World 2 (Fortran) program main include 'mpif.h' integer rank, size, ierr call MPI_INIT(ierr) call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr) call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr) print *, 'Hello world! I am ', rank, ' of ', size call MPI_FINALIZE(ierr) end

32 MPI C Datatypes MPI DatatypeC Datatype MPI_CHARsigned char MPI_SHORTsigned short int MPI_INTsigned int MPI_LONGsigned long int MPI_UNSIGNED_CHARunsigned char MPI_UNSIGNED_SHORTunsigned short int

33 MPI C Datatypes MPI DatatypeC Datatype MPI_UNSIGNEDunsigned int MPI_UNSIGNED_LONGunsigned long int MPI_FLOATfloat MPI_DOUBLEdouble MPI_LONG_DOUBLElong double MPI_BYTE MPI_PACKED

34 MPI Fortran Datatypes MPI DatatypeFortran Datatype MPI_INTEGERINTEGER MPI_REALREAL MPI_DOUBLE_PRECISIONDOUBLE PRECISION MPI_COMPLEXCOMPLEX MPI_LOGICALLOGICAL MPI_CHARACTERCHARACTER MPI_BYTE MPI_PACKED

35 Parallelization example 1: serial-pi.c #include static long num_steps = 10000000; double step; int main () { int i; double x, pi, sum = 0.0; step = 1.0/(double) num_steps; for (i=0;i< num_steps; i++){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; printf("Est Pi= %f\n",pi); } 35

36 Parallelizing serial-pi.c into mpi-pi.c:- Step 1: Adding MPI environment #include "mpi.h" #include static long num_steps = 10000000; double step; int main () { int i; double x, pi, sum = 0.0; MPI_Init(&argc,&argv); step = 1.0/(double) num_steps; for (i=0;i< num_steps; i++){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; printf("Est Pi= %f\n",pi); MPI_Finalize(); }

37 Parallelizing serial-pi.c into mpi-pi.c :- Step 2: Adding variables to print ranks #include "mpi.h" #include static long num_steps = 10000000; double step; int main () { int i; double x, pi, sum = 0.0; int rank, size; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); step = 1.0/(double) num_steps; for (i=0;i< num_steps; i++){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } pi = step * sum; printf("Est Pi= %f, Processor %d of %d \n",pi, rank, size); MPI_Finalize(); }

38 Parallelizing serial-pi.c into mpi-pi.c :- Step 3: divide the workload #include "mpi.h" #include static long num_steps = 10000000; double step; int main () { int i; double x, mypi, pi, sum = 0.0; int rank, size; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); step = 1.0/(double) num_steps; for (i=rank;i< num_steps; i+=size){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } mypi = step * sum; printf("Est Pi= %f, Processor %d of %d \n",mypi, rank, size); MPI_Finalize(); }

39 Parallelizing serial-pi.c into mpi-pi.c :- Step 4: collect partial results #include "mpi.h" #include static long num_steps = 10000000; double step; int main () { int i; double x, mypi, pi, sum = 0.0; int rank, size; MPI_Init(&argc,&argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); step = 1.0/(double) num_steps; for (i=rank;i< num_steps; i+=size){ x = (i+0.5)*step; sum = sum + 4.0/(1.0+x*x); } mypi = step * sum MPI_Reduce(&mypi, &pi, 1, MPI_DOUBLE, MPI_SUM, 0, MPI_COMM_WORLD); if (rank==0) printf("Est Pi= %f, \n",pi); MPI_Finalize(); }

40 Compile and run mpi program $ mpicc –o mpi-pi mpi-pi.c $ mpirun -np 4 -machinefile machines mpi-pi

41 Parallelization example 2: serial-mc-pi.c #include main(int argc, char *argv[]) { long in,i,n; double x,y,q; time_t now; in = 0; srand(time(&now)); printf("Input no of samples : "); scanf("%ld",&n); for (i=0;i<n;i++) { x = rand()/(RAND_MAX+1.0); y = rand()/(RAND_MAX+1.0); if ((x*x + y*y) < 1) { in++; } q = ((double)4.0)*in/n; printf("pi = %.20lf\n",q); printf("rmse = %.20lf\n",sqrt(( (double) q*(4-q))/n)); } 2r

42 Parallelization example 2: mpi-mc-pi.c #include "mpi.h" #include main(int argc, char *argv[]) { long in,i,n; double x,y,q,Q; time_t now; int rank,size; MPI_Init(&argc, &argv); in = 0; MPI_Comm_size(MPI_COMM_WORLD,&size); MPI_Comm_rank(MPI_COMM_WORLD,&rank); srand(time(&now)+rank); if (rank==0) { printf("Input no of samples : "); scanf("%ld",&n); } MPI_Bcast(&n,1,MPI_LONG,0,MPI_COMM_WORLD); for (i=0;i<n;i++) { x = rand()/(RAND_MAX+1.0); y = rand()/(RAND_MAX+1.0); if ((x*x + y*y) < 1) { in++; } q = ((double)4.0)*in/n; MPI_Reduce(&q,&Q,1,MPI_DOUBLE,MPI_SUM,0,MPI_COMM_WORLD); Q = Q / size; if (rank==0) { printf("pi = %.20lf\n",Q); printf("rmse = %.20lf\n",sqrt(( (double) Q*(4-Q))/n/size)); } MPI_Finalize(); } 2r

43 Compile and run mpi-mc-pi $ mpicc –o mpi-mc-pi mpi-mc-pi.c $ mpirun -np 4 -machinefile machines mpi-mc-pi

44 The End


Download ppt "Introduction to MPI programming Morris Law, SCID May 18/25, 2013."

Similar presentations


Ads by Google