Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.

Slides:



Advertisements
Similar presentations
1 Buffers l When you send data, where does it go? One possibility is: Process 0Process 1 User data Local buffer the network User data Local buffer.
Advertisements

Reference: / MPI Program Structure.
Tutorial on MPI Experimental Environment for ECE5610/CSC
MPI Fundamentals—A Quick Overview Shantanu Dutt ECE Dept., UIC.
Introduction to MPI. What is Message Passing Interface (MPI)?  Portable standard for communication  Processes can communicate through messages.  Each.
CS 240A: Models of parallel programming: Distributed memory and MPI.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Message Passing Interface (MPI) Part I NPACI Parallel.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
MPI Point-to-Point Communication CS 524 – High-Performance Computing.
Distributed Systems CS Programming Models- Part II Lecture 17, Nov 2, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.
1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.
Sahalu JunaiduICS 573: High Performance Computing6.1 Programming Using the Message Passing Paradigm Principles of Message-Passing Programming The Building.
CS 179: GPU Programming Lecture 20: Cross-system communication.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
1 An Introduction to MPI Parallel Programming with the Message Passing Interface Originally by William Gropp and Ewing Lusk Adapted by Anda Iamnitchi.
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.
1 CS4402 – Parallel Computing Lecture 2 MPI – Getting Started. MPI – Point to Point Communication.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
MPI More of the Story Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer.
MA471Fall 2003 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.
Specialized Sending and Receiving David Monismith CS599 Based upon notes from Chapter 3 of the MPI 3.0 Standard
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
1 Review –6 Basic MPI Calls –Data Types –Wildcards –Using Status Probing Asynchronous Communication Collective Communications Advanced Topics –"V" operations.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
1 What is MPI?  MPI = Message Passing Interface  Specification of message passing libraries for developers and users  Not a library by itself, but specifies.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
MPI Communications Point to Point Collective Communication Data Packaging.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Performance Oriented MPI Jeffrey M. Squyres Andrew Lumsdaine NERSC/LBNL and U. Notre Dame.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
CS 420 – Design of Algorithms MPI Data Types Basic Message Passing - sends/receives.
MPI Send/Receive Blocked/Unblocked Tom Murphy Director of Contra Costa College High Performance Computing Center Message Passing Interface BWUPEP2011,
1 Overview on Send And Receive routines in MPI Kamyar Miremadi November 2004.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
MA471Fall 2002 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.
An Introduction to MPI (message passing interface)
MPI Send/Receive Blocked/Unblocked Josh Alexander, University of Oklahoma Ivan Babic, Earlham College Andrew Fitz Gibbon, Shodor Education Foundation Inc.
Chapter 5. Nonblocking Communication MPI_Send, MPI_Recv are blocking operations Will not return until the arguments to the functions can be safely modified.
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE MPI 2 Part II NPACI Parallel Computing Institute.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
Parallel Algorithms & Implementations: Data-Parallelism, Asynchronous Communication and Master/Worker Paradigm FDI 2007 Track Q Day 2 – Morning Session.
1 Parallel and Distributed Processing Lecture 5: Message-Passing Computing Chapter 2, Wilkinson & Allen, “Parallel Programming”, 2 nd Ed.
Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.
An Introduction to Parallel Programming with MPI February 17, 19, 24, David Adams
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
CS 4410 – Parallel Computing 1 Chap 9 CS 4410 – Parallel Computing Dr. Dave Gallagher Chap 9 Manager Worker.
Distributed Processing with MPI International Summer School 2015 Tomsk Polytechnic University Assistant Professor Dr. Sergey Axyonov.
Introduction to parallel computing concepts and technics
CS4402 – Parallel Computing
MPI Point to Point Communication
Introduction to MPI.
CS 584.
An Introduction to Parallel Programming with MPI
Lecture 14: Inter-process Communication
A Message Passing Standard for MPP and Workstations
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
Introduction to parallelism and the Message Passing Interface
Message-Passing Computing Message Passing Interface (MPI)
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Hello, world in MPI #include <stdio.h> #include "mpi.h"
CS 584 Lecture 8 Assignment?.
Programming Parallel Computers
Presentation transcript:

Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School of Mines)

2 MPI 2 Lecture Overview Review –6 Basic MPI Calls –Data Types –Wildcards –Using Status Probing Asynchronous Communication

3 Review: Six Basic MPI Calls MPI_Init –Initialize MPI MPI_Comm_Rank –Get the processor rank MPI_Comm_Size –Get the number of processors MPI_Send –Send data to another processor MPI_Recv –Get data from another processor MPI_Finalize –Finish MPI

4 Simple Send and Receive Program #include int main(int argc,char *argv[]) { int myid, numprocs, tag,source,destination,count, buffer; MPI_Status status; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); tag=1234; source=0; destination=1; count=1; if(myid == source){ buffer=5678; MPI_Send(&buffer,count,MPI_INT,destination,tag,MPI_COMM_WORLD); printf("processor %d sent %d\n",myid,buffer); } if(myid == destination){ MPI_Recv(&buffer,count,MPI_INT,source,tag,MPI_COMM_WORLD,&status); printf("processor %d got %d\n",myid,buffer); } MPI_Finalize(); }

5 MPI Data Types MPI has many different predefined data types –All your favorite C data types are included MPI data types can be used in any communication operation

6 MPI Predefined Data Types in C C MPI Types C Types MPI_CHARsigned char MPI_SHORTsigned short int MPI_INTsigned int MPI_LONGsigned long int MPI_UNSIGNED_CHARunsigned char MPI_UNSIGNED_SHORTunsigned short int MPI_UNSIGNEDunsigned int MPI_UNSIGNED_LONGunsigned long int MPI_FLOATfloat MPI_DOUBLEdouble MPI_LONG_DOUBLElong double MPI_BYTE- MPI_PACKED-

7 MPI Predefined Data Types in Fortran Fortran MPI Types Fortran Types MPI_REALREAL MPI_INTEGERINTEGER MPI_CHARACTERCHARACTER MPI_COMPLEXCOMPLEX MPI_DOUBLE_PRECISIONDOUBLE PRECISION MPI_LOGICALLOGICAL MPI_PACKED

8 Wildcards Enables programmer to avoid having to specify a tag and/or source. Example: MPI_ANY_SOURCE and MPI_ANY_TAG are wild cards status structure is used to get wildcard values MPI_Status status; int buffer[5]; int error; error = MPI_Recv(&buffer[0], 5, MPI_INT, MPI_ANY_SOURCE, MPI_ANY_TAG, MPI_COMM_WORLD,&status);

9 Status The status parameter returns additional information for some MPI routines –additional error status information –additional information with wildcard parameters C declaration : a predefined struct  MPI_Status status; Fortran declaration : an array is used instead  INTEGER STATUS(MPI_STATUS_SIZE)

10 Accessing Status Information The tag of a received message –C : status.MPI_TAG –Fortran : STATUS(MPI_TAG) The source of a received message –C : status.MPI_SOURCE –Fortran : STATUS(MPI_SOURCE) The error code of the MPI call –C : status.MPI_ERROR –Fortran : STATUS(MPI_ERROR) Other uses...

11 MPI_Probe MPI_Probe allows incoming messages to be checked without actually receiving them –The user can then decide how to receive the data –Useful when different action needs to be taken depending on the "who, what, and how much" information of the message

12 MPI_Probe C  MPI_Probe(source, tag, COMM, &status) Fortran  MPI_Probe(source, tag, COMM, status, ierror) Parameters –source: source rank, or MPI_ANY_SOURCE –tag: tag value, or MPI_ANY_TAG –COMM: communicator –status: status object

13 MPI_Probe Example #include /* Program shows how to use probe and get_count to find the size of an incomming message */ int main(argc,argv) int argc; char *argv[]; { int myid, numprocs; MPI_Status status; int i,mytag,ierr,icount; MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); /* print out my rank and this run's PE size*/ printf("Hello from %d\n",myid); printf("Numprocs is %d\n",numprocs);

14 MPI_Probe example (cont.) mytag=123; i=0; icount=0; if(myid == 0) { i=100; icount=1; ierr=MPI_Send(&i,icount,MPI_INT,1,mytag,MPI_COMM_WORLD ); } if(myid == 1){ ierr=MPI_Probe(0,mytag,MPI_COMM_WORLD,&status); ierr=MPI_Get_count(&status,MPI_INT,&icount); printf("getting %d\n",icount); ierr = MPI_Recv(&i,icount,MPI_INT,0,mytag,MPI_COMM_WORLD, status); printf("i= %d\n",i); } MPI_Finalize(); }

15 MPI_Barrier Blocks the caller until all members in the communicator have called it. Used as a synchronization tool. C  MPI_Barrier(COMM ) Fortran  Call MPI_Barier(COMM, Ierror) Parameter –Comm: communicator (often MPI_COMM_WORLD)

16 Asynchronous (non-blocking) Communication Asynchronous send: send call returns immediately, send actually occurs later Asynchronous receive: receive call returns immediately. When received data is needed, call a wait subroutine Asynchronous communication used to overlap communication with computation. Can help prevent deadlock (but beware!)

17 Asynchronous Send with MPI_Isend C  MPI_Request request  MPI_Isend(&buffer, count, datatype, dest,tag, COMM, &request) Fortran  Integer REQUEST  MPI_Isend(buffer, count, datatype, dest, tag, COMM, request, ierror) request is a new output parameter Don't change data until communication is complete

18 Asynchronous Receive w/ MPI_Irecv C  MPI_Request request;  int MPI_Irecv(&buf, count, datatype, source, tag, COMM, request) Fortran  Integer request  MPI_Irecv(buffer, count, datatype, source, tag, COMM, request,ierror) Parameter changes –Request: communication request –Status parameter is missing Don't use data until communication is complete

19 MPI_Wait Used to Complete Communication Request from Isend or Irecv is input –The completion of a send operation indicates that the sender is now free to update the data in the send buffer –The completion of a receive operation indicates that the receive buffer contains the received message MPI_Wait blocks until message specified by "request" completes

20 MPI_Wait Usage C  MPI_Request request;  MPI_Status status;  MPI_Wait(&request, &status) Fortran  Integer request  Integer status(MPI_STATUS_SIZE)  MPI_Wait(request, status, ierror) MPI_Wait blocks until message specified by "request" completes

21 MPI_Test Similar to MPI_Wait, but does not block Value of flags signifies whether a message has been delivered C  int flag  int MPI_Test(&request,&flag, &status) Fortran  Logical flag  MPI_Test(request, flag, status, ierror)

22 Non-Blocking Send Example call MPI_Isend (buffer,count,datatype,dest, tag,comm, request, ierr) 10 continue Do other work... call MPI_Test (request, flag, status, ierr) if (.not. flag) goto 10

23 Asynchronous Send and Receive Write a parallel program to send and receive data using MPI_Isend and MPI_Irecv –Initialize MPI –Have processor 0 send an integer to processor 1 –Have processor 1 receive and integer from processor 0 –Both processors check on message completion –Quit MPI

24 #include #include "mpi.h" #include /************************************************************ This is a simple isend/ireceive program in MPI ************************************************************/ int main(argc,argv) int argc; char *argv[]; { int myid, numprocs; int tag,source,destination,count; int buffer; MPI_Status status; MPI_Request request;

25 MPI_Init(&argc,&argv); MPI_Comm_size(MPI_COMM_WORLD,&numprocs); MPI_Comm_rank(MPI_COMM_WORLD,&myid); tag=1234; source=0; destination=1; count=1; request=MPI_REQUEST_NULL;

26 if(myid == source){ buffer=5678; MPI_Isend(&buffer,count,MPI_INT,destination,tag, MPI_COMM_WORLD,&request); } if(myid == destination){ MPI_Irecv(&buffer,count,MPI_INT,source,tag, MPI_COMM_WORLD,&request); } MPI_Wait(&request,&status); if(myid == source){ printf("processor %d sent %d\n",myid,buffer); } if(myid == destination){ printf("processor %d got %d\n",myid,buffer); } MPI_Finalize(); }

Point to Point Communications in MPI Basic operations of Point to Point (PtoP) communication and issues of deadlock Several steps are involved in the PtoP communication Sending process –Data is copied to the user buffer by the user –User calls one of the MPI send routines –System copies the data from the user buffer to the system buffer –System sends the data from the system buffer to the destination processor

Point to Point Communications in MPI Receiving process –User calls one of the MPI receive subroutines –System receives the data from the source process, and copies it to the system buffer –System copies the data from the system buffer to the user buffer –User uses the data in the user buffer

sendbuf Call send routine Now sendbuf can be reused Process 0 : User mode Kernel mode Copying data from sendbuf to systembuf Send data from sysbuf to dest data Process 1 : User modeKernel mode Call receive routine receive data from src to systembuf Copying data from sysbuf to recvbuf sysbuf recvbuf Now recvbuf contains valid data

Unidirectional communication Blocking send and blocking receive if (myrank == 0) then call MPI_Send(…) elseif (myrank == 1) then call MPI_Recv(….) endif Non-blocking send and blocking receive if (myrank == 0) then call MPI_ISend(…) ……. call MPI_Wait(…) else if (myrank == 1) then call MPI_Recv(….) endif

Blocking send and non-blocking recv if (myrank == 0 ) then call MPI_Send(…..) elseif (myrank == 1) then call MPI_Irecv (…) ……… call MPI_Wait(…) endif Non-blocking send and non-blocking recv if (myrank == 0 ) then call MPI_Isend (…) ………….. call MPI_Wait (…) elseif (myrank == 1) then call MPI_Irecv (….) ………………….. call MPI_Wait(..) endif Unidirectional communication

Bidirectional communication Need to be careful about deadlock when two processes exchange data with each other Deadlock can occur due to incorrect order of send and recv or due to limited size of the system buffer sendbuf recvbuf Proc: Rank 0 Proc: Rank 1 recvbuf sendbuf

Bidirectional communication Case 1 : both processes call send first, then recv if (myrank == 0 ) then call MPI_Send(….) call MPI_Recv (…) elseif (myrank == 1) then call MPI_Send(….) call MPI_Recv(….) endif No deadlock as long as system buffer is larger than send buffer Deadlock if system buffer is smaller than send buf If you replace MPI_Send with MPI_Isend and MPI_Wait, it is still the same Moral : there may be error in coding that only shows up for larger problem size

Bidirectional communication Following is free from deadlock if (myrank == 0 ) then call MPI_Isend(….) call MPI_Recv (…) call MPI_Wait(…) elseif (myrank == 1) then call MPI_Isend(….) call MPI_Recv(….) call MPI_Wait(….) endif

Bidirectional communication Case 2 : both processes call recv first, then send if (myrank == 0 ) then call MPI_Recv(….) call MPI_Send (…) elseif (myrank == 1) then call MPI_Recv(….) call MPI_Send(….) endif The above will always lead to deadlock (even if you replace MPI_Send with MPI_Isend and MPI_Wait)

Bidirectional communication The following code can be safely executed if (myrank == 0 ) then call MPI_Irecv(….) call MPI_Send (…) call MPI_Wait(…) elseif (myrank == 1) then call MPI_Irecv(….) call MPI_Send(….) call MPI_Wait(….) endif

Bidirectional communication Case 3 : one process call send and recv in this order, and the other calls in the opposite order if (myrank == 0 ) then call MPI_Send(….) call MPI_Recv(…) elseif (myrank == 1) then call MPI_Recv(….) call MPI_Send(….) endif The above is always safe You can replace both send and recv on both processor with Isend and Irecv