Lesson2 Point-to-point semantics Embarrassingly Parallel Examples.

Lesson2 Point-to-point semantics Embarrassingly Parallel Examples

send/recv What we learned last class What we learned last class MPI_Send(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm)MPI_Send(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm) MPI_Recv(void *buf, int count, MPI_Datatype type, int src, int tag, MPI_Comm comm, MPI_Status stat)MPI_Recv(void *buf, int count, MPI_Datatype type, int src, int tag, MPI_Comm comm, MPI_Status stat) *stat is a C struct returned with at least the following fields*stat is a C struct returned with at least the following fields stat.MPI_SOURCE stat.MPI_SOURCE stat.MPI_TAG stat.MPI_TAG stat.MPI_ERROR stat.MPI_ERROR

Blocking vs. non-blocking Send/recv functions in previous slide is referred to as blocking point-to-point communication Send/recv functions in previous slide is referred to as blocking point-to-point communication MPI also has non-blocking send/recv functions that will be studied next class – MPI_Isend, MPI_Irecv MPI also has non-blocking send/recv functions that will be studied next class – MPI_Isend, MPI_Irecv Semantics between two are very different – must be very careful to understand rules to write safe programs Semantics between two are very different – must be very careful to understand rules to write safe programs

Blocking recv Semantics of blocking recv Semantics of blocking recv A blocking receive can be started whether or not a matching send has been postedA blocking receive can be started whether or not a matching send has been posted A blocking receive returns only after its receive buffer contains the newly received messageA blocking receive returns only after its receive buffer contains the newly received message A blocking receive can complete before the matching send has completed (but only after it has started)A blocking receive can complete before the matching send has completed (but only after it has started)

Blocking send Semantics of blocking send Semantics of blocking send Can start whether or not a matching recv has been postedCan start whether or not a matching recv has been posted Returns only after message in data envelope is safe to be overwrittenReturns only after message in data envelope is safe to be overwritten This can mean that date was either buffered or that it was sent directly to receive processThis can mean that date was either buffered or that it was sent directly to receive process Which happens is up to implementationWhich happens is up to implementation Very strong implications for writing safe programsVery strong implications for writing safe programs

Examples MPI_Comm_rank(MPI_COMM_WORLD, rank); if (rank == 0){ MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm); MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, stat); } else if (rank == 1){ MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, stat) MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm) } Is this program safe? Why or why not? Yes, this is safe even if no buffer space is available!

Examples MPI_Comm_rank(MPI_COMM_WORLD, rank); if (rank == 0){ MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, stat); MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm); } else if (rank == 1){ MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, stat); MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm); } Is this program safe? Why or why not? No, this will always deadlock!

Examples MPI_Comm_rank(MPI_COMM_WORLD, rank); if (rank == 0){ MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm); MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, stat); } else if (rank == 1){ MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm); MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, stat); } Is this program safe? Why or why not? Often, but not always! Depends on buffer space.

Message order Messages in MPI are said to be non- overtaking. Messages in MPI are said to be non- overtaking. That is, messages sent from a process to another process are guaranteed to arrive in same order. That is, messages sent from a process to another process are guaranteed to arrive in same order. However, nothing is guaranteed about messages sent from other processes, regardless of when send was initiated However, nothing is guaranteed about messages sent from other processes, regardless of when send was initiated

Illustration of message ordering dest = 1 tag = 1 dest = 1 tag = 4 dest = * tag = 1 dest = * tag = 1 dest = 2 tag = * dest = 2 tag = * dest = * tag = * dest = 1 tag = 1 dest = 1 tag = 2 dest = 1 tag = 3 P0 (send) P1 (recv) P2 (send)

Another example int rank = MPI_Comm_rank(); if (rank == 0){ MPI_Send(buf1, count, MPI_FLOAT, 2, tag); MPI_Send(buf2, count, MPI_FLOAT, 1, tag); } else if (rank == 1){ MPI_Recv(buf2, count, MPI_FLOAT, 0, tag); MPI_Send(buf2, count, MPI_FLOAT, 2, tag); else if (rank == 2){ MPI_Recv(buf1, count, MPI_FLOAT, MPI_ANY_SOURCE, tag); MPI_Recv(buf2, count, MPI_FLOAT, MPI_ANY_SOURCE, tag); }

Illustration of previous code send recv send recv Which message will arrive first? Impossible to say!

Progress Progress Progress If a pair of matching send/recv has been initiated, at least one of the two operations will complete, regardless of any other actions in the systemIf a pair of matching send/recv has been initiated, at least one of the two operations will complete, regardless of any other actions in the system send will complete, unless recv is satisfied by another message send will complete, unless recv is satisfied by another message recv will complete, unless message sent is consumed by another matching recv recv will complete, unless message sent is consumed by another matching recv

Fairnesss MPI makes no guarantee of fairness MPI makes no guarantee of fairness If MPI_ANY_SOURCE is used, a sent message may repeatedly be overtaken by other messages (from different processes) that match the same receive. If MPI_ANY_SOURCE is used, a sent message may repeatedly be overtaken by other messages (from different processes) that match the same receive.

Send Modes To this point, we have studied non- blocking send routines using standard mode. To this point, we have studied non- blocking send routines using standard mode. In standard mode, the implementation determines whether buffering occurs. In standard mode, the implementation determines whether buffering occurs. This has major implications for writing safe programs This has major implications for writing safe programs

Other send modes MPI includes three other send modes that give the user explicit control over buffering. MPI includes three other send modes that give the user explicit control over buffering. These are: buffered, synchronous, and ready modes. These are: buffered, synchronous, and ready modes. Corresponding MPI functions Corresponding MPI functions MPI_BsendMPI_Bsend MPI_SsendMPI_Ssend MPI_RsendMPI_Rsend

MPI_Bsend Buffered Send: allows user to explicitly create buffer space and attach buffer to send operations: Buffered Send: allows user to explicitly create buffer space and attach buffer to send operations: MPI_BSend(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm)MPI_BSend(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm) Note: this is same as standard send arguments Note: this is same as standard send arguments MPI_Buffer_attach(void *buf, int size);MPI_Buffer_attach(void *buf, int size); Create buffer space to be used with BSend Create buffer space to be used with BSend MPI_Buffer_detach(void *buf, int *size);MPI_Buffer_detach(void *buf, int *size); Note: in detach case void * argument is really pointer to buffer address, so that add Note: in detach case void * argument is really pointer to buffer address, so that add Note: call blocks until message has been safely sent Note: call blocks until message has been safely sent Note: It is up to the user to properly manage the buffer and ensure space is available for any Bsend call Note: It is up to the user to properly manage the buffer and ensure space is available for any Bsend call

MPI_Ssend Synchronous Send Synchronous Send Ensures that no buffering is used Ensures that no buffering is used Couples send and receive operation – send cannot complete until matching receive is posted and message is fully copied to remove processor Couples send and receive operation – send cannot complete until matching receive is posted and message is fully copied to remove processor Very good for testing buffer safety of program Very good for testing buffer safety of program

MPI_Rsend Ready Send Ready Send Matching receive must be posted before send, otherwise program is incorrect Matching receive must be posted before send, otherwise program is incorrect Can be implemented to avoid handshake overhead when program is known to meet this condition Can be implemented to avoid handshake overhead when program is known to meet this condition Not very typical + dangerous Not very typical + dangerous

Implementation oberservations MPI_Send could be implemented as MPI_Ssend, but this would be weird and undesirable MPI_Send could be implemented as MPI_Ssend, but this would be weird and undesirable MPI_Rsend could be implemented as MPI_Ssend, but this would eliminate any performance enhancements MPI_Rsend could be implemented as MPI_Ssend, but this would eliminate any performance enhancements Standard mode (MPI_Send) is most likely to be efficiently implemented Standard mode (MPI_Send) is most likely to be efficiently implemented

Embarrassingly parallel examples Mandelbrot set Monte Carlo Methods Image manipulation

Embarrassingly Parallel Also referred to as naturally parallel Also referred to as naturally parallel Each Processor works on their own sub-chunk of data independently Each Processor works on their own sub-chunk of data independently Little or no communication required Little or no communication required

Mandelbrot Set Creates pretty and interesting fractal images with a simple recursive algorithm Creates pretty and interesting fractal images with a simple recursive algorithm z k+1 = z k * z k + c Both z and c are imaginary numbers for each point c we compute this formula until either A specified number of iterations has occurred The magnitude of z surpasses 2 In the former case the point is not in the Mandelbrot set In the latter case it is in the Mandelbrot set

Parallelizing Mandelbrot Set What are the major defining features of problem? What are the major defining features of problem? Each point is computed completely independently of every other pointEach point is computed completely independently of every other point Load balancing issues – how to keep procs busyLoad balancing issues – how to keep procs busy Strategies for Parallelization? Strategies for Parallelization?

Mandelbrot Set Simple Example See mandelbrot.c and mandelbrot_par.c for simple serial and parallel implementation See mandelbrot.c and mandelbrot_par.c for simple serial and parallel implementation Think how load balacing could be better handled Think how load balacing could be better handled

Monte Carlo Methods Generic description of a class of methods that uses random sampling to estimate values of integrals, etc. Generic description of a class of methods that uses random sampling to estimate values of integrals, etc. A simple example is to estimate the value of pi A simple example is to estimate the value of pi

Using Monte Carlo to Estimate  Ratio of Are of circle to Square is pi/4 What is value of pi? 1 Fraction of randomly Selected points that lie In circle is ratio of areas, Hence pi/4

Parallelizing Monte Carlo What are general features of algorithm? What are general features of algorithm? Each sample is independent of the othersEach sample is independent of the others Memory is not an issue – master-slave architecture?Memory is not an issue – master-slave architecture? Getting independent random numbers in parallel is an issue. How can this be done?Getting independent random numbers in parallel is an issue. How can this be done?

Image Transformation Simple image operations such as rotating, shifting, and scaling parallelize very easily (see Wilkinson pg. 81) Simple image operations such as rotating, shifting, and scaling parallelize very easily (see Wilkinson pg. 81) However, image smoothing is just a step more involved, since the computations However, image smoothing is just a step more involved, since the computations

Lesson2 Point-to-point semantics Embarrassingly Parallel Examples.

Similar presentations

Presentation on theme: "Lesson2 Point-to-point semantics Embarrassingly Parallel Examples."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lesson2 Point-to-point semantics Embarrassingly Parallel Examples.

Similar presentations

Presentation on theme: "Lesson2 Point-to-point semantics Embarrassingly Parallel Examples."— Presentation transcript:

Similar presentations

About project

Feedback