Lesson2 Point-to-point semantics Embarrassingly Parallel Examples.

Slides:

Advertisements

Similar presentations

MPI Message Passing Interface

Advertisements

1 Non-Blocking Communications. 2 #include int main(int argc, char **argv) { int my_rank, ncpus; int left_neighbor, right_neighbor; int data_received=-1;

Parallel Processing1 Parallel Processing (CS 667) Lecture 9: Advanced Point to Point Communication Jeremy R. Johnson *Parts of this lecture was derived.

Sahalu Junaidu ICS 573: High Performance Computing 8.1 Topic Overview Matrix-Matrix Multiplication Block Matrix Operations A Simple Parallel Matrix-Matrix.

1 Implementing Master/Slave Algorithms l Many algorithms have one or more master processes that send tasks and receive results from slave processes l Because.

1 Buffers l When you send data, where does it go? One possibility is: Process 0Process 1 User data Local buffer the network User data Local buffer.

Reference: / Point-to-Point Communication.

Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.

CS 240A: Models of parallel programming: Distributed memory and MPI.

SOME BASIC MPI ROUTINES With formal datatypes specified.

Message-Passing Programming and MPI CS 524 – High-Performance Computing.

High Performance Parallel Programming Dirk van der Knijff Advanced Research Computing Information Division.

MPI Point-to-Point Communication CS 524 – High-Performance Computing.

Ace104 Lecture 8 Tightly Coupled Components MPI (Message Passing Interface)

Lecture 8 Objectives Material from Chapter 9 More complete introduction of MPI functions Show how to implement manager-worker programs Parallel Algorithms.

Jonathan Carroll-Nellenback CIRC Summer School MESSAGE PASSING INTERFACE (MPI)

Distributed Systems CS Programming Models- Part II Lecture 17, Nov 2, 2011 Majd F. Sakr, Mohammad Hammoud andVinay Kolar 1.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M

1 Tuesday, October 10, 2006 To err is human, and to blame it on a computer is even more so. -Robert Orben.

A Brief Look At MPI’s Point To Point Communication Brian T. Smith Professor, Department of Computer Science Director, Albuquerque High Performance Computing.

1 What is message passing? l Data transfer plus synchronization l Requires cooperation of sender and receiver l Cooperation not always apparent in code.

A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.

MA471Fall 2003 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.

Specialized Sending and Receiving David Monismith CS599 Based upon notes from Chapter 3 of the MPI 3.0 Standard

Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.

Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.

Jonathan Carroll-Nellenback CIRC Summer School MESSAGE PASSING INTERFACE (MPI)

1 MPI Primer Lesson 10 2 What is MPI MPI is the standard for multi- computer and cluster message passing introduced by the Message-Passing Interface.

CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.

MPI Communications Point to Point Collective Communication Data Packaging.

Message Passing Programming Model AMANO, Hideharu Textbook pp. １４０－１４７.

Performance Oriented MPI Jeffrey M. Squyres Andrew Lumsdaine NERSC/LBNL and U. Notre Dame.

Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.

MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.

MPI Send/Receive Blocked/Unblocked Tom Murphy Director of Contra Costa College High Performance Computing Center Message Passing Interface BWUPEP2011,

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M

1 Overview on Send And Receive routines in MPI Kamyar Miremadi November 2004.

Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.

MPI (continue) An example for designing explicit message passing programs Advanced MPI concepts.

Parallel Programming with MPI By, Santosh K Jena..

Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.

MA471Fall 2002 Lecture5. More Point To Point Communications in MPI Note: so far we have covered –MPI_Init, MPI_Finalize –MPI_Comm_size, MPI_Comm_rank.

Its.unc.edu 1 University of North Carolina - Chapel Hill ITS Research Computing Instructor: Mark Reed Point to Point Communication.

1 Lecture 4: Part 2: MPI Point-to-Point Communication.

MPI Point to Point Communication CDP 1. Message Passing Definitions Application buffer Holds the data for send or receive Handled by the user System buffer.

An Introduction to MPI (message passing interface)

Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.

MPI Send/Receive Blocked/Unblocked Josh Alexander, University of Oklahoma Ivan Babic, Earlham College Andrew Fitz Gibbon, Shodor Education Foundation Inc.

Chapter 5. Nonblocking Communication MPI_Send, MPI_Recv are blocking operations Will not return until the arguments to the functions can be safely modified.

Parallel Algorithms & Implementations: Data-Parallelism, Asynchronous Communication and Master/Worker Paradigm FDI 2007 Track Q Day 2 – Morning Session.

Message Passing Interface Using resources from

1 ParallelAlgorithms Parallel Algorithms Dr. Stephen Tse Lesson 9.

Distributed Systems CS Programming Models- Part II Lecture 14, Oct 28, 2013 Mohammad Hammoud 1.

Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.

3/12/2013Computer Engg, IIT(BHU)1 MPI-2. POINT-TO-POINT COMMUNICATION Communication between 2 and only 2 processes. One sending and one receiving. Types:

Introduction to parallel computing concepts and technics

MPI Point to Point Communication

Introduction to MPI.

Blocking / Non-Blocking Send and Receive Operations

Distributed Systems CS

More on MPI Nonblocking point-to-point routines Deadlock

Distributed Systems CS

Lecture 14: Inter-process Communication

A Message Passing Standard for MPP and Workstations

May 19 Lecture Outline Introduce MPI functionality

Introduction to parallelism and the Message Passing Interface

More on MPI Nonblocking point-to-point routines Deadlock

Barriers implementations

Synchronizing Computations

5- Message-Passing Programming

Presentation transcript:

Lesson2 Point-to-point semantics Embarrassingly Parallel Examples

send/recv What we learned last class What we learned last class MPI_Send(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm)MPI_Send(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm) MPI_Recv(void *buf, int count, MPI_Datatype type, int src, int tag, MPI_Comm comm, MPI_Status stat)MPI_Recv(void *buf, int count, MPI_Datatype type, int src, int tag, MPI_Comm comm, MPI_Status stat) *stat is a C struct returned with at least the following fields*stat is a C struct returned with at least the following fields stat.MPI_SOURCE stat.MPI_SOURCE stat.MPI_TAG stat.MPI_TAG stat.MPI_ERROR stat.MPI_ERROR

Blocking vs. non-blocking Send/recv functions in previous slide is referred to as blocking point-to-point communication Send/recv functions in previous slide is referred to as blocking point-to-point communication MPI also has non-blocking send/recv functions that will be studied next class – MPI_Isend, MPI_Irecv MPI also has non-blocking send/recv functions that will be studied next class – MPI_Isend, MPI_Irecv Semantics between two are very different – must be very careful to understand rules to write safe programs Semantics between two are very different – must be very careful to understand rules to write safe programs

Blocking recv Semantics of blocking recv Semantics of blocking recv A blocking receive can be started whether or not a matching send has been postedA blocking receive can be started whether or not a matching send has been posted A blocking receive returns only after its receive buffer contains the newly received messageA blocking receive returns only after its receive buffer contains the newly received message A blocking receive can complete before the matching send has completed (but only after it has started)A blocking receive can complete before the matching send has completed (but only after it has started)

Blocking send Semantics of blocking send Semantics of blocking send Can start whether or not a matching recv has been postedCan start whether or not a matching recv has been posted Returns only after message in data envelope is safe to be overwrittenReturns only after message in data envelope is safe to be overwritten This can mean that date was either buffered or that it was sent directly to receive processThis can mean that date was either buffered or that it was sent directly to receive process Which happens is up to implementationWhich happens is up to implementation Very strong implications for writing safe programsVery strong implications for writing safe programs

Examples MPI_Comm_rank(MPI_COMM_WORLD, rank); if (rank == 0){ MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm); MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, stat); } else if (rank == 1){ MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, stat) MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm) } Is this program safe? Why or why not? Yes, this is safe even if no buffer space is available!

Examples MPI_Comm_rank(MPI_COMM_WORLD, rank); if (rank == 0){ MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, stat); MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm); } else if (rank == 1){ MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, stat); MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm); } Is this program safe? Why or why not? No, this will always deadlock!

Examples MPI_Comm_rank(MPI_COMM_WORLD, rank); if (rank == 0){ MPI_Send(sendbuf, count, MPI_DOUBLE, 1, tag, comm); MPI_Recv(recvbuf, count, MPI_DOUBLE, 1, tag, comm, stat); } else if (rank == 1){ MPI_Send(sendbuf, count, MPI_DOUBLE, 0, tag, comm); MPI_Recv(recvbuf, count, MPI_DOUBLE, 0, tag, comm, stat); } Is this program safe? Why or why not? Often, but not always! Depends on buffer space.

Message order Messages in MPI are said to be non- overtaking. Messages in MPI are said to be non- overtaking. That is, messages sent from a process to another process are guaranteed to arrive in same order. That is, messages sent from a process to another process are guaranteed to arrive in same order. However, nothing is guaranteed about messages sent from other processes, regardless of when send was initiated However, nothing is guaranteed about messages sent from other processes, regardless of when send was initiated

Illustration of message ordering dest = 1 tag = 1 dest = 1 tag = 4 dest = * tag = 1 dest = * tag = 1 dest = 2 tag = * dest = 2 tag = * dest = * tag = * dest = 1 tag = 1 dest = 1 tag = 2 dest = 1 tag = 3 P0 (send) P1 (recv) P2 (send)

Another example int rank = MPI_Comm_rank(); if (rank == 0){ MPI_Send(buf1, count, MPI_FLOAT, 2, tag); MPI_Send(buf2, count, MPI_FLOAT, 1, tag); } else if (rank == 1){ MPI_Recv(buf2, count, MPI_FLOAT, 0, tag); MPI_Send(buf2, count, MPI_FLOAT, 2, tag); else if (rank == 2){ MPI_Recv(buf1, count, MPI_FLOAT, MPI_ANY_SOURCE, tag); MPI_Recv(buf2, count, MPI_FLOAT, MPI_ANY_SOURCE, tag); }

Illustration of previous code send recv send recv Which message will arrive first? Impossible to say!

Progress Progress Progress If a pair of matching send/recv has been initiated, at least one of the two operations will complete, regardless of any other actions in the systemIf a pair of matching send/recv has been initiated, at least one of the two operations will complete, regardless of any other actions in the system send will complete, unless recv is satisfied by another message send will complete, unless recv is satisfied by another message recv will complete, unless message sent is consumed by another matching recv recv will complete, unless message sent is consumed by another matching recv

Fairnesss MPI makes no guarantee of fairness MPI makes no guarantee of fairness If MPI_ANY_SOURCE is used, a sent message may repeatedly be overtaken by other messages (from different processes) that match the same receive. If MPI_ANY_SOURCE is used, a sent message may repeatedly be overtaken by other messages (from different processes) that match the same receive.

Send Modes To this point, we have studied nonblocking send routines using standard mode. To this point, we have studied nonblocking send routines using standard mode. In standard mode, the implementation determines whether buffering occurs. In standard mode, the implementation determines whether buffering occurs. This has major implications for writing safe programs This has major implications for writing safe programs

Other send modes MPI includes three other send modes that give the user explicit control over buffering. MPI includes three other send modes that give the user explicit control over buffering. These are: buffered, synchronous, and ready modes. These are: buffered, synchronous, and ready modes. Corresponding MPI functions Corresponding MPI functions MPI_BsendMPI_Bsend MPI_SsendMPI_Ssend MPI_RsendMPI_Rsend

MPI_Bsend Buffered Send: allows user to explicitly create buffer space and attach buffer to send operations: Buffered Send: allows user to explicitly create buffer space and attach buffer to send operations: MPI_BSend(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm)MPI_BSend(void *buf, int count, MPI_Datatype type, int dest, int tag, MPI_Comm comm) Note: this is same as standard send arguments Note: this is same as standard send arguments MPI_Buffer_attach(void *buf, int size);MPI_Buffer_attach(void *buf, int size); Create buffer space to be used with BSend Create buffer space to be used with BSend MPI_Buffer_detach(void *buf, int *size);MPI_Buffer_detach(void *buf, int *size); Note: in detach case void * argument is really pointer to buffer address, so that add Note: in detach case void * argument is really pointer to buffer address, so that add Note: call blocks until message has been safely sent Note: call blocks until message has been safely sent Note: It is up to the user to properly manage the buffer and ensure space is available for any Bsend call Note: It is up to the user to properly manage the buffer and ensure space is available for any Bsend call

MPI_Ssend Synchronous Send Synchronous Send Ensures that no buffering is used Ensures that no buffering is used Couples send and receive operation – send cannot complete until matching receive is posted and message is fully copied to remove processor Couples send and receive operation – send cannot complete until matching receive is posted and message is fully copied to remove processor Very good for testing buffer safety of program Very good for testing buffer safety of program

MPI_Rsend Ready Send Ready Send Matching receive must be posted before send, otherwise program is incorrect Matching receive must be posted before send, otherwise program is incorrect Can be implemented to avoid handshake overhead when program is known to meet this condition Can be implemented to avoid handshake overhead when program is known to meet this condition Not very typical + dangerous Not very typical + dangerous

Implementation oberservations MPI_Send could be implemented as MPI_Ssend, but this would be weird and undesirable MPI_Send could be implemented as MPI_Ssend, but this would be weird and undesirable MPI_Rsend could be implemented as MPI_Ssend, but this would eliminate any performance enhancements MPI_Rsend could be implemented as MPI_Ssend, but this would eliminate any performance enhancements Standard mode (MPI_Send) is most likely to be efficiently implemented Standard mode (MPI_Send) is most likely to be efficiently implemented

Embarrassingly parallel examples Mandelbrot set Monte Carlo Methods Image manipulation

Embarrassingly Parallel Also referred to as naturally parallel Also referred to as naturally parallel Each Processor works on their own sub-chunk of data independently Each Processor works on their own sub-chunk of data independently Little or no communication required Little or no communication required

Mandelbrot Set Creates pretty and interesting fractal images with a simple recursive algorithm Creates pretty and interesting fractal images with a simple recursive algorithm z k+1 = z k * z k + c Both z and c are imaginary numbers for each point c we compute this formula until either A specified number of iterations has occurred The magnitude of z surpasses 2 In the former case the point is not in the Mandelbrot set In the latter case it is in the Mandelbrot set

Parallelizing Mandelbrot Set What are the major defining features of problem? What are the major defining features of problem? Each point is computed completely independently of every other pointEach point is computed completely independently of every other point Load balancing issues – how to keep procs busyLoad balancing issues – how to keep procs busy Strategies for Parallelization? Strategies for Parallelization?

Mandelbrot Set Simple Example See mandelbrot.c and mandelbrot_par.c for simple serial and parallel implementation See mandelbrot.c and mandelbrot_par.c for simple serial and parallel implementation Think how load balacing could be better handled Think how load balacing could be better handled

Monte Carlo Methods Generic description of a class of methods that uses random sampling to estimate values of integrals, etc. Generic description of a class of methods that uses random sampling to estimate values of integrals, etc. A simple example is to estimate the value of pi A simple example is to estimate the value of pi

Using Monte Carlo to Estimate  Ratio of Are of circle to Square is pi/4 What is value of pi? 1 Fraction of randomly Selected points that lie In circle is ratio of areas, Hence pi/4

Parallelizing Monte Carlo What are general features of algorithm? What are general features of algorithm? Each sample is independent of the othersEach sample is independent of the others Memory is not an issue – master-slave architecture?Memory is not an issue – master-slave architecture? Getting independent random numbers in parallel is an issue. How can this be done?Getting independent random numbers in parallel is an issue. How can this be done?

Image Transformation Simple image operations such as rotating, shifting, and scaling parallelize very easily (see Wilkinson pg. 81) Simple image operations such as rotating, shifting, and scaling parallelize very easily (see Wilkinson pg. 81) However, image smoothing is just a step more involved, since the computations However, image smoothing is just a step more involved, since the computations