Presentation is loading. Please wait.

Presentation is loading. Please wait.

Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages.

Similar presentations


Presentation on theme: "Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages."— Presentation transcript:

1 Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages Single program multiple data (SPMD) model –Logic for multiple processes merged into one program –Control Statements separate processor blocks of logic –A compiled program is stored on each processor –All executables are started together statically –Example: MPI Multiple program multiple data (MPMD) model –Each processor has a separate master program –Master program spawns child processes dynamically –Example: PVM

2 Point-to-point Communication General syntax Send(data, destination, message tag) Receive(data, source, message tag) Synchronous –Completes after data safely transferred –No copying between message buffers Asynchronous –Completes when transmission begins –Local buffers are free for application use Process 1Process 2 send(&x, 2); recv(&y, 1); xy Generic syntax (actual formats later)

3 Synchronized sends and receives (b) recv() occurs before send()

4 Point to Point MPI calls Synchronous Send - completes when data is successfully received Buffered Send - completes after data is copied to a user supplied buffer –Becomes synchronous if no buffers are available Ready Send – synchronous; matching receive must precede the send –Completion occurs when remote processor receives the data –A matching receive must precede the send Receive - completes when the data becomes available. Standard Send –If receive posted, completes if data is on its way (asynchronous) –If no receive posted, completes when data is buffered by MPI Becomes synchronous if no buffers are available Blocking - Return occurs when the call completes Non-Blocking - Return occurs immediately –Application is responsible to properly poll or wait for completion –Allows more parallel processing

5 Buffered Send Example Applications supply a data buffer area using MPI_Buffer_attach() to hold the data during transmission

6 Message Tags Differentiates between types of messages The message tag is carried within message. Wild card receive operations –MPI_ANY_TAG: matches any message type –MPI_ANY_SOURCE: matches messages from any sender Send message type 5 from buffer x to buffer y in process 2

7 Collective Communication MPI_Bcast()): Broadcast or Multicast data to processors in a group Scatter (MPI_Scatter()): Send part of an array to separate processes Gather (MPI_Gather()): Collect array elements from separate processes AlltoAll (MPI_Alltoall()): A combination of gather and scatter MPI_Reduce(): Combine values from all processes to a single value MPI_Reduce_scatter(): Combination of reduce and then scatter result MPI_Scan(): Perform a prefix reduction on data on all processors MPI_Barrier(): Pause until all processors reach the barrier call Route a message to a communicator (group of processors) Advantages: MPI can use the processor hierarchy to improve efficiency Less programming needed for collective operations

8 Broadcast Broadcast - Sending the same message to all processes Multicast - Sending the same message to a defined group of processes. bcast(); buf bcast(); data bcast(); data Process 0Processp  1Process 1 Action Code MPI form

9 Scatter Distributing each element of an array to separate processes Contents of the ith location of the array transmits to process i scatter(); buf scatter(); data scatter(); data Process 0Processp  1Process 1 Action Code MPI form

10 Gather One process collects individual values from set of processes. gather(); buf gather(); data gather(); data Process 0Processp  1Process 1 Action Code MPI form

11 Reduce Perform a distributed calculation Example: Perform addition over a distributed array reduce(); buf reduce(); data reduce(); data Process 0Processp  1Process 1 + Action Code MPI form

12 PVM (Parallel Virtual Machine) Oak Ridge National Laboratories, Free distribution Host process controls the environment Parent process spawns other processes Daemon processes control message passing Get Buffer, Pack, Send, Unpack Non-blocking send, blocking or non-blocking receive Wild card tags and process source Sample Program: Page 52

13 mpij and MpiJava Overview –MpiJava is a wrapper sitting on mpich or lamMpi –mpij is a native Java implementation of mpi Documentation –MpiJava (http://www.hpjava.org/mpiJava.html)http://www.hpjava.org/mpiJava.html –mpij (uses the same API as MpiJava) Java Grande consortium (http://www.javagrande.org)http://www.javagrande.org –Sponsors conferences & encourages Java for Parallel Programming –Maintains Java based paradigms (mpiJava, HPJava, and mpiJ) Other Java based implementations –JavaMpi is another less popular MPI Java wrapper

14 SPMD Computation main (int argc, char *argv[]) { MPI_Init(&argc, &argv);. MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if (myrank == 0) master(); else slave();. MPI_Finalize(); } The master process executes master() The slave processes execute slave()

15 Unsafe message passing Delivery order is a function of timing among processors

16 Communicators Communicators allow: –Collective communication to groups of processors –Give mechanism to identify processors for point to point transfers The default communicator is MPI_COMM_WORLD –A unique rank corresponds to each executing process –The rank is an integer from 0 to p – 1 –The number of processors executing is p Applications can create subset communicators –Each processor has a unique rank in each sub-communicator –The rank is an integer from 0 to g-1 –The number of processors in the group is g A collection of processes

17 Point-to-point Message Transfer MPI_Comm_rank(MPI_COMM_WORLD,&myrank); int x; MPI_Status *stat; if (myrank == 0) { MPI_Send(&x,1,MPI_INT,1,99,MPI_COMM_WORLD); } else if (myrank == 1) { MPI_Recv(&x,1,MPI_INT,0,99,MPI_COMM_WORLD,stat); }

18 Non-blocking Point-to-point Transfer MPI_Comm_rank(MPI_COMM_WORLD, &myrank); int x; MPI_Request *io; MPI_STATUS *stat; if (myrank == 0) { MPI_Isend(&x,1,MPI_INT,1,99,MPI_COMM_WORLD,io); doSomeProcessing(); MPI_Wait(io, stat); } else if (myrank == 1) { MPI_Recv(&x,1,MPI_INT,0,99,MPI_COMM_WORLD,stat); } MPI_Isend() and MPI_Irecv() return immediately MPI_Wait() returns after the operation completes MPI_Test() returns a non-zero if the operation is complete

19 Collective Communication Example Processor 0 gather items from a group of processes The master processor allocates memory to hold the data The remote processors initialize the data array All processors execute the MPI_Gather() function int data[10]; /*data to gather from processes*/ MPI_Comm_rank(MPI_COMM_WORLD, &myrank); if (myrank == 0) { MPI_Comm_size(MPI_COMM_WORLD, &grp_size); buf = (int *)malloc(grp_size*10*sizeof (int)); } else { for (i=0; i<10; i++) data[i] = myrank; } MPI_Gather(data, 10, MPI_INT, buf, grp_size*10, MPI_INT, 0,MPI_COMM_WORLD) ;

20 Calculate Parallel Run Time Sequential execution time: t s –t s = # compute steps of best sequential algorithm (Big Oh) Parallel execution time: t p = t comp + t comm –Communication overhead: T comm = m(t startup + nt data ) where t startup is message latency = time to send a message with no data t data is transmission time to send one data element n is the number of data elements, m is the number of messages –Computation overhead t comp =f (n, p)) Assumptions –All processors are homogeneous and run at the same speed –T p = worst case execution time over all processors –T startup, t data, and t comp are measured in computational step units so they can be added together

21 Estimating Scalability Notes: p and n respectively indicate number of processors and data elements The above formulae help estimate scalability with respect to p and n

22 Parallel Visualization Tools Observe using a space-time diagram (or process-time diagram)

23 Parallel Program Development Cautions –Parallel program programming is harder than sequential programming –Some algorithms don’t lend themselves to running in parallel Advised Steps of Development –Step 1: Program and test as much as possible sequentially –Step 2: Code the Parallel version –Step 3: Run in parallel; one processor with few threads –Step 4: Add more threads as confidence grows –Step 5: Run in parallel with a small number of processors –Step 6: Add more processes as confidence grows Tools –There are parallel debuggers that can help –Insert assertion error checks within the code –Instrument the code (add print statements) –Timing: time(), gettimeofday(), clock(), MPI_Wtime()


Download ppt "Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages."

Similar presentations


Ads by Google