Distributed Processing Systems (InterProcess Communication) (Message Passing) Distributed Processing Systems (InterProcess Communication) (Message Passing)

Distributed Processing Systems (InterProcess Communication) (Message Passing) Distributed Processing Systems (InterProcess Communication) (Message Passing) 오 상 규 서강대학교 정보통신 대학원 Email : sgoh@macroimpact.com Email : sgoh@macroimpact.com

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 2 What is Message Passing ? n Data transfer + Synchronization TIME DATA Process 0 May I send ? Process 1 Yes ! DATA 3 Requires cooperation of sender & receiver.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 3 Characteristics of Message Passing n Multiple Threads of Control 3 Consists of multiple processes, each of which has its own control and may execute different code. Supports MIMD or SPMD parallelism. n Asynchronous Parallelism 3 Message Passing program executes asynchronously. Need barrier and blocking communication for synchronization. n Separate Address Space 3 Data variables in one process are not visible to other processes. Need special library routines (e.g., send/receive) to interact with other processes. n Explicit Interactions 3 The Programmer must resolve all the interaction issues such as communication and synchronization. n Explicit Allocation 3 Data should be explicitly allocated by the user.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 4 Message Passing Libraries n Proprietary Software 3 CMMD : Message passing library used in Thinking Machines CM-5. 3 Express : Programming environment by Parasoft Corporation for message passing and parallel I/O. 3 Nx : Microkernel system developed for Intel MPPs (e.g., Paragon). Replaced by a new kernel called PUMA. n Public-Domain Software 3 p4 : A set of macros and subroutines used for programming both shared-memory and message passing systems. 3 PARMACS : Message passing package derived from p4 and mainly used in Europe. n PVM and MPI 3 MPI : A standard specification for a library of message passing functions developed by the MPI Forum. 3 PVM : Self-contained, public domain software system to run parallel applications on a network of heterogeneous workstations.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 5 Classification of Message Passing Libraries n Application Domain 3 General Purpose : p4, PVM, MPI, Express, PARMACS, etc. ISIS, Horus, Totem, Transis for reliable group communication. 3 Application Specific : BLACS (for linear algebra), TCGMSG (for chemistry), etc. n Programming Model 3 Computation Model : data parallel or functional parallel. 3 Communication Model : RPC, message passing, or shared memory. n Underlying Implementation Philosophy 3 Socket for portability. 3 High performance communication middleware (e.g., Active Message or Fast Message) to achieve high performance. n Portability 3 CMMD for CM-5 and NX/2 for Intel parallel computers. n Heterogeneity

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 6 High-Performance Message-Passing Schemes HW-Based ApproachSW-Based Approach Multithreading High Performance API Hybrid Approach Middleware StandardProprietary (e.g. Fast Sockets)(e.g. U-Net, Active Message (AM), Fast Message (FM)) (e.g. TPVM, LPVM, Chant)(e.g. MPI-FM, PVM-ATM)(e.g. MPI-Nexus, Panda-PVM) (e.g. Nectar, PAPER, SHRIMP, ParaStation, Memory Channel)

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 7 Communication Modes in Message Passing Communication Modes in Message Passing n Synchronous Message Passing n Blocking Send/Receive Non-Blocking Send/Receive Process P Process Q M = 10; S = -100; L1: send M to Q;L1: receive S from P; L2: M = 20;L2: X = S+1; goto L1; Variable M is often called the send message buffer and S is called the receive message buffer.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 8 Three Communication Modes Three Communication Modes n Synchronous Message Passing 3 P has to wait until Q execute a corresponding Receive. 3 Will not return until M is sent and received. 3 No additional buffer needed. 3 X should be evaluated to 11. n Blocking Send/Receive 3 Send is executed when a process reaches it without waiting for a corresponding Receive. 3 Send does not return until the message is sent, meaning that message variable M can be safely rewritten. Maybe temporarily buffered in the sending node, somewhere in the network, or in the receiving node. 3 Receive is executed when a process reaches it without waiting for a corresponding Send. 3 Receive does not return until the message is received. 3 X should be evaluated to 11. n Non-Blocking Send/Receive 3 Send is executed when a process reaches it without waiting for a corresponding Receive. 3 Send returns immediately after it notifies the system. Unsafe to overwrite M. 3 Receive is executed when a process reaches it without waiting for a corresponding Send. 3 Receive return immediately regardless of the message arrival. 3 X can be 11, 21, or -99.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 9 Comparison of Communication Modes Comparison of Communication Modes SynchronousCommunication EventNon-BlockingBlocking Send Start Condition Return of Send Indicates Semantics Buffering Message Status Checking Wait Overhead Overlapping Communications and Computations Both send and receive reached Message received Clean Not needed Highest No Send reached Message sent In-Between Needed Not needed In-Between Yes Send reached Message send initiated Error-Prone Needed Lowest Yes

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 10 What is MPI ? Message Passing Interface n MPI : Message Passing Interface 3 Developed in 1993-1994 by MPI-Forum. n A message-passing library specification 3 Can be used in C, FORTRAN, and C++ program. (comprises 129 functions and macros.) 3 Not a language or compiler specification. 3 Not a specific implementation or product. n Standards for programming parallel computers, clusters, and heterogeneous networks.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 11 Reasons for using MPI n Standardization 3 The only message passing library which can be considered a standard. n Portability 3 No need to modify your source code when you port your application to a different platform. n Performance 3 Vendor implementations should be able to exploit native hardware features to optimize performance. n Availability 3 A variety of implementations are available.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 12 CommunicatorCommunicator n A subset of processes as “communication”universe. composed of Group: an ordered collection of processes. Context: a system defined tag that is attached to a group. Communicator PROCESS 0 PROCESS 1 PROCESS n... - identifying process subsets during development of modular programs. - ensuring that messages intended for different purposes are not confused. each process is assigned a unique rank.(non negative int processor I.D.)

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 13 Types of Communicators n Intra-Communicators 3 A collection of processes that can send messages to each other and engage in a collective communication operations. ex) MPI_COMM_WORLD (default) n Inter-Communicators 3 Used for sending messages between processes belonging to disjoint intra-communicators. ex) a newly created intra-communicator could be linked to the original intra-communicator by an inter-communicator.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 14 MPI Communication Model n Point to point communication operations 3 Send a message from one named process to another. 3 Used to implement local and unstructured communications. n Collective communication operations 3 Perform commonly used global operations such as summation and broadcast.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 15 MPI Data type MPI_PACKED MPI_BYTE long double MPI_LONG_DOUBLE double MPI_DOUBLE float MPI_FLOAT unsigned long int MPI_UNSIGNED_LONG unsigned int MPI_UNSIGNED unsigned short int MPI_UNSIGNED_SHORT unsigned char MPI_UNSIGNED_CHAR signed long int MPI_LONG signed int MPI_INT signed short int MPI_ SHORT signed char MPI_CHAR C DATA TYPEMPI DATA TYPE

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 16 MPI Basic functions n MPI_INIT(int *argc, char ***argv) : initiate an MPI computation n MPI_FINALIZE() : terminate a computation n MPI_COMM_SIZE(IN comm, OUT size) : determine number of processes 3 MPI_Comm comm : communicator handle 3 int size : number of processes in the group of comm n MPI_ COMM_RANK(IN comm, OUT pid) : determine my process identifier 3 MPI_Comm comm : communicator handle  int pid : process id in the group of comm Cf. IN : Call by Value, OUT : as Return Value, INOUT : Call by Reference

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 17 Simple MPI Example shutdown main routine my process id initialize number of process MPI header #include “mpi.h” main ( int argc, char *argv[ ] ) {... /* No MPI functions called before this */ Ierr = MPI_Init ( &argc, &argv ) ;... MPI_Comm_size ( MPI_COMM_WORLD, &np ) ; MPI_Comm_rank ( MPI_COMM_WORLD, &myid ) ;... if ( myid ! = 0 ) MPI_Send ( buff, 300, MPI_FLOAT, 0, 0, MPI_COMM_WORLD ) ; else MPI_Recv ( buff, 300, MPI_FLOAT, srcid, 0, MPI_COMM_WORLD, &status ) ;... MPI_Finalize ( ) ; /* No MPI functions called after this */ }

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 18 MPI Message MPI Message : Data + Envelope MPI MESSAGE DATA ENVELOPE Receiver Rank A Tag (user specified) Sender Rank A Communicator used to distinguish messages received from a single process. mechanisms for grouping data items - count parameter - derived datatypes - MPI_Pack / MPI_Unpack

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 19 COMMUNICATOR MPI Point to Point Communication SEND ( ) BLOCKING COMMUNICATION OR NON-BLOCKING COMMUNICATION PROCESS A RECV ( ) PROCESS B

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 20 MPI Send / Receive function Prototype n MPI_SEND(IN msg, IN count, IN datatype, IN dest, IN tag, IN comm) : send a message 3 void *msg : address of send buffer 3 int count : number of elements to send (  0 ) 3 MPI_Datatype datatype : data type of send buffer elements 3 int dest : process id of destination process 3 int tag : message tag 3 MPI_Comm comm : communicator handle n MPI_RECV(OUT msg, IN count, IN datatype, IN source, IN tag, IN comm, OUT status ) : receive a message 3 void *msg : address of receive buffer 3 int count : number of elements to receive (  0 ) 3 MPI_Datatype datatype : data type of receive buffer elements 3 int dest : process id of source process, or MPI_ANY_TAG 3 int tag : message tag or MPI_ANY_TAG 3 MPI_Comm comm : communicator handle 3 MPI_Status *status : status object

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 21 Blocking Message Passing Example #include “mpi.h” #include main ( int argc, char *argv[ ] ) { int numtasks, rank, dest, source, rc, tag = 1 ; char inmsg, outmsg = ‘x’ ; MPI_Status Stat ; MPI_Init ( &argc, &argv ) ; MPI_Comm_size ( MPI_COMM_WORLD, &numtasks ) ; MPI_Comm_rank ( MPI_COMM_WORLD, &rank ) ; if ( rank == 0 ) { dest = 1 ; source = 1 ; rc = MPI_Send ( &outmsg, 1, MPI_CHAR, dest, tag, MPI_COMM_WORLD ) ; rc = MPI_Recv ( &inmsg, 1, MPI_CHAR, source, tag, MPI_COMM_WORLD, &Stat ) ; } else if ( rank == 1 ) { dest = 0 ; source = 0 ; rc = MPI_Recv ( &inmsg, 1, MPI_CHAR, source, tag, MPI_COMM_WORLD, &Stat ) ; rc = MPI_Send ( &outmsg, 1, MPI_CHAR, dest, tag, MPI_COMM_WORLD ) ; } MPI_Finalize ( ) ; }

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 22 Non-Blocking Message Passing Example #include “mpi.h” #include main ( int argc, char *argv[ ] ) { int numtasks, rank, next, prev, buf[2], tag1 = 1, tag2 = 2 ; MPI_Request reqs[4] ; MPI_Status stats[4] ; MPI_Init ( &argc, &argv ) ; MPI_Comm_size ( MPI_COMM_WORLD, & numtasks ) ; MPI_Comm_rank ( MPI_COMM_WORLD, &rank ) ; prev = rank – 1 ; next = rank + 1 ; if ( rank == 0 ) prev = numtasks – 1 ; if ( rank == ( numtasks – 1) ) next = 0 ; MPI_Irecv ( &buf[0], 1, MPI_INT, prev, tag1, MPI_COMM_WORLD, reqs[0] ) ; MPI_Irecv ( &buf[1], 1, MPI_INT, next, tag2, MPI_COMM_WORLD, reqs[1] ) ; MPI_Isend ( &rank, 1, MPI_INT, prev, tag2, MPI_COMM_WORLD, reqs[2] ) ; MPI_Isend ( &rank, 1, MPI_INT, next, tag1, MPI_COMM_WORLD, reqs[3] ) ; MPI_Waitall ( 4, reqs, stats ) ; MPI_Finalize ( ) ; }

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 23 MPI Collective Communication n A communication pattern that involves all the processes in a communicator. n Tree structured communication P0 P2 P3 P0 P1 P6 P7 P4 P5 P4 P0 If we have p processes, this procedure allows us to distribute the input data In log 2 (p) stages, rather than p-1 stages, which, if p is large, is a huge savings.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 24 Barrier and Broadcast Operations n MPI_BARRIER(IN comm) : Synchronizes all processes. MPI_BCAST(INOUT inbuf, IN incnt, IN intype, IN root, IN comm) : Sends data from one process to all processes. A0 DATA PROCESSES A0 MPI_BCAST

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 25 Gather and Scatter Operations n MPI_GATHER ( IN inbuf, IN incnt, IN intype, OUT outbuf, IN outcnt, IN outtype, IN root, IN comm ) : Gathers data from all processes to one process. A0 A1 A3 DATA PROCESSES A0A1A3 MPI_GATHER n MPI_SCATTER ( IN inbuf, IN incnt, IN intype, OUT outbuf, IN outcnt, IN outtype, IN root, IN comm ) : Scatters data from one process to all processes. A0A1A2 DATA PROCESSES A0 A1 A2 MPI_SCATTER

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 26 Reduce Operation (1) n MPI_REDUCE ( IN inbuf, OUT outbuf, IN cnt, IN type, IN op, IN root, IN comm ) : combine the values to the output buffer of the single root process using a specified operation. 24 PROCESS 0 Initial Data : 57 PROCESS 1 03 PROCESS 2 62 PROCESS 3 -- 1316 -- -- MPI_REDUCE with MPI_SUM, root = 1

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 27 Reduce Operation (2) n MPI_ALLREDUCE ( IN inbuf, OUT outbuf, IN cnt, IN type, IN op, IN comm ) : combine the values to the output buffer of all processes using a specified operation. 24 PROCESS 0 Initial Data : 57 PROCESS 1 03 PROCESS 2 62 PROCESS 3 02 02 02 02 MPI_ALLREDUCE with MPI_MIN

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 28 Collective Communication Example #include “mpi.h” #include #define SIZE 4 main ( int argc, char *argv[ ] ) { int numtasks, rank, sendcount, recvcount, source ; float sbuf[SIZE][SIZE] = { { 1.0, 2.0, 3.0, 4.0 }, { 5.0, 6.0, 7.0, 8.0 }, { 9.0, 10.0, 11.0, 12.0 }, { 13.0, 14.0, 15.0, 16.0 } } ; float rbuf[SIZE] ; MPI_Init ( &argc, &argv ) ; MPI_Comm_rank ( MPI_COMM_WORLD, &rank ) ; MPI_Comm_size ( MPI_COMM_WORLD, &numtasks ) ; if ( numtasks == SIZE ) { source = 1 ; sendcount = SIZE ; recvcount = SIZE ; MPI_Scatter ( sbuf, sendcount, MPI_FLOAT, rbuf, recvcount, MPI_FLOAT, source, MPI_COMM_WORLD ) ; printf ( “rank=%d results : %f %f %f %f \n”, rank, rbuf[0], rbuf[1], rbuf[2], rbuf[3] ) ; } else printf ( “Must specify %d processors. Terminating. \n”, SIZE ) ; MPI_Finalize ( ) ; }

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 29 MPI Implementation (1) n MPICH 3 Freely available implementation of the MPI standard, designed to be both portable and efficient. 3 Developed in 1996 by MPI-Forum. 3 to compile the C source program prog.c % cc -o prog.c -I/usr/local/mpi/include -L/usr/local/mpi/lib -lmpi 3 to run the program with 4 processes % mpirun -np 4 prog

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 30 MPI Implementation (2) n LAM 3 Available from Ohio Supercomputer center and runs on heterogeneous network of Sun, DEC, SGI, HP workstations. n CHIMP - MPI 3 Available from the Edinbourgh Parallel Computing Center and runs on Sun, DEC, SGI, IBM, HP workstations, the Meiko Computing Surface machines, and the Fujitsu AP-1000.

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 31 MPI Interaction Architecture MPICH MPI (Message Passing Interface) : Machine independent layer AM (Active Messages), FM (Fast Message), etc. AM (Active Messages), FM (Fast Message), etc. ADI (Abstract Device Interface) : Machine specific layer - provides efficient communication primitives - optimizations to ADI and the higher layers of MPICH ADI (Abstract Device Interface) : Machine specific layer - provides efficient communication primitives - optimizations to ADI and the higher layers of MPICH

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 32 MPI 2 (1) n Enhanced MPI 3 Discussed in 1995 by MPI-Forum. 3 Draft made available in 1996 n New functionality 3 Dynamic processes : extensions which remove the static process model of MPI. ( e.g., MPI_SPAWN ) 3 One sided communications : Include shared memory operations (put/get) and remote accumulate operations. ( e.g., MPI_PUT )

InterProcess Communication - Message Passing 서강대학교 정보통신 대학원 Page 33 MPI 2 (2) 3 Parallel I / O : Discusses MPI support for parallel I/O. (MPI-IO) I/O can also be modeled as message passing. - Writing to a file : sending a message - Reading from a file : receiving a message 3 Extended collective operations : Allows for non-blocking collective operations and application of collective operations to inter-communicators. 3 External Interfaces : Defines routines which allow developers to layer on top of MPI, such as for debuggers and profilers. 3 Additional language bindings : Describes C++ bindings and discusses FORTRAN-90 issues.

Distributed Processing Systems (InterProcess Communication) (Message Passing) Distributed Processing Systems (InterProcess Communication) (Message Passing)

Similar presentations

Presentation on theme: "Distributed Processing Systems (InterProcess Communication) (Message Passing) Distributed Processing Systems (InterProcess Communication) (Message Passing)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Distributed Processing Systems (InterProcess Communication) (Message Passing) Distributed Processing Systems (InterProcess Communication) (Message Passing)

Similar presentations

Presentation on theme: "Distributed Processing Systems (InterProcess Communication) (Message Passing) Distributed Processing Systems (InterProcess Communication) (Message Passing)"— Presentation transcript:

Similar presentations

About project

Feedback