4. Distributed Programming

Slides:



Advertisements
Similar presentations
Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed
Advertisements

MPI Collective Communications
1 Collective Operations Dr. Stephen Tse Lesson 12.
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.
CS 240A: Models of parallel programming: Distributed memory and MPI.
SOME BASIC MPI ROUTINES With formal datatypes specified.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
MPI Collective Communication CS 524 – High-Performance Computing.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
Collective Communication.  Collective communication is defined as communication that involves a group of processes  More restrictive than point to point.
Parallel Programming with Java
Parallel Processing1 Parallel Processing (CS 676) Lecture 7: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived from chapters.
Parallel Programming and Algorithms – MPI Collective Operations David Monismith CS599 Feb. 10, 2015 Based upon MPI: A Message-Passing Interface Standard.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
ECE 1747H : Parallel Programming Message Passing (MPI)
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
HPCA2001HPCA Message Passing Interface (MPI) and Parallel Algorithm Design.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
PP Lab MPI programming VI. Program 1 Break up a long vector into subvectors of equal length. Distribute subvectors to processes. Let them compute the.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
Parallel Programming with MPI By, Santosh K Jena..
Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
An Introduction to MPI (message passing interface)
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN
MPI Derived Data Types and Collective Communication
Message Passing Interface Using resources from
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming: 1. Collective Operations 2. Overlapping Communication with Computation Dr. Xiao.
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
MPI_Alltoall By: Jason Michalske. What is MPI_Alltoall? Each process sends distinct data to each receiver. The Jth block of process I is received by process.
PVM and MPI.
Introduction to MPI Programming Ganesh C.N.
Introduction to parallel computing concepts and technics
Lecture 2: Part II Message Passing Programming: MPI
CS4402 – Parallel Computing
Introduction to MPI.
MPI Message Passing Interface
Send and Receive.
CS 584.
An Introduction to Parallel Programming with MPI
Send and Receive.
Distributed Systems CS
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
Distributed Systems CS
Lecture 14: Inter-process Communication
High Performance Parallel Programming
A Message Passing Standard for MPP and Workstations
MPI: Message Passing Interface
May 19 Lecture Outline Introduce MPI functionality
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Introduction to parallelism and the Message Passing Interface
ITCS 4/5145 Parallel Computing, UNC-Charlotte, B
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Message-Passing Computing Message Passing Interface (MPI)
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Parallel Processing - MPI
MPI Message Passing Interface
CS 584 Lecture 8 Assignment?.
Presentation transcript:

4. Distributed Programming

4.1 Introduction 1. Introduction 1) Concepts Sequential programming statements are executed one by one Concurrent programming a program is composed of several processes (threads) and time-shared executed (in parallel) Parallel programming one statement is executed by many processors (SIMD)

SIMD ... master Pn Mn P2 M2 P1 M1 Pm ① a11 a12 a1n a21 a22 a2n an1 an2 ann … … b11 b12 b1n b21 b22 b2n bn1 bn2 bnn … … x c11 c12 c1n c21 c22 c2n cn1 cn2 cnn … … = ② Ⓝ

SIMD P1 P2 Pn c1,1 = a1,1x b1,1 + a1,2x b2,1 + …… + a1,nx bn,1 Task1 c1,1 c1,2 c1,n step1 c1,1= 0 c1,2= 0 c1,n= 0 step2 a1,1x b1,1 a1,1x b1,2 a1,1x b1,n step3 c1,1+ = c1,2+ = c1,n+ = step4 a1,2x b2,1 a1,2x b2,2 a1,2x b2,n .. . … … step a1,nx bn,1 a1,nx bn,2 a1,nx bn,n step c1,1+ = c1,2+ = c1,n+ = Task2 c2,1 c2,2 c2,n

Sequential Parallel DO 10 i=1,n DO 10 i=1,n DO 10 j=1,n C(i,j) = 0 (j=1.n) C(i,j) = 0 DO 10 k=1,n DO 10 k=1,n C(i,j) = C(i,j) + A(i,k)xB(k,j) (j=1.n) C(i,j) = C(i,j) + A(i,k)xB(k,j) 10 Continue 10 Continue

... Distributed programming a distributed program consists of a number of sequential processes which execute at communication. Interconnection network LAN or MPP P1 M1 Pn Mn P2 M2 ...

Mimic vector computation a11 a12 a1n a21 a22 a2n an1 an2 ann … … b11 b12 b1n b21 b22 b2n bn1 bn2 bnn … … x c11 c12 c1n c21 c22 c2n cn1 cn2 cnn … … = P1 P2 Pn a11 a12 ….a1n a21 a22 ….a2n an1 an2 ….ann …... b11 b21 bn1 … b12 b22 bn2 b1n b2n bnn c11 c21 cn1 c12 c22 cn2 c1n c2n cnn ..

2) Language Two kinds of distributed programming language: Extend existing sequential language by adding new functions for communication, parallelism (distribution) and non-determinacy Design new distributed programming language

3) Two ways to get a parallel and distributed program: use distributed programming language to write a parallel and distributed program existing sequential program parallel and distributed program Parallel compiler

2. Example Solve simultaneous algebraic equations AX=B by relaxation, write AX - B = 0

a11*x1+ a12*x2+… … … + a1n*xn- b1=0 … … … … … … an1*x1+ an2*x2+… … … + ann*xn- bn=0 n=100000 number of nodes=10

Let X0 be initial approximation of X. A X0 - B = R0 R0 is 0th residual. Let ri0 be the ith component of R0 Take largest | ri0| and find xi such that A X1 - B = R1 (1) where ri1=0; X1= X0+ Repeat the process on (1) until Rk 0 . xi

2x1 - x2 + x3 + 1 = 0 0 + 2(0+ x2) - 0 - 6 = 0 x2 =3 X1 = R1 = -x1+ x2 + 4xn- 3 = 0 X0 = R0 = 0 + 2(0+ x2) - 0 - 6 = 0 x2 =3 X1 = R1 = 2(0+ x1) - 3 + 0 + 1 =0 x1=1 1 -6 -3 -2 3

a11*x1+ a12*x2+… … … + a1n*xn- b1=0 … … … … … … a20000,1*x1+ a20000,2*x2+… … + a20000,n*xn- b20000=0 … … … a90001,1*x1+ a90001,2*x2+… … + a90001,n*xn- b90001=0 a100000,1*x1+ a100000,2*x2+… … + a100000,n*xn- bn=0

Each node calculates its residual and finds its largest | ri0| Each node sends its largest | ri0| to node 1 Node 1 receives n-1 | ri0|, finds max{| ri0|, | rj0|, …} and tells that node to compute xi That node computes xi and broadcasts the xi. Other nodes receive it Go to the beginning, if xi > ; end, if xi <  … 1 2 3 10

P1 P2 … P10 … … … … … … L: compute ; L: compute ; R=max(r1,r2,… r1000); R=max(r10001,r10002,… r2000); recv( , T[1],P[1]); send(P1, R, i ); recv( , T[2],P[2]); … … recv( , T[9],P[9]); m=max(R,T[1],T[2],…T[9]); Pm is process that send m; broadcast(m, Pm); recv( , m, Pm); r1 r2 r10000 r10001 r10002 r20000

goto L; goto L; If ( Pm= i ) If ( Pm= i ) { compute xi; { compute xi; broadcast xi } broadcast xi } else else recv( , xi); recv( , xi); If ( xi >  ) If ( xi >  ) goto L; goto L; print X print X

coordinator slave slave slave worker worker worker worker

How write the same program for all nodes? How write receive statements which receive Rs randomly?

3. Distributed system and distributed computing node1 node2 node10 3. Distributed system and distributed computing There is no master in distributed system. There may be a coordinator in distributed computing, the coordinator can be put at any node in the system r1j r2j r3j . r10000j r10001j r10002j r10003j . r20000j r90001j r90002j r90003j . r100000j … … …

(Communicating Sequential Processes) 4.2 CSP (Communicating Sequential Processes) CSP is a theoretical model which was very influential 1. Parallelism 1) Process name :: variables; commands; 2) Parallel command [ process || process || … … || process ] Example [ Phil(i:0..4) :: PHI || table :: TAB ]

2. Process communication 1) Input command <process name>?<variable> 2) Output command <process name>!<expression> Input and output commands are synchronous Example: P:: Q:: … ... Q ! 16 P ? t

3. Non determinacy 1) Alternative command [ B1 S1  B2 S2  … …  Bi Si ] 2) Repetitive command *[ B1 S1  B2 S2  … …  Bi Si ] <alternative command>::= [<guarded command>{  <guarded command>}] <repetitive command>::= *[<guarded command>  <guarded command>}] <guarded command>::=<guard> <command list> <guard>::=<Boolean exp.>;{<Boolean exp.>; …<input command>}

[x>=y m:=x  y>=x m:=y]; Example 2: i:=0; *[ i<n ; prime(i) print(i); i=i+1  i<n ; not prime(i) i=i+1 ]; Example 3: n=2; *[ n>0 ; P2?R n=n-1  n>0 ; P3?R n=n-1]; ; = and

An alternative command fails when all guards fail, otherwise an arbitrary one with successfully executable guard is selected and executed. A repetitive command specifies as many iterations as possible of its constituent alternative command. It terminates when all guards fail.

4.3 MPI 4.3.1 Introduction 1. Two kinds of applications for distributed systems Resource sharing (distributed application) Parallel computation 2. Two kinds of tools CORBA, DCOM, ..; .NET PVM, Linda, MPI 3. What is MPI?

MPI is acronym for Message Passing Interface MPI is acronym for Message Passing Interface. It is a message passing library specification. MPI specification MPI program in C #include “mpi.h” C statement; … … MPI_Init(…); MPI_Send(……) MPI_Send(……); MPI_Recv(……); … … 125 functions; 6 basic functions

using a language independent notation Specification using a language independent notation arguments are marked as IN, OUT or INOUT Running a MPI program demon demon mpi.lib mpi- service process process process

4.3.2 MPI 1. Communicator specifies the communication context for a communication operation. Each communication context provides a separate “communication universe. The communicator also specifies the set of processes that share this communication context. Predefined communicator: MPI_COMM_WORLD

2. Point-to-point communication basic functions MPI_Init ( argc, argv ) MPI_Finalize ( ) MPI_Comm_size ( comm, size ) MPI_Comm_rank ( comm, rank ) 2. Point-to-point communication MPI_Send(message, length, type, dest, tag, comm) MPI_Recv(message, length, type, dest, tag, comm, status) envelope

3. Collective communication MPI_Barrier(comm) MPI_Barrier blocks the caller until all group members have called it MPI_Bcast(buf, count, datatype, root, comm) MPI_Bcast broadcasts a message from the process with rank root to all processes of the group, itself included. It is called by all members of group using the same arguments for comm, root

MPI_Gather(sendbuf, sendcount, sendtype, recvbuf, MPI_Gather(sendbuf, sendcount, sendtype, recvbuf, recvtype, root, comm) Each process (root process included) sends the contents of its send buffer to the root process. The root process receives the messages and stores them in the rank order MPI_Scatter(sendbuf, sendcount, sendtype, recvbuf, recvtype, root, comm) MPI_Scatter is the inverse operation to MPI_Gather

MPI_Reduce(sendbuf, recvbuf, count, datatype, op, root, comm) MPI_Reduce combines the elements provided in the input buffer (defined by sendbuf) of each process in the group, using the operation op, and returns the combined value in the output buffer (defined by recvbuf) of the process with rank root. The routine is called by all group members using the same arguments for count, datatype, op, root, comm.

Predefined reduce operations: MPI_SUM sum MPI_PROD product MPI_MAXLOC max value and location MPI_MINLOC min value and location If each process supplies a value and its rank, then a reduce operation will return the maximum value and the rank of the first process with that value

Examples 1) Gather 100 ints from every process in group to root MPI_comm comm; int root, myrank, *rbuf, gsize, sendbuf [100]; … … MPI_Comm_size (comm, &gsize); rbuf = (int *) malloc (gsize*100*sizeof (int)); MPI_Gather(sendbuf,100,MPI_int, rbuf,100,MPI_int, root,comm); 100 100 100 100 rbuf at root ?

MPI_comm comm; int root, myrank, *rbuf, gsize, sendbuf [100]; … … MPI_Comm_rank (comm, myrank); if (myrank = = root) {MPI_Comm_size (comm, &gsize); rbuf = (int *) malloc (gsize*100*sizeof (int));} MPI_Gather(sendbuf,100,MPI_int, rbuf,100,MPI_int, root,comm); rbuf = new int[gsize*100]

2) Write the framework for solving simultaneous 2) Write the framework for solving simultaneous algebraic equations AX=B by relaxation Struct { float val; int rank} in, out; float xi ; … ... R=max(r1,r2,… r1000); MPI_rank(MPI_comm_world, &myrank); in.val = abs(R); in.rank = myrank; MPI_Reduce(in,out,1,MPI_float_int, MPI_Maxloc, root,comm); MPI_Bcast(out, 1, MPI_float_int, root, comm); if ( out.rank = myrank ) {compute xi; } MPI_Bcast(xi, 1, MPI_float, out.rank, comm); … … MPI_AllReduce(in,out,1,MPI_float_int, MPI_Maxloc, comm);

3. Collective communication (continue) allgather Gather-to-all : can be thought of as gather, but all processes receive the result alltoall It is an extension of allgather to the case where each process sends distinct data to each of the receivers. The jth block sent from process i is received by process j and is placed in the ith block of recvbuf.

process0 gather process4 process0 allgather process4 process0 alltoall data process0 gather process4 process0 allgather process4 process0 alltoall process4

MPI_Gather(sbuf,slen,stype,rbuf,rlen,rtype,0,comm) 4.3.3 About implementation How implement Gather operation (MPI.lib) MPI_Gather(sbuf,slen,stype,rbuf,rlen,rtype,0,comm) for (i = 0; i < 5; i++) { MPI_recv (rbuf+i*rlen*rtype, rlen, rtype, i, comm, status);} Process 0 ? 100 100 100 100 rbuf at root

Process 0 MPI_Gather(sbuf,slen,stype,rbuf,rlen,rtype,0,comm) Process i (i=1-4) MPI_send (sbuf, slen, stype, 0, comm); Process 0 rbuf = sbuf ; for (i = 1; i < 5; i++) { recv (rbuf+1*rlen*rtype, rlen, rtype, 1, comm, status)  recv (rbuf+2*rlen*rtype, rlen, rtype, 2, comm, status)  recv (rbuf+3*rlen*rtype, rlen, rtype, 3, comm, status)  recv (rbuf+4*rlen*rtype, rlen, rtype, 4, comm, status) }

4.3.4 Running MPI program 1. Download WMPI ftp://cs.nju.edu.cn/incoming/jin/distributed/wmpi _v1.2.3.zip 2. Build up MPI environment Unzip wmpi_v1.2.3.zip Run setup 3. Write your MPI program The program should #include “mpi.h”

4. Set file hosts Set c:\windows\hosts to contain IP addresses and names of all participated computers. For example, IP1 name1 IP2 name2 … … IP8 name8 Note: Set the hosts for all computers which participate the computation For windows NT(or XP), hosts is at c:\nt\system32\driver\etc\hosts

5. Edit your .pg file 5. Edit your .pg file .pg file is important. According to .pg file, Wmpi know: How many computers participate your MPI program How many processes in each computer participate your MPI program The absolute path of your MPI program in each computer.

Notes: Only one computer needs .pg file The name of .pg file should be the same as your MPI program name. It is better to put these two files at the same directory Each process runs the same MPI program

1st column --- IP address of the computer Sample for file .pg , local 2 202.119.36.63 2 d:\wmpi\sort.exe 202.119.36.66 1 c:\temp\sort.exe 1st column --- IP address of the computer 2nd column -- number of processes in the computer 3rd column --- absolute path for the MPI program local --- refer to local computer that has the .pg file

All computers in hosts must run this daemon. Windows : 6. Run daemon All computers in hosts must run this daemon. Windows : …\wmpi\system\daemon\wmpi_daemon 7. Compile and run your MPI program Use C++ compiler to compiles your program. Run the program 8. MPI documents and examples You are advised to read document and example before you run your first program

Assignments Finding the median in several computers Distributed sorting 3. Given an even number, find the pairs of prime numbers whose sum is the even number 4. Numerical integration 5. Matrix multiplication 6. Design and implement your distributed and parallel program Requirement: must use collective communication

Summary 1. Concepts about programming sequential parallel distributed 2. Features of distributed computing communication and cooperation parallelism non-determinacy master?

2. CSP parallel : || output/input : ! ? alternative/repetitive : [ … ], *[ … ] 3. MPI communicator : parallel point-to-point communication collective communication : non-determinacy