Presentation is loading. Please wait.

Presentation is loading. Please wait.

Message-Passing Computing Dr. Tim McGuire Sam Houston State University ACET 2002 Corpus Christi, TX.

Similar presentations

Presentation on theme: "Message-Passing Computing Dr. Tim McGuire Sam Houston State University ACET 2002 Corpus Christi, TX."— Presentation transcript:

1 Message-Passing Computing Dr. Tim McGuire Sam Houston State University ACET 2002 Corpus Christi, TX

2 Motivation  So, you attended my 2000 talk in Austin* (or read a similar article), built a Beowulf cluster from castoff computers, and now you’re wondering what you can do with it, right?  Well, that’s the motivation for this talk *T. McGuire, “Building a Low-Cost Supercomputer,” ACET2000, Austin, Texas, September 2000.

3 The Target Machine  For the purpose of this talk, we will look at Beowulf clusters  All the techniques we discuss can also be extended to a network of workstations  The differences are that a Beowulf cluster uses:  dedicated processors (rather than scavenging cycles from idle workstations)  a private system area network (enclosed SAN rather than exposed LAN)

4 How Does One Program a Beowulf?  The short answer is Message Passing, a technique originally developed for distributed computing  The Beowulf architecture means that message passing is more efficient -- it doesn't have to compete with other traffic on the net  Other techniques are being explored – Java is a popular topic at this time

5 A Typical Uniprocessor System  Consists of a processor executing a program stored in main memory

6 Types of Parallel Computers  Two principal types:  Shared memory multiprocessor  Distributed memory multi-computer

7 Shared Memory Multiprocessor System  Natural way to extend single processor model - have multiple processors connected to multiple memory modules, such that each processor can access any memory module - so-called shared memory configuration

8 Shared memory multiprocessor system  Any memory location can be accessible by any of the processors.  A single address space exists, meaning that each memory location is given a unique address within a single range of addresses.  Generally, shared memory programming more convenient although it does require access to shared data to be controlled by the programmer (using critical sections etc.)

9 Message-Passing Multicomputer  Complete computers connected through an interconnection network:

10 Message Passing Software  PVM (parallel virtual machine) was the first widely used API  Developed at Oak Ridge National Laboratory (late 1980s)  Very widely used (free)  Berkeley NOW (network of workstations) project  Has task scheduling and other advanced features 

11 More Recent Message Passing Work  MPI (Message-passing Interface)  Standard for message passing libraries  Defines routines but not implementation  Has adequate features for most parallel applications  Version 1 released in 1994 with 120+ routines defined  Version 2 now available  Both PVM and MPI provide a set of user-level libraries for message passing with normal programming languages (C, C++, Fortran)

12 Basics of Message-Passing Basics of Message-Passing Programming using user-level message passing libraries:  Two primary mechanisms needed: 1.A method of creating separate processes for execution on different computers 2.A method of sending and receiving messages

13 Single Program Multiple Data (SPMD) Model  Different processes merged into one program. Within program, control statements select different parts for each processor to execute. All executables started together - static process creation. Source file Executables Compile to suit processor Processor 0Processor n-1 Basic MPI model

14 Basic “point-to-point” Send and Receive Routines  Passing a message between processes using send() and recv() library calls:

15 MPI (Message Passing Interface)  Standard developed by group of academics and industrial partners to foster more widespread use and portability  Defines routines, not implementation  Several free implementations exist

16 A Simple MPI Example  The first C program most of us saw was the “Hello, World!” program in K&R  We’ll look at a variant that makes some use of multiple processes to have each process send a greeting to another process  We will assume we have p processes identified by their rank 0, 1 …, p-1

17 First MPI Program /* From Peter Pacheco, University of San Francisco */ #include #include “mpi.h” int main(int argc, char *argv[]) { int myrank; /* rank of process */ int p;/* number of processes */ int source;/* rank of sender*/ int dest;/* rank of receiver*/ int tag = 0;/* tag for messages*/ char message[100];/* storage for message */ MPI_STATUS status;/* receive */ /* Start up MPI */ MPI_Init(&argc, &argv);

18 First MPI Program, Cont’d /* Find out process rank */ MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); /* Find out number of processes */ MPI_Comm_size(MPI_COMM_WORLD, &p); if (my_rank != 0) { /* Create message */ sprintf(message, "Greetings from process %d!", my_rank); dest = 0; /* Use strlen+1 so that '\0' gets transmitted */ MPI_Send(message, strlen(message)+1, MPI_CHAR, dest, tag, MPI_COMM_WORLD); } else { /* my_rank == 0 */ for (source = 1; source < p; source++) { MPI_Recv(message, 100, MPI_CHAR, source, tag, MPI_COMM_WORLD, &status); printf("%s\n", message); } /* end for */ } /* end if */

19 First MPI Program, Cont’d /* Shut down MPI */ MPI_Finalize(); } /* main */  The details of compilation and execution depend on the system you’re using  On Bubbawulf:  gcc –o greetings greetings.c –lmpi  To run with two processors:  mpirun –np 2 greetings

20 Running the first program  When the program is compiled and run with 4 processes, the output should be: Greetings from process 1! Greetings from process 2! Greetings from process 3!  This is an example of a special type of MIMD programming called SPMD (single-program, multiple-data) programming  Different processes execute different statements by branching within the program based on their process ranks

21 MPI  The program consists entirely of C statements  MPI is simply a library of definitions and functions (C or Fortran)

22 General MPI Programs  Every MPI program contains the directive #include “mpi.h” which includes the definitions and declarations necessary for compiling an MPI program  MPI uses a consistent scheme for identifiers – all begin with “MPI_”  MPI uses communicators (collections of processes that can send messages to each other) – MPI_COMM_WORLD is the default  Often 1 process per processor, but not necessarily

23 MPI Program Skeleton... #include "mpi.h"... int main(int argc, char* argv[]) {... /* No MPI functions called before this */ MPI_Init(&argc, &argv); /* initialize MPI system */... /* No MPI functions called after this */ MPI_Finalize(); /* clean up MPI memory, etc. */... } /* main */

24 Essential MPI Functions  MPI_Comm_size()  Used to find out how many processes are involved in the execution of a program  MPI_Comm_rank() lets a process find out its rank  Essential since we are using SPMD  MPI_Send() and MPI_Recv() are used to accomplish the actual message passing

25 The Killer App  Every paradigm shift in computing needs a motivation  The typical applications for parallel and distributed processing are not as accessible to the general undergraduate  Large matrix operations, etc  I propose a simple yet interesting application, using synchronous computations

26 Cellular Automata  The problem space is divided into cells.  Each cell can be in one of a finite number of states.  Cells affected by their neighbors according to certain rules, and all cells are affected simultaneously in a “generation.”  Rules re-applied in subsequent generations so that cells evolve, or change state, from generation to generation.

27 Heat Distribution Problem  An area has known temperatures along each of its edges. Find the temperature distribution within.  Divide area into fine mesh of points, hi,j. Temperature at an inside point taken to be average of temperatures of four neighboring points. Convenient to describe edges by points.  Temperature of each point by iterating the equation: H i,j = (Hi -1, j +Hi +1,j + Hi,j -1 + Hi,j +1 )/4 (0 < i < n, 0 < j < n) for a fixed number of iterations or until the difference between iterations less than some very small amount.

28 Heat Distribution Problem

29 Parallel Code w = x = y = z = initial temp for (iteration = 0; iteration < limit; iteration++) { g = 0.25 * (w + x + y + z); send(&g, Pi-1,j); /* non-blocking sends */ send(&g, Pi+1,j); send(&g, Pi,j-1); send(&g, Pi,j+1); recv(&w, Pi-1,j); /* synchronous recvs */ recv(&x, Pi+1,j); recv(&y, Pi,j-1); recv(&z, Pi,j+1); }  Important to use send() s that do not block while waiting for the recv() s; otherwise the processes would deadlock, each waiting for a recv() before moving on - recv() s must be synchronous and wait for the send() s.

30 The Game of Life  Most famous cellular automata is the “Game of Life” devised by John Conway (Scientific American, October 1970)  Also good assignment for graphical output, if available  Board game - theoretically infinite two- dimensional array of cells.  Each cell can hold one “organism” and has eight neighboring cells, including those diagonally adjacent. Initially, some cells occupied.

31 The Rules of Life 1.Every organism with two or three neighboring organisms survives for the next generation. 2.Every organism with four or more neighbors dies from overpopulation. 3.Every organism with one neighbor or none dies from isolation. 4.Each empty cell adjacent to exactly three occupied neighbors will give birth to an organism.  These rules were derived by Conway “after a long period of experimentation.”

32 How to Solve Life  Each block can be represented as a process  Initialization can be done by giving some blocks one organism and other blocks none. This can be done randomly or using a heuristic approach.

33 Outline of the Code do { iteration++ current_neighbors = 0; send(current value – 0 or 1 – to all neighbors); recv(current values of all neighbors); current_neighbors = sum of received values; if (current_neighbors > 4) organism = 0; /* Dead from overcrowding */ else if (current_neighbors < 1) organism = 0; /* Dead from isolation */ else if (current_neighbors == 3) organism = 1; /* new organism created */ } while((!converged() || (iteration < limit));

34 Some Other Fun Examples  Foxes and Rabbits  Rabbits move around happily (reproducing) while foxes eat any rabbits they come across  Also based on a 2-D board  Sharks and Fishes  Ocean modeled as a 3-D array of cells  Each cell holds one fish or one shark

35 Serious Applications for Cellular Automata  Diffusion of gases  Airflow across an airplane wing  Erosion/movement of sand at a beach  Biological growth

36 IEEE Task Force on Cluster Computing  Aim to foster the use and development of clusters  Has been in operation since 1999  Main home page:

37 Conclusions Cluster computing can be effectively taught at the undergraduate level Excellent and fun examples of applications exist

38 Quote: Gill wrote in 1958 (quoting papers back to 1953): “ … There is therefore nothing new in the basic idea of parallel programming, but only its application to computers. The author cannot believe that there will be any insuperable difficulty in extending it to computers. It is not to be expected that the necessary programming techniques will be worked out overnight. Much experimenting remains to be done. After all, the techniques that are commonly used in programming today were only won at the cost of considerable toil several years ago. In fact the advent of parallel programming may do something to revive the pioneering spirit in programming which seems at the present to be degenerating into a rather dull and routine occupation.” Gill, S. (1958), “Parallel Programming,” The Computer Journal (British) Vol. 1, pp

Download ppt "Message-Passing Computing Dr. Tim McGuire Sam Houston State University ACET 2002 Corpus Christi, TX."

Similar presentations

Ads by Google