Presentation is loading. Please wait.

Presentation is loading. Please wait.

MPI and OpenMP.

Similar presentations


Presentation on theme: "MPI and OpenMP."— Presentation transcript:

1 MPI and OpenMP

2 How to get MPI and How to Install MPI?
mpich2-doc-install.pdf // installation guide mpich2-doc-user.pdf // user guide

3 Outline Message-passing model Message Passing Interface (MPI)
Coding MPI programs Compiling MPI programs Running MPI programs Benchmarking MPI programs OpenMP

4 Message-passing Model

5 Processes Number is specified at start-up time
Remains constant throughout execution of program All execute same program Each has unique ID number Alternately performs computations and communicates

6 Circuit Satisfiability
1 1 Not satisfied 1 1 1 1 1 1 1 1 1 1 1 1 1 1

7 Solution Method Circuit satisfiability is NP-complete
No known algorithms to solve in polynomial time We seek all solutions We find through exhaustive search 16 inputs  65,536 combinations to test

8 Partitioning: Functional Decomposition
Embarrassingly parallel: No channels between tasks

9 Agglomeration and Mapping
Properties of parallel algorithm Fixed number of tasks No communications between tasks Time needed per task is variable Consult mapping strategy decision tree Map tasks to processors in a cyclic fashion

10 Cyclic (interleaved) Allocation
Assume p processes Each process gets every pth piece of work Example: 5 processes and 12 pieces of work P0: 0, 5, 10 P1: 1, 6, 11 P2: 2, 7 P3: 3, 8 P4: 4, 9

11 Summary of Program Design
Program will consider all 65,536 combinations of 16 boolean inputs Combinations allocated in cyclic fashion to processes Each process examines each of its combinations If it finds a satisfiable combination, it will print it

12 Include Files Standard I/O header file MPI header file
#include <mpi.h> MPI header file #include <stdio.h> Standard I/O header file

13 Local Variables int main (int argc, char *argv[]) { int i; int id; /* Process rank */ int p; /* Number of processes */ void check_circuit (int, int); Include argc and argv: they are needed to initialize MPI One copy of every variable for each process running this program

14 Initialize MPI First MPI function called by each process
MPI_Init (&argc, &argv); First MPI function called by each process Not necessarily first executable statement Allows system to do any necessary setup

15 Communicators Communicator: opaque object that provides message-passing environment for processes MPI_COMM_WORLD Default communicator Includes all processes Possible to create new communicators Will do this in Chapters 8 and 9

16 Communicator Communicator Name Communicator MPI_COMM_WORLD Processes 5
5 2 Ranks 1 4 3

17 Determine Number of Processes
MPI_Comm_size (MPI_COMM_WORLD, &p); First argument is communicator Number of processes returned through second argument

18 Determine Process Rank
MPI_Comm_rank (MPI_COMM_WORLD, &id); First argument is communicator Process rank (in range 0, 1, …, p-1) returned through second argument

19 Replication of Automatic Variables
1 id 6 p id 6 p 5 id 6 p 2 id 6 p 4 id 6 p 3 id 6 p

20 What about External Variables?
int total; int main (int argc, char *argv[]) { int i; int id; int p; Where is variable total stored?

21 Cyclic Allocation of Work
for (i = id; i < 65536; i += p) check_circuit (id, i); Parallelism is outside function check_circuit It can be an ordinary, sequential function

22 Shutting Down MPI Call after all other MPI library calls
MPI_Finalize(); Call after all other MPI library calls Allows system to free up MPI resources

23 Our Call to MPI_Reduce()
MPI_Reduce (&count, &global_count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD); Only process 0 will get the result if (!id) printf ("There are %d different solutions\n", global_count);

24 Benchmarking the Program
MPI_Barrier  barrier synchronization MPI_Wtick  timer resolution MPI_Wtime  current time

25 How to form a Ring? Now to form ring of systems first of all one has to execute following command. ~]$ mpd & [1] 9152    Which make MPI_system1 as master system of ring. ~]$ mpdtrace –l MPI_system1_32958 ( )

26 How to form a Ring? (cont’d)
Then run following command in terminal of all other system ~]$ mpd –h MPI_system1 –p & Port number of Master system Host name of Master system

27 How to kill a Ring? And to kill ring run following command on master system.  ~]$ mpdallexit

28 Compiling MPI Programs
to compile and execute above program follow the following steps first of all compile sat1.c by executing following command on master system.  ~]$ mpicc -0 sat1.out sat1.c here “mpicc” is mpi command to compile sat1.c file and sat1.out is output file.

29 Running MPI Programs Now to run this output file type following command in master system  ~]$ mpiexec -n 1 ./sat1.out

30 Benchmarking Results

31 OpenMP OpenMP: An application programming interface (API) for parallel programming on multiprocessors Compiler directives Library of support functions OpenMP works in conjunction with Fortran, C, or C++

32 Shared-memory Model Processors interact and synchronize with each
other through shared variables.

33 Fork/Join Parallelism
Initially only master thread is active Master thread executes sequential code Fork: Master thread creates or awakens additional threads to execute parallel code Join: At end of parallel code created threads die or are suspended

34 Fork/Join Parallelism

35 Shared-memory Model vs. Message-passing Model
Number active threads 1 at start and finish of program, changes dynamically during execution Message-passing model All processes active throughout execution of program

36 Parallel for Loops C programs often express data-parallel operations as for loops for (i = first; i < size; i += prime) marked[i] = 1; OpenMP makes it easy to indicate when the iterations of a loop may execute in parallel Compiler takes care of generating code that forks/joins threads and allocates the iterations to threads

37 Pragmas Pragma: a compiler directive in C or C++
Stands for “pragmatic information” A way for the programmer to communicate with the compiler Compiler free to ignore pragmas Syntax: #pragma omp <rest of pragma>

38 Parallel for Pragma Format: #pragma omp parallel for
for (i = 0; i < N; i++) a[i] = b[i] + c[i]; Compiler must be able to verify the run-time system will have information it needs to schedule loop iterations

39 Function omp_get_num_procs
Returns number of physical processors available for use by the parallel program int omp_get_num_procs (void)

40 Function omp_set_num_threads
Uses the parameter value to set the number of threads to be active in parallel sections of code May be called at multiple points in a program void omp_set_num_threads (int t)

41 Comparison Characteristic OpenMP MPI Suitable for multiprocessors Yes
Suitable for multicomputers No Supports incremental parallelization Minimal extra code Explicit control of memory hierarchy

42 C+MPI vs. C+MPI+OpenMP C + MPI C + MPI + OpenMP

43 Benchmarking Results

44 Example Programm C Files Description omp_hello.c Hello world
omp_workshare1.c Loop work-sharing omp_workshare2.c Sections work-sharing omp_reduction.c Combined parallel loop reduction omp_orphan.c Orphaned parallel loop reduction omp_mm.c Matrix multiply omp_getEnvInfo.c Get and print environment information

45 How to run OpenMP programm
$ export OMP_NUM_THREAD=2 $ gcc –fopenmp filename.c $ ./a.out

46 For more information


Download ppt "MPI and OpenMP."

Similar presentations


Ads by Google