Presentation is loading. Please wait.

Presentation is loading. Please wait.

ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.

Similar presentations


Presentation on theme: "ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago."— Presentation transcript:

1 ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago

2 Overview Message Passing and Shared Memory Reference Designing and Building Parallel Programs, by Ian Foster (textbook), Chapters 1, 2, and 8. Maui HPC Center site

3 Parallel Programming Paradigms Message Passing - the user makes calls to libraries to explicitly share information between processors. Data Parallel - data partitioning determines parallelism. Single Instruction Multiple Data (SIMD) approach. Shared Memory - multiple processes sharing common memory space.

4 Parallel Programming Paradigms Threads - a single process having multiple (concurrent) execution paths. Combined Models - composed of two or more of the above. Note: these models are machine/architecture independent, any of the models can be implemented on any hardware given appropriate operating system support. An effective implementation is one which closely matches its target hardware and provides the user ease in programming.

5 Message Passing Message Passing Model Set of processes using only local memory Processes communicate by sending and receiving messages Data transfer requires cooperative operations to be performed by each process (a send operation must have a matching receive) Programming with message passing is done by linking with and making calls to libraries which manage the data exchange between processors. Message passing libraries are available for most modern programming languages.

6 Shared Memory Shared Memory Model Processes access the same memory space. Start as single thread and creates additional threads to start a parallel region. Data transfer Read/write from/to memory Programming with shared memory is done by compiler directives and linking with and making calls to libraries which manage threads. Shared memory libraries are available for Fortran and C/C++.

7 Implementation of Message Passing: MPI A standard portable message-passing library definition developed in 1993 by a group of parallel computer vendors, software writers, and application scientists. Available for Fortran, C/C++, Java programs. Available on a wide variety of parallel machines. Target platform is a distributed memory system.

8 Implementation of Message Passing: MPI All inter-task communication is by message passing. All parallelism is explicit: the programmer is responsible for parallelism the program and implementing the MPI constructs. Programming model is SPMD (Single Program Multiple Data)

9 Implementation of Shared Memory: OpenMP OpenMP is a specification for a set of compiler directives, library routines, and environment variables that can be used to specify shared memory parallelism. Jointly defined by a group of major computer hardware, software, and application vendors. OpenMP is designed for Fortran, C and C++. Target platform is a shared memory system.

10 Implementation of Shared Memory: OpenMP Fork-join parallelism Master thread spawns a team of threads as needed. Mostly used to parallelize loops. Threads communicate by sharing variables. Threads are created with pragmas or compiler directives

11 Evaluating Paradigms Performance - can paradigm express details which will lead to an efficient execution Mapping Independence - does expression abstract away implementation Modularity - Simultaneous code development Determinism - Does result depend on asynchronous behavior, debugging

12 Programmer Requirements The programmer must consider architecture and software available in designing algorithm A different high level algorithm may be necessary depending on underlying hardware support

13 Steps to create a parallel program If you start with a serial program Debug serial code completely Identify parts of the program that can execute concurrently Requires thorough understanding of algorithm Exploit inherent parallelism May require restructuring of Program Algorithm May require new algorithm

14 Steps to create a parallel program Decompose the program Functional Parallelism Data Parallelism Combination of both Code development Code may be determined/influenced by machine architecture Choose programming paradigm Determine communication pattern Add code to accomplish tasks and communications

15 Steps to create a parallel program Compile Test Debug Optimization Measure performance Locate problem areas Improve problem areas

16 Communication Message passing Communication is explicitly programmed Programmer must understand and code communication Data parallel Compilers and run-time system do all communications behind the scenes. Programmer does not need to understand communication. (Not necessarily best performance)


Download ppt "ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago."

Similar presentations


Ads by Google