1 Parallel Computing—Higher-level concepts of MPI.

Slides:

Advertisements

Similar presentations

1 Introduction to Collective Operations in MPI l Collective operations are called by all processes in a communicator. MPI_BCAST distributes data from one.

Advertisements

Source: MPI – Message Passing Interface Communicator groups and Process Topologies Source:

Its.unc.edu 1 Collective Communication University of North Carolina - Chapel Hill ITS - Research Computing Instructor: Mark Reed

MPI Basics Introduction to Parallel Programming and Cluster Computing University of Washington/Idaho State University MPI Basics Charlie Peck Earlham College.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

MPI_REDUCE() Philip Madron Eric Remington. Basic Overview MPI_Reduce() simply applies an MPI operation to select local memory values on each process,

HPDC Spring MPI 11 CSCI-6964: High Performance Parallel & Distributed Computing (HPDC) AE 216, Mon/Thurs. 2 – 3:20 p.m Message Passing Interface.

1 Friday, October 20, 2006 “Work expands to fill the time available for its completion.” -Parkinson’s 1st Law.

A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.

Message-Passing Programming and MPI CS 524 – High-Performance Computing.

1 CS 668: Lecture 2 An Introduction to MPI Fred Annexstein University of Cincinnati CS668: Parallel Computing Fall 2007 CC Some.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Chapter 8 Matrix-vector Multiplication.

1 Parallel Computing—Introduction to Message Passing Interface (MPI)

Parallel Programming in C with MPI and OpenMP

EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.

Topic Overview One-to-All Broadcast and All-to-One Reduction

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Collective Communication.  Collective communication is defined as communication that involves a group of processes  More restrictive than point to point.

Message Passing Interface. Message Passing Interface (MPI) Message Passing Interface (MPI) is a specification designed for parallel applications. The.

Jonathan Carroll-Nellenback CIRC Summer School MESSAGE PASSING INTERFACE (MPI)

Parallel Programming with Java

Collective Communication

A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.

Parallel Programming with MPI Matthew Pratola

Basic Communication Operations Based on Chapter 4 of Introduction to Parallel Computing by Ananth Grama, Anshul Gupta, George Karypis and Vipin Kumar These.

Parallel Processing1 Parallel Processing (CS 676) Lecture 7: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived from chapters.

Director of Contra Costa College High Performance Computing Center

Parallel Programming and Algorithms – MPI Collective Operations David Monismith CS599 Feb. 10, 2015 Based upon MPI: A Message-Passing Interface Standard.

2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.

CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.

MPJ Express Alon Vice Ayal Ofaim. Contributors 2 Aamir Shafi Jawad Manzoor Kamran Hamid Mohsan Jameel Rizwan Hanif Amjad Aziz Bryan Carpenter Mark Baker.

Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.

Parallel Programming with MPI By, Santosh K Jena..

Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Lecture 6: Message Passing Interface (MPI). Parallel Programming Models Message Passing Model Used on Distributed memory MIMD architectures Multiple processes.

Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©

CSCI-455/522 Introduction to High Performance Computing Lecture 4.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.

Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.

Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Message-passing Model.

Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.

FIT5174 Parallel & Distributed Systems Dr. Ronald Pose Lecture FIT5174 Distributed & Parallel Systems Lecture 5 Message Passing and MPI.

Message Passing Programming Based on MPI Collective Communication I Bora AKAYDIN

MPI Derived Data Types and Collective Communication

Programming Parallel Hardware using MPJ Express By A. Shafi.

1 Programming distributed memory systems Clusters Distributed computers ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 6, 2015.

Chapter 4 Message-Passing Programming. Learning Objectives Understanding how MPI programs execute Understanding how MPI programs execute Familiarity with.

1 MPI: Message Passing Interface Prabhaker Mateti Wright State University.

Computer Science Department

Introduction to MPI Programming Ganesh C.N.

Collectives Reduce Scatter Gather Many more.

MPI Jakub Yaghob.

Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Message Passing Interface (cont.) Topologies.

Introduction to MPI.

CS 668: Lecture 3 An Introduction to MPI

Computer Science Department

Collective Communication with MPI

Prof. Daniel S. Katz Department of Electrical and Computer Engineering

An Introduction to Parallel Programming with MPI

Programming with MPI.

CSCE569 Parallel Computing

A Message Passing Standard for MPP and Workstations

Aamir Shafi MPJ Express: An Implementation of Message Passing Interface (MPI) in Java Aamir Shafi.

MPI: Message Passing Interface

CSCE569 Parallel Computing

Introduction to parallelism and the Message Passing Interface

Computer Science Department

5- Message-Passing Programming

Presentation transcript:

1 Parallel Computing—Higher-level concepts of MPI

2 MPI—Presentation Outline Communicators, Groups, and Contexts Collective Communication Derived Datatypes Virtual Topologies

3 Communicators, groups, and contexts MPI provides a higher level abstraction to create parallel libraries: Safe communication space Group scope for collective operations Process Naming Communicators + Groups provide: Process Naming (instead of IP address + ports) Group scope for collective operations Contexts: Safe communication

4 What are communicators? A data-structure that contains groups (and thus processes) Why is it useful: Process naming, ranks are names for application programmers Easier than IPaddress + ports Group communications as well as point to point communication There are two types of communicators, Intracommunicators: Communication within a group Intercommunicators: Communication between two groups (must be disjoint)

5 What are contexts? An unique integer: An additional tag on the messages Each communicator has a distinct context that provides a safe communication universe: A context is agreed upon by all processes when a communicator is built Intracommunicators has two contexts: One for point-to-point communications One for collective communications, Intercommunicators has two contexts: Explained in the coming slides

6 Intracommunicators Contains one group Allows point-to-point and collective communications between processes within this group Communicators can only be built from existing communicators: MPI.COMM_WORLD is the first Intracommunicator to start with Creation of intracommunicators is a collective operation: All processes in the existing communicator must call it in order to execute successfully Intracommunicators can have process topologies: Cartesian Graph

7 Creating new Intracommunicators MPI.Init(args); int [] incl1 = { 0, 3}; Group grp1 = MPI.COMM_WORLD.Group(); Group grp2 = grp1.Incl(incl1); Intracomm newComm = MPI.COMM_WORLD.Create(grp2);

8 How do processes agree on context for new Intracommunicators ? Each process has a static context variable which is incremented whenever an Intracomm is created Each process increments this variable, sends it to all the other processes The max integer is agreed upon as the context An existing communicators’ context is used for sending “context agreement” messages: What about MPI.COMM_WORLD? It is safe anyway, because it is the first intracommunicator and there is no chance of conflicts

9 Intercommunicators Contains two groups: Local group (the local process is in this group) Remote group Both groups must be disjoint Only allows point-to-point communications Intercommunicators cannot have process topologies Next slide: How to create intercommunicators

10 MPI.Init(args); int [] incl2 = {0, 2, 4, 6}; int [] incl3 = {1, 3, 5, 7}; Group grp1 = MPI.COMM_WORLD.Group(); int rank = MPI.COMM_WORLD.Rank(); Group grp2 = grp1.Incl(incl2); Group grp3 = grp1.Incl(incl3); Intercomm icomm = null; if(rank == 0 || rank == 2 || rank == 4 || rank == 6) { icomm = MPI.COMM_WORLD.Create_intercomm(comm1,0,1,56); } else { icomm = MPI.COMM_WORLD.Create_intercomm(comm2,1,0,56);} Creating intercommunicators

11 Creating intercomms … What are the arguments to Create_intercomm method: Local communicator (which contains current process) local_leader (rank) remote_leader (rank) tag for messages sent for selection of contexts But, the groups were disjoint, how can they communicate? That is where a peer communicator is required At least local_leader and remote_leader are part of this peer communicator In the last figure, MPI.COMM_WORLD is the peer communicator, and process 0 and 1 (ranks relative to MPI.COMM_WORLD) are leaders of their respective groups

12 Selecting contexts for intercomms An intercommunicator has two contexts: send_context (Used for sending messages) recv_context (Used for receiving messages) In intercommunicators, processes in local group can only send messages to remote groups How is context agreed upon? Each group decides its context, The leaders (local and remote) exchange the contexts agreed upon, The one which is greater, is selected as the context

13 Process 0 Process 1 Process 3 Process 2 Process 4 Process 5 Process 7 Process 6 COMM_WORLD Group1 Group

14 MPI—Presentation Outline Point to Point Communication Communicators, Groups, and Contexts Collective Communication Derived Datatypes Virtual Topologies

15 Collective communications Provided as a convenience for application developers: Save significant development time Efficient algorithms may be used Stable (tested) Built on top of point-to-point communications These operations include: Broadcast, Barrier, Reduce, Allreduce, Alltoall, Scatter, Scan, Allscatter Versions that allows displacements between the data

16 Image from MPI standard doc Broadcast, scatter, gather, allgather, alltoall

17 Reduce collective operations  MPI.PROD  MPI.SUM  MPI.MIN  MPI.MAX  MPI.LAND  MPI.BAND  MPI.LOR  MPI.BOR  MPI.LXOR  MPI.BXOR  MPI.MINLOC  MPI.MAXLOC Processes

18 Eight processes, thus forms only one group Each process exchanges an integer 4 times Overlaps communications well A Typical Barrier() Implementation

19 Intracomm.Bcast( … ) Sends data from a process to all the other processes Code from adlib: A communication library for HPJava The current implementation is based on n-ary tree: Limitation: broadcasts only from rank=0 Generated dynamically Cost: O( log2(N) ) MPICH1.2.5 uses linear algorithm: Cost O(N) MPICH2 has much improved algorithms LAM/MPI uses n-ary trees: Limitation, broadcast from rank=0

20 A Typical Broadcast Implementation

21 MPI—Presentation Outline Point to Point Communication Communicators, Groups, and Contexts Collective Communication Derived Datatypes Virtual Topologies

22 MPI Datatypes What kind (type) of data can be sent using MPI messaging? Basically two types: Basic (primitive) datatypes Derived datatypes

23 MPI Basic Datatypes MPI_CHAR MPI_SHORT MPI_INT MPI_LONG MPI_UNSIGNED_CHAR MPI_UNSIGNED_SHORT MPI_UNSIGNED_LONG MPI_UNSIGNED MPI_FLOAT MPI_DOUBLE MPI_LONG_DOUBLE MPI_BYTE

24 Besides basic datatypes, it is possible communicate heterogeneous and non- contiguous data: Contiguous Indexed Vector Struct Derived Datatypes

25 MPI—Presentation Outline Point to Point Communication Communicators, Groups, and Contexts Collective Communication Derived Datatypes Virtual Topologies

26 Virtual topologies Used to specify processes in a geometric shape Virtual topologies have no connection with the physical layout of machines: Its possible to make use of underlying machine architecture These virtual topologies can be assigned to processes in an Intracommunicator MPI provides: Cartesian topology Graph topology

27 Cartesian topology: Mapping four processes onto 2x2 topology Each process is assigned a coordinate: Rank 0: (0,0) Rank 1: (1,0) Rank 2: (0,1) Rank 3: (1,1) Uses: Calculate rank by knowing grid position Calculate grid positions from ranks Easier to locate rank of neighbours Applications may have communication patterns: Lots of messaging with immediate neighbours

28 Periods in cartesian topology Axis 1 (y-axis is periodic): Processes in top and bottom rows have valid neighbours towards top and bottom respectively Axis 0 (x-axis is non- periodic): Processes in right and left column have undefined neighbour towards right and left respectively

29 Graph topology

30 Just to give you an idea how MPI-based applications are designed … Doing Matrix Multiplication using MPI

31 = x Basically how it works!

32 Matrix Multiplication M x N.. int rank = MPI.COMM_WORLD.Rank() ; int size = MPI.COMM_WORLD.Size() ; if(master_mpi_process) { initialize matrices M and N for(int i=1 ; i<size ; i++) { send rows of matrix M to process `i’ } broadcast matrix N to all non-zero processes for (int i=0 ; i<size ; i++) { receive rows of resultant matrix from process `i’ }.. print results.. } else { receive rows of Matrix M call broadcast to receive matrix N compute matrix multiplication for sub matrix (its done in parallel) send resultant row back to master process }..