CS 140: Models of parallel programming: Distributed memory and MPI.

Slides:



Advertisements
Similar presentations
MPI Message Passing Interface Portable Parallel Programs.
Advertisements

MPI Message Passing Interface
1 Computational models of the physical world Cortical bone Trabecular bone.
A Message Passing Standard for MPP and Workstations Communications of the ACM, July 1996 J.J. Dongarra, S.W. Otto, M. Snir, and D.W. Walker.
CS 240A: Models of parallel programming: Distributed memory and MPI.
CS 240A: Models of parallel programming: Machines, languages, and complexity measures.
Message-Passing Programming and MPI CS 524 – High-Performance Computing.
Distributed Memory Programming with MPI. What is MPI? Message Passing Interface (MPI) is an industry standard message passing system designed to be both.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
S an D IEGO S UPERCOMPUTER C ENTER N ATIONAL P ARTNERSHIP FOR A DVANCED C OMPUTATIONAL I NFRASTRUCTURE Message Passing Interface (MPI) Part I NPACI Parallel.
Comp 422: Parallel Programming Lecture 8: Message Passing (MPI)
1 July 29, 2005 Distributed Computing 1:00 pm - 2:00 pm Introduction to MPI Barry Wilkinson Department of Computer Science UNC-Charlotte Consortium for.
EECC756 - Shaaban #1 lec # 7 Spring Message Passing Interface (MPI) MPI, the Message Passing Interface, is a library, and a software standard.
12b.1 Introduction to Message-passing with MPI UNC-Wilmington, C. Ferner, 2008 Nov 4, 2008.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Basics of Message-passing Mechanics of message-passing –A means of creating separate processes on different computers –A way to send and receive messages.
CS 179: GPU Programming Lecture 20: Cross-system communication.
Parallel Programming Using Basic MPI Presented by Timothy H. Kaiser, Ph.D. San Diego Supercomputer Center Presented by Timothy H. Kaiser, Ph.D. San Diego.
1 CS4402 – Parallel Computing Lecture 2 MPI – Getting Started. MPI – Point to Point Communication.
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 17, 2012.
2a.1 Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
ECE 1747H : Parallel Programming Message Passing (MPI)
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
2.1 Message-Passing Computing ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 14, 2013.
9-2.1 “Grid-enabling” applications Part 2 Using Multiple Grid Computers to Solve a Single Problem MPI © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid.
Part I MPI from scratch. Part I By: Camilo A. SilvaBIOinformatics Summer 2008 PIRE :: REU :: Cyberbridges.
CS 240A Models of parallel programming: Distributed memory and MPI.
Introduction to Parallel Programming with C and MPI at MCSR Part 1 The University of Southern Mississippi April 8, 2010.
Parallel Computing A task is broken down into tasks, performed by separate workers or processes Processes interact by exchanging information What do we.
Parallel Programming with MPI Prof. Sivarama Dandamudi School of Computer Science Carleton University.
Message Passing Programming with MPI Introduction to MPI Basic MPI functions Most of the MPI materials are obtained from William Gropp and Rusty Lusk’s.
CS 838: Pervasive Parallelism Introduction to MPI Copyright 2005 Mark D. Hill University of Wisconsin-Madison Slides are derived from an online tutorial.
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
Summary of MPI commands Luis Basurto. Large scale systems Shared Memory systems – Memory is shared among processors Distributed memory systems – Each.
Message Passing Interface (MPI) 1 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
Distributed-Memory (Message-Passing) Paradigm FDI 2004 Track M Day 2 – Morning Session #1 C. J. Ribbens.
Parallel Programming with MPI By, Santosh K Jena..
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
Oct. 23, 2002Parallel Processing1 Parallel Processing (CS 730) Lecture 6: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived.
CS4230 CS4230 Parallel Programming Lecture 13: Introduction to Message Passing Mary Hall October 23, /23/2012.
Message Passing and MPI Laxmikant Kale CS Message Passing Program consists of independent processes, –Each running in its own address space –Processors.
Introduction to MPI CDP 1. Shared Memory vs. Message Passing Shared Memory Implicit communication via memory operations (load/store/lock) Global address.
Programming distributed memory systems: Message Passing Interface (MPI) Distributed memory systems: multiple processing units working on one task (e.g.
12.1 Parallel Programming Types of Parallel Computers Two principal types: 1.Single computer containing multiple processors - main memory is shared,
Introduction to Parallel Programming at MCSR Message Passing Computing –Processes coordinate and communicate results via calls to message passing library.
Message Passing Interface (MPI) 2 Amit Majumdar Scientific Computing Applications Group San Diego Supercomputer Center Tim Kaiser (now at Colorado School.
1 Parallel and Distributed Processing Lecture 5: Message-Passing Computing Chapter 2, Wilkinson & Allen, “Parallel Programming”, 2 nd Ed.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Message Passing Interface Using resources from
Parallel Computers Today Oak Ridge / Cray Jaguar > 1.75 PFLOPS Two Nvidia 8800 GPUs > 1 TFLOPS Intel 80- core chip > 1 TFLOPS  TFLOPS = floating.
MPI-Message Passing Interface. What is MPI?  MPI is a specification for the developers and users of message passing libraries. By itself, it is NOT a.
Lecture 3 Point-to-Point Communications Dr. Muhammad Hanif Durad Department of Computer and Information Sciences Pakistan Institute Engineering and Applied.
1 Programming distributed memory systems Clusters Distributed computers ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, Jan 6, 2015.
Introduction to parallel computing concepts and technics
CS4402 – Parallel Computing
Introduction to MPI.
MPI Message Passing Interface
CS 584.
CS4961 Parallel Programming Lecture 16: Introduction to Message Passing Mary Hall November 3, /03/2011 CS4961.
Lecture 14: Inter-process Communication
Message-Passing Computing More MPI routines: Collective routines Synchronous routines Non-blocking routines ITCS 4/5145 Parallel Computing, UNC-Charlotte,
Lab Course CFD Parallelisation Dr. Miriam Mehl.
Introduction to parallelism and the Message Passing Interface
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
Message-Passing Computing Message Passing Interface (MPI)
Hello, world in MPI #include <stdio.h> #include "mpi.h"
Hello, world in MPI #include <stdio.h> #include "mpi.h"
MPI Message Passing Interface
CS 584 Lecture 8 Assignment?.
Presentation transcript:

CS 140: Models of parallel programming: Distributed memory and MPI

Technology Trends: Microprocessor Capacity Moore’s Law: # transistors / chip doubles every 1.5 years Microprocessors keep getting smaller, denser, and more powerful. Gordon Moore (Intel co-founder) predicted in 1965 that the transistor density of semiconductor chips would double roughly every 18 months.

Trends in processor clock speed Triton’s clockspeed is still only 2600 Mhz in 2015!

4-core Intel Sandy Bridge (Triton uses an 8-core version) 2600 Mhz clock speed

Generic Parallel Machine Architecture Key architecture question: Where and how fast are the interconnects? Key algorithm question: Where is the data? Proc Cache L2 Cache L3 Cache Memory Storage Hierarchy Proc Cache L2 Cache L3 Cache Memory Proc Cache L2 Cache L3 Cache Memory potential interconnects

Triton memory hierarchy: I (Chip level) Proc Cache L2 Cache Proc Cache L2 Cache Proc Cache L2 Cache Proc Cache L2 Cache Proc Cache L2 Cache L3 Cache (8MB) Proc Cache L2 Cache Proc Cache L2 Cache Proc Cache L2 Cache (AMD Opteron 8-core Magny-Cours, similar to Triton’s Intel Sandy Bridge) Chip sits in socket, connected to the rest of the node...

Shared Node Memory (64GB) Node L3 Cache (20 MB) P L1/L2 Chip L3 Cache (20 MB) P L1/L2 Chip Triton memory hierarchy II (Node level)

Triton memory hierarchy III (System level) 64GB Node 324 nodes, message-passing communication, no shared memory

Triton memory hierarchy III (System level) Node 324 nodes, message-passing communication, no shared memory 64GB

Some models of parallel computation Computational model Shared memory SPMD / Message passing SIMD / Data parallel PGAS / Partitioned global Loosely coupled Hybrids … Languages Cilk, OpenMP, Pthreads … MPI Cuda, Matlab, OpenCL, … UPC, CAF, Titanium Map/Reduce, Hadoop, … ???

Parallel programming languages Many have been invented – *much* less consensus on what are the best languages than in the sequential world. Could have a whole course on them; we’ll look just a few. Languages you’ll use in homework: C with MPI (very widely used, very old-fashioned) Cilk Plus (a newer upstart) You will choose a language for the final project

Generic Parallel Machine Architecture Key architecture question: Where and how fast are the interconnects? Key algorithm question: Where is the data? Proc Cache L2 Cache L3 Cache Memory Storage Hierarchy Proc Cache L2 Cache L3 Cache Memory Proc Cache L2 Cache L3 Cache Memory potential interconnects

Message-passing programming model Architecture: Each processor has its own memory and cache but cannot directly access another processor’s memory. Language: MPI (“Message-Passing Interface”) A least common denominator based on 1980s technology Links to documentation on course home page SPMD = “Single Program, Multiple Data” interconnect P0 memory NI... P1 memory NI Pn memory NI

Hello, world in MPI #include #include "mpi.h" int main( int argc, char *argv[]) { int rank, size; MPI_Init( &argc, &argv ); MPI_Comm_size( MPI_COMM_WORLD, &size ); MPI_Comm_rank( MPI_COMM_WORLD, &rank ); printf( "Hello world from process %d of %d\n", rank, size ); MPI_Finalize(); return 0; }

MPI in nine routines (all you really need) MPI_InitInitialize MPI_FinalizeFinalize MPI_Comm_sizeHow many processes? MPI_Comm_rankWhich process am I? MPI_WtimeTimer MPI_SendSend data to one proc MPI_RecvReceive data from one proc MPI_BcastBroadcast data to all procs MPI_ReduceCombine data from all procs

Ten more MPI routines (sometimes useful) More collective ops (like Bcast and Reduce): MPI_Alltoall, MPI_Alltoallv MPI_Scatter, MPI_Gather Non-blocking send and receive: MPI_Isend, MPI_Irecv MPI_Wait, MPI_Test, MPI_Probe, MPI_Iprobe Synchronization: MPI_Barrier

Example: Send an integer x from proc 0 to proc 1 MPI_Comm_rank(MPI_COMM_WORLD,&myrank); /* get rank */ int msgtag = 1; if (myrank == 0) { int x = 17; MPI_Send(&x, 1, MPI_INT, 1, msgtag, MPI_COMM_WORLD); } else if (myrank == 1) { int x; MPI_Recv(&x, 1, MPI_INT,0,msgtag,MPI_COMM_WORLD,&status); }

Some MPI Concepts Communicator A set of processes that are allowed to communicate among themselves. Kind of like a “radio channel”. Default communicator: MPI_COMM_WORLD A library can use its own communicator, separated from that of a user program.

Some MPI Concepts Data Type What kind of data is being sent/recvd? Mostly just names for C data types MPI_INT, MPI_CHAR, MPI_DOUBLE, etc.

Some MPI Concepts Message Tag Arbitrary (integer) label for a message Tag of Send must match tag of Recv Useful for error checking & debugging

Parameters of blocking send MPI_Send(buf, count, datatype, dest, tag, comm) Address of Number of items Datatype of Rank of destination Message tag Communicator send buffer to send each item process

Parameters of blocking receive Parameters of blocking receive MPI_Recv(buf, count, datatype, src, tag, comm, status) Address of Maximum number Message tag Communicator receive buffer of items to receive Datatype of each item Rank of source process Status after operation

Example: Send an integer x from proc 0 to proc 1 MPI_Comm_rank(MPI_COMM_WORLD,&myrank); /* get rank */ int msgtag = 1; if (myrank == 0) { int x = 17; MPI_Send(&x, 1, MPI_INT, 1, msgtag, MPI_COMM_WORLD); } else if (myrank == 1) { int x; MPI_Recv(&x, 1, MPI_INT,0,msgtag,MPI_COMM_WORLD,&status); }