Programming Parallel Hardware using MPJ Express By A. Shafi.

Slides:



Advertisements
Similar presentations
UNIVERSITY OF JYVÄSKYLÄ P2PDisCo – Java Distributed Computing for Workstations Using Chedar Peer-to-Peer Middleware Presentation for 7 th International.
Advertisements

1 What is message passing? l Data transfer plus synchronization l Requires cooperation of sender and receiver l Cooperation not always apparent in code.
MPI Message Passing Interface
MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
Introduction To Java Objectives For Today â Introduction To Java â The Java Platform & The (JVM) Java Virtual Machine â Core Java (API) Application Programming.
Toward Efficient Support for Multithreaded MPI Communication Pavan Balaji 1, Darius Buntinas 1, David Goodell 1, William Gropp 2, and Rajeev Thakur 1 1.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
What iS RMI? Remote Method Invocation. It is an approach where a method on a remote machine invokes another method on another machine to perform some computation.
Programming Parallel Hardware using MPJ Express
MPI_REDUCE() Philip Madron Eric Remington. Basic Overview MPI_Reduce() simply applies an MPI operation to select local memory values on each process,
Presenter : Nageeb Yahya Alsurmi GS21565 Ameen Mohammad GS22872 Ameen Mohammad GS22872 Yasien Ahmad GS24259 Yasien Ahmad GS24259 Atiq Alemadi GS21798 Atiq.
A Grid Parallel Application Framework Jeremy Villalobos PhD student Department of Computer Science University of North Carolina Charlotte.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
1 Parallel Computing—Higher-level concepts of MPI.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
1 Parallel Computing—Introduction to Message Passing Interface (MPI)
Parallel Programming in C with MPI and OpenMP
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
Jonathan Carroll-Nellenback CIRC Summer School MESSAGE PASSING INTERFACE (MPI)
Message Passing Interface In Java for AgentTeamwork (MPJ) By Zhiji Huang Advisor: Professor Munehiro Fukuda 2005.
Session-02. Objective In this session you will learn : What is Class Loader ? What is Byte Code Verifier? JIT & JAVA API Features of Java Java Environment.
1 MPI-2 and Threads. 2 What are Threads? l Executing program (process) is defined by »Address space »Program Counter l Threads are multiple program counters.
The hybird approach to programming clusters of multi-core architetures.
Parallel Programming with Java
Parallel Programming with Java YILDIRAY YILMAZ Maltepe Üniversitesi.
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
MPI3 Hybrid Proposal Description
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
Parallel Processing1 Parallel Processing (CS 676) Lecture 7: Message Passing using MPI * Jeremy R. Johnson *Parts of this lecture was derived from chapters.
1 Choosing MPI Alternatives l MPI offers may ways to accomplish the same task l Which is best? »Just like everything else, it depends on the vendor, system.
Parallel Programming and Algorithms – MPI Collective Operations David Monismith CS599 Feb. 10, 2015 Based upon MPI: A Message-Passing Interface Standard.
© 2010 IBM Corporation Enabling Concurrent Multithreaded MPI Communication on Multicore Petascale Systems Gabor Dozsa 1, Sameer Kumar 1, Pavan Balaji 2,
Parallel Programming Dr Andy Evans. Parallel programming Various options, but a popular one is the Message Passing Interface (MPI). This is a standard.
Multithreading in Java Project of COCS 513 By Wei Li December, 2000.
HPCA2001HPCA Message Passing Interface (MPI) and Parallel Algorithm Design.
 OOPLs  Help companies reduce complexity  Increase competition in open markets  Speeds up development  Improves maintenance, resusability, modifiability.
Java Threads 11 Threading and Concurrent Programming in Java Introduction and Definitions D.W. Denbo Introduction and Definitions D.W. Denbo.
MPJ Express Alon Vice Ayal Ofaim. Contributors 2 Aamir Shafi Jawad Manzoor Kamran Hamid Mohsan Jameel Rizwan Hanif Amjad Aziz Bryan Carpenter Mark Baker.
Middleware Services. Functions of Middleware Encapsulation Protection Concurrent processing Communication Scheduling.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Message-passing Model.
A new thread support level for hybrid programming with MPI endpoints EASC 2015 Dan Holmes, Mark Bull, Jim Dinan
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
1 Lecture 4: Part 2: MPI Point-to-Point Communication.
Wide-Area Parallel Computing in Java Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences vrije Universiteit.
-1.1- MPI Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
OpenMP Runtime Extensions Many core Massively parallel environment Intel® Xeon Phi co-processor Blue Gene/Q MPI Internal Parallelism Optimizing MPI Implementation.
Introduction to Programming 1 1 2Introduction to Java.
Adding Concurrency to a Programming Language Peter A. Buhr and Glen Ditchfield USENIX C++ Technical Conference, Portland, Oregon, U. S. A., August 1992.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
Applications Active Web Documents Active Web Documents.
Computer Science Department
Processes and Threads Processes and their scheduling
Computer Science Department
New trends in parallel computing
Collective Communication with MPI
Pluggable Architecture for Java HPC Messaging
More on MPI Nonblocking point-to-point routines Deadlock
MPJ (Message Passing in Java): The past, present, and future
Aamir Shafi MPJ Express: An Implementation of Message Passing Interface (MPI) in Java Aamir Shafi.
HPML Conference, Lyon, Sept 2018
More on MPI Nonblocking point-to-point routines Deadlock
MPJ: A Java-based Parallel Computing System
Prof. Leonardo Mostarda University of Camerino
F II 1. Background Objectives
Parallel programming in Java
Computer Science Department
Presentation transcript:

Programming Parallel Hardware using MPJ Express By A. Shafi

Why Java?  Portability  A popular language in colleges and software industry: –Large pool of software developers –A useful educational tool  Higher programming abstractions including OO features  Improved compile and runtime checking of the code  Automatic garbage collection  Support for multithreading  Rich collection of support libraries 2

MPJ Express  MPJ Express is an MPI-like library that supports execution of parallel Java applications  Three existing approaches to Java messaging: –Pure Java (Sockets based) –Java Native Interface (JNI) –Remote Method Invocation (RMI)  Motivation for a new Java messaging system: –Maintain compatibility with Java threads by providing thread- safety –Handle contradicting issues of high-performance and portability –Requires no change to the native standard JVM 3

MPJ Express configuration 4

MPJ Express Design 5

Installing MPJ Express 6

HelloWorld program 7

8

An Embarrassingly Parallel Toy Example 9 Master Process Worker 0Worker 1Worker 2Worker 3

Documentation: class Comm 10

Documentation: Send() method 11

Documentation: Recv() method 12

13 mpjrun.sh -np 5 ToyExample MPJ Express (0.38) is started in the multicore configuration

Point-to-point Communication Non-blocking methods return a Request object: –Wait() //waits until communication completes –Test() //test if the communication has finished 14 StandardSynchronousReadyBuffered BlockingSend() Recv() Ssend()Rsend()Bsend() Non-blockingIsend() Irecv() Issend()Irsend()Ibsend()

15 CPU waits “Blocking” Send() Recv() Sender Receiver time CPU waits “Non Blocking” Isend() Irecv() Sender Receiver time CPU does computation Wait() CPU waits Wait() CPU waits CPU does computation

Thread-safe Communication 16  Thread-safe MPI libraries allow communication from multiple user threads inside a single process  Such an implementation requires fine-grain locking: –Incorrect implementations can deadlock Levels of Thread-Safety in MPI Libraries MPI_THREAD_SINGLE : only one thread will execute MPI_THREAD_FUNNELED : may be multi-threaded, but only the main thread will make MPI calls MPI_THREAD_SERIALIZED : may be multi-threaded, but only the one thread will make MPI calls at a time MPI_THREAD_MULTIPLE : multiple threads may call MPI with no restrictions

Implementation of point-to-point communication  Various modes of blocking and non-blocking communication primitives are implemented using two protocols 17 RendezvousEager Send

Performance Evaluation of Point to Point Communication  Normally ping pong benchmarks are used to calculate: –Latency: How long it takes to send N bytes from sender to receiver? –Throughput: How much bandwidth is achieved?  Latency is a useful measure for studying the performance of “small” messages  Throughput is a useful measure for studying the performance of “large” messages 18 Node A Node B RTT!

Latency Comparison 19

Throughput Comparison 20

Latency Comparison on a multicore machine 21

22 Throughput Comparison on a multicore machine

Collective communications  Provided as a convenience for application developers: –Save significant development time –Efficient algorithms may be used –Stable (tested)  Built on top of point-to-point communications 23

24 Image from MPI standard doc

Reduce collective operations 25 MPI.PROD MPI.SUM MPI.MIN MPI.MAX MPI.LAND MPI.BAND MPI.LOR MPI.BOR MPI.LXOR MPI.BXOR MPI.MINLOC MPI.MAXLOC Processes

Documentation: Send() method 26

Toy Example with Collectives 27