04 June Thoughts on a Java Reference Implementation for MPJ Mark Baker *, Bryan Carpenter  * University of Portsmouth  Florida.

Slides:



Advertisements
Similar presentations
Support for Fault Tolerance (Dynamic Process Control) Rich Graham Oak Ridge National Laboratory.
Advertisements

COM vs. CORBA.
Java Network Programming Vishnuvardhan.M. Dept. of Computer Science - SSBN Java Overview Object-oriented Developed with the network in mind Built-in exception.
M. Muztaba Fuad Masters in Computer Science Department of Computer Science Adelaide University Supervised By Dr. Michael J. Oudshoorn Associate Professor.
Remote Procedure Call Design issues Implementation RPC programming
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
A Scalable Virtual Registry Service for jGMA Matthew Grove CCGRID WIP May 2005.
Tam Vu Remote Procedure Call CISC 879 – Spring 03 Tam Vu March 06, 03.
Copyright © 2001 Qusay H. Mahmoud RMI – Remote Method Invocation Introduction What is RMI? RMI System Architecture How does RMI work? Distributed Garbage.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
The road to reliable, autonomous distributed systems
 Introduction Originally developed by Open Software Foundation (OSF), which is now called The Open Group ( Provides a set of tools and.
Technical Architectures
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
Course: Operating Systems Instructor: Umar Kalim NUST Institute of Information Technology, Pakistan Operating Systems.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Portability Issues. The MPI standard was defined in May of This standardization effort was a response to the many incompatible versions of parallel.
© Lethbridge/Laganière 2001 Chap. 3: Basing Development on Reusable Technology 1 Let’s get started. Let’s start by selecting an architecture from among.
Communication in Distributed Systems –Part 2
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Message Passing Interface In Java for AgentTeamwork (MPJ) By Zhiji Huang Advisor: Professor Munehiro Fukuda 2005.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
CSCI 224 Introduction to Java Programming. Course Objectives  Learn the Java programming language: Syntax, Idioms Patterns, Styles  Become comfortable.
1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.
1 Lecture 4: Threads Operating System Fall Contents Overview: Processes & Threads Benefits of Threads Thread State and Operations User Thread.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Replication & EJB Graham Morgan. EJB goals Ease development of applications –Hide low-level details such as transactions. Provide framework defining the.
SOFTWARE DESIGN AND ARCHITECTURE LECTURE 09. Review Introduction to architectural styles Distributed architectures – Client Server Architecture – Multi-tier.
Crossing The Line: Distributed Computing Across Network and Filesystem Boundaries.
Spring/2002 Distributed Software Engineering C:\unocourses\4350\slides\DefiningThreads 1 RMI.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Java Remote Method Invocation RMI. Idea If objects communicate with each other on one JVM why not do the same on several JVM’s? If objects communicate.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
RMI remote method invocation. Traditional network programming The client program sends data to the server in some intermediary format and the server has.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
CORBA1 Distributed Software Systems Any software system can be physically distributed By distributed coupling we get the following:  Improved performance.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Jini Architecture Introduction System Overview An Example.
SDN Management Layer DESIGN REQUIREMENTS AND FUTURE DIRECTION NO OF SLIDES : 26 1.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Jini Architectural Overview Li Ping
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
- Manvitha Potluri. Client-Server Communication It can be performed in two ways 1. Client-server communication using TCP 2. Client-server communication.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Remote Method Invocation A Client Server Approach.
Data-Centric Systems Lab. A Virtual Cloud Computing Provider for Mobile Devices Gonzalo Huerta-Canepa presenter 김영진.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Nguyen Thi Thanh Nha HMCL by Roelof Kemp, Nicholas Palmer, Thilo Kielmann, and Henri Bal MOBICASE 2010, LNICST 2012 Cuckoo: A Computation Offloading Framework.
Programming Parallel Hardware using MPJ Express By A. Shafi.
Software, IEE Proceedings, Vol.152, Num.3, June 2005,Page(s): Prasanthi.S March, Java-based component framework for dynamic reconfiguration.
Advance Computer Programming Market for Java ME The Java ME Platform – Java 2 Micro Edition (J2ME) combines a resource- constrained JVM and a set of Java.
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Introduction Enosis Learning.
Introduction Enosis Learning.
CSE 451: Operating Systems Winter 2006 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
Pluggable Architecture for Java HPC Messaging
DISTRIBUTED COMPUTING
CSE 451: Operating Systems Autumn 2003 Lecture 16 RPC
CSE 451: Operating Systems Winter 2007 Module 20 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Winter 2004 Module 19 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Autumn 2009 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
JINI ICS 243F- Distributed Systems Middleware, Spring 2001
MPJ: A Java-based Parallel Computing System
CSE 451: Operating Systems Autumn 2010 Module 21 Remote Procedure Call (RPC) Ed Lazowska Allen Center
CSE 451: Operating Systems Winter 2003 Lecture 16 RPC
CSE 451: Operating Systems Messaging and Remote Procedure Call (RPC)
Presentation transcript:

04 June Thoughts on a Java Reference Implementation for MPJ Mark Baker *, Bryan Carpenter  * University of Portsmouth  Florida State University IPDPS, Cancun, Mexico – 5 th May

4 June, Contents Introduction Some design decisions An overview of the architecture Process creation and monitoring The MPJ daemon Handling aborts and failures MPJ device Conclusions and future work

4 June, Introduction The Message-Passing Working Group of the Java Grande Forum was formed in late 1998 as a response to the appearance of several prototype Java bindings for MPI-like libraries. An initial draft for a common API specification was distributed at Supercomputing '98. Since then the working group has met in San Francisco and Syracuse. The present API is now called MPJ.

4 June, Introduction No complete implementation of the draft specification. mpiJava, is moving towards the “standard”. The new version (1.2) of the software supports direct communication of objects via object serialization, Version 1.3 of mpiJava will implement the new API. The mpiJava wrappers rely on the availability of platform-dependent native MPI implementation for the target computer.

4 June, Introduction While this is a reasonable basis in many cases, the approach has some disadvantages. The 2-stage installation procedure – get and build native MPI then install and match the Java wrappers – tedious/off-putting to new users. On several occasions we saw conflicts between the JVM environment and the native MPI runtime behaviour. The situation has improved, and mpiJava now runs on various combinations of JVM and MPI implementation. This strategy simply conflicts with the ethos of Java – write-once-run-anywhere software is the order of the day.

4 June, MPJ – the Next Generation of Message Passing in Java, An MPJ reference implementation could be implemented as: Java wrappers to a native MPI implementation, Pure Java, Principally in Java – with a few simple native methods to optimize operations (like marshalling arrays of primitive elements) that are difficult to do efficiently in Java. We are aiming at pure Java to provide an implementation of MPJ that is maximally portable and that hopefully requires the minimum amount of support effort.

4 June, Benefits of a pure Java implementation of MPJ Highly portable. Assumes only a Java development environment. Performance: moderate. May need JNI inserts for marshalling arrays. Network speed limited by Java sockets. Good for education/evaluation. Vendors provide wrappers to native MPI for ultimate performance?

4 June, Design Criteria for the MPJ Environment Need an infrastructure to support groups of distributed processes: Resource discovery, Communications, Handle failure, Spawn processes on hosts.

4 June, Resource discovery Technically, Jini discovery and lookup seems an obvious choice. Daemons register with lookup services. A “hosts file” may still guide the search for hosts, if preferred.

4 June, Communication base Maybe, some day, Java VIA?? For now sockets are the only portable option. RMI surely too slow.

4 June, Handling “Partial Failures” Need to overcome: When a network connection breaks, The host system goes down, The JVM running the remote MPJ task halts for some other reason (e.g., occurrence of a Java exception), The program that initiated the MPJ job is killed. Unexpected termination of any particular MPJ job. Concurrent tasks associated with other MPJ jobs should be unaffected, even if they were initiated by the same daemon. All processes associated with the particular job must shut down within some (preferably short) interval of time cleanly.

4 June, Handling “Partial Failures” A useable MPJ implementation must deal with unexpected process termination or network failure, without leaving orphan processes, or leaking other resources. Could reinvent protocols to deal with these situations, but Jini provides a ready-made framework (or, at least, a set of concepts).

4 June, Handling failures with Jini If any slave dies, client generates a Jini distributed event, MPIAbort – all slaves are notified and all processes killed. In case of other failures (network failure, death of client, death of controlling daemon, …) client leases on slaves expire in a fixed time, and processes are killed.

4 June, Integration of Jini and MPI Provides a natural Java framework for parallel computing with the powerful fault tolerance and dynamic characteristics of Jini combined with proven parallel computing functionality and performance of MPI

4 June, MPJ - Implementation In the initial reference implementation we will use Jini technology to facilitate location of remote MPJ daemons and to provide a framework for the required fault-tolerance. This choice rests on our guess that in the medium-to-long-term Jini will be a ubiquitous component in Java installations. Hence using the Jini paradigms from the start should eventually help inter-working and compatibility between our software and other systems.

4 June, Acquiring compute slaves through Jini

4 June, MPJ We envisage that a user will download a jar-file of MPJ library classes onto machines that may host parallel jobs, and install a daemon on those machines – technically by registering an activatable object with an rmid daemon. Parallel java codes are compiled on one host. An mpjrun program invoked on that host transparently loads the user's class files into JVMs created on remote hosts by the MPJ daemons, and the parallel job starts.

4 June, MPJ - Implementation In the short-to-medium-term – before Jini software is widely installed – we might have to provide a “lite” version of MPJ that is unbundled from Jini. Designing for Jini protocols should, nevertheless, have a beneficial influence on overall robustness and maintainability. Use of Jini implies use of RMI for various management functions.

4 June, Mpjrun myproggy –np 4 Mpj Deamon rmid http server Slave 1Slave 2Slave 3 Slave 4 Host

4 June, MPJ – Implementation Some assumptions that have a bearing on the organization of the MPJ daemon: stdout (and stderr ) streams from all tasks in an MPJ job are merged non-deterministically and copied to the stdout of the process that initiates the job. No guarantees are made about other IO operations - these are system dependent. Rudimentary support for global checkpointing and restarting of interrupted jobs may be quite useful, although checkpointing would not happen without explicit invocation in the user-level code, or that restarting would happen automatically.

4 June, MPJ – Implementation The role of the MPJ daemons and their associated infrastructure is to provide an environment consisting of a group of processes with the user- code loaded and running in a reliable way. The process group is reliable in the sense that no partial failures should be visible to higher levels of the MPJ implementation or the user code. We will use Jini leasing to provide fault tolerance – clearly no software technology can guarantee the absence of total failures, where the whole MPJ job dies at essentially the same time.

4 June, MPJ - Implementation Once a reliable cocoon of user processes has been created through negotiation with the daemons, we have to establish connectivity. In the reference implementation this will be based on Java sockets. Recently there has been interest in producing Java bindings to VIA - eventually this may provide a better platform on which to implement MPI, but for now sockets are the only realistic, portable option.

4 June, MPJ – Implementation Between the socket API and the MPJ API there will be an intermediate “MPJ device” level – modelled on the Abstract Device Interface (ADI) of MPICH. Although the role is slightly different here - we do not really anticipate a need for multiple platform- specific implementations - this still seems like a good layer of abstraction to have in our design. The API is actually not modelled in detail on the MPICH device, but the level of operations is similar (based on isend/irecv/waitany calls).

4 June, Layers of an MPJ Reference Implementation High Level MPI Base Level MPI MPJ Device Level Java Socket and Thread API Process Creation and Monitoring Collective Operations Process Topologies All pt-to-pt modes Groups Communicators Datatypes Isend, irecv, waitany, … Physical PIDs Contexts & Tags Byte vector data All-to-all TCP Connect Input Handler Threads Synchronised methods, wait, notify… MPJ Daemon Lookup, Leasing (Jini) Exec java MPJLoader Serializable objects

4 June, MPJ - Conclusions On-going effort (NSF proposal + volunteer help). Collaboration to define exact MPJ interface – consisting of other Java MP system developers. Work at the moment is based around the development of the low-level MPJ device and exploring the functionality of Jini.