Evaluating the Performance of MPI Java in FutureGrid Nigel Pugh 2, Tori Wilbon 2, Saliya Ekanayake 1 1 Indiana University 2 Elizabeth City State University.

Slides:

Advertisements

Similar presentations

MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.

Advertisements

Big Data Open Source Software and Projects ABDS in Summary XIV: Level 14B I590 Data Science Curriculum August Geoffrey Fox

Presented by Dealing with the Scale Problem Innovative Computing Laboratory MPI Team.

Decision Trees and MPI Collective Algorithm Selection Problem Jelena Pje¡sivac-Grbovi´c,Graham E. Fagg, Thara Angskun, George Bosilca, and Jack J. Dongarra,

Tile Reduction: the first step towards tile aware parallelization in OpenMP Ge Gan Department of Electrical and Computer Engineering Univ. of Delaware.

High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.

1 Performance Evaluation of Gigabit Ethernet & Myrinet

High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.

Jefferson Ridgeway 2, Ifeanyi Rowland Onyenweaku 3, Gregor von Laszewski 1*, Fugang Wang 1 1* Indiana University, Bloomington, IN 47408, U.S.A.,

Parallel Data Analysis from Multicore to Cloudy Grids Indiana University Geoffrey Fox, Xiaohong Qiu, Scott Beason, Seung-Hee.

The hybird approach to programming clusters of multi-core architetures.

Parallel Programming with Java

SALSASALSASALSASALSA Digital Science Center June 25, 2010, IIT Geoffrey Fox Judy Qiu School.

HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC

SALSASALSASALSASALSA Performance Analysis of High Performance Parallel Applications on Virtualized Resources Jaliya Ekanayake and Geoffrey Fox Indiana.

Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,

Fine Grain MPI Earl J. Dodd Humaira Kamal, Alan University of British Columbia 1.

FutureGrid: an experimental, high-performance grid testbed Craig Stewart Executive Director, Pervasive Technology Institute Indiana University

Performance Evaluation of Hybrid MPI/OpenMP Implementation of a Lattice Boltzmann Application on Multicore Systems Department of Computer Science and Engineering,

Collective Communication

1 Developing Native Device for MPJ Express Advisor: Dr. Aamir Shafi Co-advisor: Ms Samin Khaliq.

Science Clouds and FutureGrid’s Perspective June Science Clouds Workshop HPDC 2012 Delft Geoffrey Fox

A Metadata Based Approach For Supporting Subsetting Queries Over Parallel HDF5 Datasets Vignesh Santhanagopalan Graduate Student Department Of CSE.

1 Performance of a Multi-Paradigm Messaging Runtime on Multicore Systems Poster at Grid 2007 Omni Austin Downtown Hotel Austin Texas September

HierKNEM: An Adaptive Framework for Kernel- Assisted and Topology-Aware Collective Communications on Many-core Clusters Teng Ma, George Bosilca, Aurelien.

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

FutureGrid Dynamic Provisioning Experiments including Hadoop Fugang Wang, Archit Kulshrestha, Gregory G. Pike, Gregor von Laszewski, Geoffrey C. Fox.

GU Junli SUN Yihe 1.  Introduction & Related work  Parallel encoder implementation  Test results and Analysis  Conclusions 2.

Scientific Computing Environments ( Distributed Computing in an Exascale era) August Geoffrey Fox

Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.

Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.

Event-Based Hybrid Consistency Framework (EBHCF) for Distributed Annotation Records Ahmet Fatih Mustacoglu Advisor: Prof. Geoffrey.

From 0 to 100: Cloud Computing for the Non-Programmer Derek Morris, Jr. Mentor: Gregor von Laszewski ABSTRACT This project will be demonstrating that it.

SALSASALSASALSASALSA FutureGrid Venus-C June Geoffrey Fox

Performance Model for Parallel Matrix Multiplication with Dryad: Dataflow Graph Runtime Hui Li School of Informatics and Computing Indiana University 11/1/2012.

CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.

X-Informatics MapReduce February Geoffrey Fox Associate Dean for Research.

Message Management April Geoffrey Fox Computer Science, Informatics, Physics Pervasive Technology Laboratories Indiana University Bloomington IN.

1 "Workshop 31: Developing a Hands-on Undergraduate Parallel Programming Course with Pattern Programming SIGCSE The 44 th ACM Technical Symposium.

Thinking in Parallel – Implementing In Code New Mexico Supercomputing Challenge in partnership with Intel Corp. and NM EPSCoR.

Computing Research Testbeds as a Service: Supporting large scale Experiments and Testing SC12 Birds of a Feather November.

13/11/98Java Grande Forum1 MPI for Java Bryan Carpenter, Vladimir Getov, Glenn Judd, Tony Skjellum, Geoffrey Fox and others.

Stacey Levine Chapter 4.1 Message Passing Communication.

SALSASALSASALSASALSA Digital Science Center February 12, 2010, Bloomington Geoffrey Fox Judy Qiu

Memcached Integration with Twister Saliya Ekanayake - Jerome Mitchell - Yiming Sun -

1 Cloud Systems Panel at HPDC Boston June Geoffrey Fox Community Grids Laboratory, School of informatics Indiana University

1 HPJAVA I.K.UJJWAL 07M11A1217 Dept. of Information Technology B.S.I.T.

Outline  Introduction  Subgraph Pattern Matching  Types of Subgraph Pattern Matching  Models of Computation  Distributed Algorithms  Performance.

Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.

SALSASALSA Large-Scale Data Analysis Applications Computer Vision Complex Networks Bioinformatics Deep Learning Data analysis plays an important role in.

Multicore Applications in Physics and Biochemical Research Hristo Iliev Faculty of Physics Sofia University “St. Kliment Ohridski” 3 rd Balkan Conference.

SPIDAL Java High Performance Data Analytics with Java on Large Multicore HPC Clusters

Towards High Performance Processing of Streaming Data May Supun Kamburugamuve, Saliya Ekanayake, Milinda Pathirage and Geoffrey C. Fox Indiana.

Hyungro Lee, Geoffrey C. Fox

SPIDAL Analytics Performance February 2017

Digital Science Center II

Evaluating Techniques for Image Classification

NSF start October 1, 2014 Datanet: CIF21 DIBBs: Middleware and High Performance Analytics Libraries for Scalable Data Science Indiana University.

IEEE BigData 2016 December 5-8, Washington D.C.

Digital Science Center I

I590 Data Science Curriculum August

Scalable Parallel Interoperable Data Analytics Library

Evaluation of Java Message Passing in High Performance Data Analytics

Digital Science Center III

Towards High Performance Data Analytics with Java

Indiana University, Bloomington

Department of Intelligent Systems Engineering

Big Data, Simulations and HPC Convergence

Motivation Contemporary big data tools such as MapReduce and graph processing tools have fixed data abstraction and support a limited set of communication.

Iterative and non-Iterative Computations

Presentation transcript:

Evaluating the Performance of MPI Java in FutureGrid Nigel Pugh 2, Tori Wilbon 2, Saliya Ekanayake 1 1 Indiana University 2 Elizabeth City State University Abstract Message Passage Interface (MPI) has been popular in developing tightly coupled parallel applications in the high performance computing (HPC) domain. The majority of such applications are based on either C, C++ or Fortran. The recent advancement in big data, however, has brought attention towards Java. Effort has also been put on Java's support for HPC with flavors of MPI such as OpenMPI Java and FastMPJ. We evaluate these against native C based MPI on a set of standard micro-benchmarks from Ohio State University. The results show a promising future with Java and MPI for HPC applications. Introduction This study serves as a proof of concept that Java based MPI communications can perform close to native implementations. We evaluate two MPI Java versions – OpenMPI [1] Java and FastMPJ [2] – against native OpenMPI. FastMPJ is a pure Java implementations whereas OpenMPI Java is a Java binding for the native OpenMPI. Our evaluations are based on benchmarking MPI kernel operations following the standard Ohio State University (OSU) Micro-Benchmark suite [3]. A detailed study extending this work is available at [4]. We like to express our gratitude to Dr. Geoffrey Fox for giving us the opportunity to work in his lab this summer. We also would like to thank the School of Informatics at Indiana University Bloomington and the IU-SROC Director Dr. Lamara Warren. We are equally thankful Dr. Guillermo López Taboada for giving us commercially available FastMPJ. This material is based upon work supported in part by the National Science Foundation under Grant No [1] Gabriel, E., Fagg, G., Bosilca, G., Angskun, T., Dongarra, J., Squyres, J., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R., Daniel, D., Graham, R. and Woodall, T. Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. Springer Berlin Heidelberg, City, [2] Expósito, R., Ramos, S., Taboada, G., Touriño, J. and Doallo, R. FastMPJ: a scalable and efficient Java message-passing library. Cluster Computing(2014/02/ ), [3] Laboratory, T. O. S. U. s. N.-B. C. and (NBCL) OMB (OSU Micro-Benchmarks). City. [4] Ekanayake, S. Evaluation of Java Message Passing in High Performance Data Analytics. City, [5] von Laszewski, G.; Fox, G. C.; Wang, F.; Younge, A. J.; Kulshrestha; Pike, G. G.; Smith, W.; Voeckler, J.; Figueiredo, R. J.; Fortes, J.; Keahey, K. & Deelman, E. Design of the FutureGrid Experiment Management Framework, Proceedings of Gateway Computing Environments 2010 (GCE2010) at SC10, IEEE, 2010 Saliya Ekanayake, Indiana University, Dr. Geoffrey Fox, Indiana University, Performance Results MPI Allreduce with floating point sum Acknowledgements MPI collective primitives are evaluated for patterns, 1x1x8, 1x2x8, 1x4x8, and 1x8x8, where the digits represent threads per process, processes per node, and nodes respectively. Send and receive operations are timed for pattern 1x1x2 with different pairs of nodes. Averaged values of these tests are presented in graphs. Message sizes range from 0 bytes to 1 mega byte. We use FutureGrid as the HPC environment. FutureGrid – 128 nodes, 2 Intel Xeon X5550 CPUs at 2.66 GHz with 4 cores, 8 cores per nodes 24 GB node memory and 20 Gbps IB. MPI Allgather MPI BroadcastMPI Send and Receive Evaluation References Primary Contact