1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

A NOVEL APPROACH TO SOLVING LARGE-SCALE LINEAR SYSTEMS Ken Habgood, Itamar Arel Department of Electrical Engineering & Computer Science GABRIEL CRAMER.
A Dynamic World, what can Grids do for Multi-Core computing? Daniel Goodman, Anne Trefethen and Douglas Creager
Parallel Programming Patterns Ralph Johnson. Why patterns? Patterns for Parallel Programming The road ahead.
Advanced Computational Software Scientific Libraries: Part 2 Blue Waters Undergraduate Petascale Education Program May 29 – June
Parallel Programming Models and Paradigms Prof. Rajkumar Buyya Cloud Computing and Distributed Systems (CLOUDS) Lab. The University of Melbourne, Australia.
CISC 879 : Software Support for Multicore Architectures John Cavazos Dept of Computer & Information Sciences University of Delaware
Reference: Message Passing Fundamentals.
Dynamic adaptation of parallel codes Toward self-adaptable components for the Grid Françoise André, Jérémy Buisson & Jean-Louis Pazat IRISA / INSA de Rennes.
Reconfigurable Application Specific Computers RASCs Advanced Architectures with Multiple Processors and Field Programmable Gate Arrays FPGAs Computational.
CISC 879 : Software Support for Multicore Architectures John Cavazos Dept of Computer & Information Sciences University of Delaware
Scripting Languages For Virtual Worlds. Outline Necessary Features Classes, Prototypes, and Mixins Static vs. Dynamic Typing Concurrency Versioning Distribution.
Parallel Programming Models and Paradigms
High Performance Computing 1 Parallelization Strategies and Load Balancing Some material borrowed from lectures of J. Demmel, UC Berkeley.
PETSc Portable, Extensible Toolkit for Scientific computing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Parallelization: Conway’s Game of Life. Cellular automata: Important for science Biology – Mapping brain tumor growth Ecology – Interactions of species.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
Lecture 29 Fall 2006 Lecture 29: Parallel Programming Overview.
Data Partitioning on Heterogeneous Multicore and Multi-GPU Systems Using Functional Performance Models of Data-Parallel Applications Published in: Cluster.
An approach for solving the Helmholtz Equation on heterogeneous platforms An approach for solving the Helmholtz Equation on heterogeneous platforms G.
CS 591x – Cluster Computing and Programming Parallel Computers Parallel Libraries.
1 Discussions on the next PAAP workshop, RIKEN. 2 Collaborations toward PAAP Several potential topics : 1.Applications (Wave Propagation, Climate, Reactor.
High-Performance Quantum Simulation: A challenge to Schr ö dinger equation on 256^4 grids * Toshiyuki Imamura 13 今村俊幸, Thanks to Susumu Yamada 23, Takuma.
ICOM 5995: Performance Instrumentation and Visualization for High Performance Computer Systems Lecture 7 October 16, 2002 Nayda G. Santiago.
DynamicBLAST on SURAgrid: Overview, Update, and Demo John-Paul Robinson Enis Afgan and Purushotham Bangalore University of Alabama at Birmingham SURAgrid.
1 Using the PETSc Parallel Software library in Developing MPP Software for Calculating Exact Cumulative Reaction Probabilities for Large Systems (M. Minkoff.
© 2009 Mathew J. Sottile, Timothy G. Mattson, and Craig E Rasmussen 1 Concurrency in Programming Languages Matthew J. Sottile Timothy G. Mattson Craig.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
SPMD: Single Program Multiple Data Streams
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
ANS 1998 Winter Meeting DOE 2000 Numerics Capabilities 1 Barry Smith Argonne National Laboratory DOE 2000 Numerics Capability
1 Computer Programming (ECGD2102 ) Using MATLAB Instructor: Eng. Eman Al.Swaity Lecture (1): Introduction.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Center for Component Technology for Terascale Simulation Software CCA is about: Enhancing Programmer Productivity without sacrificing performance. Supporting.
1 1  Capabilities: Scalable algebraic solvers for PDEs Freely available and supported research code Usable from C, C++, Fortran 77/90, Python, MATLAB.
ATmospheric, Meteorological, and Environmental Technologies RAMS Parallel Processing Techniques.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Domain Decomposition in High-Level Parallelizaton of PDE codes Xing Cai University of Oslo.
Connections to Other Packages The Cactus Team Albert Einstein Institute
© David Kirk/NVIDIA and Wen-mei W. Hwu, ECE408/CS483/498AL, University of Illinois, Urbana-Champaign 1 ECE 408/CS483 Applied Parallel Programming.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA
Finding concurrency Jakub Yaghob. Finding concurrency design space Starting point for design of a parallel solution Analysis The patterns will help identify.
08/10/ NRL Hybrid QR Factorization Algorithm for High Performance Computing Architectures Peter Vouras Naval Research Laboratory Radar Division Professor.
CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-1.
A Pattern Language for Parallel Programming Beverly Sanders University of Florida.
C OMPUTATIONAL R ESEARCH D IVISION 1 Defining Software Requirements for Scientific Computing Phillip Colella Applied Numerical Algorithms Group Lawrence.
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
Parallel Computing Presented by Justin Reschke
Linear Algebra Libraries: BLAS, LAPACK, ScaLAPACK, PLASMA, MAGMA Shirley Moore CPS5401 Fall 2013 svmoore.pbworks.com November 12, 2012.
Holding slide prior to starting show. Processing Scientific Applications in the JINI-Based OGSA-Compliant Grid Yan Huang.
SimTK 1.0 Workshop Downloads Jack Middleton March 20, 2008.
Department of Computer Science, Johns Hopkins University Lecture 7 Finding Concurrency EN /420 Instructor: Randal Burns 26 February 2014.
GridWay Overview John-Paul Robinson University of Alabama at Birmingham SURAgrid All-Hands Meeting Washington, D.C. March 15, 2007.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Parallel OpenFOAM CFD Performance Studies Student: Adi Farshteindiker Advisors: Dr. Guy Tel-Zur,Prof. Shlomi Dolev The Department of Computer Science Faculty.
A survey of Exascale Linear Algebra Libraries for Data Assimilation
Xing Cai University of Oslo
Shrirang Abhyankar IEEE PES HPC Working Group Meeting
Parallel Programming By J. H. Wang May 2, 2017.
Pattern Parallel Programming
Parallel Algorithm Design
Team 1 Aakanksha Gupta, Solomon Walker, Guanghong Wang
MIMD Multiple instruction, multiple data
Introduction, background, jargon
Parallel Programming in C with MPI and OpenMP
Mattan Erez The University of Texas at Austin
Presentation transcript:

1 A Common Application Platform (CAP) for SURAgrid -Mahantesh Halappanavar, John-Paul Robinson, Enis Afgane, Mary Fran Yafchalk and Purushotham Bangalore. SURAgrid All-hands Meeting, 27 September, 2007, Washington D.C.

2 Introduction Problem Statement: “How to quickly grid-enable scientific applications to exploit SURAgrid resources ?” Goals for Grid-enabling Applications:  Dynamic resource discovery  Collective resource utilization  Simple Job Management and Accounting  Minimal Programming Efforts

3 Identifying Patterns Algorithm Structures Support Structures Relationships

4 Basic Process ProblemsProgramsAlgorithmsImplementation Algorithm Structures Supporting Structures Programming Environments

5 Algorithm Structures 1. Organize by Tasks Task Parallelism Divide and Conquer 2. Organize by Data Decomposition Geometric Decomposition Recursive Data 3. Organize By Flow of Data Pipeline Event-Based Coordination How to organize? (Linear and Recursive)

6 Support Structures Program Structures 1. SPMD 2. Master/Worker 3. Loop Parallelism 4. Fork/Join Data Structures 1. Shared Data 2. Shared Queue 3. Distributed Array ProblemsProgramsAlgorithmsImplementation

7 Relationships: AS and SS AS SS Task Parallelism Divide & Conquer Geometric Decomposition Recursive Data PipelineEvent-based Coordination SPMD ****************** Loop Parallelism ********* Master/Work er ********** Fork/Join ************ Relationship between Supporting Structures (SS) patterns and Algorithm Structure (AS) patterns.

8 Relationships: SS and PE SS PEOpenMPMPIJava SPMD ********* Loop Parallelism ******** Master-Worker ***** Fork-Join ******* Relationship between Supporting Structures (SS) patterns and Programming Environments (PE).

9 Important Observation “This (SPMD) pattern is by far the most commonly used pattern for structuring parallel programs. It is particularly relevant for MPI programmers and problems using the Task Parallelism and Geometric Decomposition patterns. It has also proved effective for problems using the Divide and Conquer, and Recursive Data patterns.” Single Program, Multiple Data (SPMD). This is the most common way to organize a prallel program, especially on MIMD computers. The idea is that a single program is written and loaded onto each node of a parallel computer. Each copy of the single program runs independently (aside from coordination events), so the instruction streams executed on each node can be completely different. The specific path through the code is in part selected by the node ID.

10 The Computing Continuum OpenMP MPIJava

11 Scientific Applications What are they? How are they built?

12 Scientific Applications Life Sciences:  Biology, Chemistry, … Engineering:  Aerospace, Civil, Mechanical, Environmental Physics:  QCD, Black Holes, … …. The SCaLeS Reports:

13 Building Blocks OpenMP; MPI; BLACS, … Metis/ParMetis, Zoltan, Chaco, … BLAS and LAPACK, … ScaLapack, MUMPS, SuperLU, … PETSc, Aztec, … “ A core requirement of many engineering and scientific applications is the need to solve linear and non-linear systems of equations, eigensystems and other related problems. ” – The Trilinos Project

14 ScaLAPACK BLAS LAPACK BLACS MPI/PVM/... PBLAS Global Local platform specific Level 1/2/3 Linear systems, least squares, singular value decomposition, eigenvalues. Communication routines targeting linear algebra operations. Parallel BLAS. Communication layer (message passing).

15 PETSc Computation and Communication Kernels MPI, MPI-IO, BLAS, LAPACK Profiling Interface PETSc PDE Application Codes Object-Oriented Matrices, Vectors, Indices Grid Management Linear Solvers Preconditioners + Krylov Methods Nonlinear Solvers, Unconstrained Minimization ODE Integrators Visualization Interface Portable, Extensible Toolkit for Scientific Computation

16 Observation “Explicit message passing will remain the dominant programming model for the foreseeable future because of the huge investment in application codes.” - Jim Tomkis, Bob Balance, and Sue Kelly, ASC PI Meeing, Nevada, Feb 2007

17 Common Application Platform Basic Architecture Building Blocks Conclusions

18 Basic Architecture SiteA MetaScheduler (MS) Site1 Site2 Site3 SURAgrid Portal Command-Line (GSISSH, etc.) Browser A B

19 Building Blocks MPICH-G2

20 Conclusions Powershift !  Onus is now on System Administrators Load Balancing  Minimize communication costs  Limits: ScaLAPACK (Latency<500 ms)  Dynamic Redistribution of Work Heterogeneous Environment  Issues with floating-point operations