Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.

Slides:



Advertisements
Similar presentations
European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Scalability.
Advertisements

European Research Network on Foundations, Software Infrastructures and Applications for large scale distributed, GRID and Peer-to-Peer Technologies Experiences.
7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
High Performance Computing Course Notes Grid Computing.
The Ibis model as a paradigm for programming distributed systems Henri Bal Vrije Universiteit Amsterdam (from Grids and Clouds to Smartphones)
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
The road to reliable, autonomous distributed systems
Summary Background –Why do we need parallel processing? Applications Introduction in algorithms and applications –Methodology to develop efficient parallel.
Distributed components
Real-World Distributed Computing with Ibis Henri Bal Vrije Universiteit Amsterdam.
Notes to the presenter. I would like to thank Jim Waldo, Jon Bostrom, and Dennis Govoni. They helped me put this presentation together for the field.
Parallel Programming Henri Bal Rob van Nieuwpoort Vrije Universiteit Amsterdam Faculty of Sciences.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal, Jason Maassen, Rob van Nieuwpoort, Thilo Kielmann, Niels Drost, Ceriel Jacobs, Frank.
Grid Adventures on DAS, GridLab and Grid'5000 Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
Ibis: a Java-centric Programming Environment for Computational Grids Henri Bal Vrije Universiteit Amsterdam vrije Universiteit.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
Parallel Programming on Computational Grids. Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis.
The Distributed ASCI Supercomputer (DAS) project Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences.
1 GRID D. Royo, O. Ardaiz, L. Díaz de Cerio, R. Meseguer, A. Gallardo, K. Sanjeevan Computer Architecture Department Universitat Politècnica de Catalunya.
Virtual Lab AMsterdam VLAM-G Project VLAM-G developers team Computer Architecture and Parallel Systems Group Department of Computer Science Universiteit.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Ch 4. The Evolution of Analytic Scalability
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500, clusters.
1 Dr. Markus Hillenbrand, ICSY Lab, University of Kaiserslautern, Germany A Generic Database Web Service for the Venice Service Grid Michael Koch, Markus.
This work was carried out in the context of the Virtual Laboratory for e-Science project. This project is supported by a BSIK grant from the Dutch Ministry.
Panel Abstractions for Large-Scale Distributed Systems Henri Bal Vrije Universiteit Amsterdam.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
The Ibis Project: Simplifying Grid Programming & Deployment Henri Bal Vrije Universiteit Amsterdam.
DISTRIBUTED COMPUTING
WP9 Resource Management Current status and plans for future Juliusz Pukacki Krzysztof Kurowski Poznan Supercomputing.
Grid Technologies  Slide text. What is Grid?  The World Wide Web provides seamless access to information that is stored in many millions of different.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Henri Bal Vrije Universiteit Amsterdam High Performance Distributed Computing.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
1 Computing Challenges for the Square Kilometre Array Mathai Joseph & Harrick Vin Tata Research Development & Design Centre Pune, India CHEP Mumbai 16.
ICT infrastructure for Science: e-Science developments Henri Bal Vrije Universiteit Amsterdam.
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Connections to Other Packages The Cactus Team Albert Einstein Institute
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Wide-Area Parallel Computing in Java Henri Bal Vrije Universiteit Amsterdam Faculty of Sciences vrije Universiteit.
Parallel Computing on Wide-Area Clusters: the Albatross Project Aske Plaat Thilo Kielmann Jason Maassen Rob van Nieuwpoort Ronald Veldema Vrije Universiteit.
Background Computer System Architectures Computer System Software.
MSF and MAGE: e-Science Middleware for BT Applications Sep 21, 2006 Jaeyoung Choi Soongsil University, Seoul Korea
Nguyen Thi Thanh Nha HMCL by Roelof Kemp, Nicholas Palmer, Thilo Kielmann, and Henri Bal MOBICASE 2010, LNICST 2012 Cuckoo: A Computation Offloading Framework.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Fault tolerance, malleability and migration for divide-and-conquer applications on the Grid Gosia Wrzesińska, Rob V. van Nieuwpoort, Jason Maassen, Henri.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Chapter 1 Characterization of Distributed Systems
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Grid Computing.
Real-World Distributed Computing with Ibis
Recap: introduction to e-science
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Summary Background Introduction in algorithms and applications
Ch 4. The Evolution of Analytic Scalability
Gordon Erlebacher Florida State University
Presentation transcript:

Parallel Programming on Computational Grids

Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis

Grids Seamless integration of geographically distributed computers, databases, instruments –The name is an analogy with power grids Highly active research area –Open Grid Forum –Globus middleware –Many European projects, e.g.: Gridlab: Grid Application Toolkit and Testbed DEISA: Distributed European Infrastructure for Supercomputing Applications XtreemOS: Linux-based OS for grids –VL-e (Virtual laboratory for e-Science) project –….

Why Grids? New distributed applications that use data or instruments across multiple administrative domains and that need much CPU power –Computer-enhanced instruments –Collaborative engineering –Browsing of remote datasets –Use of remote software –Data-intensive computing –Very large-scale simulation –Large-scale parameter studies

Web, Grids and e-Science Web is about exchanging information Grid is about sharing resources –Computers, data bases, instruments e-Science supports experimental science by providing a virtual laboratory on top of Grids –Support for visualization, workflows, data management, security, authentication, high-performance computing

The big picture Management of comm. & computing Management of comm. & computing Management of comm. & computing Potential Generic part Potential Generic part Potential Generic part Application Virtual Laboratory Application oriented services Grids Harness distributed resources

The data explosion e-Science experiments generate much data, that often is distributed and that need much (parallel) processing –high-resolution imaging: ~ 1 GByte per measurement –Bio-informatics queries: 500 GByte per database –Satellite world imagery: ~ 5 TByte/year –Current particle physics: 1 PByte per year –LHC physics (2010?): PByte per year

Grid programming The goal of a grid is to make resource sharing very easy (transparent) Practice: grid programming is very difficult –Finding resources, running applications, dealing with heterogeneity and security, etc. Grid middleware (Globus) makes this somewhat easier, but is still low-level and changes frequently Need application-level tools

Application-level tools Paper (on Blackboard): –Blueprint for a New Computing Infrastructure (2nd Edition, editors I. Foster and C. Kesselman); Chapter 24: Application- Level tools (Bal, Casanova, Dongarra, Matsuoka) Builds on grid software infrastructure Isolates users from dynamics of the grid hardware infrastructure Generic (broad classes of applications) Easy-to-use

Taxonomy of application-level tools Grid programming models –RPC –Task parallelism –Message passing –Java programming Grid application execution environments –Parameter sweeps –Workflow –Portals

Remote Procedure Call (RPC) GridRPC: specialize RPC-style (client/server) programming for grids –Allows coarse-grain task parallelism & remote access –Extended with resource discovery, scheduling, etc. Example: NetSolve –Solves numerical problems on a grid Current development: use web technology (WSDL, SOAP) for grid services Web and grid technology are merging

Task parallelism Many systems for task parallelism (master-worker, replicated workers) exist for the grid Examples –MW (master-worker) –Satin: divide&conquer (hierarchical master-worker)

Message passing Several MPI implementations exist for the grid PACX MPI (Stutgart): –Runs on heterogeneous systems MagPIe (Thilo Kielmann): –Optimizes MPI’s collective communication for hierarchical wide-area systems MPICH-G2: – Similar to PACX and MagPIe, implemented on Globus

Java programming Java uses bytecode and is very portable –``Write once, run anywhere’’ Can be used to handle heterogeneity Many systems now have Java interfaces: –Globus (Globus Java Commodity Grid) –MPI (MPIJava, MPJ, …) –Gridlab Application Toolkit (Java GAT) Ibis and ProActive are Java-centric grid programming systems

Parameter sweep applications Computations what are mostly independent –E.g. run same simulation many times with different parameters Can tolerate high network latencies, can easily be made fault-tolerant Many systems use this type of trivial parallelism to harness idle desktops –APST, Entropia, XtremWeb

Workflow applications Link and compose diverse software tools and data formats –Connect modules and data-filters Results in coarse-grain, dataflow-like parallelism that can be run efficiently on a grid Several workflow management systems exist –E.g. Virtual Lab Amsterdam (predecessor VL-e)

Portals Graphical interfaces to the grid Often application-specific Also portals for resource brokering, file transfers, etc.

Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis

Distributed supercomputing Parallel processing on geographically distributed computing systems (grids) Examples: ( ), RSA-155, Entropia, Cactus Mostly limited to trivially parallel applications Questions: –Can we generalize this to more HPC applications? –What high-level programming support is needed?

Grids usually are hierarchical –Collections of clusters, supercomputers –Fast local links, slow wide-area links Can optimize algorithms to exploit this hierarchy –Minimize wide-area communication Wide-area bandwidth is increasing –DAS-3 has 10 Gb/s dedicated optical links between sites –Wide-area latency remains high (limited by speed-of-light) Speedups on a grid?

Example: N-body simulation Much wide-area communication –Each node needs info about remote bodies CPU 1 CPU 2 CPU 1 CPU 2 AmsterdamDelft

Trivial optimization AmsterdamDelft CPU 1 CPU 2 CPU 1 CPU 2

Wide-area optimizations Message combining on wide-area links Latency hiding on wide-area links Collective operations for wide-area systems –Broadcast, reduction, all-to-all exchange Load balancing Conclusions: –Many applications can be optimized to run efficiently on a hierarchical wide-area system –Need better programming support

Outline Grids Application-level tools for grids Parallel programming on grids Case study: Ibis al., AGRIDM’03 (Workshop on Adaptive Grid Middleware, New Paper: –Real-World Distributed Computing with Ibis, Sept. 2003

The Ibis system High-level & efficient programming support for distributed supercomputing Use Java-centric approach + JVM technology –Inherently more portable than native compilation Goal: drastically simplify programming and deployment of high performance distributed applications Target: –Large-scale distributed systems, including clusters, grids, desktop grids, clouds, mobile devices …. –Possibly all at the same time for 1 application

Real-world distributed systems

World wide testbed

Problem How to write (high-performance) applications for real-world distributed systems? How to deal with: –Performance: efficiency on wide-area system –Heterogeneity: different systems & APIs –Malleability:resources come and go –Fault tolerance: crashes –Connectivity:firewalls, NAT, etc.

Our approach Study fundamental underlying problems … hand-in-hand with realistic applications … integrate solutions in one system: Ibis Distributed SystemsUser !

Applications Scientific applications –Imaging (VU Medical Center, AMOLF) –Bioinformatics (sequence analysis) –Astronomy (data analysis challenge) Multimedia content analysis Games and model checking Semantic web (distributed reasoning)

Multimedia content analysis Automatically extract information from images & video –E.g., video archive, surveillance cameras Extract feature vectors from images –Describe properties (color, shape) –Data-parallel task on a cluster Compute on consecutive images –Task-parallelism on a grid

Example: object recognition ● Analyze video stream from camera to learn and recognize every-day objects ● Representative for more serious applications ● Same algorithms used for surveillance cameras ● London Underground  > years of processing for >> ’s CCTV cameras

Games and Model Checking Can solve entire Awari game on wide-area DAS-3 (889 B positions) –Needs 10G private optical network Distributed model checking has very similar communication pattern –Search huge state spaces, random work distribution, bulk asynchronous transfers Can efficiently run DeVinE model checker on wide- area DAS-3, use up to 1 TB memory

Distributed reasoning MaRVIN (Frank van Harmelen et al, VU): –A distributed platform for massive RDF inferencing (deductive closure) –``a brain the size of a planet’’ Uses Ibis to run on heterogeneous systems (clusters, desktop grids) Used for Billion Triple track of Semantic Web Challenge 2008 –Inputs 800M RDF triples, derives 29B triples

Awards SCALE 2008 (CCGrid’08) DACH 2008 – BS DACH FT AAAI-VC 2007 ISWC 2008 Multimedia Computing Astronomy Semantic Web (van Harmelen et al.) (Cluster/Grid’08)

Ibis Philosophy Real-world distributed applications should be developed and compiled on a local workstation, and simply be launched from there

Ibis Approach Virtual Machines (Java) deal with heterogeneity Provide range of programming abstractions Designed for dynamic/faulty environments Easy deployment through middleware-independent programming interfaces Modular and flexible: can replace Ibis components by external ones

Ibis Design Applications need functionality for –Programming (as in programming languages) –Deployment (as in operating systems) Programming Logical Likes math Deployment Practical Visual (GUI)

Ibis System

Ibis brains

Programming system

Programming models Message passing (IPL, RMI, MPJ) Satin: Fault-tolerant, malleable divide-and-conquer system Jorus: Transparent library with multimedia operations Maestro: Self-optimizing fault-tolerant dataflow framework

Satin: a parallel divide-and-conquer system on top of Ibis Divide-and-conquer is inherently hierarchical More general than master/worker Satin: Cilk-like primitives (spawn/sync) in Java

Example interface FibInter { public int fib(long n); } class Fib implements FibInter { int fib (int n) { if (n < 2) return n; return fib(n-1) + fib(n-2); } Single-threaded Java

Example Java + divide&conquer interface FibInter extends ibis.satin.Spawnable { public int fib(long n); } class Fib extends ibis.satin.SatinObject implements FibInter { public int fib (int n) { if (n < 2) return n; int x = fib (n - 1); int y = fib (n - 2); sync(); return x + y; }

IPL (Ibis Portability Layer) Java-centric “run-anywhere” library Point-to-point, multicast, streaming Simple model for tracking resources –Join-Elect-Leave –Supports malleability & fault-tolerance

SmartSockets library Detects connectivity problems Tries to solve them automatically With as little help from the user as possible Integrates existing and several new solutions Reverse connection setup, STUN, TCP splicing, SSH tunneling, smart addressing, etc. Uses network of hubs as a side channel

Ibis Deployment system

IbisDeploy GUI

JavaGAT GAT: Grid Application Toolkit –Makes grid applications independent of the underlying grid infrastructure Used by applications to access grid services –File copying, resource discovery, job submission & monitoring, user authentication Successor API is currently being standardized

Grid Applications with GAT GAT Engine Remote Files Monitoring Info service Resource Management GridLabGlobusUnicoreSSHP2PLocal GAT Grid Application File.copy(...)‏ submitJob(...)‏ gridftp globus Intelligent dispatching Koala

Zorilla: Java P2P supercomputing middleware

Ibis demo (movie)

Object recognition Client Broker Servers Ibis (Java) Runs simultaneously on clusters (DAS-3, Japan, Australia), Desktop Grid, Amazon EC2 Cloud Connectivity problems solved automatically by Ibis SmartSockets

Ibis movie (part 1)

Performance on 1 DAS-3 cluster Relative speedups of Java/Ibis and C++/MPI –Using TCP or Myricom’s MX protocol Sequential performance Java: 88% of C++

DAS-3DAS-3

Speedup (wide-area) Homogeneous wide-area systems (DAS-3): –Frame rate increases linearly with #clusters World-wide experiment : –24 frames per second 640 x 480 resolution) –Speed limited by camera, not computing infrastructure

Smart Phones GSM + PC + GPS + camera + networks + …. Will become ubiquitous (like GSMs) Our goal: study distributed applications running on (multiple) smart phones & other resources

Example: eyeDentify Implemented Ibis on Android –Google’s open-source Java-based platform Object recognition (eyeDentify) on a G1 smartphone Deploys computation server on DAS-3 cluster Launched from IbisDeploy/eyeDentify client on phone + +

Summary Parallel computing on Grids (distributed supercomputing) is a challenging and promising research area Many grid programming environments exist Ibis: a Java-centric Grid programming environment Extends to the mobile world