Presentation is loading. Please wait.

Presentation is loading. Please wait.

FESR Consorzio COMETA - Progetto PI2S2 Porting a program to run on the Grid Marcello Iacono Manno Consorzio COMETA

Similar presentations


Presentation on theme: "FESR Consorzio COMETA - Progetto PI2S2 Porting a program to run on the Grid Marcello Iacono Manno Consorzio COMETA"— Presentation transcript:

1 www.consorzio-cometa.it FESR Consorzio COMETA - Progetto PI2S2 Porting a program to run on the Grid Marcello Iacono Manno Consorzio COMETA marcello.iacono@ct.infn.it TUTORIAL GRID PER I LABORATORI NAZIONALI DEL SUD 26 Febbraio 2008

2 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 2 Outline Overview Relevant Issues Application classification Data Management Computing Schema

3 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 3 The four pillars of Grid Computing

4 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 4 Why running a program on the Grid? – Faster (hundreds of processors) – Greater (TBs of storage capacity) – Cheaper (currently is free, but NOT forever) Relevant Issues – Platform: UNIX (LINUX) / Windows – Software: COMPILER / LIBRARY AVAILABILITY – Interactivity: COMMAND LINE / GRAPHICAL USER INTERFACE – Legal Issues: LICENSE / DATA SECURITY Overview

5 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 5 The Grid “native” environment is open source … but heterogeneity is a Grid feature … so: – legacy middleware is based on Scientific Linux (SLC) 3.0.8 (similar to Red Hat 6 / 7) with a 2.4.x kernel running on a i686 (32 bit) architecture – migration to SLC 4 with a 2.6 Linux kernel on a x86_64 architecture (64 bit) is under way – migration is complete for applications:  COMETA (Consortium) is a “production” infrastructure supporting the gLite 3.0 (edg flavor) middleware version  Worker Nodes are full 64-bit (4 cores) processors  PI2S2 (Project) User Interface is 64-bit SLC4.4 (build machine) … about Windows applications: – a strong demand from the industry / business world Unix (Linux) / Windows (1/2)

6 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 6 Unix (Linux) / Windows (2/2) new users can use the Grid Presently on gLite –Grid users interact via the gLite middleware from Linux-based User Interfaces via CLI –(almost) all gLite resources are Linux-based This implies –Grid users need to be trained –Only Linux-based applications can be deployed onto the Grid Porting of gLite to Windows –User Interface –Computing Element (farm)

7 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 7 On a Cometa WN the PGI (Portland Group Inc.) 7.0-6 (64 bit target) – /opt/share/pgi Cc, CC, f77, f90 – Java Runtime Environment Libraries – Static compilation:  The executable “contains” all the libraries Large file (slows down data transfer) Non need for “external” calls (faster execution) – Dynamic Libraries  Call to external libraries Small executable file (easy to transfer) Less robust solution Compiler / Library Availability

8 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 8 – Two ways to install the software:  Static “Public” installation as a stable extension of the m/ware Advantage: speeds up job execution oLess work to do at run time Disadvantages: ocompatibility problems with middleware and/or other applications omore complex modifying and updating the application orequires Software Manager (SWM) role privileges Usage: ohuge and/or stable SW packages oonly MPI binary executable available (difficultly modifiable) Static Installation

9 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 9  Dynamic “Private” installation during job pre – processing Advantages: omore robust and flexible jobs oeasier modifying and updating oNo SWM privileges required Disadvantages: oslows down job execution (more work to do during job execution) Usage: olittle and/or frequently modified SW packages oMPI launching script (easily modifiable) Dynamic Installation

10 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 10 Interactivity – The “standard” use of Grid is the “batch” mode – Interactive Jobs are also available but … – Grid integration of “foreign” GUIs requires modifications to the application code (deprecated for both difficulties and “danger”) – Data Inspection is required by long-running jobs (>1 day)  by a watchdog script (user application unchanged)  check - pointable jobs (not available on Cometa, require a library called from “inside” the application)  customized solutions (see Computing Schema) Graphical User Interface – The Genius portal is the “natural” GUI for the gLite m/ware  many functions already implemented (i.e. authentication)  development is faster due to standardization Interactivity & Graphical User Interface

11 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 11 License Issues – In the future will act as a service provider selling calculus and storage resources on demand – The accounting system is under construction – Licensed software can be already installed at site level – A Grid License Server is being built Data Security – Data Catalog, replicas enhance data system robustness – Security is based on X509 certificates – Virtual Organization membership, groups, roles are recognized and Access Control List (ACL) are supported – Cryptography is also available (GFAL library) License Issues & Data Security

12 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 12 Classification: Simulation Characteristics –Jobs are CPU-intensive –Large number of independent jobs –Run by few (expert) users –Small input; large output Needs –Batch-system services –Minimal data management for storage of results Examples: LHC Monte Carlo simulation, Fusion

13 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 13 Classification: Bulk Processing Characteristics –Widely-distributed input data –Significant amount of input and output data Needs –Job management tools (workload management) –Meta-data services –More sophisticated data management Examples: HEP processing of raw data, analysis, Earth observation data Processing.

14 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 14 Classification: Responsive Apps (I) Characteristics –Small amounts of input and output data –Not CPU-intensive –Short response time (few minutes) Needs –Configuration which allows “immediate” execution (QoS) –Services must treat jobs with minimum latency Examples: Prototyping new applications, Monitoring grid operations

15 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 15 Characteristics –Rapid response: a human waiting for the result! –Many small but CPU-intensive tasks –User is not aware of “grid”! Needs –Interfacing (data & computing) with non-grid application or portal –User and rights management between front-end and grid Examples: Appls that use Grid as a backend infrastructure (gMOD, gLibrary, Hadrontherapy, GATE, Interactive Analysis of Medical images, Volcano Sonification) Classification: Responsive Apps (II)

16 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 16 WORKFLOW Characteristics –Use of grid and non-grid services –Complex set of algorithms for data analysis –Complex dependencies between individual tasks Needs –Tools for managing the workflow itself –Standard interfaces for services (I.e. web-services) Examples: Flood prediction

17 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 17 PARALLEL JOBS Characteristics –Many interdependent, communicating tasks –Many CPUs needed simultaneously –Use of MPI libraries Needs –Configuration of resources for flexible use of MPI –Pre-installation of optimized MPI libraries

18 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 18 DATA MANAGEMENT Data distribution must be carefully planned: –When huge amounts of data are involved –When a single file is large (1 GB) –When data transfer impacts on computing performance –When security issues are relevant (data integrity) Metadata –May help in data management (updating) –Are useful to add more information to raw data

19 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 19 COMPUTING SCHEMA Computing Schema –The way the computation is actually performed on the Grid –May be critical for Grid effectiveness –Requires cooperation between Grid and application experts Guidelines –Leave the application unchanged as far as possible –Adapt the Grid (especially extending the m/ware by customization scripts) –Clearly separate application form Grid “domains” –If a massive exploitation is needed, reach it step by step

20 Catania, Tutorial Grid per i Laboratori Nazionali del Sud, 26 Febbraio 2008 20 Questions…


Download ppt "FESR Consorzio COMETA - Progetto PI2S2 Porting a program to run on the Grid Marcello Iacono Manno Consorzio COMETA"

Similar presentations


Ads by Google