Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.consorzio-cometa.it FESR Consorzio COMETA Giuseppe Andronico INFN Sez. CT & Consorzio COMETA Workshop Grids vs. Clouds Beijing, 18.05.2011 Consorzio.

Similar presentations


Presentation on theme: "Www.consorzio-cometa.it FESR Consorzio COMETA Giuseppe Andronico INFN Sez. CT & Consorzio COMETA Workshop Grids vs. Clouds Beijing, 18.05.2011 Consorzio."— Presentation transcript:

1 www.consorzio-cometa.it FESR Consorzio COMETA Giuseppe Andronico INFN Sez. CT & Consorzio COMETA Workshop Grids vs. Clouds Beijing, 18.05.2011 Consorzio COMETA: MPI use case

2 Outline Consorzio COMETA Specific solutions CFD & MPI on Consorzio COMETA infrastructure Some results Conclusions

3 3 The Italian e-Infrastructure (interoperable with common communication protocols ) The Network FESR The Grid

4 Objectives of an e-Infrastructure in Sicily 4 Create a Virtual Laboratory in Sicily,both for scientific and industrial applications, built on top of a Grid infrastructure Connect the Sicilian e-Infrastructure to those already existing in Italy, in Europe, and in the rest of the world improving the scientific collaboration and increasing the “competitiveness” of e-Science and e-Industry “made in Sicily” Disseminate the “grid paradigm” through the organization of dedicated events and training courses Trigger/foster the creation of spin-offs in the ICT area in order to reduce the “brain drain” of brilliant young people to other parts of Italy and beyond

5 5 2000 cores for HPC 250 TB for storage COMETA Consortium and its e-Infrastructure

6 The COMETA Infrastructure 6 Catania ~15 M € in 3 years >350 people involved! ~2500 CPU ~250 TB

7 7 Hardware 1. ~2000 cores AMD Opteron 2218 rev. F 2. 2 GB of RAM per core 3.Commercial LRMS (LSF) 4.Infiniband-4X (for MPI applications) 1. ~ 250+ TB of storage 2.Distributed parallel filesystem (GPFS) gLite 3.2 as Grid middleware everywhere. A deliberate investment on a “de facto” standard.

8 Catania Computing Room 8 Full Area: ~200 m 2 Area # 1 10 racks / 40 kW UPS/PDU Area # 2 80 kW UPS/PDU Area # 2 13 racks Area # 2 80 kW Air Cond. with ~100 kW external chiller 3D Model of Catania Data Center

9 9 Infiniband-4x Mt 250 Catania University Campus Infiniband-4x net layer allows latency times of a few  s – ideal for HPC Sensitive performance enhancement with >40 cores Currently under project: of a 1000 core-site inside the Catania University Campus linking INFN-CT UNICT-MATH UNICT-ENGIN INFN-LNS

10 10 Scheduling Policy Usual Policy –Queues and related longest durations  Short (15 min), long (12 hrs), “infinite” (21 days) jobs –Decreasing priority from short to infinite –All resources available before starting –Too restrictive for HPC jobs: long waiting times HPC: RESERVATION (resources are assigned to the incoming HPC job as soon as they become available) and COALLOCATION (the reserved resources are assigned to SHORT jobs while the HPC is collecting its resources) EMERGENCY: PRE-EMPTION (current job is interrupted and then restored after the completion of the incoming job)

11 11 GridFlex

12 Watchdog Tool to monitor job status at runtime. It is made from scripts. Some have to be sent with the job, other will be used from user to retrieve information WMS CEWN AMGA Server JDL: job + watchdog (watchdog.conf, watchdog.ctrl, watchdog.sh) Querying job status using getinfo.ctrl getcontent.sh watchdog

13 Application distribution & Fluent - Many fields impacted in both academic and business areas - More than 300 users from Italy and abroad

14 COMETA workflow 1.COMETA people, application developer(s) and application user(s) sitting together to verify application portability 2.If point 1 was OK, then a JDL was developed to run application on grid in the simplest case 3.A more general JDL was then developed to let application usage 4.In some cases a portal was developed to simplify interaction between users and grid environment At this point a user interested in the application connected to the portal, filled the required data to run application, pushed a button to have application running, monitored application status and retrieved results at the end. 14 SaaS

15 15 Introduction: FLUENT & OpenFOAM FLUENT & OpenFOAM are SW packages for CFD simulations Two different approaches for the same computational field (mainly used for flow modelling, heat and mass transfer simulations)  Fluent is a Commercial Product several libraries included, ready for many architectures  OpenFOAM is an open-source SW easily adaptable by the user

16 Fluent -Fluent (Computer Fluid Dynamics) -ANSYS software for parallel computations on fluid, heat and mass flows -COMETA has been a leading customer in Italy with >150 licenses -Applications ranged from car design to the Marmore Falls simulation going through the study of heat dispersion in engines and refrigerators 16

17 17 Fluent & OpenFOAM PDE solvers require long CPU times, large memories A commonly used architecture is MPI/MPI2 with low-latency net layers (InfiniBand) and dedicated compilers (PGI C++/Fortran, Intel, GCC) Both packages run on the COMETA Infrastructure

18 Fluent use case Users of COMETA interested in running Fluent have problems that require the use of MPI clusters with 128 or 256 nodes for weeks or months. Big enough to switch to dedicate hardware, not enough to move on supercomputers. It is not possible to use virtualization: VMs do NOT support Infiniband nor any other low latency connection Currently it is not possible to use cloud computing, as it is intended nowadays, to provide service to such users Also, 5% loss in performance in a run a month long means 1.5 days more. Not a lot, but cloud computing is paid on a hourly basis 18

19 Fluent use case In this case using grid was possible to integrate MPI support in the infrastructure Compilers, software and licensing were nicely integrated too Required expert users: Fluent application is obtained writing C/C++ code and compiling against Fluent library With the COMETA solution users can ask for a MPI pool till to about 600 cores. 19 IaaS

20 20 Marmore Falls Simulation

21 21 Fluent Video (1/2)

22 Fluent Video (2/2)

23 23 Other Applications COMETA hosts several MPI codes:  FLASH is a 3D astrophysical hydrodynamic code for supercomputers used in current astrophysical research  ABINIT is a Molecular PhysicsAb-initio parallel code for molecular clustering  CLUSTAL-W is a Bio-informatics code for molecular affinity  TEPHRA is a Civil Defense code for the forecast of ash-cloud evolution (Etna/Ejafiallajokull)

24 24 Performance

25 25 Support

26 Conclusions We showed as SaaS and IaaS were implemented on Consorzio COMETA infrastructure Also we showed that exist yet cases in which is better to rely on grid computing grid computing can be used to provide a service simple to use and with good QoS 26

27 27 MPI on the web.. https://edms.cern.ch/file/454439/LCG-2-UserGuide.pdf http://oscinfo.osc.edu/training/ http://www.netlib.org/mpi/index.html http://www-unix.mcs.anl.gov/mpi/learning.html http://www.ncsa.uiuc.edu/UserInfo/Training

28 28 Questions…


Download ppt "Www.consorzio-cometa.it FESR Consorzio COMETA Giuseppe Andronico INFN Sez. CT & Consorzio COMETA Workshop Grids vs. Clouds Beijing, 18.05.2011 Consorzio."

Similar presentations


Ads by Google