Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org Application Porting Support Group Demonstration at EGEE’08 Conference Istanbul, 22-26.

Similar presentations


Presentation on theme: "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org Application Porting Support Group Demonstration at EGEE’08 Conference Istanbul, 22-26."— Presentation transcript:

1 EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org Application Porting Support Group Demonstration at EGEE’08 Conference Istanbul, 22-26 September 2008

2 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 The Application Porting team MTA SZTAKI, Budapest –Grid Application Support Center (GASuC)  http://www.lpds.sztaki.hu/gasuc http://www.lpds.sztaki.hu/gasuc INFN, Catania –GILDA Team  https://gilda.ct.infn.it/ https://gilda.ct.infn.it/ UCM, Madrid –Distributed Systems Architecture Research Group  http://asds.dacya.ucm.es/doku.php?id=start http://asds.dacya.ucm.es/doku.php?id=start CSIC, Santander –Institute of Physics of Cantabria  http://grid.ifca.es/ http://grid.ifca.es/ ASGC, Taipei –Academia Sinica Grid Computing  http://grid.sinica.edu.tw/ http://grid.sinica.edu.tw/

3 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Budapest Paris Madrid Catania Taipei Melbourne The Application Porting team

4 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Support cycle and services Interviews Apply online at www.lpds.sztaki.hu/gasuc Personalized training Porting specifications Problem analysis Writing publications and case studies Prototyping grid applications Fine tuning applications on production grids

5 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Application analysis Application Description Form at www.lpds.sztaki.hu/gasucwww.lpds.sztaki.hu/gasuc

6 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Dissemmination and outreach

7 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Main tools and technologies gLite command line tools and scripts –Interfacing with the infrastructure P-GRADE Portal –Workflows and parameter studies –Application specific interfaces GEMLCA service –Exposing legacy component as Grid services GridWay –Metascheduling on clusters and grids –Programming abstractions GILDA services –gLite infrastructure for training and prototyping –Training modules and services

8 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Success stories Solving the Schrodinger equation for triatomic systems using time independent method –Department of Chemistry, University of Perugia and MTA SZTAKI

9 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Success stories E-marketplace Model Integrated with Logistics –International Business School and MTA SZTAKI

10 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Success stories 2.5 Dimensional Frequency Domain Electromagnetic Numerical Modelling –University of Miskolc and MTA SZTAKI

11 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Success stories Hybrid pellet code –KFKI Research Institute for Particle and Nuclear Physics and MTA SZTAKI

12 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Success stories Application specific grid portals –Hiding grids and grid applications from non-technical end users

13 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Current applications at MTA SZTAKI RWavePR - Solving the Schrodinger equation for triatomic systems using time dependent methodRWavePR –Application specific grid portal for parameter study simulation SIMRI 3D - 3 dimensional MRI simulatorSIMRI 3D –Application specific grid portal for MPI code MPI-FD-FDTD - Numerical modeling of ElectroMagnetic field distribution in human tissuesMPI-FD-FDTD –Simulating signals propagating inside human body using large number of jobs and output files Universality Classes Explorer – Explaining classes in nonequilibrium systems –Describing collective behavior in statistical physics using a volunteer desktop grid

14 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Demo applications „ABC” computational chemistry code to EGEE –P-GRADE portal –MTA SZTAKI, Hungary „CD-HIT” computational chemistry code –GridWay –UCM, Spain Planck Applications –Command line scripting –IFCA, Spain

15 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Porting the „ABC” computational chemistry code to EGEE Case study from MTA SZTAKI

16 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Main facts about the ABC code Department of Chemistry, University of Perugia SOLUTION OF SCHRODINGER EQUATION FOR TRIATOMIC SYSTEMS USING TIME- DEPENDENT (RWAVEPR) OR TIME INDEPENDENT (ABC) METHOD A single execution can be between 5 hours and 10 hours SEQUENTIAL FORTRAN 90 Binary: 400KB One input: ~ 1KB One output: ~500KB Many simulations at the same time User-fiendy, graphical interface, easy to change scale and input of application

17 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Customized user interface Business layer EGEE Grid services (gLite WMS, LFC, VOMS, …) EGEE Grid Grid layer gLite command line tools ABC Grid system specification Read user input generate application input Collect result from grid Visualize result for user Start and manage ABC jobs

18 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 P-GRADE Portal ABC specific Gridsphere portlet Compchem VO EGEE Grid Fault tolerant grid execution and file transfer layer ABC Grid system implementation ABC parameter study workflow ABC parameter study job

19 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 ABC application workflow ABC job. Executed as many times as many parameters are provided by end user Job collects and archives all output files into a single TAR file Generator job to generates input files for ABC jobs

20 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 ABC user input interfaces In this form end users can define actual values for the ABC application Changeable parameters of the ABC grid application

21 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Fine tuning the ABC application Execution of 4 ABC parameter study jobs on a local machine P4, 3.4GHz, 1GByte on 4 broker selected clusters of EGEE Better speed-up can be achieved with more parameter jobs.

22 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 User manual of the ported application in PPT format

23 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 End user portlet for ported application

24 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Conference paper from the porting of the ABC grid application

25 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Training exercise from ported ABC code Demonstrating parameter study concept Demonstrating workflow concept Usage of gLite CE, SE, WMS, VOMS Tutorial for P-GRADE Portal Customized for GILDA

26 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Porting the „CD-HIT” computational chemistry code to EGEE Case study from UCM

27 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 The CD-HIT Application Application Description “Cluster Database at High Identity with Tolerance” Protein (and also DNA) clustering Compares protein DB entries Eliminates redundancies Examples: Used in UniProt for generating UniRef data sets UniProt is the world's most comprehensive catalog of information on proteins. CD-HIT program is used to generate the UniRef reference data sets, UniRef90 and UniRef50. CD-HIT is also used at the PDB to treat redundant sequences

28 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 The CD-HIT Application Application Description Our case: Widely used in the Spanish National Oncology Research Center (CNIO) Input DB now: 4,186,284 proteins / 1.7GB Infeasible to be executed on single machine Memory requirements Total execution time

29 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 The CD-HIT Application Execute cd-hit in parallel mode Idea: divide the input database to compare each division in parallel Divide the input db Repeat Cluster the first division (cd- hit) Compare others against this one (cd-hit-2d) Merge results Speed-up the process and deal with larger databases Computational characteristics Variable degree of parallelism Grain must be adjusted A90 B-AC-AD-A C-ABD-AB D-ABC B90 C90 A BCD D90 DB DB90 Merge Div. CD-HIT Parallel

30 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Grid Infrastructure Experiment Resources (during Grid porting process) http://www.gridimadrid.org/ (Research and short scale) http://www.eu-egee.org/ (Production and large scale)

31 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 The GridWay Metascheduler

32 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 The GridWay Metascheduler GridWay Internals Execution Manager Transfer Manager Information Manager Dispatch Manager Request Manager Scheduler Job PoolHost Pool DRMAA libraryCLI GridWay Core Grid File Transfer Services Grid Execution Services GridFTPRFT pre-WS GRAM WS GRAM Grid Information Services MDS2 GLUE MDS4 Resource Discovery Resource Monitoring Resource Discovery Resource Monitoring Job Preparation Job Termination Job Migration Job Preparation Job Termination Job Migration Job Submission Job Monitoring Job Control Job Migration Job Submission Job Monitoring Job Control Job Migration

33 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 SGE Cluster Users PBS Cluster gLite GridWay gLite Services: BDII, GRAM, GridFTP EGEE Resource Broker DRMAA interface VO Schedulers GridWay Users Other V.O. Biomed EGEE RB Other App. Deployment Example The GridWay Metascheduler

34 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Distributed Resource Management Application API –http://www.drmaa.org/ Open Grid Forum Standard Homogeneous interface to different Distributed Resource Managers (DRM): –SGE –Condor –PBS/Torque –GridWay  C  JAVA  Perl  Ruby  Python What is DRMAA? The GridWay Metascheduler

35 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Community – Open Source Project. Globus Development Philosophy Development Infrastructure (thanks to Globus Project!) Mailing Lists Bugzilla CVS You are very welcome to contribute: Reporting Bugs (gridway-user@globus.org) Making feature requests for the next GridWay release (gridway-user@globus.org) Contributing your own developments (bug fixes, new features, documentation) Detailed Roadmap: GridWay Campaigns at bugzilla.mcs.anl.gov/globus/query.cgi www-unix.mcs.anl.gov/~bacon/cgi-bin/big-roadmap.cgi#Gridway Development Process The GridWay Metascheduler

36 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 PBS C-ABD-AB D-ABC C90 C-AD-A D-AB D90 cd-hit-div merge C90 B90 Merge sequential tasks to reduce overhead Provide a uniform interface (DRMAA) to interact with different DRMS. Some file manipulation still needed DRMS GridWay local cluster Grid Porting

37 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Grid Porting Optimization Heuristics More information on Globus GridWay http://www.gridway.org/

38 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Porting of Planck Applications to EGEE Case study from CSIC-IFCA Santander

39 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Marcos López-Caniego IFCA Santander Spain Compact Source Detection (Point Sources and SZ Clusters) Scheme: 1.Prepare intel-compatible static executable of the code 2.Run script to generate necessary macros to be submitted 3.Submit job to the node: 1.Retrieve maps/patches from the storage element 2.Move/Rename files 3.Execute application 4.Compress outputs and copy-register them in the storage element 4.Retrieve results from the storage element to the local disk Concept of porting

40 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Marcos López-Caniego IFCA Santander Spain These are the the libraries used by the application: HEALPix: pixelization scheme from Gorski et al. that produces a subdivision of a spherical surface in which each pixel covers the same surface area as every other pixel. Demo libraries

41 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Marcos López-Caniego IFCA Santander Spain CFITSIO: simple high-level routines for reading and writing FITS files developed and maintained by NASA HEASARC (High Energy Astrophysics Science Archive Research Center). CPACK: a package of routines produced at CPAC (Cambridge Planck Analysis Software). In particular, tesselation and proyection of regions of the sphere into flat patches. Demo libraries

42 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Marcos López-Caniego IFCA Santander Spain In this approach to detect point sources we work on flat patches of the sky Demo: Detection of Point Sources

43 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Marcos López-Caniego IFCA Santander Spain + Filtering MHW2 Demo: Detection of Point Sources

44 Enabling Grids for E-sciencE EGEE-III-INFSO-RI-222667 Application Porting Support Group Contact and further information www.lpds.sztaki.hu/gasuc www.lpds.sztaki.hu/gasuc


Download ppt "EGEE-III INFSO-RI-222667 Enabling Grids for E-sciencE www.eu-egee.org Application Porting Support Group Demonstration at EGEE’08 Conference Istanbul, 22-26."

Similar presentations


Ads by Google