Presentation is loading. Please wait.

Presentation is loading. Please wait.

EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Experience with the deployment of applications.

Similar presentations


Presentation on theme: "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Experience with the deployment of applications."— Presentation transcript:

1 EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Experience with the deployment of applications on the grid Vincent Breton LPC, CNRS-IN2P3 Credit for the slides: V. Floros, C. Loomis, M. Hofmann, N. Jacq, V. Kasam, J. Montagnat

2 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 2 Who am I ? Vincent Breton, research associate at CNRS –Email: breton@clermont.in2p3.frbreton@clermont.in2p3.fr –Phone: + 33 6 86 32 57 51 In 2001, I created a research group in my laboratory on grid-enabled biomedical applications –Web site: http://clrpcsv.in2p3.frhttp://clrpcsv.in2p3.fr The PCSV team has been continuously attempting to deploy scientifically relevant applications on grid infrastructures –FP5: DataGrid –FP6: EGEE, Embrace, BioinfoGRID, Share This talk is given from a user perspective

3 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 3 Introduction A few words on EGEE applications A few principles Application development and deployment

4 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 4 Growing Usage Recent level: ~15k CPUs in continuous use Usage doubled this last year

5 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 5 Scientific Disciplines Disciplines: 10 Sub-disciplines: 36 See growth and diversification of applications. Reported apps. only  underestimate! 6/20062/2007 Astronomy & Astrophysics28 Computational Chemistry627 Earth Science16 Fusion23 High-Energy Physics911 Life Sciences2339 Others414 Total62118 Condensed Matter Physics Comp. Fluid Dynamics Computer Science/Tools Civil Protection Finance

6 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 6 Usage by Scientific Discipline Wide (natural) differences in total CPU utilization. Evidence of broad adoption of grid technology.

7 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 7 Resources by Discipline Utilization depends on having available resources. See good coverage of scientific disciplines for computing and storage resources. –Sites often have more than one CE or SE defined. –Number not size of resources! Thanks! # CEs # SEs HEP292299 LS113123 CC2541 AA5783 Fusion1921 ES4265 Others143149 Unknown288327 Infra.282306 Total366334

8 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 8 Active VOs Number of “active” VOs growing steadily! –Turnover: Diff. VOs in last 6 / 12 / 24 months = 83 / 92 / 102 –Total VOs: 104 registered, 258 visible

9 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 9 Summary of Use Large, growing overall utilization Long-term, habitual use of infrastructure. Broad adoption many diverse communities

10 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 10 High-Energy Physics Coordinator: Massimo Lamanna Use by major (& future) HEP laboratories –LHC –Tevatron –Hera –SLAC –ILC –…

11 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 11 High-Energy Physics CMS Analysis in Summer 2007 * > 600k jobs/month, 20k jobs/day * 89% grid success rate * substantial Tier2 contribution ATLAS collected and distributed first detector data (cosmic rays) to Tier2s Dashboard: serving 4 of 4 LHC experiments Interest from other communities: * Pilot for Life Sciences (vlemed VO) * Interest from Diligent Contribution to monitor users’ jobs: Grid reliability + more information to final users

12 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 12 Life Sciences Coordinators: –Drug Discovery: Vincent Breton –Medical Imaging: Johan Montagnat –Bioinformatics: Christophe Blanchet

13 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 13 AMGA server used for medical metadata management –Fine-grained access control using grid credentials –Secured data transfers Medical data mgt. environments exploiting AMGA –gLite Medical Data Manager (CNRS, I3S lab)‏ –Medical image management web portal (CNRS, LPC)‏ –Alzheimer's patient data analysis env. (Biolab, U. Genova)‏ –Health-e-child Data Management System (Health-e-child project)‏ Future evolution: Distributed metadata repositories Life Sciences Image + metadata Patient name Radiology center Image Ids... AMGA Server LFN Other medical metadata

14 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 14 Earth Sciences Coordinator: Monique Petitdidier Extremely varied range of applications in this sector.

15 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 15 Earth Sciences Sharing Algorithms GEOCLUSTERELMERCODESA-3D3DSEM_UNSTRUCT VO - EGEODEVO - ESREUMEDGridEELA Exploring Large Set of Data Geoscope: (http://geoscope.ipgp.jussieu.fr) IPGP-Francehttp://geoscope.ipgp.jussieu.fr 28 seismological stations and data center 25 years of data Processing of the whole data set on EGEE Impact on seismological data center design CGG-VeritasCSC - FinlandCRS4 -ItalyIPGP- France Partners:

16 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 16 Computational Chemistry Coordinator: Mariusz Sterzel

17 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 17 Computational Chemistry Commercial software availability –Gaussian, Turbomole Parallel (MPI) execution –DL_POLY, NAMD, Turbomole Nanotubes studies GEMS: Ab initio and Molecular Dynamics chemical reactions modeling Other interested disciplines: * Biology * Pharmacology * Solid state physics

18 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 18 Astronomy & Astrophysics Coordinator: Claudio Vuerli

19 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 19 Astronomy & Astrophysics Requirements/Wishes Moving huge amount of data over the grid. On the fly deployment of code on the grid. Saving intermediate data directly on SEs. Deployment of dedicated libraries. Deployment of visualization tools. Make EGEE G-DSE compliant. Integration in grid portals (Genius, EnginFrame, etc.). More effort to train people. Typical application that benefits by distributed computing techniques International collaboration with a high level of interactions. Usage of local clusters, EGEE infrastructure and DEISA facilities.

20 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 20 Fusion Coordinator: Francisco Castejon

21 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 21 Business Collaboration Gaussian –http://www.gaussian.com/http://www.gaussian.com/ –Predicts the energies, vibrational freq., … of molecular systems. –VO-based licensing model, actually in use in gaussian VO. MathWorks –http://www.mathworks.com/http://www.mathworks.com/ –Integrate MATLAB & Distributed Computing Engine with EGEE. –Both client and server are licensed in this model. Interactive Supercomputing –http://www.interactivesupercomputing.com/http://www.interactivesupercomputing.com/ –Similar to DCE; used from multiple clients (MATLAB, Python, R) –Server licensed, some clients licensed

22 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 22 A few principles for achieving scientific production on grids Basic principles we used to achieve scientific production on grids –Principle n°1: the bottom-up approach –Principle n°2: the grid risk –Principle n°3: the natural choice –Principle n°4: the minimum effort These principles are not relevant to people who are doing research on grids

23 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 23 Principle n°1: the bottom-up approach There are at least three complementary approaches to doing science with grids – Top-down (à la MyGRID): start from end users and integrate grid services as appropriate –Integrated approach (à la BIRN): develop at all levels –Bottom-up (our approach): start from the services made available by the grid infrastructures Our philosophy: identify and deploy the science that can be done with the services available –It requires to understand both the needs of the user communities and the services available on the grids Consequence: don’t ever wait for the next generation middleware –It will not hold on premises !

24 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 24 Principle n°2: the grid risk Most scientific applications today do not require grids –Most data crunching applications require only cluster computing –Most data applications do not require grids By gridifying them, new perspectives are open –Exemple: virtual screening Remember the pioneers building planes –First planes were by no mean efficient vehicles for traveling –It took years and a war to offer a transport service using planes Look at your grid application as a prototype for the future –Grid operating systems are going to evolve in the coming years Consequence: be ready to face skepticism

25 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 25 Principle n°3: the natural choice To achieve scientific production, we needed help –Developing our own middleware or building our own grid infrastructure was very expensive and out of reach –Learning how to use a middleware was already expensive –Being alone would have been a very heavy burdeon Looking at principle n°1, we had to make compromises –User support and accessibility at the price of reduced functionalities provided by EGEE middleware services Consequence: look around you and make sure to choose the technology for which you will get the strongest support

26 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 26 Principle n°4: minimum effort keep in mind there are two steps –Application development requires services –Application deployment requires an infrastructure offering the services used to develop application At development stage, it is tempting to use services not yet available on the infrastructure –Very important additional cost to maintain additional services To achieve scientific production, it is import to stick to the middleware released –As a consequence, put as much pressure as possible to get the services you need in the middleware release Consequence: think carefully in terms of the middleware and the infrastructure you will need

27 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 27 Application development Gridification of the application Key steps –Need to carefully analyze the application §Where do I need the grid ? §What are the steps where the grid is needed ? –Need to carefully consider the services offered by the grid §Different strategies can be considered §Avoid using services which are not widely deployed (MPI) Vincent Bloch will show on the WISDOM example how to develop an application on the grid

28 ACGRID School, Nov. 7 2007 Enabling Grids for E-sciencE EGEE-II INFSO-RI-031688 28 Application deployment Deployment of the gridified application on the grid Key questions –On which VO should I deploy my application ? §Do I need specific software ? (LHC computing environment) §Do I need specific resources (2G RAM, MPI) ? §Do I need a large amount of resources ? §Do I need specific data ? –Do I need a lot of storage space for my application input and output ? §Need to plan it in advance to avoid overloading the network and/or the User Interface §Use of grid Storage Elements Hands on will provide you the opportunity to better understand the issues


Download ppt "EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks Experience with the deployment of applications."

Similar presentations


Ads by Google