Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org User communities and applications David Fergusson 28th February.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org User communities and applications David Fergusson 28th February."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org User communities and applications David Fergusson 28th February

2 Enabling Grids for E-sciencE INFSO-RI-508833 Enabling Grids for EsciencE What is the EGEE community? –Researchers in eScience (applications NA4) –eResearch –European community –World grid community –Industry (industry forum) What is not the EGEE community?

3 Enabling Grids for E-sciencE INFSO-RI-508833 eScience/eResearch EGEE’s initial focus is on specific scientific communities –High Energy Physics (Large Hadron Collider) –Biomedical –Geology –Chemistry –Astrophysics Collaborating with other EU projects in other areas –For example, digital libraries - DILIGENT

4 Enabling Grids for E-sciencE INFSO-RI-508833 Applications in EGEE Production service supporting multiple VOs with different requirements –Data  Volume  Location – distributed?  Write Once or Update?  Metadata archives?  Controlled or open access? –Computation  High throughput (~ current LCG)  High performance, supercomputing –No. of sites, scientists,… Establish viable general process to bring other scientific communities on board

5 Enabling Grids for E-sciencE INFSO-RI-508833 An EGEE community EGEE communities are based around the idea of Virtual Organisations. A Virtual Organisation: –Owns shared computing resources –Authorises and authenticates its members access to resources –Manages its own resources

6 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE: adding a VO EGEE has a formal procedure for adding selected new user communities (Virtual Organisations): Negotiation with one of the Regional Operations Centres Seek balance between the resources contributed by a VO and those that they consume. Resource allocation will be made at the VO level. Many resources need to be available to multiple VOs : shared use of resources is fundamental to a Grid

7 Enabling Grids for E-sciencE INFSO-RI-508833 The role of the pilot applications – HEP and Biomedicine Initial area of focus to establish a strong user base on which to build a broad EGEE user community Provide early feedback to the infrastructure activities on their experience with application deployment and VO management Act as guinea pigs and provide early feedback to the middleware developers on their experience with new services

8 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE pilot application: Large Hadron Collider Mont Blanc (4810 m) Data Challenge: –10 Petabytes/year of data !!! –20 million CDs each year! Simulation, reconstruction, analysis: –LHC data handling requires computing power equivalent to ~100,000 of today's fastest PC processors! Operational challenges –Reliable and scalable through project lifetime of decades Downtown Geneva

9 Enabling Grids for E-sciencE INFSO-RI-508833 The characteristics of pilot HEP applications Very large scale from project day 1 Virtual Organizations were already set up at project day 1 Very centralized: jobs are sent in a very organized way Multi-grid: data challenges are deployed on several grids –ALICE LCG, Alien –ATLAS LCG, US Grid2003, Nordugrid –CMS LCG, US Grid2003 –LHCb LCG, Dirac

10 Enabling Grids for E-sciencE INFSO-RI-508833 The Large Hadron Collider http://www.cern.ch ~9 km LHC SPS CERN

11 Enabling Grids for E-sciencE INFSO-RI-508833 The LHC Experiments

12 Enabling Grids for E-sciencE INFSO-RI-508833 Overview of experiences with LHC data challenges There was continual evolution throughout 2004, with LCG and experiments gaining more experience in the development and use of an expanding LCG grid All experiments had excellent relations with LCG-EIS support – a model for the future support of VOs Global job efficiencies ranged from 60-80% as experience developed – must get up to 90+% for user analysis - look to new middleware developments and tighter operational procedures Sources of problems and losses –Site configuration, management and stability –Data Management (especially metadata handling) –Difficult to monitor job running and causes of failure D0 in early 2005 showed that one can run with good efficiency with a set of well controlled sites

13 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE pilot application: BioMedical BioMedical –Bioinformatics (gene/proteome databases distributions) –Medical applications (screening, epidemiology, image databases distribution, etc.) –Interactive application (human supervision or simulation) –Security/privacy constraints  Heterogeneous data formats - Frequent data updates - Complex data sets - Long term archiving http://egee-na4.ct.infn.it/biomed/applications.html

14 Enabling Grids for E-sciencE INFSO-RI-508833 The characteristics of biomedical pilot applications Prototype level at project day 1 VO was created after the project kicked-off Very decentralized: application developers use the grid at their own pace Very demanding on services –Compute intensive applications –Applications requiring large amounts of short jobs –Need for interactivity or guaranteed response time Resources were focused on the deployment of large scale applications on LCG-2 –Integration of Biomed VO used to identify issues relevant to all VOs to be deployed during EGEE lifetime –Decentralized usage of the infrastructure highlights different weaknesses from the more centralized HEP data challenges

15 Enabling Grids for E-sciencE INFSO-RI-508833 Status of Biomedical VO PADOVA BARI 15 resource centres ( )  17 CEs (>750 CPUs)  16 SEs 4 RBs: CNAF, IFAE, LAPP, UPV RLS, VO LDAP Server: CC-IN2P3 4 RBs 1 RLS 1 LDAP Server

16 Enabling Grids for E-sciencE INFSO-RI-508833 Biomedical VO: production jobs on EGEE

17 Enabling Grids for E-sciencE INFSO-RI-508833 Biomedical applications –3 batch-oriented applications ported on LCG2  SiMRI3D: medical image simulation  xmipp_MLRefine: molecular structure analysis  GATE: radiotherapy planning –3 high throughput applications ported on LCG2  CDSS: clinical decision support system  GPS@: bioinformatics portal (multiple short jobs)  gPTM3D: radiology images analysis (interactivity) –New applications to join in the near future  Especially in the field of drug discovery

18 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE pilot application: BioMedical BioMedical –Bioinformatics (gene/proteome databases distributions) –Medical applications (screening, epidemiology, image databases distribution, etc.) –Interactive application (human supervision or simulation) –Security/privacy constraints  Heterogeneous data formats - Frequent data updates - Complex data sets - Long term archiving BioMed applications deployed GATE - Geant4 Application for Tomographic Emission –GPS@ - genomic web portal –CDSS - Clinical Decision Support System

19 Enabling Grids for E-sciencE INFSO-RI-508833 12 Biomed applications GATE: Geant4 Application for Tomographic Emission (LPC) Docking platform for tropical diseases: grid-enabled docking platform for in sillico drug discovery (LPC) CDSS: Clinical Decision Support System (UPV) GPS@: Grid genomic web portal (IBCP) SiMRI 3D: Magnetic Resonance Image simulator (CREATIS) gPTM 3D: Interactive radiological image visualization and processing tool (LRI) xmipp_ML_refine: Macromolecular 3D structure analysis (CNB) xmipp_multiple_CTFs : Electronmicroscopic images CTF calculation (CNB) GridGRAMM: Molecular Docking web (CNB) GROCK: Mass screenings of molecular interaction (CNB Mammogrid: Mammograms analysis (EU project) SPLATCHE: Genome evolution modeling (U. Berne/WHO)

20 Enabling Grids for E-sciencE INFSO-RI-508833...and more to come SPLATCHE –first application being migrated from GILDA to biomed VO Pharmacokinetics in MRI (UPV) –MRI registration for contrast agent diffusion study Some progress on biological sequences analysis (M. Lexa)...

21 Enabling Grids for E-sciencE INFSO-RI-508833 BLAST – comparing DNA or protein sequences BLAST is the first step for analysing new sequences: to compare DNA or protein sequences to other ones stored in personal or public databases. Ideal as a grid application. –Requires resources to store databases and run algorithms –Can compare one or several sequence against a database in parallel –Large user community

22 Enabling Grids for E-sciencE INFSO-RI-508833 Bio-medicine applications Medical images Exam image patient key ACL... 1. Query the medical image database and retrieve a patient image Metadata 3. Retrieve most similar cases Similar images Low score images 2. Compute similarity measures over the database images Submit 1 job per image Bio-informatics –Phylogenetics –Search for primers –Statistical genetics –Bio-informatics web portal –Parasitology –Data-mining on DNA chips –Geometrical protein comparison Medical imaging –MR image simulation –Medical data and metadata management –Mammographies analysis –Simulation platform for PET/SPECT Applications deployed Applications tested Applications under preparation

23 Enabling Grids for E-sciencE INFSO-RI-508833 Bio-medicine applications

24 Enabling Grids for E-sciencE INFSO-RI-508833 Bio-medicine applications

25 Enabling Grids for E-sciencE INFSO-RI-508833 Bio-medicine applications

26 Enabling Grids for E-sciencE INFSO-RI-508833 gPTM3D : Grid-Enabling Interactive Medical Analysis Interaction RenderExploreAnalyseInterpretAcquire

27 Enabling Grids for E-sciencE INFSO-RI-508833 Use case Planning percutaneous nephrolithotomy

28 Enabling Grids for E-sciencE INFSO-RI-508833 Evolution of biomedical applications Growing interest of the biomedical community –Partners involved proposing new applications –New application proposals (in various health-related areas) –Enlargement of the biomedical community (drug discovery) Growing scale of the applications –Progressive migration from prototypes to pre-production services for some applications –Increase in scale (volume of data and number of CPU hours) Towards pre-production –Several initiatives to build user-friendly portals and interfaces to existing applications in order to open to an end-users community

29 Enabling Grids for E-sciencE INFSO-RI-508833 A look at the future: the HealthGrid vision INDIVIDUALISED HEALTHCARE MOLECULAR MEDICINE Databases Association Modelling Computation HealthGRID Computational recommendation Public Health Patient Tissue, organ Cell Molecule Patient related data Public Health Patient Tissue, organ Cell Molecule In this context "Health" does not involve only clinical practice but covers the whole range of information from molecular level (genetic and proteomic information) over cells and tissues, to the individual and finally the population level (social healthcare).

30 Enabling Grids for E-sciencE INFSO-RI-508833 Earth Sciences in EGEE Research –Earth observations by satellite  (ESA(IT), KNMI(NL), IPSL(FR), UTV(IT), RIVM(NL),SRON(NL)) –Climate :  DKRZ(GE),IPSL(FR) –Solid Earth Physics:  IPGP (FR) –Hydrology:  Neuchâtel University (CH) Industry –CGG : Geophysics Company (FR)

31 Enabling Grids for E-sciencE INFSO-RI-508833 Climate Applications in EGEE Model: Atmosphere, Ocean, Hydrology, Atmospheric and Marine chemistry…. Goal: Comparison of model outputs from different runs and/or institutes  Large volume of data (TB) from different model outputs, and experimental data  Run made on supercomputer => Link the EGEE infrastruture with supercomputer Grids (DEISA) EXAMPLE: For the IPCC Assessment reports many experiment are performed with different models (different spatial resolution, different time- step, different "physics"..) and various sites. The generated data need to be compared in a comprehensive and "unified" way.

32 Enabling Grids for E-sciencE INFSO-RI-508833 Geophysics Applications Seismic processing Generic Platform: - Based on Geocluster, an industrial application – to be a starter of the core member VO. - Include several standard tools for signal processing, simulation and inversion. - Opened: any user can write new algorithms in new modules (shared or not) - Free for academic research -Controlled by license keys (opportunity to explore license issue at a grid level) - initial partners F, CH, UK, Russia, Norway

33 Enabling Grids for E-sciencE INFSO-RI-508833 Flood simulation Sample Vah river Geographical Information Systems Computer vision Results: flow + water depths

34 Enabling Grids for E-sciencE INFSO-RI-508833 Computational Chemistry: molecular simulatorSURFACE Construction of the Potential Energy Surface DYNAMICS Dynamical properties Calculation no yes end PROPERTIES Calculation of Averaged quantities Good Results? Ar - Benzene

35 Enabling Grids for E-sciencE INFSO-RI-508833 The MAGIC telescope Largest Imaging Air Cherenkov Telescope (17 m mirror dish) Located on Canary Island La Palma (@ 2200 m asl) Lowest energy threshold ever obtained with a Cherenkov telescope Aim: detect  – ray sources in the unexplored energy range: 30 (10)-> 300 GeV

36 Enabling Grids for E-sciencE INFSO-RI-508833 n AGNs n SNRs n Cold Dark Matter n Pulsars n GRBs n Tests of Quantum Gravity effects Cosmological  -Ray Horizon Cosmological  -Ray Horizon The MAGIC Physics Program n Origin of Cosmic Rays

37 Enabling Grids for E-sciencE INFSO-RI-508833 Feedback to LCG-2 middleware developers and infrastructure From HEP applications –Experiment Integration Support group and Grid Applications Group produced documents summarizing problems encountered in use of LCG-2 From Biomed applications –Very significant exchanges related to the set-up of the biomed VO and the deployment of relevant services –Request to use MPI

38 Enabling Grids for E-sciencE INFSO-RI-508833 Engineering applications

39 Enabling Grids for E-sciencE INFSO-RI-508833 Engineering applications

40 Enabling Grids for E-sciencE INFSO-RI-508833 Grid Applications: art Books are being scanned in at 767 MB per page 1/2 Terabyte for Gutenberg Bible Paintings are being scanned in at 30 GB each in the EU CRISATEL Project Museo Virtual de Artes El Pais (MUVA) http://www3.diarioelpais.com/muva/http://www3.diarioelpais.com/muva/.

41 Enabling Grids for E-sciencE INFSO-RI-508833 Who else can benefit from EGEE? EGEE Generic Applications Advisory Panel: –For new applications EU projects: MammoGrid, Diligent, SEE- GRID … Expression of interest: Planck/Gaia (astroparticle), SimDat (drug discovery) http://agenda.cern.ch/age?a042351 Next meeting at EGEE conference (November)

42 Enabling Grids for E-sciencE INFSO-RI-508833 New communities identification Through training, dissemination and outreach, communities already using advanced computing and keen to use EGEE infrastructure are identified These communities are encouraged to prepare a document describing their interest to use EGEE A scientific advisory panel (EGAAP) assesses and chooses among the interested communities the ones which seem the most mature to deploy their applications on EGEE

43 Enabling Grids for E-sciencE INFSO-RI-508833 GILDA, an infrastructure for dissemination and demonstration Goals –Demonstration of grid operation for tutorials and outreach –Initial deployment of new applications for testing purposes Key features –Initiative of the INFN Grid Project using LCG-2 middleware –On request, anyone can quickly receive a grid certificate and a VO membership allowing them to use the infrastructure for 2 weeks –Certificate expires after two weeks but can be renewed –Use of friendly interface: Genius grid portal Very important for the first steps of new user communities on to the grid infrastructure

44 Enabling Grids for E-sciencE INFSO-RI-508833 GILDA numbers 14 sites in 2 continents >1200 certificates issued, 10% renewed at least once >35 tutorials and demos performed in 10 months >25 jobs/day on the average Job success rate above 96% >320,000 hits on the web site from 10’s of different countries >200 copies of the UI live CD distributed in the world

45 Enabling Grids for E-sciencE INFSO-RI-508833 NA4 Applications and GILDA 7 Virtual Organizations supported: –Biomed –Earth Science Academy (ESR) –Earth Science Industry (CGG) –Astroparticle Physics (MAGIC) –Computational Chemistry (GEMS) –Grid Search Engines (GRACE) –Astrophysics (PLANCK) Development of complete interfaces with GENIUS for 3 Biomed Applications: GATE, hadronTherapy, and Friction/Arlecore Development of complete interfaces with GENIUS for 4 Generic Applications: EGEODE (CGG), MAGIC, GEMS, and CODESA-3D (ESR) (see demos!) Development of complete interfaces with GENIUS for 16 demonstrative applications available on the GILDA Grid Demonstrator (https://grid-demo.ct.infn.it)https://grid-demo.ct.infn.it

46 Enabling Grids for E-sciencE INFSO-RI-508833 Summary EGEE and grids – not just physics For communities to benefit they need to know what grids can do for them – dissemination Many communities are beginning to adopt the grid EGEE has a mechanism for assisting communities onto the grid

47 Enabling Grids for E-sciencE INFSO-RI-508833 Practical URLs homepages.nesc.ac.uk/~gcw grid-demo.ct.infn.it


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org User communities and applications David Fergusson 28th February."

Similar presentations


Ads by Google