Presentation is loading. Please wait.

Presentation is loading. Please wait.

Plateforme de Calcul pour les Sciences du Vivant Addressing neglected and emerging diseases on the grid: new results and prospects.

Similar presentations


Presentation on theme: "Plateforme de Calcul pour les Sciences du Vivant Addressing neglected and emerging diseases on the grid: new results and prospects."— Presentation transcript:

1 Plateforme de Calcul pour les Sciences du Vivant http://clrpcsv.in2p3.fr Addressing neglected and emerging diseases on the grid: new results and prospects Vincent Breton LPC Clermont-Ferrand IN2P3/CNRS Credits: V. Kasam, D. Kim

2 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 2 Enjeux et intérêt des grilles en sciences du vivant Les enjeux –L’avalanche des données a bouleversé les stratégies de recherche en biologie moléculaire –La médecine doit évoluer vers une science exacte exploitant toutes les données de la génomique à l’épidémiologie L’apport des grilles –La grille fournit aujourd’hui les siècles de cycles CPU requis pour les calculs massifs –La grille fournit aujourd’hui les services de gestion sécurisée pour stocker et copier les données biologiques et médicales –La grille offrira à terme l’environnement collaboratif pour l’intégration et le partage des données dans les communautés de recherche

3 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 3 Comment utilise-t-on les grilles aujourd’hui en sciences du vivant ? Pour déployer des calculs à très grande échelle –Exemple en bioinformatique: raffinement des structures de la PDB (Embrace – EGEE, LPC Clermont-Fd) –Exemple en imagerie médicale: MRI simulator (EGEE, CREATIS) –Exemple pour la recherche de nouveaux médicaments: WISDOM (AuverGrid – EGEE – BioinfoGRID, LPC Clermont-Fd) Pour l’analyse interactive de données de plus en plus volumineuses –Exemple en bioinformatique: portail GPS@ (EGEE, IBCP) –Exemple en Imagerie médicale: GPTM3D (AGIR – EGEE, LRI - LAL) Pour mutualiser de nouveaux services et compétences –Exemple en bioinformatique à l’échelle régionale: LifeGrid (AuverGrid) –Exemple en imagerie médicale: Bronze Standard (AGIR – EGEE, I3S) –Exemple pour la recherche de nouveaux médicaments: WISDOM (AuverGrid – EGEE, LPC Clermont-Fd)

4 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 4 Bioinformatics: recalculating protein 3D structures in PDB The PDB data base gathers publicly available 3D protein structures –Full of bugs Project: redo the structures by recalculating the diffraction patterns Credit: G. Vriend, CMBI PDB-files42.752 X-ray structures36.124 Successfully recalculated~36.000 Improved R-free12.500/17000 CPU time estimate 21.7 CPU years Real time estimate1 month on Embrace Virtual Organization on EGEE

5 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 5 Table of content Rationale for in silico drug discovery on tropical / neglected diseases WISDOM, what is being acomplished From grid computing to knowledge management –The issues –The requirements –The first steps Grid-enabled virtual screening on avian flu: short status

6 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 6 Table of content Rationale for in silico drug discovery on neglected diseases WISDOM, what is being acomplished From grid computing to knowledge management –The issues –The requirements –The first attempts Grid-enabled virtual screening on avian flu: short status

7 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 7 Burden of diseases in developing world 2002

8 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 8 Major need to improve information collection, analysis and accessibility Find new drugs and vaccines –Improve collection of epidemiological data for research –Speed-up drug discovery process –Improve the deployment of clinical trials on plagued areas Improve disease monitoring and drug delivery –Monitor the impact of policies and programs –Monitor drug delivery and vector control –Improve epidemics warning and monitoring system Improve the ability of African countries to undertake health innovation –Strengthen the integration of African life science research laboratories in the world community –Provide access to resources –Provide access to services

9 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 9 Grid added value Grids offer unprecedented opportunities for resource sharing and collaboration Grids open exciting perspectives to handle the information flows needed to fight neglected diseases –Deployment of services for healthcare and research centers in endemic regions –Deployment of infrastructures (federation of databases) to collect bio /chemical/medical data and improve disease monitoring –Cross-organizational collaboration space to share data and resources Challenges –Grid technology must provide the services for data and knowledge management –IT expertise and willingness to share information is needed from the participating healthcare centers –Infrastructure capacity building in Africa

10 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 10 Table of content Rationale for in silico drug discovery on neglected diseases WISDOM, what is being acomplished From grid computing to knowledge management –The issues –The requirements –The first steps Grid-enabled virtual screening on avian flu: short status

11 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 11 Introduction to WISDOM WISDOM stands for World-wide In Silico Docking On Malaria Goal: find new drugs for tropical / neglected / emerging diseases –Tropical diseases lack R&D –Need to develop collaborative knowledge spaces to mutualize efforts and results Method: use of grid technology –First step: grid-enabled virtual screening  Cheaper than in vitro tests  Faster than in vitro tests –Next steps: data integration

12 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 12 Grid-enabled virtual docking Millions of potential drugs to test against interesting proteins! High Throughput Screening 1-10$/compound, several hours Data challenge on EGEE ~ 2 to 30 days on ~5000 computers Hits screening using assays performed on living cells Leads Clinical testing Drug Selection of the best hits Too costly for neglected disease! Molecular docking (FlexX, Autodock) ~1 to 15 minutes Targets: PDB: 3D structures Compounds: ZINC: 4.3M Chembridge: 500 000 Cheap and fast!

13 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 13 Virtual screening workflow FLEXX AUTODOCK Molecular docking Molecular dynamics Re-ranking MMPBSA-GBSA Complex visualization In vitro tests Catalytic aspartic residues 4 H bonds Amber Ligand 2 Hydrogen Bonds Ligand Catalytic aspartic residues AMBER CHIMERA WET LABORATORY Millions 5000 180 30 Credit: D. Kim Supporting Projects: EGEE Embrace BioinfoGRID

14 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 14 The grid added value Univ. Los Andes (Venezuela) Biological targets, Malaria biology LPC Clermont-Ferrand (F) Biomedical grid SCAI Fraunhofer (D) Knowledge extraction, Chemoinformatics Univ. Modena(It) Biological targets, Molecular Dynamics ITB CNR (It) Bioinformatics, Molecular modelling Univ. Pretoria / CSIR (RSA) Bioinformatics, Malaria biology Academia Sinica (Taiwan) Grid user interface Biological targets In vitro testing HealthGrid (Int) Biomedical grid, Dissemination CEA, Acamba project (F) Biological targets, Chemogenomics Chonnam Nat. Univ. (Korea) In vitro testing The grid provides the centuries of CPU cycles required on demand The grid provides the reliable and secure data management services to store and replicate the biochemical inputs and outputs The grid offers a collaborative environment for the sharing of data in the research community on avian flu and malaria Mahidol Univ. (Thailand) Biochemistry, in vitro testing KISTI. (Korea) Grid technology

15 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 15 WISDOM data challenges *: use of DIANE/GANGA and WISDOM production environments DatesTarget (s)CPU consumed EGEE AuverGrid Data produced Specific features Status Summer 2005 Malaria: plasmepines 80 years1TBFirst data challenge In vitro tests In vivo tests Spring 2006 Avian flu: Neuraminidase N1 100 years*800 GB*Only 45 days needed for preparation In vitro tests Winter 2006 Malaria: GST, DHFR, Tubulin 400 years1,6TB> 100.000 dockings / hr Under analysis Fall 2007Avian flu: Neuraminidase N1 Estimated 100 CPU years* Estimated 800 GB* Joint deployment on CNGrid Data Challenge under way Winter 2007 Malaria: DHPS To be estimated Joint deployment on desktop grid In preparation

16 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 16 Plasmepsin assay - Plasmepsin activation : rPMII + assay buffer (pH 4.5) + 1 μl inhibitor (100 nM)  37 o C, 30 min induction - FRET assay 1) Activated enzyme (75 ng) + 3 μM substrate peptide (50 μl final volume)  30 min incubation at room temperature 2) measuring using a fluorescence microplate reader (excitation 405 nm, emission 510 nm)  detection of fluorescence spectra or UV spectra IFI Rank NI3418 PA1059 12200 23 22074 22 32981 27 41439 16 52485 26 61297 11 71230 9 81402 13 93534 30 103430 28 111531 17 121808 21 132209 24 1410252 I – Inhibitors; NI – w/o inhibitor PA – w/ pepstatin A (reference) * Red valued inhibitors – contain own fluorescence.  Similar or better inhibitions : 6/30 compounds [21,14, 20, 17, 19, (27)] Malaria: in vitro test results (I/II) Credit: D. Kim, Y. Kim et al, Chonnam Univ. I FI Rank 151760 20 161214 8 1710464 181173 7 1910595 2010263 2110161 222322 25 231414 14 241642 19 251253 10 261341 12 2710746 281426 15 291631 18 303499 29

17 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 17 Malaria: in vitro test results (II/II)  No detection of hemoglobin degradation fragments except inhibitors of 5, 9, 13, 21, 23, 24, 29 P – Pepstatin A, C – w/o Inhibitor, H – Hemoglobin only H C P 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 H H C P 24 25 26 27 28 29 30 1.Recombinant plasmepsin II (75 ng) + assay buffer (pH 4.5) + 1 μl inhibitor  37 o C, 30 min incubation ( pre-activation ) ( 500 μM ) 2.rPMII-inhibitor complex + Human hemoglobin (10 μg)  37 o C, 3 hr incubation 3. Digestion was terminated by addition of SDS-PAGE loading dye 4. SDS-PAGE analysis on a 15% polyacrylamide gel  Further in vivo test currently under way Credit: D. Kim, Y. Kim et al, Chonnam Univ.

18 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 18 Involving new computing infrastructures Joint deployment on e-infrastructures not running gLite –Next malaria Data Challenge on Thai Grid Joint deployment on desktop grids –Next malaria DC will use EGEE, its related project infrastructures and a volunteer computing grid, Africa@home Volunteer computing vs EGEE-like grids Very large computing resources Poor security and data management

19 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 19 Table of content Rationale for in silico drug discovery on neglected diseases WISDOM, what is being acomplished From distributed computing to knowledge management –The issues –The requirements –The first steps Grid-enabled virtual screening on avian flu: short status

20 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 20 What is the goal ?  The capture, analysis and interpretation of knowledge generated regarding the physiology and pathophysiology related to disease stage or toxicological targets  The capture, analysis and interpretation of knowledge generated for one potential drug candidate from discovery, non-clinical and clinical development all the way to lifecycle management. Innovative Medicines Initiative (IMI) Strategic Research Agenda

21 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 21 What are the requirements ?  Capacity to search, query, extract, integrate and share data in a scientifically and semantically consistent manner across heterogeneous sources (public and proprietary) ranging from chemical structures and “omics” to clinical trial data  Capacity to integrate and share scientific tools (e.g., modelling, simulation) as modules in a generic framework and apply them to relevant dynamic data sets,  Expressive data representation and exchange standards,  Dynamic and customizable configuration of applications,  Encapsulation of validated physiological models, when applicable,  Flexible, secure (covering all aspects of data protection encountered in a biomedical context), and scalable IT infrastructure. Innovative Medicines Initiative (IMI) Strategic Research Agenda

22 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 22 Grids are the answer provided technical challenges are overcome Distributed data integration and computing –Security –Performance –Usability Standards –Need for reference implementations of standard grid services –Lack of connection between medical informatics standards and grid standards (e.g. grid-enabled DICOM) –Lack of standard open source ontologies in medical informatics Grid deployment in medical research centres –Easy installation of secure grid nodes –Friendly user interface Share roadmap http://share.healthgrid.org

23 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 23 The technology to build knowledge grids is not yet mature SHARE roadmap http://share.healthgrid.org

24 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 24 How to progress ? The grid technology does not allow yet to build knowledge grids –Computing and data grids pillars not strong enough yet On the other hand, a number of services are now available –DIstributed computing –Secured data management –Distributed data repositories WISDOM steps –Step 1: distributed virtual screening –Step 2: improved security features for virtual screening input and output data –Step 3: distributed data repositories

25 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 25 WISDOM Secured Production Environment

26 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 26 Building a grid of data repositories Chemical data: focussed compound libraries Genomics data on parasite and targets Epidemiological data: patient information Geographical data: maps Private Public Private Public Private Public Private Public Private Public

27 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 27 From distributed data repositories to a chemogenomic space Source: Birkholtz L.-M. et al., Malaria Journal, 2006 Chemogenomic knowledge space Goals: - comparison of protein sequences - high throughput reconstruction of molecular phylogeny - representation of biological processes particularly metabolic pathways - integration of genomic data, biological representations and functional profiling after drug treatments - determination and prediction of protein structures - virtual docking with drug candidate structures

28 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 28 Table of content Rationale for in silico drug discovery on neglected diseases WISDOM, what is being acomplished From distributed computing to knowledge management –The issues –The requirements –The first steps Grid-enabled virtual screening on avian flu: short status

29 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 29 Recherche de nouveaux médicaments contre la grippe aviaire Objectifs: –étudier l’impact de mutations de la neuraminidase N1 sur l’efficacité des médicaments actuels (Tamiflu) –Identifier de nouvelles molécules actives Méthode: –Calculs sur ordinateur des probabilités d’accrochage des molécules sur la neuraminidase mutée Résultats expérimentaux –20% des 300 molécules sélectionnées in vitro et testées in vivo sont plus actives que le tamiflu  Facteur important d’amélioration des résuttats des tests in vitro N1 H5 Credit: Y-T Wu – D. Kim Répartition mondiale des ressources mobilisées pour un total de 100 années CPU

30 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 30 Neuraminidase from Influenza A and B virus Neuraminidase from pathogenic bacteria Avian flu: in vitro test results 62/308 compounds-20.1% (Higher inhibition activity compared to Tamiflu) 0/169 compounds-0% Credit: D. Kim, Y. Kim et al, Chonnam Univ.

31 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 31 Perspectives sur les maladies émergentes Politique scientifique ambitieuse du CNRS en Asie Les grilles offrent une infrastructure pour développer des collaborations –Intégration des équipes asiatiques dans des grands projets internationaux (LHC) –Intégration des équipes asiatiques dans des organisations virtuelles (biomédical) Nombreuses actions entreprises en 2007 –Ecole sur les grilles (ACGRID, Vietnam, Nov. 2007) –Participation à des Laboratoires Internationaux Associés (LIA)  Chine, Corée, Japon, Vietnam Projet de système de surveillance et d’alerte pour la grippe aviaire –Collaboration CNRS (IN2P3,EDD) avec des laboratoires asiatiques (Vietnam, Chine, Corée,…) –GDRi en cours de montage (maladies infectieuses – biodiversité) ACGRID school, Nov. 2007, Hanoï

32 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 32 Conclusion Grids open new perspectives for supporting R & D on tropical and emerging diseases Grids can already contribute significantly to in silico drug discovery –WISDOM large scale virtual screening routinely deployed on grid infrastructures Technology is not yet ready to build a distributed knowledge space –Our approach: progressive integration of data and knowledge management tools in distributed data repositories –Focus on malaria and avian flu

33 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 33 Credits Development of the WISDOM environment –ASGC: Yu-Hsuan Chen, Li-Yung Ho, Hurng-Chun Lee –ITB-CNR: G. Trombetti –CNRS-IN2P3: V. Bloch, M. Diarena, J. Salzemann –HealthGrid: B. Grenier, N. Spalinger, N. Verhaeghe Biochemical preparation and analysis –ASGC: Y-T Wu –Chonnam National University: D. Kim & al –CNRS-IN2P3: A. Da Costa, V. Kasam –ITB-CNR: L. Milanesi & al Projects supporting WISDOM –Projects providing human resources: BioinfoGRID, EGEE, Embrace –Projects providing computing resources: AuverGRID, EELA, EGEE, EUMedGRID, EUChinaGRID, TWGrid

34 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 34 Conclusion WISDOM data challenges are now routinely deployed on EGEE and its related infrastructures –In vitro results on malaria and avian flu are very exciting –Requests to dock targets on malaria, TB, AIDS and Schistosomiasis –Need to find resources to run it as a service to the research communities on tropical diseases Joint deployments are now being tested –Other grid infrastructures: CNGrid, Thai Grid –Volunteer grids: africa@home, World Community Grid The next step: data integration –Federation of bird flu data repositories proposed in EUChinaGRID-II WISDOM is an open collaboration –New partners are welcome –Sourceforge repository of software –No IP claimed on the molecules selected in silico

35 Plateforme de Calcul pour les Sciences du Vivant V. Breton CC-IN2P3, 6/12/07 35 Distributed data repositories Chemical data: focussed compound libraries Genomics data on parasite and targets Epidemiological data: patient information Geographical data: maps Private Public Private Public Private Public Portal A Services A Portal B Services B Portal C Services C Country A Country BCountry C


Download ppt "Plateforme de Calcul pour les Sciences du Vivant Addressing neglected and emerging diseases on the grid: new results and prospects."

Similar presentations


Ads by Google