Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Using Grid Computing to Accelerate Structure-Based Design Against Influenza A Neuraminidases.

Similar presentations


Presentation on theme: "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Using Grid Computing to Accelerate Structure-Based Design Against Influenza A Neuraminidases."— Presentation transcript:

1 INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Using Grid Computing to Accelerate Structure-Based Design Against Influenza A Neuraminidases Hurng-Chun Lee, Li-Yung Ho, and Ying-Ta Wu* ywu@gate.sinica.edu.tw *Genomics Research Center Academia Sinica, Taiwan EGEE User Forum CERN, 01-03.03.2006

2 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 2 Outline Influenza A Pandemic H5N1 H1N1 H2N2H3N2H1N1 H9N2H7N7H5N1 NAHA 2006 2005 http://www.who.int/csr/disease/avian_influenza 92 deaths /170 cases Feb 26, 2006

3 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 3 Neuraminidases cleave host receptors help release of new virions

4 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 4 Neuraminidase and Inhibitors Zanamivir R=guanidine Oseltamivir R=H R’=amine R’ Structure-Based Drug Design binding pocket

5 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 5 MutationN1N2 R292K oseltamivir Zanamivir H274Y(F)oseltamivir N294Soseltamivir?oseltamivir E119Voseltamivir?oseltamivir E119(G;A;D)oseltamivir?Zanamivir : Predicted mutation site by structure overlay and sequence alignment : Reported mutation site Drug-resistant variants and Point Mutation

6 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 6 1. Prepare the Target Protein -- add polar hydrogen atoms -- assign charges to atoms -- decide range of binding site 2. Run AutoGrid 3. Prepare the Ligand -- assign charges to atoms -- decide flexible bonds (run AutoTors) 4. Run AutoDock 5. Evaluate Results and Rank Score AutoGrid AutoTors Garrett M. Morris David S. Goodsell Ruth Huey William E. Hart Scott Halliday Rik Belew Arthur J. Olson AutoDock Morris et al. (1998), J. Computational Chemistry, 19 : 1639-1662. Docking Engine : AutoDock 3.0.5

7 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 7 Application Characteristic Virtual screening based on molecular docking is the most time consuming part in structure-based drug design workflow Number of docking tasks = N x M –N: number of ligands –M: number of target structures CPU-bound application, huge amount of output, no communication between tasks Task complexity is unpredictable –difficult to apply trivial domain decomposition method in splitting the tasks The pitiful …

8 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 8 Issues of the Grid applications Due to the loose coupling nature, distributing application jobs on the Grid is not trivial –extra works are needed concerning the efficient job handling and result gathering –need also efforts to handle transient network or site problems –complexities should be hidden and the interface to end user should be application oriented The significant Grid system overhead makes the Grid only benefit to the jobs with long computing time –not suitable for the pilot jobs for decision making

9 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 9 What is DIANE? A lightweight framework for parallel scientific applications in master-worker model –ideal for applications without communications between parallel tasks (e.g. for most of the Bioinformatics applications in analyzing huge amount of independent dataset) The framework takes care of all synchronization, communication and workflow management details on behalf of application DIANE = Distributed Analysis Environment

10 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 10 Distributing AutoDock tasks on the Grid using DIANE

11 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 11 DIANE/AutoDock A generic framework to which application can easily plug-in # -*- python -*- Application = 'Autodock' JobInitData = {'macro_repos' :'/home/hclee/diane_demo/autodock/macro', 'ligand_repos':'/home/hclee/diane_demo/autodock/ligand', 'ftprotocol':'gass', 'output_prefix':'autodock_test' } ## The input files will be staged in to workers InputFiles = [] ## The definition of failure recovery def failRecovery(self): print '*'*30 for t in self.master.tasks.failed(): print "ignoring failed task:",t t.ignore() print '*'*30 return 1 autodock.job Application specific job attributes Job level failure recovery definition % diane.startjob –-job autodock.job –ganga –w 32@lcg,32@pbs Intuitive job execution command Possible to mix heterogeneous computing backends

12 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 12 DIANE/AutoDock – integrated user interface

13 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 13 Performance Evaluation Test case –5 target protein: 1 protein, 5 conformations –ligand: 100 small compounds (with 7 positives )  500 docking tasks in total Test environment –DIANE backend handler: SSH –Hardware spec:  Traditional PC cluster with NFS (2 x Intel Xeon 2.8 GHz + 2 GB memory per node) –Grid: LCG

14 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 14 Test Results DIANE/AutoDock framework on Cluster Duration time : total elapsed time of a DIANE job Each DIANE job contain 500 tasks (5 protein conformations x 100 compounds)

15 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 15 Handling docking jobs on traditional PC cluster good load balance a DIANE/Autodock Task Test Results DIANE/AutoDock framework on Cluster

16 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 16 DIANE/AutoDock framework on LCG-GRID terminated

17 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 17 Without redundant scheduling

18 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 18 With redundant scheduling job was reassigned to other nodes

19 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 19 Compound library enrichment AutoDock parameters: translation / step=2.0 Å quaternion / step =20 degree torsion / step= 20 degree number of energy evaluation =1.5 X 10 6 max. number of generation =2.7 X 10 4 Run number =10 red = positives All positives were docked within RMSD<1.5Å

20 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 20 Probe effects due to minor changes in target’s binding sites

21 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 21 Summary Modeling compound-protein complex can be speeded up by distributing molecular docking processes on the Grid. With the DIANE framework, distributing molecular docking tasks on the Grid can be easily implemented with intuitive interface for end user. The DIANE framework also provides the functionalities by which the system can be easily tuned to tackle the issues in distributing molecular docking tasks on the loosely-coupled Grid. This simple test case demonstrated that huge compound databases can be effectively enriched by executing docking tasks on Grid. However, more resources are required in order to build up a real HTP docking service for life science community.

22 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 22 Acknowledgements Li-Yung Ho Hurng-Chun Lee Hsing-Yen Chen Dr. Simon Lin Jakub Moscicki Dr. Massimo Lamanna Supports from Genomics Research Center, Academia Sinica National Science Council, Taiwan are highly appreciated LCG-ARDA, CERN

23 Enabling Grids for E-sciencE INFSO-RI-508833 EGEE User Forum, CERN, 01-03.03.2006 23 Interacting Complexes A key step to structure-based inhibitor design PDB1F8B


Download ppt "INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Using Grid Computing to Accelerate Structure-Based Design Against Influenza A Neuraminidases."

Similar presentations


Ads by Google