Presentation is loading. Please wait.

Presentation is loading. Please wait.

VS Explorer – Analyzing large scale docking experiments ChemAxon 2005 User Group Meeting Marc Zimmermann Martin Hofmann.

Similar presentations


Presentation on theme: "VS Explorer – Analyzing large scale docking experiments ChemAxon 2005 User Group Meeting Marc Zimmermann Martin Hofmann."— Presentation transcript:

1 VS Explorer – Analyzing large scale docking experiments ChemAxon 2005 User Group Meeting Marc Zimmermann Martin Hofmann

2 Page 2 Marc Zimmermann, 2005 ChemAxon UGM05 28 million compounds currently known Drug company biologists screen up to 1 million compounds against target using ultra-high throughput technology Chemists select compounds for follow-up Chemists work on these compounds, developing new, more potent compounds Pharmacologists test compounds for pharmacokinetic and toxicological profiles 1-2 compounds are selected as potential drugs Selection of Potential Drugs

3 Page 3 Marc Zimmermann, 2005 ChemAxon UGM05 High Volume Screening Analysis – the Methods Screening vHTS (similarity, docking) HTS Clustering active inactive Assembling Filtering Modeling Virtual Screening – Computational or in silico analog of biological screening o Score, rank, and/or filter a set of structures using one or more computational procedures o Helps to decide: Which compounds to screen Which libraries to synthesize Which compounds to purchase from an external source Virtual Screening – Computational or in silico analog of biological screening o Score, rank, and/or filter a set of structures using one or more computational procedures o Helps to decide: Which compounds to screen Which libraries to synthesize Which compounds to purchase from an external source

4 Page 4 Marc Zimmermann, 2005 ChemAxon UGM05 High Volume Screening Analysis – the Tools at SCAI Screening ClusteringAssembling Filtering Modeling HTSviewVS ExplorerDB Annotator FTrees FlexX GRID Layer ProMiner TopNet

5 Page 5 Marc Zimmermann, 2005 ChemAxon UGM05 Linking Chemistry and Biology DBA nnotator DB ProMiner TopNet FTrees Pharm CDB bio QSAR TMDB VS E xplorer D ata M ining ChEBI KEGG Brenda EC, MeSH Text-documents Reports Expert

6 Page 6 Marc Zimmermann, 2005 ChemAxon UGM05 Enable scientists to quickly and easily find compounds binding to a particular target protein o growth of targets number o growth of 3D structures determination (PDB database) o growth of computing power o growth of prediction quality of protein-compound interactions Experimental screening very expensive : not for academic or small companies Aim : Active molecules Tested molecules Computational Aspects of Drug Discovery : Virtual Screening

7 Page 7 Marc Zimmermann, 2005 ChemAxon UGM05 In silico drug discovery process (EGEE, Swissgrid, …) Clermont-Ferrand The grid impact : Computing and storage resources for genomics research and in silico drug discovery cross-organizational collaboration space to progress research work Federation of patient databases for clinical trials and epidemiology in developing countries Grids for neglected diseases and diseases of the developing world Support to local centres in plagued areas (genomics research, clinical trials and vector control) SCAI Fraunhofer Swiss Biogrid consortium Local research centres In plagued areas

8 Page 8 Marc Zimmermann, 2005 ChemAxon UGM05 Structure-Based Virtual Screening Protein-Ligand Docking o Aims to predict 3D structures when a molecule docks to a protein Need a way to explore the space of possible protein- ligand geometries (poses) Need to score or rank the poses o Problem: many degrees of freedom (rotation, conformation, solvent effects) Protein-Ligand Docking o Aims to predict 3D structures when a molecule docks to a protein Need a way to explore the space of possible protein- ligand geometries (poses) Need to score or rank the poses o Problem: many degrees of freedom (rotation, conformation, solvent effects) Ligand database Target Protein Molecular docking Ligand docked into proteins active site

9 Page 9 Marc Zimmermann, 2005 ChemAxon UGM05 Grid VS Results Browser Quick overview on very large log-files Sorting and merging of files Storing and retrieval in databases Similarity searches and property predictions Interface to R statistics box Prototype is under construction concat('ZINC', lpad(p.sub_id_fk,8,'0')) | target | ligand | conformations || score || time ZINC | 1cet | ZINC | 172 || || 3.25 ZINC | 1cet | ZINC | 203 || || 3.84 s ZINC | 1cet | ZINC | 241 || || s ZINC | 1cet | ZINC | 399 || || 7.41 s ZINC | 1cet | ZINC | 272 || || 2.44 s ZINC | 1cet | ZINC | 259 || || s ZINC | 1cet | ZINC | 82 || || s ZINC | 1cet | ZINC | 256 || || 3.76 s ZINC | 1cet | ZINC | 447 || || s ZINC | 1cet | ZINC | 418 || || 7.43 s concat('ZINC', lpad(p.sub_id_fk,8,'0')) | target | ligand | conformations || score || time ZINC | 1cet | ZINC | 172 || || 3.25 ZINC | 1cet | ZINC | 203 || || 3.84 s ZINC | 1cet | ZINC | 241 || || s ZINC | 1cet | ZINC | 399 || || 7.41 s ZINC | 1cet | ZINC | 272 || || 2.44 s ZINC | 1cet | ZINC | 259 || || s ZINC | 1cet | ZINC | 82 || || s ZINC | 1cet | ZINC | 256 || || 3.76 s ZINC | 1cet | ZINC | 447 || || s ZINC | 1cet | ZINC | 418 || || 7.43 s "Smiles";"Data" "c1(N2CCC(CC2)C(OCC)=O)sc3c(ccc(Cl)c3)n1";MAC ;02;101.66; "C(=O)(Nc(cc1)ccc1Cl)N(CCCN2c(c(Cl)cc3C(F)(F)F)nc3)CC2";MAC ;02;101.14; "n1(CC(CNCCNc2nccc(n2)C(F)(F)F)O)c3c(cc1)cccc3";MAC ;02;101.64;97.32 "[N+](=O)([O-])c(ccc1N(CCCN2C(=S)Nc3ccc(cc3Cl)Cl)CC2)cn1";MAC ;02;100.09; "[N+](=O)([O-])c(ccc1N(CCCN2C(=S)Nc3ccc(cc3Br)F)CC2)cn1";MAC ;02;108.98;97.02 "C(F)(F)(F)c1ccnc(NCCNC(=O)c2ccco2)n1";MAC ;02;110.19; "C(F)(F)(F)c1ccnc(NCCNC(c2ccccc2)=O)n1";MAC ;02;107.42;98.46 "C(NCc1ccco1)(=S)Nc(cccn2)c2";MAC ;02;103.86;97.98 "C(F)(F)(F)c1ccnc(NCCNC(=S)Nc(cccn2)c2)n1";MAC ;02;107.77;98.6 "C(=O)(c1cccs1)N(CCCN2CC(O)COc(ccc3C(C)=O)cc3)CC2";MAC ;02;107.41; "C(F)(F)(F)c1ccnc(NCC=C)n1";MAC ;02;105.78; "N1(CCNc2ncccc2C(F)(F)F)C(=O)CC3(CCCC3)C1=O";MAC ;02;105.26; "N1(CCCNc(c(Cl)cc2C(F)(F)F)nc2)C(=O)CC3(CCCC3)C1=O";MAC ;02;102; "Smiles";"Data" "c1(N2CCC(CC2)C(OCC)=O)sc3c(ccc(Cl)c3)n1";MAC ;02;101.66; "C(=O)(Nc(cc1)ccc1Cl)N(CCCN2c(c(Cl)cc3C(F)(F)F)nc3)CC2";MAC ;02;101.14; "n1(CC(CNCCNc2nccc(n2)C(F)(F)F)O)c3c(cc1)cccc3";MAC ;02;101.64;97.32 "[N+](=O)([O-])c(ccc1N(CCCN2C(=S)Nc3ccc(cc3Cl)Cl)CC2)cn1";MAC ;02;100.09; "[N+](=O)([O-])c(ccc1N(CCCN2C(=S)Nc3ccc(cc3Br)F)CC2)cn1";MAC ;02;108.98;97.02 "C(F)(F)(F)c1ccnc(NCCNC(=O)c2ccco2)n1";MAC ;02;110.19; "C(F)(F)(F)c1ccnc(NCCNC(c2ccccc2)=O)n1";MAC ;02;107.42;98.46 "C(NCc1ccco1)(=S)Nc(cccn2)c2";MAC ;02;103.86;97.98 "C(F)(F)(F)c1ccnc(NCCNC(=S)Nc(cccn2)c2)n1";MAC ;02;107.77;98.6 "C(=O)(c1cccs1)N(CCCN2CC(O)COc(ccc3C(C)=O)cc3)CC2";MAC ;02;107.41; "C(F)(F)(F)c1ccnc(NCC=C)n1";MAC ;02;105.78; "N1(CCNc2ncccc2C(F)(F)F)C(=O)CC3(CCCC3)C1=O";MAC ;02;105.26; "N1(CCCNc(c(Cl)cc2C(F)(F)F)nc2)C(=O)CC3(CCCC3)C1=O";MAC ;02;102; M END > MAC > 03 > > M END > MAC > 03 > >

10 Page 10 Marc Zimmermann, 2005 ChemAxon UGM05 Rapid prototyping using ChemAxon Libraries GUI (Swing) File I/ODB connect Table Module Chem Module 100% Pure JAVA (JRE) o Swing o JTable Using ChemAxon (MarvinBeans) for the chemical stuff OJDBC for database connection to Oracle

11 Page 11 Marc Zimmermann, 2005 ChemAxon UGM05 Molecule Rendering From spreadsheets to molecular spreadsheets o Overloading cellRenderer with Marvin from Switch SMILES Structure on / off

12 Page 12 Marc Zimmermann, 2005 ChemAxon UGM05 File Import / Export Implemented as a thread Comma Separated Files o CSV Parser o Preview Window o Tag missing Values SDF Molecular Files o SDF Properties Names as Row-Keys o Import Coordinates o Based on MolImporter from Preview

13 Page 13 Marc Zimmermann, 2005 ChemAxon UGM05 Smart Indexing for large Collections Large index storing filepointers or database keys JAVA TableModel only stores the full information for a limited number of elements (cache) Index FilePointer

14 Page 14 Marc Zimmermann, 2005 ChemAxon UGM05 Interactive Focus on Data Large index storing filepointers or database keys JAVA TableModel only stores the full information for a limited number of elements EventHandler for scrolling triggers reload from external memory (e.g. a cursor for RDB) Update of the TableModel Index FilePointer

15 Page 15 Marc Zimmermann, 2005 ChemAxon UGM05 Column Sorting EventHandle starting a sorting thread Resorting of the Index for flat files New database query: + ORDER BY columnLabel Coming next: o Implementation of efficient online sorting algorithms in order to reduce the file access o Merging of two tables Indexsort(List) Object FilePointer

16 Page 16 Marc Zimmermann, 2005 ChemAxon UGM05 DB Annotator: Semantics for databases Semantic annotation of relational data o Linking databases and ontologies o Using the VS Explorer as Plugin Ontology browser VS Explorer

17 Page 17 Marc Zimmermann, 2005 ChemAxon UGM05 DHFR Assay for E.coli: Folate -> DHF -> THF -> synthesis of thymidin Important for cell growth DHFR inhibitor: Trimethoprim DHF Trimethoprim Bioorg Med Chem Lett Aug 4; 13(15): High throughput screening identifies novel inhibitors of Escheria coli dihydrofolate reductase that are competitive with dihydrofolate. Zolli-Juran M, Cechetto JD, Hartlen R, Daigle DM, Brown ED. Bioorg Med Chem Lett Aug 4; 13(15): High throughput screening identifies novel inhibitors of Escheria coli dihydrofolate reductase that are competitive with dihydrofolate. Zolli-Juran M, Cechetto JD, Hartlen R, Daigle DM, Brown ED.

18 Page 18 Marc Zimmermann, 2005 ChemAxon UGM05 Docking with FlexX 1 PDB structure 1RA2 Cocrystallized DHFR and NADP FlexX places water particles 1 Rarey M, Kramer B, Lengauer T and Klebe G, J Mol Biol 1996, 261(3): th Symposium on QSAR 2004; Poster Drilling into a HTS data set of e. coli. Zimmermann M, Tresch A, Maass A, Hofmann M 15th Symposium on QSAR 2004; Poster Drilling into a HTS data set of e. coli. Zimmermann M, Tresch A, Maass A, Hofmann M

19 Page 19 Marc Zimmermann, 2005 ChemAxon UGM05 In silico Screening Workflow: HTS 2D Similarity Analysis Fragment Analysis Classification MD Simulation QSAR Training SetTest Set Docking Candidates Activity Region active inactive

20 Page 20 Marc Zimmermann, 2005 ChemAxon UGM05 1CET – Lactate Dehydrogenase of Plasmodium Falciparum Malaria Target: o Chloroquine binds in the cofactor binding site of Plasmodium Falciparum lactate dehydrogenase o PDB structure: 1CET o Ligand: Chloro-Quinolin o Test Ligands: Ambinter data set from ZINC

21 Page 21 Marc Zimmermann, 2005 ChemAxon UGM05 1CET vs Compounds on 200 Nodes: Global Statistics Done : 100% Rescheduled : 46 Running on nodes : 2296 h – 96 days o Autodock.pl : 2288 h o Total transfer : 8h submission script : 36 h time gain of : 64 (instead of 200) Ideal : 11,5 h Grid Time : 205,5 h o Scheduled : 179h o Ready : 78 mn o Waiting : 78 mn o Submitted : 24 h

22 Page 22 Marc Zimmermann, 2005 ChemAxon UGM05 Planning Next Steps 2M compounds vs. 1 protein target o Input : 13GB o Output : 2 TB output (dlg), 0,5 TB (pdb) o 12 CPU/year o Ideal : 3 days with 1350 CPUs o Reality : clusters grid with users, queues, errors… Challenges for our application? o 100% obtained results o Minimal process time o Grid resources consuming (storage, cpu) o User interface for the application o …


Download ppt "VS Explorer – Analyzing large scale docking experiments ChemAxon 2005 User Group Meeting Marc Zimmermann Martin Hofmann."

Similar presentations


Ads by Google