Presentation on theme: "Drug Discovery Grid -- A real grid application Zhang Wenju, Shen Jianhua Shanghai Institute of Materia Medica, CAS Shanghai Jiaotong University Jiangnan."— Presentation transcript:
Drug Discovery Grid -- A real grid application Zhang Wenju, Shen Jianhua Shanghai Institute of Materia Medica, CAS Shanghai Jiaotong University Jiangnan Institute of Computing The University of Hong Kong
Background Large-scale High-throughput Virtual Screening in Silico The computational analysis of chemical databases to identify compounds appropriate for a given biological receptor in Vitro Identification of new compounds showing some activity against a target biological receptor, and the progressive optimization of these leads to yield a compound with improved potency and physicochemical properties in vitro in Vivo eventually, improved efficacy, pharmacokinetics, and toxicological profiles in vivo.
Process of Drug Discovery and Design 2-3 years 3-4 years Random Screening Random Screening 10, 000 ~ 20, , 000 ~ 20, 000 Compounds Random Screening Random Screening 10, 000 ~ 20, , 000 ~ 20, 000 Compounds Drug Candidate Pre-clinicPre-clinic Clinic (phase I, II, III) Clinic MarketMarket 2-3 years Time: years Time: years Money: several billion dollars Money: several billion dollars Computer -Aid Drug Design Leads and Opt. Leads and Opt. Leads and Opt. Leads and Opt.
DDGrid overview Drug Discovery Grid project aims to build a collaboration platform for drug discovery using the state-of-the-art grid computing technology. This project intends to solve large-scale computation and data intensive scientific applications in the fields of medicine chemistry and molecular biology with the help of grid middleware developed by our team. Over one million compounds database with 3-D structure and physicochemical properties are also provided to identify potential drug candidates. Users also can build and maintain their own customized ligand database to share in this grid platform.
DDGrid Architecture Internet Global Server Slave Server User Internet Slave Server
DDGrid Architecture Internet Global Server Slave Server User Internet Slave Server Resource monitoring, job submit and monitor, input and parameter, result view and download through Web Portal
DDGrid Architecture Internet Global Server Slave Server User Internet User interface Resources manag. Job submit and mon. Key and cert manag. Result analysis Global scheduling visisualiszation Distributed CDB
DDGrid Architecture Internet Slave server User Internet slave Local job manag. Local res. manag. Local CDB manag. Data en-decrypt Local result assimilate Local job manag. Local res. manag. Local CDB manag. Data en-decrypt Local result assimilate
DDGrid Workflow Job Submit Global Server (Monitoring, Work Pool, Resource Manag., Assimilate of Result) ID and Result Return Slave Server (Local Resource Manag., Monitoring, Local Work Pool, Assimilate of Result) Return of Result, New job request Job Dispatch Computational Client (Docking) Job Dispatch Return of Result, New job request xml
DDGrid security 1. PKI-based security 2. All the sites involved should hold a certification issued by our CA 3. All the databases deployed and results are encrypted 4. All the message passing are SSL/TLS-enabled
DDGrid Web Portal
Test Case 1 Virtual Screening from 20,000 compounds Involved Sites: Shanghai Inst. of M. M. (SIMM) Alpha Cluster (32CPU) Beijing Mol. Ltd. Sunway Cluster (224CPU) The Univ. of Hong KongGideon Cluster (16CPU) Shanghai SuperComp. CentreDawning 4000A Dalian Univ. of Tech.Dawning 4000A London e-Science CentreMars Cluster Time consumed: 5946 sec appr. 99 min Data Sets (CDB) Specs
Visualisation of Docking Result
DDGrid message passing i686-pc-linux-gnu … …
DDGrid message passing No work available Ddg sss … … … …
DDGrid Resources Computational and Data Resources Integration Resources aggregated SIMM Sunway 32A Cluster Beijing Molecule Inc. Sunway 256P Cluster HKU Gideon 300 Cluster SSC Dawning 4000A LeSC Mars Cluster (Test only) Singapore Poly-tech Univ. Dalian Univ. of Technology Shanghai Jiaotong Univ. Heterogeneous resources OS: IRIX, Digital Unix, Linux(IA32, x86_64) CPU R12000, Alpha, Pentium, AMD
DDGrid Resources DDGrid Apps. 1.Docking pre-process software Combimark 2. Docking software 1) Dock UCSF 2) gsDock SIMM 3. CDB build and maintain S/W Combilib 4. AutoDock 5. AutoGrid 6. Visualisation 7. Security-related tools Fixed CDB start Input File Pre process Dock Drug-like Analysis New CDB Exper iment end CDB Gen. CDB Para.
DDGrid Resources Chemical Databases (CDB) Each ligand record in a chemical database represents the 3D structural information of a compound. The numbers of compounds in each CDB can be in the order of tens of thousands and the database size be anywhere from tens of megabytes to gigabytes and even terabytes. 1. static databases purchased from commercial chemical company. Available Chemical Directory (ACD) Chinese natural product database (CNPD) SPECS database chemical ADME/T database, etc. 2. dynamic databases made by user own, and deployed automatically.
Deployed commercial CDB (appr.700,000) Name of DatabaseDescription Specs Provides about 230,000 compounds CMC-3D Provides 3D models and important biochemical properties (including drug class, logP, and pKa values) for over 8,400 pharmaceutical compounds. ACD-3D Provides 200,000 3D compounds commercial available NCI-3D 213,000compounds with 2D information from the National Cancer Institute CNPD Collected 12,000 Chinese natural products with chemical structure TCMD With 9127 compounds and 3922 herbs
VendorNum. of Mol.VendorNum. of Mol. ACB-Eurochem98603Maybridge53042 Ambinter533866Nanosyn68317 Asinex293385National Cancer Institute ChemBridge562624Otava ChemDiv361859Peakdale9632 ComGenex38590Pharmeks Enamine533111PubChem IBScreen452728Ryan Scientific64205 InterChim288882Sigma-Aldrich49022 KeyOrganics22294Specs Life Chemicals44762TimTec appr. 3,300,000 compounds
CDB example CNPD-China Natural Products Database
CDB example CNPD CNPD: The first and only comprehensive source of chemical, structural and bibliographic data on all known natural products in China. CNPD serves as information sources for chemical, physical and biological properties, literature, they are useful to scientists within the pharmaceutical industry. CNPD can be searched in flexible ways: structure, sub-structure, name, molecular formula, molecular weight, CAS register number, category, etc. CNPD: Traditional Chinese Medicine (TCM) applications are pre- indexed in CNPD to provide hints for lead compounds discovery.
CDB example CNPD
CDB example TCMD TCMD-Traditional Chinese Medicine Database TCMD is a bibliographical database of approximately 20,000 records with abstracts of TCM articles. Relevant articles are selected from among journals from Mainland China, Taiwan, and Hong Kong (most of them are Chinese); English abstracts are written for the selected articles and other pertinent information is translated into English.
CDB example TCMD
DDGrid applications in reality SIMM carried out anti-SARS and anti-diabetes drug research using the DDGrid 1.Anti-SARS drug research 2.Anti-diabetes drug research
Virtual screening from Comprehensive Medicinal Chemistry-3D (CMC-3D) database which contains 7,900 compounds, found that cinanserin have distinct anti-SARS effect Department of Virology, Bernhard-Nocht-Institute for Tropical Medicine, Germany Research Department, Cantonal Hospital St Gallen, Switzerland Basically your inhibitor turned out to be the best compound we have tested so far! Have applied for domestic patent x and PCT patent pi Research on Anti-SARS medicine
Found an anti-diabetes lead better than Rosiglitazone. by targeting on PPAR through virtual screening, optimization design and synthesis and biology and pharmacology testing CADD process 800, , Research on anti-diabetes medicine
2.4 m 10 t protein testing 400 t 85 composite design virtual screening 48 synthesis 8 cell testing 4 animal testing 1 comprehensive evaluation 48 K D <1 M 22 K D <0.1 M K D <100 M protein testing Research on anti-diabetes medicine manually screening
New anti-diabetes drug Current Progress 1. Applied for patent X and PCT patent 2. Security testing and pre-clinic research
What does the DDGrid provide 1 Drug Design Collaboration Platform Large-scale Virtual Screening platform sharing large CDB 2 Computational Resources Sharing SIMM/SSC/HKU/Mol. Ltd/SJTU/DUT 3 Data Resources Sharing pre-deployed commercial CDB (ACD/CNPD … ) sharing self-made CDB 4 Medicinal chemistry text and structure search 5 Customization and Extension