Presentation on theme: "Bridging Bioinformatics and Chem(o)informatics"— Presentation transcript:
1 Bridging Bioinformatics and Chem(o)informatics Gary WigginsSchool of InformaticsIndiana UniversityYan He (SLIS MLS Student)Meredith Saba (SLIS MLS Student)
2 Provocative Thought“While much bioscience is published with the knowledge that machines will be expected to understand at least part of it, almost all chemistry is published purely for humans to read.”Murray-Rust et al. Org. Biomol. Chem. 2004, 2, 3201.
3 Overview of the Talk Review of ACS CINF 2004 Papers Review of Relevant ArticlesPublic Chemistry Databases and Data Repositories with Bioinformatics Info/LinksOverview of Web ServicesNIH-funded Projects Underway or Planned at Indiana University
4 “The Bigger Picture — Linking Bioinformatics to Cheminformatics” American Chemical Society Division of Chemical Information (CINF) Symposium, Anaheim, Spring 2004All-day session with 16 papers
5 Problems from ACS CINF 2004Both technical and people factors hinder knowledge exchange between biology and chemistry. (Lipinski)People Problems per Chris LipinskiMeta data capture is complicated by people issues, particularly those between chemists and biologists.Discipline-based disconnects occur distressingly often and are frequently overlooked as a cause of lost productivity.
6 Interdisciplinary Collaborations: Biology and Chemistry [What’s] “... important for these collaborations is, not only do you have to accept the other guy’s paradigm or at least live with it; you have to be willing to accept the other guy’s foibles or your perception of the other guy’s foibles (and recognize the opposite of this). We each have our own approaches to how we do science, and it’s just different cultures.”--Thom Kauffman interview in ACS LiveWire, March 2005, 7.3.
7 Some Questions from the ACS CINF 2004 Symposium "Find all proteins related to protein A (i.e. within a given path length of A) in a protein interaction graph, and retrieve related assay results and compound structures.”“Find all pathways where compound X inhibits or slows a reaction, and retrieve Gene Ontology classifications for all proteins involved in the reaction.”
8 Problems from ACS CINF 2004 Commercial vs. public data Batch mode data processing possible in biology, but primitive in chemistryPrimary HTS data has a very high noise factorData format standardization problemChemoinformatics and bioinformatics use completely different data formats and analysis toolsChemical and protein sequence information has been largely analyzed separately
9 Solutions from ACS CINF 2004 Linking biological and chemical information in computational approaches to predict biological activity, ADME profiles, and adverse drug reactions (ADR)Energetics of binding for more accurate and sensitive chemical representation of DNA-protein interactionsA discovery informatics platform that facilitates archival, sharing, integration, and exploration of synthetic methods and biological activity data
10 Solutions from ACS CINF 2004 Data pipelining approach makes it possible to apply bioinformatics and chemoinformatics data and analyses together.Visualizations are the best way for people to understand data.
11 Solutions from ACS CINF 2004 Cabinet (Chemical And Biological Information NETwork, formerly Fedora) servers includeMetabolic pathway network chart (Empath)Protein-Ligand Association Network (Planet)Enzyme Commission Codebook (EC Book)Traditional Chinese Medicines (TCM)World Drug Index (WDI), and others.Built on the Daylight HTTP toolkit
12 Overview of the Talk Review of Relevant Articles Review of ACS CINF 2004 PapersReview of Relevant ArticlesPublic Chemistry Databases and Data Repositories with Bioinformatics Info/LinksOverview of Web ServicesNIH-funded Projects Underway or Planned at Indiana University
13 What is Chemoinformatics? (Brown) “…the essence of chemoinformatics is integration and focus rather than its components, which are independent disciplines.”Supporting disciplines:Chemical informationComputational chemistryChemometrics
15 Toolkits as Integrators (Brown) Companies such as Daylight, Advanced Visual Systems, OpenEye, and SciTegic provide integration systems for:Statistical methodsText miningComputational chemistryVisualization
16 Genego’s MetaDrug Product Toxicogenomics platform for the prediction of human drug metabolism and toxicity of novel compoundsEnables the visualization of pre-clinical and clinical high-throughput data in the context of the complete biological systemIntegrates chemical, biological, and protein function data
17 BioWisdomExamination of vast amounts of available information using its Sofia KnowledgeScan methodologySRS data integration platform
18 Lessons from Hip Hop (Salamone) Mashup techniqueBring together disparate informatics, biological, chemical, and imaging information when conducting researchExample of an integration tool: iSpecies.orgA search for a species returns a page with NCBI genomics information, Yahoo images of the species, and articles culled from Google Scholar
20 Chemogenomics and Chemoproteomics (Gagna) Chemogenomics (def.)—The description of all potential drugs that can be used against all possible target sites, OR the actions of target-specific chemical ligands and how they are used to globally examine genesChemoproteomics (def.)—Uses chemistry to characterize protein structure and functionsThey are “. . . a form of chemical biology brought up to date in the area of genome and proteome analysis.”
21 New Interdisciplinary Journals ACS Chemical Biology (ACS)ChemBioChem; A European Journal of Chemical Biology (Wiley/VCH)Chemical Biology and Drug Design (Blackwell)JBIC; Journal of Biological and Inorganic Chemistry (Springer)Journal of Biochemical and Molecular Toxicology (Wiley)Molecular Biosystems (RSC)Nature Chemical Biology (Nature Publishing)Organic & Biomolecular Chemistry (RSC)
22 Open Source Software (Geldenhuys) Log P calculator from Interactive AnalysisUniversity of Utah’s Computational Science and Engineering OnlineCan submit jobs for molecular mechanics, quantum chemical calculations, and biomolecular interfaces for viewing PDB filesVirtual Computational Chemistry Laboratory
23 The Blue Obelisk (Guha) Several open chemistry and chemoinformatics projects that have pooled forces to enhance interoperabilityMaintain:Chemoinformatics Algorithms DictionaryData Repository for standardized data for chemical properties and other facts (e.g., mass)
24 BlueObelisk.org Working collaboratively on projects such as: Chemistry Development Kit (CDK)JChemPaintJmolJUMBONMRShiftDBOctetOpen BabelQSARWorld Wide Molecular Matrix (WWMM)
25 Barriers to the Use of Open Source Software Unix command lineProblem: Lack of known standards and datasets of compounds for validation, e.g., in docking programs
26 Lessons from the Human Genome Project (Austin) Keys to success in the HGP were:ComprehensivenessCommitment to open access to the sequence as a research tool without encumbranceProposed tools for a “genome functionation toolbox”:Whole-genome transcriptome and proteome characterizationDevelopment of small inhibitory RNAs (siRNAs) and knockout mice for every geneSmall molecules and the druggable genome
28 ChEBI, Chemical Entities of Biological Interest Dictionary of molecular entities focused on small chemical compoundsFeatures an ontological classification, showing the relationships between molecular entities or classes of entities and their parents and/or children
30 The IUPAC International Chemical Identifier (InChI) Open source, non-proprietary, public-domain identifier for chemicalsString of characters that uniquely represent a molecular substanceIndependent of the way the chemical structure is drawnEnables reliable structure recognition and easy linking of diverse data compilationsAccepts as input MOLfiles (or SDfiles) and CML filesDownload the program to your computer at:
34 Vioxx PubChem Link to External Sources of Information
35 The Elsevier MDL/NIH Link via PubChem and DiscoveryGate Cross-indexes PubChem to the Compound Index hosted on Elsevier MDL’s DiscoveryGate platformMDL added 5 million structures from PubChem to their index, resulting in over 14 million unique chemical structuresLinks go both waysCan move from biological data in PubChem to bioactivity, chemical sourcing, synthetic methodology, and EHS data in DiscoveryGate sources
36 Elsevier MDL’s xPharm Comprehensive set of records linking: Agents (compounds) (2300)Targets (600)Disorders (450)Principles that govern their interactions (180)Answers questions such as:What targets are associated with control of blood pressure?What adverse effects are associated with monoamine oxidase inhibitors?
37 Text Datamining (Banville) “In the pharmaceutical field, it is ideally the marriage of biological and chemical information that needs to be the ultimate focus of text data mining applications.”Problems:Lack of universal publication standards for identifying each unique chemical entitySelective indexing policies of A&I servicesNeed to understand how chemical structures link to biological processes
38 Chemical Datamining Software SureChemCLiDERecognizes structures, reactions, and textOSCAR“OSCAR1” to check experimental dataCSR (Chemical Structure Reconstruction)MDL DocSearch—combines MDL’s Isentris platform and EMC’s Documentum
39 Overview of the TalkReview of ACS CINF 2004 PapersReview of Relevant ArticlesPublic Chemistry Databases and Data Repositories with Bioinformatics Info/LinksOverview of Web ServicesNIH-funded Projects Underway or Planned at Indiana University
40 Themes from SwissProt’s 20th Anniversary Conference, “In silico Analysis of Proteins” Knowledgebases, databases and other information resources for proteinsSequence searches and alignmentsProtein sequence analysisProtein structure prediction, analysis and visualizationProteomics data analysis
41 Chemoinformatics Databases (Jónsdóttir) Lists databases relevant to drug discovery and development, including:General databasesDBs for screening compoundsDBs for medicinal agentsDBs with ADMET propertiesDBs with physico-chemical propertiesCuriously does not mention Chemical Abstracts
42 Databases with Protein and Ligand Information (Jónsdóttir) Protein Data BankTarget Registration DatabaseRelibase—uses structural info to analyze protein-ligand interactions; Relibase+ for protein-protein interaction searchingCambridge Structural DatabaseKEGG LIGAND DB for enzyme reactions
43 Other Databases with Protein and Ligand Information SitesBase--a database of known ligand binding sites within the PDBBinding MOADsc-PDB (Kellenberger)
46 Other Databases with Protein-Protein Interaction Data (Jónsdóttir) YPD, Yeast Proteome Database (for proteins from S. cerevisiae)Human Protein Reference DatabaseBIND, Biomolecular Interaction Network Database (ceased as of 11/16/2005?)
47 International Molecular Exchange (IMEx) Consortium http://imex BIND (http://www.blueprint.org) The Blueprint Initiative AsiaPte. Ltd, Singapore and The Blueprint Initiative North America,Toronto CanadaDIP (http://dip.doe-mbi.ucla.edu) UCLA-DOE Institute for Genomics & ProteomicsIntAct (http://www.ebi.ac.uk/intact), EMBL–European Bioinformatics Institute, Hinxton, UK;MINT (http://mint.bio.uniroma2.it/mint/) University of Rome “Tor Vergata”, Rome ItalyMPact (http://mips.gsf.de/genre/proj/mpact), MIPS / Institute for Bioinformatics, Munich, Germany.
48 Protein Sites from IU I533 Students and others LigandDepot—integrated source for small moleculesPSIPRED Protein Structure Prediction ServerDSSP--a database of secondary structure assignments (and much more) for all protein entries in the PDBDr. Predrag Radivojac’s I690 class on Structural Bioinformatics
49 Protein Secondary Structure Prediction MethodsNeural NetworkRule BasedOther Machine LearningHomology Based
50 Protein Secondary Structure Prediction Software PredictProteinChou-FasmanNN Predict
51 Structure-Based Docking Methods Scans many small molecules and “docks” them to a site of interest on a protein structurePredicts free energy of bindingFilters thousands of compounds relatively quicklyTop hits can be used for more rigorous computational/experimental characterization and optimization
52 Structure-Based Docking Methods Accelrys’s Insight (built on DOCK)FlexXGlideGOLD
53 Useful Structure Databases ModBaseDali Database (Fold classification; based on PDB)Protein Structure Analysis, Comparison, &/or Classification [Guide]
54 SCOP, Structural Classification of Proteins Curated database of structural and evolutionary relationshipsAll known protein folds (v. 1.69, July 2005)70,859 domains organized into 2,845 families, 1,539 superfamilies, and 945 foldsDetailed information about close relativesLinks to coordinates, images of structures, interactive viewers, and literature references
55 SCOP Search OptionsHomology search yields a list of structures with significant levels of sequence similarityKeyword search matches words in SCOP and PDB
56 CATH Protein Structure Classification Like SCOP, structured hierarchically by:Class (determined by secondary structure)Architecture (overall shape, e.g., barrel, sandwich, roll, etc.) – no equivalent in SCOPTopology (grouped into fold families based on overall shape and connectivity of secondary structures)Homologous Superfamily (domains thought to share a common ancestor)As of January 2005, had 43,229 domains classified into 1,467 superfamilies and 5,107 sequence families; A protein family database (CATH-PFDB) contained a total of 616,470 domain sequences classified into 23,876 sequence families
57 CATH Search OptionsCan browse or search the classification by CATH codeCATH codes can be used to search other databases, e.g., DHS, Gene3D, and Impala
58 Gasteiger’s Biochemical Pathways Database Database of biochemical pathways that represents chemical structures and reactions on the atomic levelGives access to each atom and bond of the substrates of enzyme reactionsAllows the study of transition state hypotheses of enzyme reactionsAnalysis of the physicochemical effects operating at the reaction site allows a classification of enzyme reactions that goes beyond the traditional EC code for enzymes.1533 biochemical molecules and 2175 reactions
59 A Gene Expression Database for NCI60 (Scherf) Published in Nature Genetics, 2000First study to integrate gene expression with molecular pharmacology databasesGene expression profiles for NCI60 assessed using microarray technologyGene-drug relationships investigated by how the gene transcription levels vary with respect to drug activities
60 Correlation Matrix Between Drug Activity and Gene Expression
61 Other Relevant Databases/Servers Each year Nucleic Acids Research publishes a Database Issue in January and a Web Server Issue in July (See refs in Bibliography section). Examples from the most recent issues:DatabasesServersKEGGBASysPDBBRIDGEPPINTSCRATCHMutDBGlyprotGLIDAI2I-SiteEngDrugBankPatchDockSPACESymmDockDeNovoID
62 Overview of the Talk Overview of Web Services Review of ACS CINF 2004 PapersReview of Relevant ArticlesPublic Chemistry Databases and Data Repositories with Bioinformatics Info/LinksOverview of Web ServicesNIH-funded Projects Underway or Planned at Indiana University
63 Web Services Overview What are “Web Services”? A distributed invocation system built on Grid computingIndependent of platform and programming languageBuilt on existing Web standardsA service oriented architecture withInterfaces based on Internet protocolsMessages in XML (except for binary data attachments)
64 Service-Oriented Architecture From Curcin et al. DDT, 2005, 10(12),867
65 Web Services for Chemistry: Problems Performance and scalabilityProprietary dataCompetition from high-performance desktop applications-- Geoff Hutchison, it’s a puzzle blog,ALSO:Lack of a substantial body of trustworthy Open Access databasesNon-standard chemical data formats (over 40 in regular use and requiring normalization to one another)
66 Overview of the TalkReview of ACS CINF 2004 PapersReview of Relevant ArticlesPublic Chemistry Databases and Data Repositories with Bioinformatics Info/LinksOverview of Web ServicesNIH-funded Projects Underway or Planned at Indiana University
67 Indiana University Planned Projects: http://www.chembiogrid.org Design of a Grid-based distributed data architectureDevelopment of tools for HTS data analysis and virtual screeningDatabase for quantum mechanical simulation dataChemical prototype projectsNovel routes to enzymatic reaction mechanismsMechanism-based drug designData-inquiry-based development of new methods in natural product synthesis
68 Web Services for Chemistry at IU PurposeTechnologiesInteraction LayerInteractive software for creative access and exploitation of information by humansMicrosoft .NET Smart Clients, portlets, Java applets, and browser clients, visualization technologiesAggregation LayerWorkflows and data schemas customized for particular domains, applications and usersBPEL, Taverna and other workflow modeling tools, aggregate web servicesWeb service layerComprehensive data and computation provision including storage, calculation, semantics and meta-data exposed as web servicesApache web services, SOAP wrappers, WSDL, UDDI, XML,Microsoft .NET
69 NCI Developmental Therapeutics Program (DTP) Downloadable data:In vitro 60 cell line resultsin vitro anti-HIV resultsYeast assay200,000+ chemical structuresmolecular targetsmicroarray dataOr search the database at:
70 IU Database of NIH DTP Data Contains over 200,000 chemical structures tested in 60 cellular assays from different human tumor cell linesAlso includes microarray assay profiles for the untreated cell lines (~14,000 datapoints)A local PostgreSQL database containing the data that is exposed as a web serviceUsing workflows and complex SQL queries, we can do advanced data mining that exploits the chemical, biological and genomic information for particular audiences (chemists, biologists, etc)
71 Mining the NIH DTP database ~14,000 gene expression values60 cell linesCell lines can be clustered based on gene expression similarity~200,000 compoundsCompounds can be clustered based on similarity of profileacross cell lines, or by chemical structure fingerprint similarity
72 Use of Taverna at IUA protein implicated in tumor growth is supplied to the docking program (in this case HSP90 taken from the PDB 1Y4 complex)The workflow employs our local NIH DTP database service to search 200,000 compounds tested in human tumor cellular assays for similar structures to the ligand.Client portlets are used to browse these structuresOnce docking is complete, the user visualizes the high-scoring docked structures in a portlet using the JMOL applet.Similar structures are filtered for drugability, and are automatically passed to the OpenEye FRED docking program for docking into the target protein.A 2D structure is supplied for input into the similarity search (in this case, the extracted bound ligand from the PDB IY4 complex)Correlation of docking results and “biological fingerprints” across the human tumor cell lines can help identify potential mechanisms of action of DTP compounds
73 Taverna Workflow Workflow definition Available web services (WSDL) Visual depiction of workflow
76 Pre-Closing Quote“There is not going to be a ‘voila’ moment at the computer terminal. Instead, there is systematic use of wide-ranging computational tools to facilitate and enhance the drug discovery process.”Jorgensen. Science, March 19, 2004, 303, 1814.
77 Closing quote“The future of chemistry depends on the automated analysis of chemical knowledge, combining disparate data sources in a single resource, such as the World-Wide Molecular Matrix, which can be analysed using computational techniques to assess and build on these data.”Townsend et al. Org. Biomol. Chem. 2004, 2, 3299.
78 Post-closing quote: zzzzzCAS “In an industry first, Chemical Abstracts Service (CAS) has unveiled a revolutionary new literature searching tool which will permit scientists to search and retrieve the world’s chemical literature—including patents and obscure technical reports—in their sleep.”--Author unknown
79 Acknowledgements Randy Arnold Xiao Dong Sean Mooney Peter Murray-Rust David J. WildI533 Chemical Informatics Seminar StudentsElsevier Science
80 Bibliography: Articles, Books, and Conference Papers “The Bigger Picture: Linking Bioinformatics to Cheminformatics” [CINF Symposium] Abstracts [1-16], 227th ACS National Meeting Anaheim, CA, March 28-April 1, 2004Austin, C.P. “The completed human genome: implications for chemical biology.” Current Opinion in Chemical Biology 2003, 7,Bajorath, Jürgen, ed. Chemoinformatics: concepts, methods, and tools for drug discovery. Totowa, N.J. : Humana Press, c2004. (Methods in molecular biology ; v. 275)Banville, Debra L. “Mining chemical structural informationo from the drug literature.” Drug Discovery Today January 2006, 11(1/2),Brown F. “Editorial opinion: chemoinformatics - a ten year update.” Current Opinion in Drug Discovery and Development 2005 May; 8(3):
81 Bibliography: Articles (cont’d) Coles, Simon J.; Day, Nick E.; Murray-Rust, Peter; Rzepa, Henry S.; Zhang, Yong. “Enhancement of the chemical semantic web through InChIfication.” Organic & Biomolecular Chemistry 2005, 3,Curcin, Vera; Ghanem, Moustafa; Guo, Yike. "Web services in the life sciences." Drug Discovery Today 2005, 10(12),Gagna CE, Winokur D, Clark Lambert W. “Cell biology, chemogenomics and chemoproteomics.” Cell Biol Int. 2004; 28(11):Geldenhuys, W.J.; Gaasch, K.E.; Watson, M.; Allen, D.D.;Van Der Schyf, C.J. “Optimizing the use of open-source software applications in drug discovery.” Drug Discovery Today February 2006, 11(3/4),Guha, R.; Howard, M.T.; Hutchison, G.R.; Murray-Rust, P.; Rzepa, H.; Steinbeck, C; Wegner, J.; Willighagen, E.L. “The Blue Obelisk—Interoperability in chemical informatics.” Journal of Chemical Information and Modeling 2006 Web Release Date: 22-Feb-2006; DOI: /ci050400b
82 Bibliography: Articles (cont’d) Jónsdóttir, S.O.; Jorgensen, F.S.; Brunak, S. “Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates.” Bioinformatics 2005 May 15; 21(10):Jorgensen, William L. “The many roles of computation in drug discovery.” Science March 19, 2004, 303,Kauffman, Thom. “Profile.” [interview] LiveWire, March 2005, 7.3;Murray-Rust, Peter S.; Mitchell, John B.O.; Rzepa, Henry S. “Communication and re-use of chemical information in bioscience.” BMC Bioinformatics 2005, 6, 180.Murray-Rust, Peter; Mitchell, John B.O.; Rzepa, Henry S. “Chemistry in bioinformatics.” BMC Bioinformatics 2005, 6,Povolna, Vera; Dixon, Scott; Weininger, David. “Cabinet—Chemical and Biological Informatics NETwork.” in: Oprea, Tudor I., ed. Chemoinformatics in Drug Discovery. Weinheim: Wiley-VCH, 2004,
83 Bibliography: Articles (cont’d) Salamone, Salvatore. “Hip Hop offers lessons on life sciences data integration.” Bio-IT World February 2006, 36.Scherf Uwe, Ross Douglas T., Waltham Mark, Smith Lawrence H., Lee Jae K., Tanabe Lorraine, Kohn Kurt W., Reinhold William C., Myers Timothy G., Andrews Darren T., Scudiero Dominic A., Eisen Michael B., Sausville Edward A., Pommier Yves, Botstein David, Brown Patrick O., Weinstein John N. “A gene expression database for the molecular pharmacology of cancer.” Nature Genetics 2000, 24,Souchelnytskyi, S. "Bridging proteomics and systems biology: What are the roads to be traveled?" Proteomics 2005 (November), 5(16),Tetko, Igor V. “Computing chemistry on the web.” Drug Discovery Today November 2005, 10(22),
84 Bibliography: Articles (cont’d) Zimmermann, Marc; Thi, Le Thuy Bui; Hofmann, Martin. “Combating illiteracy in chemistry: Towards computer-based chemical structure reconstruction.” ERCIM News January 2005, 60,Zimmermann, Marc; Fluck, Juliane; Thi, Le Thuy Bui; Kolarik, Corinna; Kumpf, Kai; Hofmann, Martin. “Information extraction in the life sciences: Perspectives for medicinal. chemistry, pharmacology and toxicology.” Current Topics in Medicinal Chemistry 2005, 5(8),
85 Bibliography: Databases Andreeva, A.; Howorth, D.; Brenner, S.E.; Hubbard, T.J.P.; Chothia, C.; Murzin, A.G. “SCOP database in 2004: refinements integrate structure and sequence family data.” Nucleic Acids Research 2004, 32 Database issue D226-D229 doi: /nar/gkh039Chen J, Swamidass SJ, Dou Y, Bruand J, Baldi P. “ChemDB: a public database of small molecules and related chemoinformatics resources.” Bioinformatics Nov 15; 21(22):Dunkel, M.; Fullbeck, M.; Neumann, S.; Preissner, R. “SuperNatural: a searchable database of available natural compounds.” Nucleic Acids Research 2006, 34, Database issue D678-D683 doi: /nar/gkj132Gold, Nicola D.; Jackson, Richard M. “A searchable database for comparing protein-ligand binding site for the analysis of structure-function relationships.” Journal of Chemical Information and Modeling 2006, 46(2),
86 Bibliography: Databases (cont’d) Kanehisa, M.; Goto, S.; Hattori, M.; Aoki-Kinoshita, F. Itoh, M.; Kawashima, S.; Katayama, T.; Araki, M; Hirakawa, M. “From genomics to chemical genomics: new developments in KEGG.” Nucleic Acids Research 2006, 34, Database issue D354-D357. doi: 10:1093/nar/gkj102.Kellenberger, Esther; Muller, Pascal; Schalon, Clarire; Bret, Guillaume; Foata, Nicolas; Rognan, Didier. “sc-PDB: An annotated database of druggable binding sites from the Protein Data Bank.” Journal of Chemical Information and Modeling 2006, 46(2),Kirwin, J.J.; Shoichet, B.K. “ZINC—A free database of commercially available compounds for virtual screening.” Journal of Chemical Information and Modeling 2005, 45,Kouranov, A.; Xie, L. de la Cruz, J.; Chen, L.; Westbrook, J.; Bourne, P.E.; Berman, H.M. “The RCSB PDB information protal for structural genomics.” Nucleic Acids Research 2006, 34, Database issue D302-D305 doe: 10:1093/nar/gkj120Kumar, M.D.S.; Gromiha, M.M. “PINT: Protein-protein interactions thermodynamic database.” Nucleic Acids Research 2006, 34 Database issue D195-D198 doi: /nar/gkj017
87 Bibliography: Databases (cont’d) Lo Conte, L.; Brenner, S.E.; Hubbard, T.J.P.; Chothia, C.; Murzin, A.G. “SCOP database in 2002: refinements accommodate structural genomics.” Nucleic Acids Research 2002, 30(1):Murzin, A.G.; Brenner, S.E.; Hubbard, T.; Chothia, C. “SCOP: A structural classification of proteins database for the investigation of sequences and structures.” Journal of Molecular Biology 1995, 247,Okuno, Y.; Yang, J.; Taneishi, K.; Yabuuchi, H.; Tsujimoto, G. “GLIDA: GPCR-ligand database for chemical genomic drug discovery.” Nucleic Acids Research 2006, 34, Database issue D673-D677 doi: /nar/gkj028.Pearl F, Todd A, Sillitoe I, Dibley M, Redfern O, Lewis T, Bennett C, Marsden R, Grant A, Lee D, Akpor A, Maibaum M, Harrison A, Dallman T, Reeves G, Diboun I, Addou S, Lise S, Johnston C, Sillero A, Thornton J, Orengo C. The CATH Domain Structure Database and related resources Gene3D and DHS provide comprehensive domain family information for genome analysis.” Nucleic Acids Research. 2005, 33 Database Issue D247-D251.
88 Bibliography: Databases (cont’d) Wheeler, D.L. et al. “Database resources of the National Center for Biotechnology Information.” Nucleic Acids Research 2006, 34 Database Issue D173-D180 doi: /nar/gkj158Wishart DS, Knox C, Guo AC, Shrivastava S, Hassanali M, Stothard P, Chang Z, Woolsey, Jennifer. “DrugBank: a comprehensive resource for in silico drug discovery and exploration.” Nucleic Acids Res Jan 1;34(Database issue): D
89 Biotech Validation Suite for Protein Structures Send the server a PDB fileServer provides a comprehensive check of the protein, including:Atomic volume analysisFull geometric analysisNMR restraint data
90 Knowledge-Driven Bioinformatics Enhanced with Chemistry
91 ToxTree An in silico toxicology prediction suite Based on the CDK toolkitBuilt on CMLReleased as OpenSource under the GPLStandalone PC softwareUser Manual:
92 Tools for Genomic and Proteomic Scientists vis-à-vis Cell Biology (Gagna et al.) Tools to fully exploit the techniques in cellular biologyLight microscopy for high resolution imagesFractionation of cells into basic components via ultracentrifugationAnalysis of individual cells through flow cytometryLCM, normal and diseased TMAs (tissue microarrays), quantitative computer image analysis, cell micromanipulation, and high-throughput microscopy
93 InChI Generation on the Web The following websites provide the facility to generate InChIs:ACD/Labs' freely available structure-drawing program ChemSketch includes the facility to generate InChIs from drawn structures.pubchem.ncbi.nlm.nih.gov/edit/ PubChem Server Side Structure Editor v1.8 includes a facility for generating InChIs as you draw the structure.
94 Advances in Macromolcular Crystallography by CCG More protein structures available nowUse of 3D info in bioinformatics makes functional inferences more dependableCCG Structural Family Database distributed with MOEIncludes fold detection methodology to ID structurally similar proteinsSimultaneous sequence and structural alignment of large collections of proteins3D structural family analysis for insight into conserved geometry, water molecules, salt bridges, hydrogen bonds, hydrophobic contacts, and disulfide bonds
95 CCG’s Cheminformatics Offerings MOE Molecular DatabaseMo lecular Descriptors calculated and used for classification, clustering, filtering, and predictive model constructionQSAR/QSPR Predictive ModelingDiversity and Similarity SearchingHigh Throughput Conformational Search3D Pharmacophore Search
96 Components of the Semantic Web for Chemistry XML – eXtensible Markup LanguageRDF – Resource Description FrameworkRSS – Rich Site SummaryDublin Core – allows metadata-based newsfeedsOWL – for ontologiesBPEL4WS – for workflow and web servicesMurray-Rust et al. Org. Biomol. Chem. 2004, 2,
97 Web Services Integration Projects: Biosciences myGridBIOPIPEBioMOBY
98 BIOT 2006 Major themes, areas and suggested topics include - Bio-molecular and Phylogenetic Databases- Molecular Evolution and Phylogenetic analysis- Drug Delivery Systems- Bio-Ontology and Data Mining- Sequence Search and Alignment- Microarray Analysis- System Biology- Pathway analysis- Identification and Classification of Genes- Protein Structure Prediction and Molecular Simulation- Functional Genomics- Proteomics- Tertiary structure prediction- Drug Docking- Gene Expression Analysis- Biomedical Imaging
99 Proteomics: What is it?Proteomics is the study of protein expression, regulation, modification, and function in living systems for understanding how living systems use proteins. Using a variety of techniques, proteomics can be used to study how proteins interact within a system, or how proteins change due to applied stresses.Requires advanced measurement techniques, especially separations and mass spectrometry
100 Proteomics Needs Informatics for: Locating peaks in 2 or more dimensionsMS/MS spectra interpretationProtein/Peptide quantificationPeptide detectabilityExperimental data Biological informationenzyme or pathway regulationdisease susceptibilitydrug efficacy
Your consent to our cookies if you continue to use this website.