Presentation is loading. Please wait.

Presentation is loading. Please wait.

Www.ccdc.cam.ac.uk CCDC Tools for Mining Structural Databases Or – Building Solid Foundations for a Structure Based Design Campaign John Liebeschuetz,

Similar presentations


Presentation on theme: "Www.ccdc.cam.ac.uk CCDC Tools for Mining Structural Databases Or – Building Solid Foundations for a Structure Based Design Campaign John Liebeschuetz,"— Presentation transcript:

1 www.ccdc.cam.ac.uk CCDC Tools for Mining Structural Databases Or – Building Solid Foundations for a Structure Based Design Campaign John Liebeschuetz, Peter Carlqvist, Simon Bowden Cambridge Crystallographic Data Centre 12 Union Rd., Cambridge, UK

2 www.ccdc.cam.ac.uk Assessment and Comparison of Ligand – Protein Structural Models For the Crystallographer –What is wrong with my model? –What interesting features or differences with related structures can I highlight in my publication? For the Molecular Modeller –What is wrong with the Crystallographer’s model? –What interesting features or differences with related structures can I use to inform my structure-based drug design campaign ? –Are there non-homologous structures with similar features that I need to watch out for?

3 www.ccdc.cam.ac.uk Why can’t I take a structure from the PDB and just use it ? Validation of ligand structures bound to proteins 15% of 100 recent PDB entries have ligand geometry that are almost certainly in significant error (in house analysis using Relibase+/Mogul) Pre 2000 2006

4 www.ccdc.cam.ac.uk How much ligand strain is accomodated by the protein? Accepted View –Many ligands adopt strained conformation when bound to proteins, some (60%) do not bind even in a local minimum conformation. ( Perola & Charifson, J. Med. Chem. 2004, 47, 2499-2510 ) Alternative view – Ligands usually (but not always) bind in a local minimum. Many ‘strained’ structures found in the PDB are imperfectly refined. ( Open-Eye, B. Kelley and G. Warren, EuroCYP )

5 www.ccdc.cam.ac.uk CCDC Tools that can help you Relibase/Relibase+ - Web-based database system for searching, retrieving and analysing 3D structures of protein-ligand complexes in the Brookhaven Protein Data Bank (PDB) –Relibase is freely available for academics –Relibase+ has extra features (some of these will be used in this workshop) The Cambridge Structural Database System - Database of > 400,000 small molecule crystallographic structures, and associated query software –Mogul and IsoStar knowledge-bases of molecular geometry and inter- molecular interactions –Directly linked access from Relibase+

6 www.ccdc.cam.ac.uk The Workshop Part 1: Validation of models and structural analysis Analysing a protein structure for errors and interesting features Comparing a structure with structures related by homology or by functionality Part 2: Probing the Protein-Ligand Interface Substructure searching in Relibase/Relibase+ Comparing the interactions of different ligands with the same target Validating an unusual interaction using substructure searching in Relibase+

7 www.ccdc.cam.ac.uk Relibase+ –Web-based database system for searching, retrieving and analysing 3D structures of protein-ligand complexes in the Brookhaven Protein Data Bank (PDB) –Successor to ReLiBase (developed by Manfred Hendlich et al. (Merck, Marburg U.) M. Hendlich, Acta Cryst. D54,1178-1182, 1998 Relibase: free on WWW for academics –http://relibase.ccdc.cam.ac.uk/ –http://relibase.rutgers.edu/

8 www.ccdc.cam.ac.uk Relibase+ Keyword searching FASTA protein sequence searching 2D substructure searching 3D protein-ligand interaction searching Protein-protein interaction searching Similarity searching for ligands SMILES substructure matching Automatic superposition of related binding sites to compare ligand binding modes, water positions, etc. 3D visualisation with AstexViewer and ReliView(Hermes) Basic Functionality

9 www.ccdc.cam.ac.uk Relibase+ Functionality for generation and search of proprietary databases of protein-ligand complexes alongside the PDB Links to the Mogul and IsoStar modules of the CSDS for geometry validation Additional modules: Crystal packing, WaterBase, CavBase Detailed analysis of superimposed binding sites Enhanced treatment of hitlists Reliscript: Command-line access via a Python-based toolkit Coming Soon: SecBase including Turn Classification Advanced Functionality

10 www.ccdc.cam.ac.uk CavBase Detect unexpected similarities amongst protein cavities (e.g. active sites) that share little or no sequence homology. Similarity judged by matching 3D property descriptors (pseudocentres) that encode the shape and chemical characteristics of each cavity No sequence information used, can detect similar cavities even if they have no obvious secondary-structure relationship Developed by S.Schmitt et al., J.Mol.Biol. (2002) CavBase

11 www.ccdc.cam.ac.uk Cambridge Structural Database Repository for the world’s small organic and metal-organic crystal structures (up to 500 non-H atoms) Experimentally determined 3D structures via X-ray, and neutron diffraction methods 2007 release contains 423,798 entries –approximately 32,000 entries added per year Derived from around 1200 published sources –official depository for >80 major journals –majority of data directly deposited electronically (CIF) Increasing number of Private Communications

12 www.ccdc.cam.ac.uk How much Data is Available? CSD Growth 1970-2006 419,768 entries June 2007 Growth of the CSD Predicted Growth to 2010 >500,000 entries during 2009

13 www.ccdc.cam.ac.uk CSD Information content Atomic coordinates, unit-cell, space-group symmetry (fully validated) Crystal structure data

14 www.ccdc.cam.ac.uk Bibliographic and Chemical Information Bibliographic and chemical text and properties (all searchable) 4-Oxonicotinamide-1- (1’-beta-D-2’,3’,5’-tri-O-acetyl-ribofuranoside) Source: Rothmannia longiflora Colour: pale yellow Habit: acicular Polymorph: Form IV C17 H20 N2 O9 G. Bringmann, M. Ochse, K. Wolf, J. Kraus, K. Peters, E-M. Peters, M. Herderich, L. Ake, F. Tayman Phytochemistry 51 (1999), p271 R-factor:.0506 Chemical diagram and chemical connectivity to enable 2D and 3D searching for substructures, pharmacophores and intermolecular interactions Cross-referencing between entries CSD Information content

15 www.ccdc.cam.ac.uk Cambridge Structural Database System CambridgeStructuralDatabase PreQuest Database Production VISTA Statistical analysis Mercury Graphical display, packing analysis ConQuest Database Search Mogul Library of Molecular Geometry IsoStar Library of Intermolecular Interactions Knowledge Bases

16 www.ccdc.cam.ac.uk Mogul A Knowledge Base of Molecular Geometries Bruno et al., J. Chem. Inf. Comput. Sci., 44, 2133- 2144, 2004

17 www.ccdc.cam.ac.uk  Incorporates pre-computed libraries of bond lengths, valence angles and torsion angles, derived entirely from the CSD  Sketch or import molecule, then click on feature of interest to view distribution, mean values and statistics  Very fast search speeds, with hyperlinks to the CSD to view specific structures  Complete geometry: retrieve distributions for all bonds, angles and torsions in the molecule Mogul Rapid access to CSD information

18 www.ccdc.cam.ac.uk A Knowledge Base of Intermolecular Interactions Experimental data from: –Cambridge Structural Database –Protein Data Bank (protein-ligand complexes only) –Theoretical potential energy minima (DMA, IMPT) Interaction distributions displayed immediately as scatterplots or contour surfaces >20,000 CSD scatterplots, >5,500 PDB, 1,500 E minima IsoStar

19 www.ccdc.cam.ac.uk central group: -CONH 2 contact group: NH IsoStar Methodology Search CSD or PDB for structures containing desired contact Superimpose hits and display as scatterplots

20 www.ccdc.cam.ac.uk Density Maps Can also represent distribution as density maps

21 www.ccdc.cam.ac.uk The Workshop Part 1: Validation of models and structural analysis Analysing a protein structure for errors and interesting features Comparing a structure with structures related by homology or by functionality Part 2: Probing the Protein-Ligand Interface Substructure searching in Relibase/Relibase+ Comparing the interactions of different ligands with the same target Validating an unusual interaction using substructure searching in Relibase+

22 www.ccdc.cam.ac.uk How to access the workshop http://relibase.ccdc.cam.ac.uk/ demo@ccdc.cam.ac.uk s1mple Webpage Email address Password

23 www.ccdc.cam.ac.uk

24 Cavity Detection PROTEIN Based on the LIGSITE Program M.Hendlich et al., J.Mol.Graph. (1997).

25 www.ccdc.cam.ac.uk The pseudo-centre concept donor acceptor aliphatic pi/aromatic Coding Molecular Recognition into Simple Descriptors

26 www.ccdc.cam.ac.uk Cavity Protein 3D Property Description

27 www.ccdc.cam.ac.uk Similarity Search

28 www.ccdc.cam.ac.uk Similarity Search Clique detection Bron-Kerbosch

29 www.ccdc.cam.ac.uk Similarity Search Clique detection Bron-Kerbosch

30 www.ccdc.cam.ac.uk Similarity Analysis Scoring based on matching pseudo- centres, and the associated surface patches

31 www.ccdc.cam.ac.uk An Example 1OXO/1F2D Overlay of PLP ligands Matching pseudo-centres and surface patches shown

32 www.ccdc.cam.ac.uk Crystal Packing Important e.g. when docking ligands Concanavalin A (1cjp) Binding site in Relibase+

33 www.ccdc.cam.ac.uk 1mtw reference ligand, no packing reference in green, first-rank solution atom-coloured

34 www.ccdc.cam.ac.uk 1mtw, Packing Included reference ligand, no packing including neighbouring chains GOLD’s first-rank solution


Download ppt "Www.ccdc.cam.ac.uk CCDC Tools for Mining Structural Databases Or – Building Solid Foundations for a Structure Based Design Campaign John Liebeschuetz,"

Similar presentations


Ads by Google