Coordinate handling and exploitation An overview of coordinate functionality in CCP4 suite Coordinate functionality in REFMAC group of programs (A. Vaguine)

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

Protein Structure.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
Web Resources for Bioinformatics Vadim Alexandrov and Mark Gerstein.
EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel
Recent developments 1) Tests (outlier analysis) and Bug fixing ( with Paul) 2) Regeneration of Values of Bonds and Bond-angles existing all structures.
A New Analytical Method for Computing Solvent-Accessible Surface Area of Macromolecules.
The Protein Data Bank (PDB)
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
Protein Structure Analysis - I
Using 3D-SURFER. Before you start 3D-Surfer can be accessed at For visualization.
Data Mining Techniques
Protein Interfaces, Surfaces and Assemblies
Protein Tertiary Structure Prediction
Number of released entries Year. Growth of Molecular Complexity Number of Chains Year Number of Structures Containing that Number of Chains.
13 ° COSMO General Meeting Rome VERSUS2 Priority Project Report and Plan Adriano Raspanti.
CCP4mg Liz Potterton, Stuart McNicholas, Martin Noble, Jan Gruber.
COMPARATIVE or HOMOLOGY MODELING
Chapter 6 : Software Metrics
28 th March 2007 MrBUMP – Automated Molecular Replacement Ronan Keegan, Martyn Winn CCP4, Daresbury Laboratory.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
1 PyMOL Evolutionary Trace Viewer 1.1 Lichtarge Lab Sept. 13, 2010.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes MSD Protein.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Neural Networks for Protein Structure Prediction Brown, JMB 1999 CS 466 Saurabh Sinha.

CCP4 Developers Meeting 2007 CCP4 Molecular Graphics Liz Potterton and Stuart McNicholas.
EBI is an Outstation of the European Molecular Biology Laboratory. Annotation Procedures for Structural Data Deposited in the PDBe at EBI.
Crystallographic Databases I590 Spring 2005 Based in part on slides from John C. Huffman.
EBI is an Outstation of the European Molecular Biology Laboratory. Avazeh Ghanbarian Paul Kersey Alessandro Vullo EBI Microme Annotation Meeting June 2011.
Computational prediction of protein-protein interactions Rong Liu
EMBL-EBI MSDpisa a web service for studying Protein Interfaces, Surfaces and Assemblies Eugene Krissinel
EBI is an Outstation of the European Molecular Biology Laboratory. A web service for the analysis of macromolecular interactions and complexes PDBe Protein.
R. Keegan 1, J. Bibby 3, C. Ballard 1, E. Krissinel 1, D. Waterman 1, A. Lebedev 1, M. Winn 2, D. Rigden 3 1 Research Complex at Harwell, STFC Rutherford.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
Session Session 15 FAFSA on the Web - Onward and Upward!
EBI is an Outstation of the European Molecular Biology Laboratory. MSDchem and the chemistry of the wwPDB EMBO 22nd-26th September 2008 EMBL-EBI Hinxton.
EBI is an Outstation of the European Molecular Biology Laboratory. Quaternary Structure.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
EBI is an Outstation of the European Molecular Biology Laboratory. Assessment of macromolecular interactions and identification of macromolecular assemblies.
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Deposition, Validation, Search and Analysis Services.
Macromolecular Structure Database Project EMSD Infra-structure Services for Europe To develop an autonomous structural database capability in Europe
EBI is an Outstation of the European Molecular Biology Laboratory. Protein Database in Europe Gaurav Sahni, Ph.D. Deposition, Validation, Search and Analysis.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-PISA a web based service for understanding Protein Interfaces, Surfaces and Assemblies.
SR Users Meeting 10-11th September 2003 CCP4 Release 5.0 Peter Briggs CCP4/CCLRC Daresbury Laboratory.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBeChem The Ligand Database.
Query sequence MTYKLILNGKTKGETTTEAVDAATAEKVFQYANDN GVDGEWTYTE Structure-Sequence alignment “Structure is better preserved than sequence” Me! Non-redundant.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Project Planning Defining the project Software specification Development stages Software testing.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe Search Services (PDBelite, PDBePro and BIObar) Sanchayita Sen, Ph.D. PDB Depositions.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
Protein Tertiary Structure Prediction Structural Bioinformatics.
Peter J. Briggs, Alun Ashton, Charles Ballard, Martyn Winn and Pryank Patel CCLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK The CCP4 project.
EBI is an Outstation of the European Molecular Biology Laboratory. A web based integrated search service to understand ligand binding and secondary structure.
Sequence: PFAM Used example: Database of protein domain families. It is based on manually curated alignments.
PDBe Protein Interfaces, Surfaces and Assemblies
Protein Structure Visualisation
PDBemotif A web based integrated search service to understand ligand binding and secondary structure properties in macromolecular structures.
Complex Geometry Visualization TOol
Getting the Most out of the PDBe
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Molecular Docking Profacgen. The interactions between proteins and other molecules play important roles in various biological processes, including gene.
Protein structure prediction.
Solution and Crystal Structures of a Sugar Binding Site Mutant of Cyanovirin-N: No Evidence of Domain Swapping  Elena Matei, William Furey, Angela M.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Mr.Halavath Ramesh 16-MCH-001 Dept. of Chemistry Loyola College University of Madras-Chennai.
Presentation transcript:

Coordinate handling and exploitation An overview of coordinate functionality in CCP4 suite Coordinate functionality in REFMAC group of programs (A. Vaguine) New CCP4 project “Protein Interfaces” (E. Krissinel)

Coordinate support in CCP4 Old FORTRAN coordinate- related applications not using RWBrook (42%) Own coordinate functions Refmac group of programs Own coordinate functions Old FORTRAN coordinate- related applications using RWBrook (33%) New C & C++ coordinate- related applications (a few) Clipper Molecular Graphics Coot RWBrook emulator MMDB (C++ Coordinate Library) MMDB (C++ Coordinate Library) SSM DNA group Own coordinate functions other Own coordinate functions

CCP4 Coordinate Library (MMDB) Manager Interface API PDB file One or more C++ classes mmCIF fileBinary file Model Header Cryst Sequence Model Residue Atom Chain Residue Atom C++ class hierarchy PDB/mmCIF support Database features ~600 interface functions Emulate RWBrook Wealth of retrieval, selection, transformation and edit tools User-defined data Built-in high-level functionality (contacts, alignment, superposition etc.) Monomer database SWIG interface Stable and documented E. Krissinel et.al. (2004) Acta Cryst. D

Approximately 40% of CCP4 suite now uses a common set of coordinate functions provided by MMDB. This should help greatly in maintenance and adaptation to possible format changes. Conversion of older FORTRAN applications, which are not using RWBrook, to MMDB, in most cases means a complete rewriting. This does not seem to be necessary at the moment. All on-going developments in FORTRAN seem to be using their own coordinate functions and libraries. MMDB delivers all its power only in C++ interface. Most of MMDB functionality cannot be expressed in traditional FORTRAN terms. Should we encourage new coordinate developments in C/C++ using MMDB? - shift away from FORTRAN thinking. New coordinate-related CCP4 projects - MG, Coot, SSM and Protein Interfaces - are all based on MMDB and that seems to be an advantage for the projects. General remarks

PIAS Protein Interactions, Assemblies and Searches E. Krissinel CCP4 - EBI/MSD project

PIAS Project goals Develop a tool and publicly available interactive service to aid solution of different tasks that involve structural and chemical analysis of protein interactions, such as prediction of oligomeric states analysis of structure-function relationship analysis and prediction of protein interactions search for interface homologues active site recognition and analysis protein surface analysis structure specificity analysis other Project started in 2004.

Interactive Web server provisional parts, subject to progress and feasibility Crystal interfaces Interface calculations, analysis, scoring & biological significance Interfaces & structure similarity searches Interface fingerprinting Applied studies (e.g. discovery of multispecific proteins) Active site recognition Docking Procedures for CCP4 MG Prediction of interfaces Prediction of oligomeric states (PQS-3) Interfaces & surface similarity searches PIAS Project overview PIAS database

Crystal interfaces Interface calculations, analysis, scoring & biological significance Interfaces & structure similarity searches Interface fingerprinting Applied studies (e.g. discovery of multispecific proteins) Active site recognition Docking Procedures for CCP4 MG Prediction of interfaces Prediction of oligomeric states (PQS-3) Interfaces & surface similarity searches PIAS Project schedule PIAS database

PIAS Database Interface is defined as area that becomes inaccessible to solvent upon complex formation Databased properties for interfacing structures: Contains interfaces between polypeptides found in all PDB entries: all crystal contacts for X-ray entries and chain contacts for NMR entries. Also contains predicted protein assemblies.  Interface area per residue (+ selection of interfacing atoms and residues)  Number of atoms and residues involved  Solvation energy gain (per residue) and P-value of hydrophobic patches  List of potential hydrogen bonds and salt bridges  Complexation significance score Databased properties for interfaces:  Size, weight  Solvent accessible area per residue (+ selection of surface atoms and residues) Databased properties for assemblies:  Composition, chemical formula  List of engaged interfaces  Transformation matrices  Solvation energy gain  Solvent accessible and buried surface area  Dissociation pattern and barrier  Solvation energy per residue  SSM data for structure search  Structure and sequence alignment PIAS database

Existing tools for the calculation of quaternary structures Prediction of oligomeric states (PQS-3)  PQS MSD (Kim Henrick) (PQS-1) Prediction of oligomeric states Method: recursive splitting of the largest complexes allowed by crystal symmetry. Termination criteria is derived from the individual statistical scores of crystal contacts. The results are not curated.  PITA Thornton group EBI (Hannes Ponstingl) (PQS-2) Method: progressive built-up by addition of monomeric chains that suit the selection criteria. The results are partly curated.

Graph-chemical approach Crystal is represented as a periodic graph of monomers (a “supermolecule”) All possible assemblies that obey the symmetry criteria are recursively enumerated as subgraphs covering all the crystal Only sets of chemically stable assemblies are left as an answer: Prediction of oligomeric states Prediction of oligomeric states (PQS-3)

Success rate obtained on a benchmark set of 212 structures (H. Ponstingl) PQS MSD78%(not optimised on the benchmark set) PITA software84%(optimised with 18 parameters) PIAS89%(optimised with 8 parameters, underfit) Early results outside the benchmark set indicate some prevalence of PIAS, however the actual differences may be less significant. Prediction of oligomeric states Prediction of oligomeric states (PQS-3)

Prediction of oligomeric states PQS may be predicted only up to a certain level of confidence. It seems that 85-90% of correct predictions may be reached. Main reasons for why 100% success rate can never be achieved: theoretical models for protein affinity and entropy change upon complexation are primitive coordinate (experimental) data are of limited accuracy there is no feasible way to take conformation changes into account experimental data on multimeric states is very limited and not always reliable - calibration of parameters is difficult assemblies may exist in some environments and dissociate in other - a definitive answer is simply not there

Questions to answer Searching the PIAS database for structurally similar interfaces and interfaces between similar structures Interfaces & structure similarity searches What interfaces are formed by structures similar to the given one(s) in PDB What are the interface partners of a given structure in PDB What is the relation between sequence and biological (complexation) significance of the interface (function) What PQS may be formed by structures similar to the given one(s) and how the PQS may depend on the sequence Is a given structure interaction-specific and/or multispecific Interfaces and structure similarity searches

A preliminary version of the MSD protein interaction service is set up at The version includes: Calculations for uploaded files or database retrievals on PDB Id code of  Solvent-Accessible Surface area  Crystal contacts / interfaces  Protein interface parameters and scoring  Interface area  Solvation energy gain  Hydrogen bonds and salt bridges  Hydrophobic P-value  Biological relevance score  Selection of interfacing residues and atoms  Protein Quaternary Structures Interface and structure searches in protein interface database derived from PDB Visualisation of the structures, interfaces and PQS PIAS web server

PIAS web server

PIAS web server

PIAS web server

3gcb hexamerDissociation of 3gcb hexamer

Concluding remarks The PIAS software is almost ready for first release. It may be released in 2 months time after catching up with on-line help and documentation minor cleaning and re-design of output pages enhancement of structural search options further entropy calibration to increase accuracy of PQS prediction Further work will concentrate on surface calculation and analysis surface / active sites searches possibly docking additions to and improvements of existing functions (based on users’ feedback and own needs)