Todd J.Taylor, Iosif I.Vaisman Abstract: A method of protein structural domain assignment using an Ising/Potts-like.

Slides:



Advertisements
Similar presentations
Secondary structure prediction from amino acid sequence.
Advertisements

Surface Simplification using Quadric Error Metrics Guowei Wu.
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Extended Gaussian Images
Discrete Geometry Tutorial 2 1
1st Meeting Industrial Geometry Computational Geometry ---- Some Basic Structures 1st IG-Meeting.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ What is Cluster Analysis? l Finding groups of objects such that the objects in a group will.
Spatial Information Systems (SIS)
3. Delaunay triangulation
Chapter 9 Structure Prediction. Motivation Given a protein, can you predict molecular structure Want to avoid repeated x-ray crystallography, but want.
CS 376b Introduction to Computer Vision 04 / 08 / 2008 Instructor: Michael Eckmann.
CISC667, F05, Lec21, Liao1 CISC 467/667 Intro to Bioinformatics (Fall 2005) Protein Structure Prediction 3-Dimensional Structure.
Bioinformatics master course DNA/Protein structure-function analysis and prediction Lecture 5: Protein Fold Families Jaap Heringa Integrative Bioinformatics.
Protein structure (Part 2 of 2).
Image Segmentation. Introduction The purpose of image segmentation is to partition an image into meaningful regions with respect to a particular application.
Proteins  Proteins control the biological functions of cellular organisms  e.g. metabolism, blood clotting, immune system amino acids  Building blocks.
Structures and Structure Descriptions Chapter 8 Protein Bioinformatics.
Protein structures in the PDB
Identification of Domains using Structural Data Niranjan Nagarajan Department of Computer Science Cornell University.
Comparing Database Search Methods & Improving the Performance of PSI-BLAST Stephen Altschul.
Single Motif Charles Yan Spring Single Motif.
Quadtrees and Mesh Generation Student Lecture in course MATH/CSC 870 Philipp Richter Thursday, April 19 th, 2007.
Prediction of Local Structure in Proteins Using a Library of Sequence-Structure Motifs Christopher Bystroff & David Baker Paper presented by: Tal Blum.
Detecting the Domain Structure of Proteins from Sequence Information Niranjan Nagarajan and Golan Yona Department of Computer Science Cornell University.
A Statistical Geometry Approach to the Study of Protein Structure Majid Masso Bioinformatics and Computational Biology George Mason University.
Protein Structure Alignment by Incremental Combinatorial Extension (CE) of the Optimal Path Ilya N. Shindyalov, Philip E. Bourne.
IBGP/BMI 705 Lab 4: Protein structure and alignment TA: L. Cooper.
Structural alignments of Proteins using by TOPOFIT method Vitkup D., Melamud E., Moult J., Sander C. Completeness in structural genomics. Nature Struct.
Overcoming the Curse of Dimensionality in a Statistical Geometry Based Computational Protein Mutagenesis Majid Masso Bioinformatics and Computational Biology.
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
Lecture 10: Protein structure
Algorithms for Triangulations of a 3D Point Set Géza Kós Computer and Automation Research Institute Hungarian Academy of Sciences Budapest, Kende u
PRE-TRIANGULATIONS Generalized Delaunay Triangulations and Flips Franz Aurenhammer Institute for Theoretical Computer Science Graz University of Technology,
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Analyzing the Simplicial Decomposition of Spatial Protein Structures Rafael Ördög, Zoltán Szabadka, Vince Grolmusz.
Intelligent Vision Systems ENT 496 Object Shape Identification and Representation Hema C.R. Lecture 7.
Sequence Analysis CSC 487/687 Introduction to computing for Bioinformatics.
The Geometry of Biomolecular Solvation 2. Electrostatics Patrice Koehl Computer Science and Genome Center
Order independent structural alignment of circularly permutated proteins T. Andrew Binkowski Bhaskar DasGupta  Jie Liang ‡ Bioengineering Computer Science.
Topic 16 K Plaxco et al (1998), J Mol Biol, 227: D Baker (2000), Nature, 405:39-42.
Topic 10 Chapter 20, Du and Bourne “Structural Bioinformatics”
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.
Conformational Space.  Conformation of a molecule: specification of the relative positions of all atoms in 3D-space,  Typical parameterizations:  List.
Multiple Mapping Method with Multiple Templates (M4T): optimizing sequence-to-structure alignments and combining unique information from multiple templates.
Geometry Objectives Identify a three- dimensional object from two- dimensional representations of that object and vice versa Given a.
DALI Method Distance mAtrix aLIgnment
UNC Chapel Hill David A. O’Brien Chain Growing Using Statistical Energy Functions David A. O'Brien Balasubramanian Krishnamoorthy: Jack Snoeyink Alex Tropsha.
Protein Structure Prediction ● Why ? ● Type of protein structure predictions – Sec Str. Pred – Homology Modelling – Fold Recognition – Ab Initio ● Secondary.
A New Voronoi-based Reconstruction Algorithm
Pair-wise Structural Comparison using DALILite Software of DALI Rajalekshmy Usha.
UNC Chapel Hill M. C. Lin Delaunay Triangulations Reading: Chapter 9 of the Textbook Driving Applications –Height Interpolation –Constrained Triangulation.
Network Partition –Finding modules of the network. Graph Clustering –Partition graphs according to the connectivity. –Nodes within a cluster is highly.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
1 Three-Body Delaunay Statistical Potentials of Protein Folding Andrew Leaver-Fay University of North Carolina at Chapel Hill Bala Krishnamoorthy, Alex.
Modeling Cell Proliferation Activity of Human Interleukin-3 (IL-3) Upon Single Residue Replacements Majid Masso Bioinformatics and Computational Biology.
Computational Physics (Lecture 10) PHY4370. Simulation Details To simulate Ising models First step is to choose a lattice. For example, we can us SC,
A new protein-protein docking scoring function based on interface residue properties Reporter: Yu Lun Kuo (D )
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.
Visualising contingency table data Dongwen Luo, G. R. Wood, G. Jones.
Catalogs contain hundreds of millions of objects
Chapter 14 Protein Structure Classification
Protein Proteins are biochemical compounds consisting of one or more polypeptides typically folded into a globular or fibrous form in a biologically functional.
Structural and sequence features of beta-turns in beta-hairpins
Protein Structure Comparison
Introduction to Spatial Computing CSE 555
Mean Shift Segmentation
DALI Method Distance mAtrix aLIgnment
Protein Domains Andrew Torda, wintersemester 2006 / 2007, Angewandte …
Presentation transcript:

Todd J.Taylor, Iosif I.Vaisman Abstract: A method of protein structural domain assignment using an Ising/Potts-like model on a lattice derived from the Delaunay tessellation of a protein structure is described. The method is very simple and agrees well with previously published methods. Protein Structural Domain Assignment with a Delaunay Tessellation Derived Lattice

Protein structures have been analyzed with a technique from computational geometry known as Delaunay tessellation (DT). Each amino acid is abstracted to a point and the points are then joined by edges to form a set of non-overlapping, irregular, space-filling tetrahedra each having the property that the sphere on the surface of which all four vertices reside does not contain a vertex from any other tetrahedron. The union of the surface faces of the tessellated protein forms the convex hull of the Cα point set. Surface irregularities are ‘paved over’ by long edges (20Å+) which form contacts between residue pairs that are too far apart to be ‘true’ neighbors. It is sometimes expedient therefore to impose an edge length cutoff in the DT analysis. Cα Delaunay tessellation of phosphoglycerate kinase (16pk) with no edge cutoff and with a 10Å cutoff

Structural domains: Wetlaufer (1973), Definition - continuous segment(s) of the main chain that form a compact, stable structure with a hydrophobic core and potentially could fold and function independently from the rest of the structure Delaunay-Potts: Sequence of domain labels is S={s 1,s 2, …, s N }, initialized to residue numbers. s i t+1 = s i t + U[∑ J(s i t,s j t ) ], i =1, …, N, where j varies over the Delaunay neighbors of i and U(x) = x/|x| Pick residue at random and immediately update (asynchronous updating). Iterate until shape of domain label profile meets ending 'stairstep' criteria. 1 if s j > s i and d ij ≤ r J(s i t,s j t ) = -1 if s j < s i and d ij ≤ r cutoff distance r, typically Å 0 if d ij > r Smooth in a window around residue i, replacing the label at i with the median in the window. Post-processing fine tunes assignment: no domains smaller than 40 residues, no domain boundary cuts a beta sheet. Protein domain assignment and DePot

domain 1 domain 2 Schematic of Delaunay-Potts (DePot) procedure

Example assignments and evolution of domain labels 2laodomain1domain2 Expert1-90, DALI1-89, CATH1-90, PDP1-90, DomainParser21-89, DEE1-89, DDBASE5-91, Islam1-88, SCOP1-238 DOMS1-90, DePot1-91, avhAdomain1domain2domain3domain4 Expert DALI3-86, CATH PDP3-140, DomainParser23-89, DEE DDBASE Islam SCOP3-320 DOMS DePot

same # overlapVIRand DOMS SCOP DePot Islam DDBASE DEE Domain Parser PDP CATH DALI Depot along with several other methods was tested on a set of 100 structures from three previously published domain assignment papers. The overlap score (used before in the literature) was used to measure similarity wrt expert assignments as well as two other scoring schemes, not applied to domain assignment before from the clustering literature. Performance on combined Jones, Taylor, and Veretnik test set wrt expert assignment

[1] Singh RK, Tropsha A, Vaisman II (1996) Delaunay tessellation of proteins: four body nearest-neighbor propensities of amino acid residues. J Comput Biol 3(2): [2] Taylor TJ, Vaisman II (2006) Protein structural domain assignment with a Delaunay tessellation derived lattice, Proceedings of the 3 rd International Symposium on Voronoi Diagrams in Science and Engineering. [3] Taylor WR (1999) Protein structural domain identification. Protein Eng 12: [4] Veretnik S, Bourne PE, Alexandrov NN, Shindyalov IN (2004) Toward consistent assignment of protein domains in proteins. J Mol Biol 339: [5] Holland TA, Veretnik S, Shindyalov IN, Bourne PE. (2006) Partitioning protein structures into domains: why is it so difficult? J Mol Biol. 361(3): [6] Jones S, Stewart M, Michie A, Swindells MB, Orengo C, Thornton JM (1998) Domain assignment for protein structures using a consensus approach: characterization and analysis. Protein Sci 7: [7] Okabe A (2000) Spatial tessellations : concepts and applications of Voronoi diagrams. Wiley Assignment server Acknowledgements W.R. Taylor for the DOMS method and code. Stella Veretnik for discussions regarding her work with domain assignment. NSF for funding. Selected references