Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.

Slides:



Advertisements
Similar presentations
PROTEOMICS 3D Structure Prediction. Contents Protein 3D structure. –Basics –PDB –Prediction approaches Protein classification.
Advertisements

Lecture 8 Alignment of pairs of sequence Local and global alignment
1 Protein Structure, Structure Classification and Prediction Bioinformatics X3 January 2005 P. Johansson, D. Madsen Dept.of Cell & Molecular Biology, Uppsala.
Tema 14. Bases of protein structure and structural prediction. Structural data bank. Protein Data Bank. Molecular Visualization Tools for 3D. Prediction.
Structural bioinformatics
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Protein structure (Part 2 of 2).
Bayesian Classification of Protein Data Thomas Huber Computational Biology and Bioinformatics Environment ComBinE Department of Mathematics.
Structure Prediction. Tertiary protein structure: protein folding Three main approaches: [1] experimental determination (X-ray crystallography, NMR) [2]
Computational Biology, Part 10 Protein Structure Prediction and Display Robert F. Murphy Copyright  1996, 1999, All rights reserved.
Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way?
The Protein Data Bank (PDB)
Protein Tertiary Structure Comparison Dong Xu Computer Science Department 271C Life Sciences Center 1201 East Rollins Road University of Missouri-Columbia.
Protein threading Structure is better conserved than sequence
Protein structure prediction May 30, 2002 Quiz#4 on June 4 Learning objectives-Understand difference between primary secondary and tertiary structure.
Protein Structure Analysis - I
BMI 731 Protein Structures and Related Database Searches.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Protein Tertiary Structure Prediction Structural Bioinformatics.
Protein Structures.
Protein Structure Prediction and Analysis
Multiple Sequence Alignment CSC391/691 Bioinformatics Spring 2004 Fetrow/Burg/Miller (Slides by J. Burg)
IBGP/BMI 705 Lab 4: Protein structure and alignment TA: L. Cooper.
Protein Tertiary Structure Prediction
Cédric Notredame (30/08/2015) Chemoinformatics And Bioinformatics Cédric Notredame Molecular Biology Bioinformatics Chemoinformatics Chemistry.
Chapter 12 Protein Structure Basics. 20 naturally occurring amino acids Free amino group (-NH2) Free carboxyl group (-COOH) Both groups linked to a central.
Structural alignment Protein structure Every protein is defined by a unique sequence (primary structure) that folds into a unique.
Structural alignments of Proteins using by TOPOFIT method Vitkup D., Melamud E., Moult J., Sander C. Completeness in structural genomics. Nature Struct.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches Gaurav Sahni, Ph.D.
1 Randomized Algorithms for Three Dimensional Protein Structures Comparison Yaw-Ling Lin Dept Computer Sci and Info Engineering, Providence University,
Fast Search Protein Structure Prediction Algorithm for Almost Perfect Matches1 By Jayakumar Rudhrasenan S Primary Supervisor: Prof. Heiko Schroder.
 Four levels of protein structure  Linear  Sub-Structure  3D Structure  Complex Structure.
Representations of Molecular Structure: Bonds Only.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
© Wiley Publishing All Rights Reserved. Protein 3D Structures.
Getting To Know Your Protein Comparative Protein Analysis: Part III. Protein Structure Prediction and Comparison Robert Latek, PhD Sr. Bioinformatics Scientist.
Lecture 6. Pairwise Local Alignment and Database Search Csc 487/687 Computing for bioinformatics.
Protein Structure Prediction and Structural Genomics Computer Science Department North Dakota State University Fargo, ND.
Construction of Substitution Matrices
Protein Structure Comparison. Sequence versus Structure The protein sequence is a string of letters: there is an optimal solution (DP) to the problem.
Multiple alignment: Feng- Doolittle algorithm. Why multiple alignments? Alignment of more than two sequences Usually gives better information about conserved.
Function preserves sequences Christophe Roos - MediCel ltd Similarity is a tool in understanding the information in a sequence.
Part I : Introduction to Protein Structure A/P Shoba Ranganathan Kong Lesheng National University of Singapore.
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
DALI Method Distance mAtrix aLIgnment
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
Protein Tertiary Structure. Protein Data Bank (PDB) Contains all known 3D structural data of large biological molecules, mostly proteins and nucleic acids:
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
MINRMS: an efficient algorithm for determining protein structure similarity using root-mean-squared-distance Andrew I. Jewett, Conrad C. Huang and Thomas.
Pairwise sequence alignment Lecture 02. Overview  Sequence comparison lies at the heart of bioinformatics analysis.  It is the first step towards structural.
Sequence Alignment.
Construction of Substitution matrices
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Lecture 11 CS5661 Structural Bioinformatics – Structure Comparison Motivation Concepts Structure Comparison.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
V diagonal lines give equivalent residues ILS TRIVHVNSILPSTN V I L S T R I V I L P E F S T Sequence A Sequence B Dot Plots, Path Matrices, Score Matrices.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
Protein Structure Prediction: Threading and Rosetta BMI/CS 576 Colin Dewey Fall 2008.
Protein Structure Prediction. Protein Sequence Analysis Molecular properties (pH, mol. wt. isoelectric point, hydrophobicity) Secondary Structure Super-secondary.
EBI is an Outstation of the European Molecular Biology Laboratory. PDBe-fold (SSM) A web-based service for protein structure comparison and structure searches.
Bioinformatics Overview
Sequence Based Analysis Tutorial
Protein Structures.
Protein structure prediction.
DALI Method Distance mAtrix aLIgnment
Protein Structural Classification
Presentation transcript:

Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman

2 1.Terminology 2.Classes of protein structures 3.Why do we need to align structures 4.Viewing protein structures and 5.How to recognize structural similarities 6.Algorithms 7.Summary Outline

3 Tertiary structure, three-dimensional Class - similar 2° structure - all  all  + ,  /  Fold - major structural similarity - similar arrangement of 2° Superfamily (topology) - probable common ancestry Family - clear evolutionary relationship - sequence similarity > 25% Individual Protein Terminology

4 1.Class  2.Class  3.Class  4.Class  5.Multidomain proteins 6.Membrane and cell surface proteins 7.And more … Class of Protein Structure

5 Structure of  class proteins

6 Structure of  class proteins

7 Structure of  class proteins

8 Structure of  class proteins

9 Structure of membrane proteins

10 What is structure alignment In performing, the three-dimensional structure of one protein domain is superimposed upon the a second protein domain, to achieve minimal RMS To discover structural similarity

11 1.For homologous proteins (similar ancestry), this provides the “gold standard” for sequence alignment--elucidates the common ancestry of the proteins. 2.For nonhomologous proteins, allows us to identify common substructures of interest. 3.Allows us to classify proteins into clusters, based on structural similarity. Why Align Structures

12 to be considered: 1. Number of amino acid correspondences created. 2. RMSD of corresponding amino acids 3. Percent identity in aligned residues 4. Number of gaps introduced 5. Size of the two proteins 6. Conservation of known active site environments … There are no universally agreed upon criteria. As usual, it depends on what you are using the alignment to do. Evaluating Structural Alignments

13 Protein sequence 1 Database Similarity search 2 Align w/ known structure? 3 Predicted Structure? 7 Relationship To know structure? 5 Yes Tertiary Structure analysis 9 Predicted tertiary structure Tertiary comparative modeling 8 Structural analysis 6 Protein family analysis 4 Yes No Methods

14 Viewing Protein Structures Chime A Web browser plug-in to display and manipulate structures inside a Web page. Cn3d a Provides viewing of three-dimensional structures from Entrez and MMDB a. Cn3D runs on Windows, MacOS, and Unix; simultaneously displays structural and sequence alignments; can show multiple superimposed images from NMR studies. Mage (see Richardson and Richardson 1994) Standard molecular viewing features with animation and kaleidoscope effects. Rasmol b Most commonly used viewer for Windows, MacOS, UNIX, and VMS operating systems. Performs many functions. Swiss 3D viewer, Spdbv (Guex and Peitsch 1997) Protein models can be built by structural alignments; calculates atomic angles and distances, threading, energy minimation, and interacts with the Swiss Model server.

15 SCOP -- structural classification of proteinsSCOP FSSP -- fold classification and multiple structure alignmentsFSSP CATH -- structural classification of proteinsCATH MMDB – by VAST program SARF Protein Structure Classification Databases

16 Difference: Sequence vs. Structural similarity. Indicator to evolutionary relationship? More difficult to align structures 1.Similar structure may form by many different foldings of the amino acid C  2.Although the local environments of many molecules in two proteins may be similar, there may also be some local differences. Alignment of Protein Structures

17 How to recognize structural similarities 1.By eye 2. Algorithmically point-based methods use properties of points (distances) to establish correspondences secondary structure-based methods use vectors representing secondary structures to establish correspondences.

18 Align Structures by Secondary Structures

19 1. STRUCTAL, uses dynamic programming iteratively to refine an arbitrary starting alignment. 2. DALI, Uses distance matrix to find similar patterns of distances, indicating correspondences. 3. LOCK, uses vectors associated with secondary structures to do quick screen for similar structures. Three prototypical methods

20 Uses dynamic programming iteratively to refine an arbitrary starting alignment. STEPS: 1. Start with any set of correspondences between two structures (sequence alignment, secondary structure alignment, by eye, random). 2. Compute a score matrix by computing a score between all pairs of points based on their distance. 3. Trace back through the score matrix to find a new set of correspondences that maximizes the score (standard DP) 4. Iterate 2 and 3 until score doesn’t change. Note: heuristic, no guarantees of success, depends on quality of starting structure. STRUCTAL

21 Need to find a score that is maximal when alignment is good (good distances are small). Also may want to include other computable attributes of the point. Scoring in STRUCTAL Where M is maximum score desired, d is the measured value (of distance or some other attribute), and do is value at which score is 0. All values between do and d get some “credit” but values less than do are penalized.

22 Similar to a dot matrix to identify the atoms that lie most closely together If two proteins have a similar structure, the graphs of these structures will be superimposable. Distance Matrix

23 Uses distance matrix to find similar patterns of distances, indicating correspondences. STEPS: 1. Systematically look through 2 distance matrices to find pairs of segments with similar pattern of distances. Provides pairs of similar segments. 2. Assemble pairs into larger sets, to maximize the number of atoms and minimize the RMS distance between them. The assembly step is done in a random fashion, since the search space is too large. DALI

24 DALI

25 DALI

26 DALI

27 Steps for LOCK 1. Define local secondary structures. 2. Find an initial superposition by using DP (and score functions shown) to align secondary structure vectors. 3. Use greedy algorithm to find nearest neighbors and minimize RMSD. 4. Prune the atoms to get core with minimal RMSD Fast Structural Search based on Secondary Structure Analysis

28 1.Structural alignment is a key activity, combinatorially expensive, used for : Gold standard for alignments Elucidating evolutionary relationships Creating classifications of protein structure 2.Multiple methods exist, often based on a basic DP approach including Analysis of distances Analysis of vectors Combinations of both Summary

29 1.STRUCTAL – dynamic programming using a distance metric 2.DALI – analysis of distance maps 3.LOCK – analysis of secondary structure vectors, followed by refinement with distances Summary