Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman.

Similar presentations


Presentation on theme: "Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman."— Presentation transcript:

1 Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman

2 2 1.Terminology 2.Classes of protein structures 3.Why do we need to align structures 4.Viewing protein structures and 5.How to recognize structural similarities 6.Algorithms 7.Summary Outline

3 3 Tertiary structure, three-dimensional Class - similar 2° structure - all  all  + ,  /  Fold - major structural similarity - similar arrangement of 2° Superfamily (topology) - probable common ancestry Family - clear evolutionary relationship - sequence similarity > 25% Individual Protein Terminology

4 4 1.Class  2.Class  3.Class  4.Class  5.Multidomain proteins 6.Membrane and cell surface proteins 7.And more … Class of Protein Structure

5 5 Structure of  class proteins

6 6 Structure of  class proteins

7 7 Structure of  class proteins

8 8 Structure of  class proteins

9 9 Structure of membrane proteins

10 10 What is structure alignment In performing, the three-dimensional structure of one protein domain is superimposed upon the a second protein domain, to achieve minimal RMS To discover structural similarity

11 11 1.For homologous proteins (similar ancestry), this provides the “gold standard” for sequence alignment--elucidates the common ancestry of the proteins. 2.For nonhomologous proteins, allows us to identify common substructures of interest. 3.Allows us to classify proteins into clusters, based on structural similarity. Why Align Structures

12 12 to be considered: 1. Number of amino acid correspondences created. 2. RMSD of corresponding amino acids 3. Percent identity in aligned residues 4. Number of gaps introduced 5. Size of the two proteins 6. Conservation of known active site environments … There are no universally agreed upon criteria. As usual, it depends on what you are using the alignment to do. Evaluating Structural Alignments

13 13 Protein sequence 1 Database Similarity search 2 Align w/ known structure? 3 Predicted Structure? 7 Relationship To know structure? 5 Yes Tertiary Structure analysis 9 Predicted tertiary structure Tertiary comparative modeling 8 Structural analysis 6 Protein family analysis 4 Yes No Methods

14 14 Viewing Protein Structures Chime http://www.umass.edu/microbio/chime/http://www.umass.edu/microbio/chime/ A Web browser plug-in to display and manipulate structures inside a Web page. Cn3d a http://www.ncbi.nlm.nih.gov/Structure/ http://www.ncbi.nlm.nih.gov/Structure/ Provides viewing of three-dimensional structures from Entrez and MMDB a. Cn3D runs on Windows, MacOS, and Unix; simultaneously displays structural and sequence alignments; can show multiple superimposed images from NMR studies. Mage http://kinemage.biochem.duke.edu/ (see Richardson and Richardson 1994)http://kinemage.biochem.duke.edu/ Standard molecular viewing features with animation and kaleidoscope effects. Rasmol b http://www.umass.edu/microbio/rasmol/ http://www.umass.edu/microbio/rasmol/ Most commonly used viewer for Windows, MacOS, UNIX, and VMS operating systems. Performs many functions. Swiss 3D viewer, Spdbv http://www.expasy.ch/spdbv/mainpage.html (Guex and Peitsch 1997)http://www.expasy.ch/spdbv/mainpage.html Protein models can be built by structural alignments; calculates atomic angles and distances, threading, energy minimation, and interacts with the Swiss Model server.

15 15 SCOP -- structural classification of proteinsSCOP FSSP -- fold classification and multiple structure alignmentsFSSP CATH -- structural classification of proteinsCATH MMDB – by VAST program SARF Protein Structure Classification Databases

16 16 Difference: Sequence vs. Structural similarity. Indicator to evolutionary relationship? More difficult to align structures 1.Similar structure may form by many different foldings of the amino acid C  2.Although the local environments of many molecules in two proteins may be similar, there may also be some local differences. Alignment of Protein Structures

17 17 How to recognize structural similarities 1.By eye 2. Algorithmically point-based methods use properties of points (distances) to establish correspondences secondary structure-based methods use vectors representing secondary structures to establish correspondences.

18 18 Align Structures by Secondary Structures

19 19 1. STRUCTAL, uses dynamic programming iteratively to refine an arbitrary starting alignment. 2. DALI, Uses distance matrix to find similar patterns of distances, indicating correspondences. 3. LOCK, uses vectors associated with secondary structures to do quick screen for similar structures. Three prototypical methods

20 20 Uses dynamic programming iteratively to refine an arbitrary starting alignment. STEPS: 1. Start with any set of correspondences between two structures (sequence alignment, secondary structure alignment, by eye, random). 2. Compute a score matrix by computing a score between all pairs of points based on their distance. 3. Trace back through the score matrix to find a new set of correspondences that maximizes the score (standard DP) 4. Iterate 2 and 3 until score doesn’t change. Note: heuristic, no guarantees of success, depends on quality of starting structure. STRUCTAL

21 21 Need to find a score that is maximal when alignment is good (good distances are small). Also may want to include other computable attributes of the point. Scoring in STRUCTAL Where M is maximum score desired, d is the measured value (of distance or some other attribute), and do is value at which score is 0. All values between do and d get some “credit” but values less than do are penalized.

22 22 Similar to a dot matrix to identify the atoms that lie most closely together If two proteins have a similar structure, the graphs of these structures will be superimposable. Distance Matrix

23 23 Uses distance matrix to find similar patterns of distances, indicating correspondences. STEPS: 1. Systematically look through 2 distance matrices to find pairs of segments with similar pattern of distances. Provides pairs of similar segments. 2. Assemble pairs into larger sets, to maximize the number of atoms and minimize the RMS distance between them. The assembly step is done in a random fashion, since the search space is too large. DALI

24 24 DALI

25 25 DALI

26 26 DALI

27 27 Steps for LOCK 1. Define local secondary structures. 2. Find an initial superposition by using DP (and score functions shown) to align secondary structure vectors. 3. Use greedy algorithm to find nearest neighbors and minimize RMSD. 4. Prune the atoms to get core with minimal RMSD Fast Structural Search based on Secondary Structure Analysis

28 28 1.Structural alignment is a key activity, combinatorially expensive, used for : Gold standard for alignments Elucidating evolutionary relationships Creating classifications of protein structure 2.Multiple methods exist, often based on a basic DP approach including Analysis of distances Analysis of vectors Combinations of both Summary

29 29 1.STRUCTAL – dynamic programming using a distance metric 2.DALI – analysis of distance maps 3.LOCK – analysis of secondary structure vectors, followed by refinement with distances Summary


Download ppt "Protein Classification II CISC889: Bioinformatics Gang Situ 04/11/2002 Parts of this lecture borrowed from lecture given by Dr. Altman."

Similar presentations


Ads by Google