Presentation is loading. Please wait.

Presentation is loading. Please wait.

Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way?

Similar presentations


Presentation on theme: "Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way?"— Presentation transcript:

1 Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way? How are distant relationships to be recognized? programmethod DALIdistance matrix comparison (basis for FSSP structural classification) SSAPdynamic programming (used in CATH to classify topologies) VASTconvert secondary structures to vectors and align vectors

2 Structure comparison is pretty easy when two proteins are very similar when two proteins are so similar that the sequences can be reliably aligned, say >35% identical, structure comparison can proceed from the seq. alignment: 1. Align the sequences sequence 1: YIREV-GKL sequence 2: YITQVRNKA 2. Superpose the structures to minimize the RMSD for equivalent residue pairs in the alignment note: these structures do not correspond to the sequences above

3 it is harder when the proteins are very different... if one cannot align the sequence reliably, how does one establish which residues, if any, play equivalent structural roles in the two proteins? the answer is to attempt to align the structures directly in such a way that structural equivalencies in the two proteins are revealed we will discuss how the distance-matrix based algorithm of DALI solves this problem

4 Distance Matrices 2D representation of 3D structure plot sequence against itself identify pairs of residues which are close in space to each other usually distance between C-alpha carbons is used identify closeness between residues as dark parts of the matrix

5 Distance matrices

6 Different substructures, such as secondary or supersecondary structures, give rise to distinct patterns in the matrix e.g. antiparallel vs. parallel beta-sheets in principle, one could recognize structural similarity in two proteins by comparing patterns in distance matrices, but it’s not that simple

7 Problem: two structures with the same topology may differ in the precise location of secondary structure elements along the sequence, i.e. loop lengths may differ same fold, different matrices

8 Or two common architectures may differ in connectivity (topology)... both three-stranded antiparallel beta-sheets how might we compare their distance matrices to reveal this similarity?

9 DALI algorithm not useful to compare entire matrices instead, chop distance matrices into all possible submatrices of 6x6 amino acids compare this set of submatrices for pattern similarities rather than comparing entire matrix

10 1. identify a pair of matching submatrices within the two matrices make an initial sequence alignment from this match...

11 2. Identify a second pair which overlaps the first (contains one common structural element)

12 3. Combine overlapping pairs overall alignment of structurally equivalent sequence regions

13 4. Rearrange and “collapse” the matrix according to the aligned regions of the sequence now the common structural elements are aligned as are the structurally equivalent residues in the sequence!

14 All together now...

15 The Power of DALI DALI is quite powerful because it can recognize architectural similarities even when topologies are different. It is also flexible because it can be made more topologically restrictive (i.e. no swapping of segments in chain allowed) to focus on closer relationships

16 FSSP uses DALI alignments to classify structures all PDB entries representative set of structures representative set of domains group domains into fold types (clusters of similar structures) and make set of representatives of each fold eliminate similar sequences divide into domains align domains with DALI! 8320 947 1484 540

17 Judging DALI alignments Z-score: how much better than average is the alignment, i.e. how many standard deviations from the mean of a distribution of alignments of random pairs of proteins. >16 very close, 8-16 pretty close, <8 not so close. RMSD: root mean square deviation of alpha carbons for the matching portion of the structures. LALI: length of alignment (recognizably matching portion of the structures) LSEQ2: total length of the sequence being matched. %IDE: % sequence identity between the two sequences

18 if you go into FSSP, and search for a particular structure, you’ll get an output of its best DALI alignments with other structures STRID2Z RMSDLALI LSEQ2 %IDE PROTEIN 1plc 24.4 0.0 99 99 100 Plastocyanin (cu2+, ph 6.0) 2pcy 23.4 0.2 99 99 100 Apo-plastocyanin (pH 6.0) 1bqk 12.1 2.0 89 124 29 pseudoazurin 1aac 11.0 1.9 84 104 24 amicyanin 1ibzA 9.1 2.5 83 111 19 nitrosocyanin 1qhqA 8.3 2.4 87 139 29 auracyanin 1rcy 8.2 2.5 90 151 17 rusticyanin biological_unit 1qniA 7.7 2.2 78 572 19 nitrous-oxide reductase 1kcw 7.1 2.4 81 1017 17 ceruloplasmin biological_unit 2cuaA 7.0 2.2 80 122 15 cua fragment 1nwpA 6.7 3.1 85 128 24 azurin


Download ppt "Appendix: Automated Methods for Structure Comparison Basic problem: how are any two given structures to be automatically compared in a meaningful way?"

Similar presentations


Ads by Google