Download presentation
Presentation is loading. Please wait.
Published byMarylou Martin Modified over 9 years ago
1
Three-Stage Prediction of Protein Beta-Sheets Using Neural Networks, Alignments, and Graph Algorithms Jianlin Cheng and Pierre Baldi Institute for Genomics and Bioinformatics School of Information and Computer Sciences University of California Irvine
2
Importance of Predicting Beta- Sheet Structure Ab-initio Structure Prediction Fold Recognition Model Refinement Protein Design Protein Folding Rendered in Protein Explorer beta-sheet helix Coil
3
An Example of Beta-Sheet Architecture Structure of Protein 1VJG Beta Sheets Level 1 4 5 2 1 3 6 7
4
An Example of Beta-Sheet Architecture Structure of Protein 1VJG Beta Sheets Strand Strand Pair Strand Alignment Pairing Direction Level 1 Level 2 Antiparallel Parallel 4 5 2 1 3 6 7
5
An Example of Beta-Sheet Architecture Structure of Protein 1VJG Beta Sheets Strand Strand Pair Strand Alignment Pairing Direction Beta Residue Residue Pair Level 1 Level 2 Level 3 Antiparallel Parallel 4 5 2 1 3 6 7 H-bond
6
Previous Work Statistical potential approach for strand alignment (Hubbard, 1994; Zhu and Braun, 1999) Statistical potentials to improve beta-sheet secondary structure prediction (Asogawa,1997) Information theory approach for strand alignment (Steward and Thornton, 2000) Neural networks for beta-residue pairs (Baldi, et al., 2000)
7
Three-Stage Prediction of Beta-Sheets Stage 1 Predict beta-residue pairing probabilities using 2D-Recursive Neural Networks ( 2D- RNN, Baldi and Pollastri, 2003 ) Stage 2 Use beta-residue pairing probabilities to align beta-strands Stage 3 Predict beta-strand pairs and beta-sheet architecture using graph algorithms
8
Dataset and Statistics Num Protein Chains916 Beta Residues48,996 Beta Residue Pairs31,638 Beta Strands10,745 Beta Strand Pairs8,172 Beta Sheets2,533 Extract proteins with high resolution from Protein Data Bank (Berman et al., 2000) Use DSSP (Kabsch and Sander, 1983) to assign intra-chain beta- sheet structure Use UniqueProt (Mika and Rost, 2003) to reduce redundancy Use PSI-BLAST (Altschul et al., 1997) to generate profiles Statistics
9
Stage 1: Prediction of Beta-Residue Pairings Using 2D-RNN Input Matrix I (m×m) 2D-RNN O = f(I) Target / Output Matrix (m×m) (i,j) 20 profiles3 SS2 SA T ij : 0/1 O ij : Pairing Prob. (i,j) I ij X i -2 X i -1 X i X i +1 X i +2 X j -2 X j -1 X j X j +1 X j +2 |X i – X j | X i or X j is the position of beta-residue i or j in the sequence
10
An Example (Target) Protein 1VJG Beta-Residue Pairing Map (Target Matrix) 1234567
11
An Example (Target) Protein 1VJG Beta-Residue Pairing Map (Target Matrix) 1234567 Antiparallel Parallel
12
An Example (Prediction)
13
Stage 2: Beta-Strand Alignment Use output probability matrix as scoring matrix Dynamic programming Disallow gaps and use the simplified search algorithm 1m n1 1m 1n Antiparallel Parallel Total number of alignments = 2(m+n-1)
14
Strand Alignment and Pairing Matrix The alignment score is the sum of the pairing probabilities of the aligned residues The best alignment is the alignment with the maximum score Strand Pairing Matrix Strand Pairing Matrix of 1VJG
15
Stage 3: Prediction of Beta-Strand Pairings and Beta-Sheet Architecture (Constraints) (a) Seven strands of protein 1VJG in sequence order (b) Beta-sheet topology of protein 1VJG
16
Stage 3: Prediction of Beta-Strand Pairings and Beta-Sheet Architecture (Constraints) (a) Seven strands of protein 1VJG in sequence order (b) Beta-sheet topology of protein 1VJG Protein: 1B7G Rendered in Rasmol 3 partners
17
Minimum Spanning Tree Like Algorithm Strand Pairing Graph (SPG) (a) Complete SPG Strand Pairing Matrix
18
Minimum Spanning Tree Like Algorithm Strand Pairing Graph (SPG) Goal: Find a set of connected subgraphs that maximize the sum of the alignment scores and satisfy the constraints Algorithm: Minimum Spanning Tree Like Algorithm (a) Complete SPG (b) True Weighted SPG Strand Pairing Matrix
19
An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 Strand Pairing Matrix of 1VJG Step 1: Pair strand 4 and 5
20
An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 Strand Pairing Matrix of 1VJG N Step 2: Pair strand 1 and 2
21
An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 3 Strand Pairing Matrix of 1VJG N Step 3: Pair strand 1 and 3
22
An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 3 6 Strand Pairing Matrix of 1VJG N Step 4: Pair strand 3 and 6
23
An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 3 6 7 Strand Pairing Matrix of 1VJG N C Step 5: Pair strand 6 and 7
24
A New Fold Example (Last CASP) 1S12 (94 residues) 12345 101.71.05.29.33 20.06.41.12 30.22.04 40.53 50 CEEEEEECCEEEECCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHEHHCCCCEEEEHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEECCC Predicted: 1-2, 2-4, 3-4, 4-5 CEEEEECCCEEEEECCCCCHHHHHHHHHHHHHHHHHHHHCCCEEEEEECCEEEEEECCCCHHHHHHHHHHHHHHHHHHHHCCCCEEEEECCCCCC True: 1-2, 2-4, 3-4, 1-5 True secondary structure Predicted secondary structure by SSpro (Pollastri, et al., 2002) 1 2 4 3 5 Rendered in Rasmol Strand Pairing Matrix 1s12 Beta Sheet Topology
25
Beta-Residue Pairing Results MethodSpecificity/ Sensitivity Ratio of Improvement ROC Area TPR at 5% FPR BetaPairing41%17.80.8658% CMAPpro (Pollastri and Baldi, 2002) 27%11.70.8042% The accuracy of random algorithm is 2.3%. ROC Plot
26
Strand Pairing Results Naïve algorithm of pairing all adjacent strands Specificity = 42% Sensitivity = 50% All strand pairs are local strand pairs. MST like algorithm Specificity = 53% Sensitivity = 59% >20% correctly predicted strand pairs are non-local strand pairs.
27
Strand Alignment Results Paring Direction Alignment Accuracy93%72% On the correctly predicted strand pairs Pairing Direction Alignment Accuracy84%66% On all native strand pairs The accuracy of pairing direction is 15% higher than that of the base-line algorithm. The alignment accuracy is significantly higher than previous methods.
28
Future Work and Applications Allow a cycle to handle beta-barrel, allow gaps in alignment for beta bulge, add more inputs (Punta and Rost, 2005) for beta residue pairing prediction Applications Contact map Fold recognition Ab-initio structure prediction Model refinement Web server and dataset (SCRATCH suite) http://www.ics.uci.edu/~baldig/betasheet.html
29
Acknowledgement Pierre Baldi, Arlo Randall, Michael Sweredoski NIH grant (LM-07443-01) NSF grant (EIA-0321390)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.