Presentation is loading. Please wait.

Presentation is loading. Please wait.

Three-Stage Prediction of Protein Beta-Sheets Using Neural Networks, Alignments, and Graph Algorithms Jianlin Cheng and Pierre Baldi Institute for Genomics.

Similar presentations


Presentation on theme: "Three-Stage Prediction of Protein Beta-Sheets Using Neural Networks, Alignments, and Graph Algorithms Jianlin Cheng and Pierre Baldi Institute for Genomics."— Presentation transcript:

1 Three-Stage Prediction of Protein Beta-Sheets Using Neural Networks, Alignments, and Graph Algorithms Jianlin Cheng and Pierre Baldi Institute for Genomics and Bioinformatics School of Information and Computer Sciences University of California Irvine

2 Importance of Predicting Beta- Sheet Structure Ab-initio Structure Prediction Fold Recognition Model Refinement Protein Design Protein Folding Rendered in Protein Explorer beta-sheet helix Coil

3 An Example of Beta-Sheet Architecture Structure of Protein 1VJG Beta Sheets Level 1 4 5 2 1 3 6 7

4 An Example of Beta-Sheet Architecture Structure of Protein 1VJG Beta Sheets Strand Strand Pair Strand Alignment Pairing Direction Level 1 Level 2 Antiparallel Parallel 4 5 2 1 3 6 7

5 An Example of Beta-Sheet Architecture Structure of Protein 1VJG Beta Sheets Strand Strand Pair Strand Alignment Pairing Direction Beta Residue Residue Pair Level 1 Level 2 Level 3 Antiparallel Parallel 4 5 2 1 3 6 7 H-bond

6 Previous Work Statistical potential approach for strand alignment (Hubbard, 1994; Zhu and Braun, 1999) Statistical potentials to improve beta-sheet secondary structure prediction (Asogawa,1997) Information theory approach for strand alignment (Steward and Thornton, 2000) Neural networks for beta-residue pairs (Baldi, et al., 2000)

7 Three-Stage Prediction of Beta-Sheets Stage 1 Predict beta-residue pairing probabilities using 2D-Recursive Neural Networks ( 2D- RNN, Baldi and Pollastri, 2003 ) Stage 2 Use beta-residue pairing probabilities to align beta-strands Stage 3 Predict beta-strand pairs and beta-sheet architecture using graph algorithms

8 Dataset and Statistics Num Protein Chains916 Beta Residues48,996 Beta Residue Pairs31,638 Beta Strands10,745 Beta Strand Pairs8,172 Beta Sheets2,533 Extract proteins with high resolution from Protein Data Bank (Berman et al., 2000) Use DSSP (Kabsch and Sander, 1983) to assign intra-chain beta- sheet structure Use UniqueProt (Mika and Rost, 2003) to reduce redundancy Use PSI-BLAST (Altschul et al., 1997) to generate profiles Statistics

9 Stage 1: Prediction of Beta-Residue Pairings Using 2D-RNN Input Matrix I (m×m) 2D-RNN O = f(I) Target / Output Matrix (m×m) (i,j) 20 profiles3 SS2 SA T ij : 0/1 O ij : Pairing Prob. (i,j) I ij X i -2 X i -1 X i X i +1 X i +2 X j -2 X j -1 X j X j +1 X j +2 |X i – X j | X i or X j is the position of beta-residue i or j in the sequence

10 An Example (Target) Protein 1VJG Beta-Residue Pairing Map (Target Matrix) 1234567

11 An Example (Target) Protein 1VJG Beta-Residue Pairing Map (Target Matrix) 1234567 Antiparallel Parallel

12 An Example (Prediction)

13 Stage 2: Beta-Strand Alignment Use output probability matrix as scoring matrix Dynamic programming Disallow gaps and use the simplified search algorithm 1m n1 1m 1n Antiparallel Parallel Total number of alignments = 2(m+n-1)

14 Strand Alignment and Pairing Matrix The alignment score is the sum of the pairing probabilities of the aligned residues The best alignment is the alignment with the maximum score Strand Pairing Matrix Strand Pairing Matrix of 1VJG

15 Stage 3: Prediction of Beta-Strand Pairings and Beta-Sheet Architecture (Constraints) (a) Seven strands of protein 1VJG in sequence order (b) Beta-sheet topology of protein 1VJG

16 Stage 3: Prediction of Beta-Strand Pairings and Beta-Sheet Architecture (Constraints) (a) Seven strands of protein 1VJG in sequence order (b) Beta-sheet topology of protein 1VJG Protein: 1B7G Rendered in Rasmol 3 partners

17 Minimum Spanning Tree Like Algorithm Strand Pairing Graph (SPG) (a) Complete SPG Strand Pairing Matrix

18 Minimum Spanning Tree Like Algorithm Strand Pairing Graph (SPG) Goal: Find a set of connected subgraphs that maximize the sum of the alignment scores and satisfy the constraints Algorithm: Minimum Spanning Tree Like Algorithm (a) Complete SPG (b) True Weighted SPG Strand Pairing Matrix

19 An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 Strand Pairing Matrix of 1VJG Step 1: Pair strand 4 and 5

20 An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 Strand Pairing Matrix of 1VJG N Step 2: Pair strand 1 and 2

21 An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 3 Strand Pairing Matrix of 1VJG N Step 3: Pair strand 1 and 3

22 An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 3 6 Strand Pairing Matrix of 1VJG N Step 4: Pair strand 3 and 6

23 An Example of MST Like Algorithm 0 1.30.94.370.02.040.02.031.90.10.05.74.04 0.02.03.02.200 1 2 3 4 5 6 7 123456 7 45 21 3 6 7 Strand Pairing Matrix of 1VJG N C Step 5: Pair strand 6 and 7

24 A New Fold Example (Last CASP) 1S12 (94 residues) 12345 101.71.05.29.33 20.06.41.12 30.22.04 40.53 50 CEEEEEECCEEEECCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHEHHCCCCEEEEHHHHHHHHHHHHHHHHHHHHHHHHHCCCCEEEEEEECCC Predicted: 1-2, 2-4, 3-4, 4-5 CEEEEECCCEEEEECCCCCHHHHHHHHHHHHHHHHHHHHCCCEEEEEECCEEEEEECCCCHHHHHHHHHHHHHHHHHHHHCCCCEEEEECCCCCC True: 1-2, 2-4, 3-4, 1-5 True secondary structure Predicted secondary structure by SSpro (Pollastri, et al., 2002) 1 2 4 3 5 Rendered in Rasmol Strand Pairing Matrix 1s12 Beta Sheet Topology

25 Beta-Residue Pairing Results MethodSpecificity/ Sensitivity Ratio of Improvement ROC Area TPR at 5% FPR BetaPairing41%17.80.8658% CMAPpro (Pollastri and Baldi, 2002) 27%11.70.8042% The accuracy of random algorithm is 2.3%. ROC Plot

26 Strand Pairing Results Naïve algorithm of pairing all adjacent strands Specificity = 42% Sensitivity = 50% All strand pairs are local strand pairs. MST like algorithm Specificity = 53% Sensitivity = 59% >20% correctly predicted strand pairs are non-local strand pairs.

27 Strand Alignment Results Paring Direction Alignment Accuracy93%72% On the correctly predicted strand pairs Pairing Direction Alignment Accuracy84%66% On all native strand pairs The accuracy of pairing direction is 15% higher than that of the base-line algorithm. The alignment accuracy is significantly higher than previous methods.

28 Future Work and Applications Allow a cycle to handle beta-barrel, allow gaps in alignment for beta bulge, add more inputs (Punta and Rost, 2005) for beta residue pairing prediction Applications Contact map Fold recognition Ab-initio structure prediction Model refinement Web server and dataset (SCRATCH suite) http://www.ics.uci.edu/~baldig/betasheet.html

29 Acknowledgement Pierre Baldi, Arlo Randall, Michael Sweredoski NIH grant (LM-07443-01) NSF grant (EIA-0321390)


Download ppt "Three-Stage Prediction of Protein Beta-Sheets Using Neural Networks, Alignments, and Graph Algorithms Jianlin Cheng and Pierre Baldi Institute for Genomics."

Similar presentations


Ads by Google