Presentation is loading. Please wait.

Presentation is loading. Please wait.

Madhavi Ganapathiraju Graduate student Carnegie Mellon University

Similar presentations


Presentation on theme: "Madhavi Ganapathiraju Graduate student Carnegie Mellon University"— Presentation transcript:

1 Madhavi Ganapathiraju Graduate student Carnegie Mellon University
TM PRO & Comparison of Algorithms for “Protein Stability Prediction Upon Mutations” Madhavi Ganapathiraju Graduate student Carnegie Mellon University

2 Overview TMpro evaluations on PDBTM, TMPDB and MPTOPO are complete
Additional inputs to TMPro are being studied Yule values (not successful) Evolutionary Profile (promising) TMPro website has been completed Evaluation of algorithms to predict protein stability changes upon mutations

3 Part 1: TM pro

4 TMPro Evaluations Segment Residue level Method  Qok F Score Recall
Precision Q2 Misclassified as Soluble MPtopo (101 TM proteins) 2a TMHMM 66 91 89 94 84 5 2b TMpro NN 60 93 92 79 PDBTM (191 TM proteins) 3a 68 90 13 3b 57 81 2

5 is fully functional! Competition
TMPro web-server is fully functional! Competition for TMpro Logo Prize: See your logo on the web!

6 Attempts to overcome confusion with globular soluble helices (1)
Yule value features to be added Yule value features that discriminate amino acid neighbor propensities between TM and nonTM helices were computed earlier Tried to add these features as input to NN predictor, but could not achieve quantitative improvement I will discuss this in future when I have any results to present

7 Attempts to overcome confusion with globular soluble helices (2)
Evolutionary profile information It is known that knowledge of evolutionary profile of a protein can improve prediction accuracy to a great extent TMPro is capable of predicting TMs without requiring knowledge of profile Useful when you cannot extract sequence alignments from known proteins But where profile is known, we would like to use that additional information

8 Profile generation Get multiple sequence alignments
Those of you who have worked with evolutionary analysis before, please give feedback Get multiple sequence alignments Compute position specific scoring matrix for each protein 21 rows (20 amino acids, and 1 row for gaps) Profile is generated for each protein in the training and test sets PSSM (i,j) = log(C(i,j)/total counts at position j) log(C(i,j)/unigram count of i in the protein)

9 Doubts We have labels for training sequences
What labels to assign to gaps? We have labels for training sequences But when original sequence has gaps when aligned, how to interpret the labels of the gaps? --n------n----n------nnn-----n------n M----- 2a D------E----L------KLS-----R------K H 2A65_A AAC YP_ E------S----F------G.K T -M------M------M------M M M MM 2a A------V------L------W T A AI 2A65_A AAC YP_ S------C IL Even TM regions are having gaps such as shown above

10 What do with missing segment info for some sequences
Doubts What do with missing segment info for some sequences When nothing is shown (gap/alignment) for some sequences, I am counting those as gaps XP_ L K KAP----RSNQV.-..FVAGTMGLASAVGA.AT 86 AAW A..A KNP----NTTRNV-..FMVGALGALGASSV.ST 136 CAB N.RP.-A..VIGSARFAYMAWTRVA 83 XP_ SKRA.-A.FVLSGGRFIYASLLRLL 130 AAA SKRA.-A.FVLTGGRFVYASLVRLL 126

11 Using profile for prediction
Studied independent of TMpro Neural network with 21 input, 21 hidden and 1 output neurons Residue Number (nonmembrane=0, membrane =1) Predicted output Experimental observed locations of TM helices

12 Another output

13 NN architecture needs to be modified
But instead I did post-processing of Neural network output Computed Wavelet Transform Mexican hat wavelet, scale = 10

14 Some more wavelet outputs
Note that these are from the training data itself.. Yet to check how it performs overall

15 Part 2: Stability upon Mutations

16 Evaluation of predictions of protein stability changes upon mutations
Effects of mutations on 2 TM proteins are available in our group The two proteins are rhodopsin and bacteriorhodopsin Data available for how much mis-folding occurs How stability of protein is affected There are algorithms that can also predict these changes We compared how accurate or reliable the prediction methods are, by comparing their results with our experimental data

17 3 Prediction algorithms
I mutant 2.0 Support vector machine Features: amino acid neighbors in 9nm sphere, temperature, pH, relative solvent accessibility surface are DFIRE Knowledge based statistical potentials FOLDX Statistical mechanics.. Account for various energy terms

18 Authors’ claims in 3 papers

19 Our results Rhodopsin (PDB: 1U19) Bacteriorhodopsin (PDB: 1QM8)

20 Bias in # of mutations that increase/decrease stability
Database bias affects apparent accuracies of algorithms I-mutant for example, predicts decrease in stability for a majority of the mutations. Whether the mutations studied through experiments preserve the natural bias of decreasing stability mutations, affects the apparent accuracy of the prediction algorithms

21 Correlation with known data
Reported correlations for these methods are quite large (>0.7) On data compared here the correlations are quite low

22 Notes .. Local installation of blast and netblast are on cologne:
/usr1/blast / /usr1/netblast / Java SDK on Cologne /usr1/j2sdk1.4.2_11/

23 Acknowledgements Judith Klein-Seetharaman Christopher Jon Jursa
Pitt Information sciences (for developing web interface)


Download ppt "Madhavi Ganapathiraju Graduate student Carnegie Mellon University"

Similar presentations


Ads by Google