Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,

Similar presentations


Presentation on theme: "Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,"— Presentation transcript:

1 Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider, R. & Sander, C. JMB(1997)270,471-480 Presented by Jian Qiu

2 Why do we need protein threading?  To detect remote homologue  Genome annotation Structures are better conserved than sequences. Remote homologues with low sequence similarity may share significant structure similarity.  To predict protein structure based on structure template Protein A shares structure similarity with protein B. We could model the structure of protein A using the structure of protein B as a starting point.

3 An successful example by GenTHREADER  ORF MG276 from Mycoplasma genitalium was predicted to share structure similarity with 1HGX.  MG276 shares a low sequence similarity (10% sequence identity) with 1HGX. Supporting Evidence:  MG276 has an annotation of adenine phosphoribosyltransferase, based on high sequence similarity to the Escherichia coli protein; 1HGX is a hypoxanthine-guanine-xanthine phosphoribosyltransferase from the protozoan parasite Tritrichomonas foetus.  Four functionally important residues in 1HGX are conserved in MG276.  The secondary structure prediction for ORF MG276 agrees very well with the observed secondary structure of 1HGX.

4 Structure of 1HGX

5 Functional residue conservation between 1HGX and MG276

6 GenTHREADER Protocol Sequence alignment  For each template structure in the fold library, related sequences were collected by using the program BLASTP.  A multiple sequence alignment of these sequences was generated with a simplified version of MULTAL.  Get the optimal alignment between the target sequence and the sequence profile of a template structure with dynamic programming.

7 Threading Potentials Pairwise potential (the pairwise model family): k: sequence separation s: distance interval m ab : number of pairs ab observed with sequence separation k  weight given to each observation f k (s): frequency of occurrence of all residue pairs f k ab (s): frequency of occurrence of residue pair ab

8 Solvation potential (the profile model family): r: the degree of residue burial the number of other C  atoms located within 10 Å of the residue's C  atom f a (r): frequency of occurrence of residue a with burial r f (r): frequency of occurrence of all residues with burial r

9 Variables considered to predict the relationship  Pairwise energy score  Solvation energy score  Sequence alignment score  Sequence alignment length  Length of the structure  Length of the target sequence

10 Artificial Neural Network A node

11 Neural network architecture in GenTHREADER

12 The effects of sequence alignment score and pairwise potential on the Network output

13 Confidence level with different network scores Low Medium(80%) High (99%) Certain (100%)

14 Genome analysis of Mycoplasma genitalium All the 468 ORFs were analyzed within one day.

15 Distribution of protein folds in M. genitalium

16 PHD: Predict 1D structure from sequence MaxHom Sequence Multiple Sequence Alignment PHDsecPHDacc Secondary structure: H(helix), E(strand), L(rest) Solvent accessibility: Buried( =15%)

17 Threading Protocol

18 Similarity matrix in dynamic programming  Purely structure similarity matrix: six states (combination of three secondary structure states and two solvent accessibility states)  Purely sequence similarity matrix: McLachlan or Blosum62  Combination of strcture and sequence similarity matrix: M ij =  M ij 1D structure + (100-  )  M ij sequence  sequence alignment only  1D  structure alignment only

19 Performance of the algorithm

20 Results on the 11 targets of CASP1  Correctly detected the remote homologues at first rank in four cases; Average percentage of correctly aligned residues: 21%; Average shift: nine residues. Best performing methods in CASP1:  Expert-driven usage of THREADER by David Jones and colleagues detected five out of nine proteins correctly at first rank.  Best alignments of the potential-based threading method by Manfred Sippl and colleagues were clearly better than the best ones of this algorithm.


Download ppt "Protein threading algorithms 1.GenTHREADER Jones, D. T. JMB(1999) 287, 797-815 2.Protein Fold Recognition by Prediction-based Threading Rost, B., Schneider,"

Similar presentations


Ads by Google