Presentation is loading. Please wait.

Presentation is loading. Please wait.

Proteins Secondary Structure Predictions

Similar presentations


Presentation on theme: "Proteins Secondary Structure Predictions"— Presentation transcript:

1 Proteins Secondary Structure Predictions

2

3 Specific databases of protein sequences and structures
Swissprot PIR TREMBL (translated from DNA) PDB (Three Dimensional Structures)

4 Protein Structure Primary Secondary Tertiary Quaternary Amino acid
sequence Alpha helices & Beta sheets, loops. Packing of secondary elements. Packing of several polypeptide chains

5 Symbols for the 20 amino acids
A ala alanine M met methionine C cys cysteine N asn aspargine D asp aspartic acid P pro proline E glu glutamic acid Q gln glutamine F phe phenylalanine R arg arginine G gly glycine S ser serine H his histidine T thr threonine I ile isoleucine V val valine K lys lysine W trp tryptophane L leu leucine Y tyr tyrosine

6 The 20 Amino Acids

7 Grouping amino acids to physio-chemical properties

8 Myoglobin – the first high resolution protein structure
Solved in 1958 by Max Perutz John Kendrew of Cambridge University. Won the 1962 and Nobel Prize in Chemistry. “ Perhaps the most remarkable features of the molecule are its complexity and its lack of symmetry. The arrangement seems to be almost totally lacking in the kind of regularities which one instinctively anticipates.”

9 Alpha Helices Right-handed spiral 5 to 40 amino acids (10 average)
3.6 amino acids per turn Some a.a. are more frequent than others in helices.

10 Beta Sheets Parallel – Strands run in the same direction (C to N)
Anti-parallel- Strands run in opposite directions Each strand has 5-10 amino acids (6 average) Some a.a. are more frequent than others N C C C C N

11 Loop Regions All other protein regions Irregular shape and size
Connect the secondary structure elements

12 Structure Presentation
Ribbon diagram: Alpha helix Beta Sheet

13 Structure Presentation
TOPS cartoon: beta sheets are triangles alpha helices are circles. the peptide chain runs from N terminus to C terminus.

14 Structure Prediction: Motivation
Hundreds of thousands of gene sequences translated to proteins (genbanbk, SW, PIR) Only about solved structures (PDB) Goal: Predict protein structure based on sequence information

15 Structure Prediction: Motivation
Understand protein function Locate binding sites Broaden homology Detect similar function where sequence differs Explain disease See effect of amino acid changes Design suitable compensatory drugs

16 Prediction Approaches
Primary (sequence) to secondary structure Sequence characteristics Secondary to tertiary structure Fold recognition Threading against known structures Primary to tertiary structure Ab initio modelling

17 Can we predict the secondary structure from sequence ?
a-helix b-sheet non- polar polar polar polar Non-polar Secondary structures have an amphiphilic nature : one face polar and the other non polar

18 Secondary Structure Prediction
Why is it complex? A huge space of possible structures Assume a 100 aa chain only 2 possible conformations for each residue 2100~1030 different conformations for the chain as a whole. Infer secondary structure from sequence is problematic: Similar sequences may result in different structures (mutations, different environments). Different sequences may result in similar structures (the Globin fold).

19 Secondary Structure Prediction Methods
Chou-Fasman / GOR Method Based on amino acid frequencies No more than 60% accurate Artificial Neural Network (ANN) methods PHDsec and PSIpred Use multiple sequences Secondary structure based on family Best accuracy now ~78%

20 PHDsec and PSIpred PHDsec PSIpred
Rost & Sander, 1993 Based on sequence family alignments PSIpred Jones, 1999 Based on Position Specific Scoring Matrix Generated by PSI-BLAST Both consider long-range interactions

21 Brain Neurons Outgoing signal determined by incoming
Connected together in networks Learns from experience

22 SS prediction using ANN
F G H I K L M N P Q R S T V W Y . Inputs for one position Amino acid at position

23 Position-Specific Scoring Matrix

24 Inputs for one position
PHDsec Neural Net A C D E F G H I K L M N P Q R S T V W Y . Inputs for one position Amino acid at position Outputs H= helix E= strand C= Coil Confidence 0=low,9=high Hidden layer

25 Secondary structure prediction
AGADIR - An algorithm to predict the helical content of peptides APSSP - Advanced Protein Secondary Structure Prediction Server GOR - Garnier et al, 1996 HNN - Hierarchical Neural Network method (Guermeur, 1997) Jpred - A consensus method for protein secondary structure prediction at University of Dundee JUFO - Protein secondary structure prediction from sequence (neural network) nnPredict - University of California at San Francisco (UCSF) PredictProtein - PHDsec, PHDacc, PHDhtm, PHDtopology, PHDthreader, MaxHom, EvalSec from Columbia University Prof - Cascaded Multiple Classifiers for Secondary Structure Prediction PSA - BioMolecular Engineering Research Center (BMERC) / Boston PSIpred - Various protein structure prediction methods at Brunel University SOPMA - Geourjon and Delיage, 1995 SSpro - Secondary structure prediction using bidirectional recurrent neural networks at University of California DLP - Domain linker prediction at RIKEN


Download ppt "Proteins Secondary Structure Predictions"

Similar presentations


Ads by Google