Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 7 Protein and RNA Structure Prediction 暨南大學資訊工程學系 黃光璿 2004/05/24.

Similar presentations


Presentation on theme: "1 Chapter 7 Protein and RNA Structure Prediction 暨南大學資訊工程學系 黃光璿 2004/05/24."— Presentation transcript:

1 1 Chapter 7 Protein and RNA Structure Prediction 暨南大學資訊工程學系 黃光璿 2004/05/24

2 2 Proteins Built from a repertoire of 20 amino acids

3 3

4 4 7.1 Amino Acids

5 5 胺基酸 中心碳 胺基( NH 2 ) COOH 氫( H ) 側鏈( side chain, R )

6 6 同分異構物

7 7

8 8

9 9 Fig. 7.2

10 10

11 11

12 12

13 13 pH, pK a, and pI pH  -log [H + ] pK a  = pH ~ half of the amino acid residues will dissociate ( 釋放出 H + ). pI  = pH, isoelectric point for protein

14 14 7.2 Polypeptide Composition

15 15

16 16 7.3 Secondary Structure

17 17 7.3.1 Backbone Flexibility

18 18 Conformation of Polypeptide Chain

19 19 Ramachandran Plot N: 藍 C: 黑 O: 紅 H: 白

20 20 二級結構( Secondary Structure ) Alpha helix

21 21 Beta sheet

22 22

23 23 Beta turn

24 24 Loop

25 25 7.3.2 Accuracy of Prediction Computational methods  neural network  discrete-state models  hidden Markov models  nearest neighbor classification  evolutionary computation

26 26 PHD, Predator  structure prediction algorithms  accuracies in the range 70% ~ 75%

27 27 7.3.3 Chou-Fasman Method

28 28 Identifying Alpha Helices 1. Find all regions where four out of six have P(a)>100. 2. Extend the regions until four with P(a) < 100 in both directions. 3. If ΣP(a) > ΣP(b) and the stretch >5, then it is identified as a helix.

29 29 Identifying Beta Sheets 1. Find all regions where four out of six have P(b)>100. 2. Extend the regions until four with P(b) < 100 in both directions. 3. If ΣP(b) > ΣP(a) and the average value of P(b) over the stretch >100, then it is identified as a helix.

30 30 Resolving Overlapping Regions 1. Identified as helix if ΣP(a) > ΣP(b), as sheet if ΣP(b) > ΣP(a) over the overlapping regions.

31 31 Identifying Turns 1. Let P(t) = f(i)xf(i+1)xf(i+2)xf(i+3) for each position i. 2. Identify as a turn if 1. P(t) > 0.000075; 2. The average of P(turn) over the four residues > 100; 3. ΣP(a) ΣP(b) over the four residues.

32 32 7.3.4 GOR Method on a window of 17 residues

33 33 7.4 Tertiary and Quaternary Structure

34 34 三級結構( Tertiary Structure ) 折疊成立體的形狀

35 35 四級結構( Quaternary Structure ) 數個三級結構結合成具 有功能的大分子 人類的血球蛋白

36 36 Driving Forces for Folding electrostatic forces hydrogen bonds van der Waals forces disulfide bonds solvent interactions

37 37 7.4.1 Hydrophobicity ( 疏水性 ) hydrophobic collapse  Tend to keep polar, charged residues on the surface.  The class of membrane-integral proteins is an exception.

38 38 sickle-cell anemia ( 鐮狀細胞性貧血 )  human hemoglobin: 2 alpha & 2 beta globins  charged glutamic acid residue  hydrophobic valine residues

39 39 7.4.2 Disulfide Bonds

40 40

41 41

42 42 7.4.3 Active Structures vs Most Stable Structures Natural selection favors proteins that are both active and robust.

43 43 Levinthal Paradox in 1968 100 residues, each assume 3 different conformations  3 100 ~ 5x10 47 possibilities  Suppose it takes 10 -13 s for one trial. Proteins fold by progressive stabilization of intermediates rather than by random search.

44 44 7.5 Algorithms for Modeling Protein Folding Lattice Models Off-Lattice Models

45 45 7.5.1 Lattice Models Reduce the search space and make computing tractable.  Minimize free energy conformation

46 46 HP-model hydrophobic-polar model  Scoring is based on hydrophobic contacts.  Maximize the H-to-H contacts. Fig. 7.8

47 47

48 48 7.5.2 Off-Lattice Models Use RMSD (root mean square deviation) to measure the accuracy. Determine Φ and Ψin the allowable region of the Ramachandran plot.

49 49 7.5.3 Energy Functions and Optimization Problems  The exact forces that drive the folding process are not well understood.  It is too computationally expensive.

50 50 Summary model representation scoring function search (optimization)  Folding@Home (V. Pande, Stanford)

51 51 7.6 Structure Prediction very high accuracy  < 3.0 Å

52 52 7.6.1 Comparative Modeling Also called homology modeling Rely on the robustness of the folding code

53 53 1. Identify a set of protein structures related to the target protein. 2. Align the sequence of the target with the sequence of the template. 3. Construct the model. 4. Model the loop. 5. Model the side chains. 6. Evaluate the model.

54 54 7.6.2 Threading Given  a conformation and  a protein sequence, measure its favorability.

55 55 7.7 Predicting RNA Secondary Structures

56 56 Nearest Neighbor Energy Rules Zuker’s Mfold program

57 57 Why study RNA secondary structures? For understanding of  gene regulation  expression of protein products

58 58 參考資料及圖片出處 1. Fundamental Concepts of Bioinformatics Dan E. Krane and Michael L. Raymer, Benjamin/Cummings, 2003. Fundamental Concepts of Bioinformatics 2. Biochemistry, by J. M. Berg, J. L. Tymoczko, and L. Stryer, Fith Edition, 2001. Biochemistry


Download ppt "1 Chapter 7 Protein and RNA Structure Prediction 暨南大學資訊工程學系 黃光璿 2004/05/24."

Similar presentations


Ads by Google