26 PHD, Predator structure prediction algorithms accuracies in the range 70% ~ 75%
28 Identifying Alpha Helices 1. Find all regions where four out of six have P(a)> Extend the regions until four with P(a) < 100 in both directions. 3. If ΣP(a) > ΣP(b) and the stretch >5, then it is identified as a helix.
29 Identifying Beta Sheets 1. Find all regions where four out of six have P(b)> Extend the regions until four with P(b) < 100 in both directions. 3. If ΣP(b) > ΣP(a) and the average value of P(b) over the stretch >100, then it is identified as a helix.
30 Resolving Overlapping Regions 1. Identified as helix if ΣP(a) > ΣP(b), as sheet if ΣP(b) > ΣP(a) over the overlapping regions.
31 Identifying Turns 1. Let P(t) = f(i)xf(i+1)xf(i+2)xf(i+3) for each position i. 2. Identify as a turn if 1. P(t) > ; 2. The average of P(turn) over the four residues > 100; 3. ΣP(a) ΣP(b) over the four residues.
Active Structures vs Most Stable Structures Natural selection favors proteins that are both active and robust.
43 Levinthal Paradox in residues, each assume 3 different conformations ~ 5x10 47 possibilities Suppose it takes s for one trial. Proteins fold by progressive stabilization of intermediates rather than by random search.
Algorithms for Modeling Protein Folding Lattice Models Off-Lattice Models
Lattice Models Reduce the search space and make computing tractable. Minimize free energy conformation
46 HP-model hydrophobic-polar model Scoring is based on hydrophobic contacts. Maximize the H-to-H contacts. Fig. 7.8
Off-Lattice Models Use RMSD (root mean square deviation) to measure the accuracy. Determine Φ and Ψin the allowable region of the Ramachandran plot.
Energy Functions and Optimization Problems The exact forces that drive the folding process are not well understood. It is too computationally expensive.
50 Summary model representation scoring function search (optimization) (V. Pande, Stanford)
Structure Prediction very high accuracy < 3.0 Å
Comparative Modeling Also called homology modeling Rely on the robustness of the folding code
53 1. Identify a set of protein structures related to the target protein. 2. Align the sequence of the target with the sequence of the template. 3. Construct the model. 4. Model the loop. 5. Model the side chains. 6. Evaluate the model.
Threading Given a conformation and a protein sequence, measure its favorability.
Predicting RNA Secondary Structures
56 Nearest Neighbor Energy Rules Zuker’s Mfold program
57 Why study RNA secondary structures? For understanding of gene regulation expression of protein products
58 參考資料及圖片出處 1. Fundamental Concepts of Bioinformatics Dan E. Krane and Michael L. Raymer, Benjamin/Cummings, Fundamental Concepts of Bioinformatics 2. Biochemistry, by J. M. Berg, J. L. Tymoczko, and L. Stryer, Fith Edition, Biochemistry