Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal.

Similar presentations


Presentation on theme: "Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal."— Presentation transcript:

1 Protein Structure Prediction

2 Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal perspective on advances and developments in protein folding over the last 40 years

3 Levinthal Paradox Cyrus Levinthal, Columbia University, 1968 Observed that there is insufficient time to randomly search the entire conformational space of a protein Resolution: Proteins have to fold through some directed process Goal is to understand the dynamics of this process

4 Old vs. New Views Old:  Heirarchical view of protein folding  Secondary structures form, then interact to form tertiary structures  General order of events New:  Statistical ensembles of states  Potential energy landscape  Folding “Funnel” Not all that different; most important ideas were theorized many years ago

5 Secondary Structures Consensus view is that secondary structure formation is the earliest part of the folding process Numerous studies indicate that local sequence codes for local structures  Helical sequences in a folded protein tend to be helical in isolation Current SSE prediction algorithms about 70% correct (1993). Failure indicates some tertiary interactions in stabilizing SSEs

6 However… Not clear what sequence elements code for overall topology One factor is the existence of hydrophobic faces on the surface of SSEs Still challenges in predicting topology of SSEs, even when protein class is known

7 Atomic level calculations Molecular calculations have made great impact in our understanding of protein folding Harold Scheraga, 1968 Shneior Lifson, 1969 Martin Karplus’s laboratory, ~1979 Early calculations had trouble dealing with solvent effects

8 Secondary Structure Many of the essential elements of protein energetics can be derived from looking at SSE formation Early experimental work: Ingwall et all, 1968 Baldwin et all, 1989, Worked on stabilizing shorter helices Dyson, Wright, 1991, demonstrated that even short peptides in solution can be partially structured

9 Results Yang and Honig, 1995 Alpha-helices stabilized by hydrophobic interactions and close packing; hydrogen bonding has little effect Beta-sheets stabilized by non-polar interactions between residues on adjacent strands Work supports idea that SSEs coded for locally in the sequence

10 Folding Pathways SSEs can change conformation in the presence of a relatively small number of tertiary interactions Free-energy difference between alpha-helix, beta-sheet, and coil is not great Individual helices can be changed into beta- sheets by changing just a few amino acids This suggests that proteins have a “structural plasticity” which allows for changes in conformation

11 Folding Pathways Early in folding processes, many different combinations of SSEs have very similar stabilities In the end, it is the tertiary interactions which drive towards the native topology Early in folding, “flickering” of SSEs, eventually stabilized by tertiary interactions and converge to native state Suggests that multiple folding pathways exist, which can all lead to the same end result once stabilized

12 Structure Prediction Recently, a split has been seen  Protein prediction problem Trying to predict the end result of folding, using a large amount of comparison between known and unknown structures  Protein folding problem Trying to understand the folding path which leads to the end result of folding, typically by MD simulations or energy calculation Authors contention that both areas will need to be used together to fully understand protein folding

13 PrISM Yang and Honig, 1999 Software suite which integrates prediction based on simulations and known information about structures  Sequence analysis  Structure based sequence alignment  Fast structure-structure superposition using a structural domain database  Multiple Structure alignment  Fold recognition and homology model building Used to make predictions for all 43 targets of CASP3 conference (more on CASP later)

14 Conclusions Much of the current understanding of protein folding was theorized long ago Vague and speculative ideas have been replaced by carefully defined theoretical concepts and rigorous experimental observations

15 Conclusions Polypeptide backbone is the most important determinant of structure SSEs are “meta-stable”; statement that sequence determines structure not wholly accurate More accurate statement is that sequence chooses from a limited set of available SSEs and determines how they are ordered in space

16 Conclusions Free-energy differences between alternate conformations is not large: may provide a bases for rapid evolutionary change

17 CASP A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction, John Moult CASP = Critical Assessment of Structure Prediction First held in 1994, every 2 years afterwards Teams make structure predictions from sequences alone

18 CASP Two categories of predictors  Automated Automatic Servers, must complete analysis within 48 hours Shows what is possible through computer analysis alone  Non-automated Groups spend considerable time and effort on each target Utilize computer techniques and human analysis techniques

19 CASP CASP6, 1994  200 prediction teams from 24 countries  Over 30,000 predictions for 64 protein targets collected and evaluated  Conference held after to discuss results, with many teams presenting individual results and methodologies  Helps to steer future work

20 Modeling classes Comparative modeling based on a clear sequence relationship Modeling based on more distant evolutionary relationships Modeling based on non-homologous fold relationships Template free modeling

21 Comparative modeling based on a clear sequence relationship Easily detectable sequence relationship between the target protein and one or more known protein structures, typically through BLAST Copy from template, however:  Must align target and template sequences  In general, reliably building regions not present in the template is still a challenge  Sidechain accuracy is poor Refinement remains a challenge

22 Comparative modeling based on a clear sequence relationship Progress in MD needed for refinement Models useful for identifying which members of a protein family have similar functionalities, and which are different

23 Modeling based on more distant evolutionary relationships Makes use of PSI-BLAST and hidden Markov models Compile a profile for the sequence, compare this profile to other known profiles Allows for prediction of structures, even when sequence is not close Use of metaservers to find consensus structures between CASP4 and CASP5 has led to improved accuracy

24 Modeling based on more distant evolutionary relationships Limitations:  Correct template may not be identified  Alignment of target sequence to template is not trivial  Significant fraction of residues will have no structural equivalent in the template; modeling of these regions is hit or miss  Although regions are similar, they are not identical, and the greater the difference, the higher the error Details are thus not accurate, but overall structure can be useful For improvements, must work together with template-free methodologies

25 Modeling based on more distant evolutionary relationships

26 Modeling based on non- homologous fold relationships Protein “threading” In recent CASP experiments, these methods have not been competitive with template free models

27 Template-free Modeling For sequences where no template is available Historically physics based approaches were used Newer methods focus on substructures  While we have not seen all folds, we have probably seen nearly all substructures Make use of substructure relationships  From a few residues through SSEs to super- secondary structures

28 Template-free Modeling Range of possible conformations and considered Most successful package has been ROSETTA For proteins less than ~100 residues, produce one or several approximately correct structures (4-6 A rmsd for C-alpha atoms) Selecting the most accurate structures from all possibilities is still to be solved, typically make use of clustering currently Development of atomic models is crucial to further progress

29 Template-free Modeling

30 CASP Progress


Download ppt "Protein Structure Prediction. Historical Perspective Protein Folding: From the Levinthal Paradox to Structure Prediction, Barry Honig, 1999 A personal."

Similar presentations


Ads by Google