Presentation is loading. Please wait.

Presentation is loading. Please wait.

Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain.

Similar presentations


Presentation on theme: "Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain."— Presentation transcript:

1 Protein Modeling Protein Structure Prediction

2 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain

3 The Protein Folding Problem we know that the function of a protein is determined in large part by its 3D shape (fold, conformation) can we predict the 3D shape of a protein given only its amino-acid sequence?

4 Motivation Want to identify the function of genes we find, and what different mutations/alleles do One gene = one protein (sort of) –Function of protein = function of gene Function can be determined in many ways –Gene expression, knockouts, etc –But these take time, and are prone to mistakes Goal: If we can structure every protein, learning their functions isn’t too far away

5 Thornton et al 2000 (Nature)

6

7 Protein Architecture proteins are polymers consisting of amino acids linked by peptide bonds each amino acid consists of –a central carbon atom –an amino group –a carboxyl group –a side chain differences in side chains distinguish different amino acids NH 2 COOH

8 3D Protein Structure bbaacckkbboonnee ssiiddeecchhaaiin n C- alpha

9 Peptide Bonds amino group carboxyl group side chain  carbon (common reference point for coordinates of a structure)

10 Amino Acid Side Chains side chains vary in: shape, size, charge, polarity

11 Levels of Description protein structure is often described at four different scales –primary structure –secondary structure –tertiary structure –quaternary structure

12 Levels of Description

13 Secondary Structure secondary structure refers to certain common repeating structures it is a “local” description of structure two common secondary structures  helices  strands/sheets a third category, called coil or loop, refers to everything else

14  Helices  carbon hydrogen bond individual amino acid

15  Sheets

16 RibbonDiagram Showing Secondary Structures

17 Levels of Description

18 What Determines Conformation? in general, the amino-acid sequence of a protein determines the 3D shape of a protein [Anfinsen et al., 1950s] but some exceptions –all proteins can be denatured –some proteins are inherently disordered (i.e. lack a regular structure) –some proteins get folding help from chaperones –there are various mechanisms through which the conformation of a protein can be changed in vivo –post-translational modifications such as phosphorylation –prions –etc.

19 What Determines Conformation? what physical properties of the protein determine its fold? –rigidity of the protein backbone –interactions among amino acids, including electrostatic interactions van der Waals forces volume constraints hydrogen, disulfide bonds –interactions of amino acids with water

20 Determining Protein Structures protein structures can be determined experimentally (in many cases) by –x-ray crystallography –nuclear magnetic resonance (NMR)

21 Myoglobin From www.inst.bnl.gov/GasDetectorLab/x-rays/SRI94.htm www.inst.bnl.gov/GasDetectorLab/x-rays/SRI94.htm

22 Myoglobin S.E.V. Phillips. "Structure and refinement of oxymyoglobin at 1.6 Å resolution.", J. Mol. Biol. 1980, 142, 531.

23 X-ray Crystallography protein crystal collection plate x-ray beam diffraction pattern electron density map (“3D picture”)

24 Electron Density Map Interpretation GIVEN: 3D Electron Density Map … …

25 Electron Density Map Interpretation FIND:All-atom Protein Model … …

26 NMR Nuclear Magnetic Resonance Spectroscopy Cannot handle large proteins like X-ray Exploits the chemical environment to return distances between atoms – Can use knowledge of restraints to identify positions of atoms that produce peaks

27 Protein structure determination in solution by NMR spectroscopy Wuthrich K. J Biol Chem. 1990 December 25;265(36):22059-62

28 Experimental Methods Very expensive and time-consuming – Computational methods can help with time Many proteins still cannot be done in this manner

29 More motivation there is a large sequence-structure gap ≈300K protein sequences in SwissProt database ≈50K protein structures in PDB database key question: can we predict structures by computational means instead?

30 Approaches to Protein Structure Prediction prediction in 1D –secondary structure –solvent accessibility (which residues are exposed to water, which are buried) –transmembrane helices (which residues span membranes) prediction in 2D –inter-residue/strand contacts prediction in 3D –homology modeling –fold recognition (e.g. via threading) –ab initio prediction (e.g. via molecular dynamics)

31 Prediction in 1D, 2D and 3D known secondary structure (E = beta strand) and solvent accessibility Figure from B. Rost, “Protein Structure in 1D, 2D, and 3D”, The Encyclopaedia of Computational Chemistry, 1998 predicted secondary structure and solvent accessibility

32 2D Prediction Approaches use secondary structure predictions to predict short-range contacts (e.g. hydrogen bonds in α helices) use secondary structure predictions to predict β strand alignments use correlated mutations to predict contacts

33 Prediction in 3D homology modeling given: a query sequence Q, a database of protein structures do: find protein P has high sequence similarity to Q return P’s structure as an approximation to Q’s structure fold recognition (threading) given: a query sequence Q, a database of known folds do: find fold F such that Q can be aligned with F in a highly compatible manner return F as an approximation to Q’s structure

34 Prediction in 3D fragment assembly(Rosetta) given: a query sequence Q, a database of structure fragments do: find a set of fragments that Q can be aligned with in a highly compatible manner return the combined fragments as an approximation molecular dynamics given: a query sequence Q do: use laws of physics to to simulate folding of Q

35

36 Homology Modeling 0%100% remote homologs 20%30% pairwise sequence identity homologs probably unrelated most pairs of proteins with similar structure are remote homologs (< 25% sequence identity) homology modeling usually doesn’t work for remote homologs ; most pairs of proteins with< 25% sequence identity are unrelated

37 Homology-based Prediction Raw model Loop modeling Side chain placement Refinement

38 The SCOP Database Structural Classification Of Proteins FAMILY: proteins that are >30% similar, or >15% similar and have similar known structure/function SUPERFAMILY: proteins whose families have some sequence and function/structure similarity suggesting a common evolutionary origin COMMON FOLD: superfamilies that have same secondary structures in same arrangement, probably resulting by physics and chemistry

39 Examples of Fold Classes

40 Threading

41 Ab initio Prediction – ROSETTA 1.PSI-BLAST – homology search Discard sequences with >25% homology 2.PHD For each 3-long and each 9-long sequence fragment, get 25 structure fragments that match “well” ? ?

42 ai.stanford.edu/~serafim/CS262_2006/Slides

43 Ab initio Prediction – CASP results

44 Summary of current state of the art

45 Open Ended Ab Initio is the goal, far from it Sidechain prediction Contact Map prediction Search space reduction Parallelization (GPUs) Surface Accessibility

46 Other areas Protein-Protein Interaction Drug Design Protein Engineering Ligand Docking/Inhibition Function Prediction


Download ppt "Protein Modeling Protein Structure Prediction. 3D Protein Structure ALA CαCα LEU CαCαCαCαCαCαCαCα PRO VALVAL ARG …… ??? backbone sidechain."

Similar presentations


Ads by Google