Solving NMR structures II hydrogen bond restraints chemical shift information structure determination methods evaluating/describing NMR structures.

Slides:



Advertisements
Similar presentations
Protein NMR terminology COSY-Correlation spectroscopy Gives experimental details of interaction between hydrogens connected via a covalent bond NOESY-Nuclear.
Advertisements

NMR - Recall From Last Week
Determination of Protein Structure. Methods for Determining Structures X-ray crystallography – uses an X-ray diffraction pattern and electron density.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
One-dimensional Spectra Provides 1. Chemical shifts & Relative Intensities 2. J-couplings.
Solving NMR Structures II: Calculation and evaluation What NMR-based (solution) structures look like the NMR ensemble inclusion of hydrogen coordinates.
CENTER FOR BIOLOGICAL SEQUENCE ANALYSISTECHNICAL UNIVERSITY OF DENMARK DTU Homology Modeling Anne Mølgaard, CBS, BioCentrum, DTU.
Incorporating additional types of information in structure calculation: recent advances chemical shift potentials residual dipolar couplings.
Solving NMR structures II: Calculation and evaluation The NMR ensemble Methods for calculating structures distance geometry, restrained molecular dynamics,
Resonance assignment in proteins sequence of lysozyme: KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAA KFESNFNTQATNRNTDGSTDYGILQINSRWWCN DGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVS.
Solving NMR structures I --deriving distance restraints from crosspeak intensities in NOESY spectra --deriving dihedral angle restraints from J couplings;
. Protein Structure Prediction [Based on Structural Bioinformatics, section VII]
Solving NMR structures Part I: Commonly used experimentally derived restraints Distance restraints from crosspeak intensities in NOESY spectra; measuring.
Protein Basics Protein function Protein structure –Primary Amino acids Linkage Protein conformation framework –Dihedral angles –Ramachandran plots Sequence.
Chemical shifts and structure chemical shifts depend upon local electron distributions, bond hybridization states, proximity to polar groups, nearby aromatic.
Solving NMR structures Part I: Experimentally derived restraints 1. Distance restraints from crosspeak intensities in NOESY spectra; measuring and calibrating.
Proteins: Levels of Protein Structure Conformation of Peptide Group
Bioinf. Data Analysis & Tools Molecular Simulations & Sampling Techniques117 Jan 2006 Bioinformatics Data Analysis & Tools Molecular simulations & sampling.
Chapter 8 Covalent Bonding. The Covalent Bond Atoms will share electrons in order to form a stable octet. l Covalent bond : the chemical bond that results.
What are proteins? Proteins are important; e.g. for catalyzing and regulating biochemical reactions, transporting molecules, … Linear polymer chain composed.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
02/03/10 CSCE 769 Dihedral Angles Homayoun Valafar Department of Computer Science and Engineering, USC.
RNA Secondary Structure Prediction Spring Objectives  Can we predict the structure of an RNA?  Can we predict the structure of a protein?
Bioinformatics: Practical Application of Simulation and Data Mining Protein Folding I Prof. Corey O’Hern Department of Mechanical Engineering & Materials.
Comparing Data from MD simulations and X-ray Crystallography What can we compare? 3D shapes (Scalar coupling constants, a.k.a. J-values, nuclear Overhauser.
Biomolecular Nuclear Magnetic Resonance Spectroscopy BASIC CONCEPTS OF NMR How does NMR work? Resonance assignment Structure determination 01/24/05 NMR.
-1/2 E +1/2 low energy spin state
Department of Mechanical Engineering
Biomolecular Nuclear Magnetic Resonance Spectroscopy FROM ASSIGNMENT TO STRUCTURE Sequential resonance assignment strategies NMR data for structure determination.
Conformational Entropy Entropy is an essential component in ΔG and must be considered in order to model many chemical processes, including protein folding,
A Technical Introduction to the MD-OPEP Simulation Tools
Altman et al. JACS 2008, Presented By Swati Jain.
Introduction to Protein Structure Prediction BMI/CS 576 Colin Dewey Fall 2008.
Copyright © 2010, 2007, 2004 Pearson Education, Inc Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by.
JG/10-09 NMR for structural biology DNA purification Protein domain from a database Protein structure possible since 1980s, due to 2-dimensional (and 3D.
Protein Folding & Biospectroscopy Lecture 6 F14PFB David Robinson.
Structural classification of Proteins SCOP Classification: consists of a database Family Evolutionarily related with a significant sequence identity Superfamily.
Measurements and Their Analysis. Introduction Note that in this chapter, we are talking about multiple measurements of the same quantity Numerical analysis.
3D Triple-Resonance Methods for Sequential Resonance Assignment of Proteins Strategy: Correlate Chemical Shifts of Sequentially Related Amides to the Same.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
How NMR is Used for the Study of Biomacromolecules Analytical biochemistry Comparative analysis Interactions between biomolecules Structure determination.
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
Enzymes SADIA SAYED. Enzymes are proteins  All enzymes are proteins  Strings of amino acids folding up into distinct structures  The properties of.
Uses of NMR: 1) NMR is a method of chemical analysis
Protein Structure BL
The Contribution of Entropy, Enthalpy, and Hydrophobic Desolvation to Cooperativity in Repeat-Protein Folding  Tural Aksel, Ananya Majumdar, Doug Barrick 
Volume 14, Issue 3, Pages (March 2006)
Volume 24, Issue 7, Pages (July 2016)
Backbone Dynamics of the 18
Barley lipid-transfer protein complexed with palmitoyl CoA: the structure reveals a hydrophobic binding site that can expand to fit both large and small.
Understanding protein folding via free-energy surfaces from theory and experiment  Aaron R Dinner, Andrej Šali, Lorna J Smith, Christopher M Dobson, Martin.
Axel T Brünger, Paul D Adams, Luke M Rice  Structure 
Richard C. Page, Sanguk Kim, Timothy A. Cross  Structure 
Volume 13, Issue 9, Pages (December 2015)
Coarse-Grained Peptide Modeling Using a Systematic Multiscale Approach
Structure of Bax  Motoshi Suzuki, Richard J. Youle, Nico Tjandra  Cell 
Volume 21, Issue 10, Pages (October 2013)
G. Fiorin, A. Pastore, P. Carloni, M. Parrinello  Biophysical Journal 
The Arginine-Rich RNA-Binding Motif of HIV-1 Rev Is Intrinsically Disordered and Folds upon RRE Binding  Fabio Casu, Brendan M. Duggan, Mirko Hennig 
Statistical Prediction and Molecular Dynamics Simulation
Combining Efficient Conformational Sampling with a Deformable Elastic Network Model Facilitates Structure Refinement at Low Resolution  Gunnar F. Schröder,
Solution Structure of the Cyclotide Palicourein
Volume 95, Issue 9, Pages (November 2008)
Unmasking the Annexin I Interaction from the Structure of Apo-S100A11
Feng Ding, Douglas Tsao, Huifen Nie, Nikolay V. Dokholyan  Structure 
Solution Structure of the Proapoptotic Molecule BID
Backbone Dynamics of the 18
Volume 27, Issue 7, Pages e5 (July 2019)
Tertiary structure of an immunoglobulin-like domain from the giant muscle protein titin: a new member of the I set  Mark Pfuhl, Annalisa Pastore  Structure 
Volume 9, Issue 2, Pages (February 2001)
Presentation transcript:

Solving NMR structures II hydrogen bond restraints chemical shift information structure determination methods evaluating/describing NMR structures

Amide hydrogen exchange amide protons undergo acid- and base-catalyzed exchange with solvent protons at a rate which ranges from the second to minute time scale, depending upon conditions if a protein is placed in D 2 O, the amide signals due to 1 H nuclei will disappear over time due to this exchange poly-D,L-alanine exchange rate in D2O at 20 °C. Minimum with respect to pH is due to the fact that exchange is both acid and base catalyzed Englander & Mayne, Ann. Rev. Biophys. Biomol. Struct. (1992) 21, 243.

Amide hydrogen exchange when amide protons are involved in hydrogen bonds in a folded protein, they are protected from exchange with solvent and exchange somewhat more slowly faster: N-H --> N-D slower: N-H....O=C --> N-D....O=C

Protection factors the protection factor P for a given amide proton is the intrinsic rate of exchange expected for that amide proton in an unfolded protein under a given set of solvent conditions, divided by the observed rate of exchange in the native state P= k ex (U)/k ex (N) where U is the unfolded state and N is the native state intrinsic rates of exchange vary with the amino acid sequence and the solvent conditions in a predictable manner--there are published tables and websites where you can look them up/calculate them.

Measuring exchange rates 15 N- 1 H HSQC spectra are a good way to monitor exchange because individual well-resolved peaks are visible for most amides one might record a spectrum in H 2 O, freeze dry the sample, resuspend in D 2 O, and record a series of spectra in D 2 O to monitor the rate at which each amide proton disappears

15 N- 1 H HSQC of Arc repressor 5 min after resuspension in D 2 O (pH 4.7)-- note that unprotected amides such as the unstructured N-terminus (1-7) have already exchanged. This is because intrinsic half-lives at this pH are in the vicinity of 10 seconds to a minute. Burgering et al. Biopolymers (1995) 35, 217. measured exchange rates from a series of 15 N- 1 H HSQC spectra-- note that protected amides are in secondary structure elements

blue spheres: P > 4650 green spheres: 370 < P < 4650 red spheres: P < 370 Liu et al. Biochemistry (1999) 38, HX (hydrogen exchange) of Syrian hamster prion protein note that while all protected amides are hydrogen-bonded, not all hydrogen-bonded amides are equally well protected. Rates differ both within secondary structure elements (buried positions near the center usually most protected) and between different secondary structure elements surface is less well-protected ends of helices less well-protected

Hydrogen bond restraints amide protons showing significant protection are inferred to be involved in hydrogen bonds, but it is not clear from this data alone what the identity of the hydrogen bond acceptor group is. however, if additional information is available, one can infer what the acceptor is. For instance, d  (i-3,i) and d  (i-4,i) nOe’s such as are characteristic of  -helices imply that the protected amide of residue i is hydrogen bonded to the carbonyl of residue i-4. hydrogen bond restraints are usually expressed as a pair of distance restraints, e.g d HN-O = Å, d N-O = Å. A pair of restraints like this ensures reasonable good geometry for the hydrogen bond.

Chemical shifts and structure chemical shifts depend upon local electron distributions, bond hybridization states, proximity to polar groups, nearby aromatic rings, local magnetic anisotropies. in other words, chemical shifts depend upon the structural environment and thus can provide information about structure observation of relationships between chemical shifts and protein secondary structure has led to the development of some useful empirical rules

H  chemical shifts and secondary structure H  chemical shifts are generally lower for  -helices than for  sheets the figure at right shows distributions of H  chemical shifts observed in sheets (lighter bars) and helices (darker bars). The black bar in each distribution is the median. H   chemical shifts in  -helices are on average 0.39 ppm below “random coil” values, while  - sheet values are 0.37 ppm above random coil values. Wishart, Sykes & Richards J Mol Biol (1991) 222, 311.

Chemical shift index (CSI) trends like these led to the development of the concept of the chemical shift index* as a tool for assigning secondary structure using chemical shift values. one starts with a table of reference values for each amino-acid type, which is essentially a table of “random coil” H   values CSI’s are then assigned as follows: exp’tl H a shift rel. to reference assigned CSI within ± 0.1 ppm 0 >0.1 ppm lower-1 >0.1 ppm higher +1 *Wishart, Sykes & Richards Biochemistry (1992) 31,

Chemical shift indices one can then plot CSI vs. sequence and assign 2ndary str. as follows: any “dense” grouping of four or more “-1’s”, uninterrupted by “1’s” is assigned as a helix, while any “dense” grouping of three or more “1’s”, uninterrupted by “-1’s”, is assigned as a sheet. a “dense” grouping means at least 70% nonzero CSI’s. other regions are assigned as “coil” this simple technique assigns 2ndary structure w/90-95% accuracy similar useful relationships exist for 13 C , 13 C  13 C C=O shifts CSI residue #

here’s a figure showing deviations of C  and C  chemical shifts from random coil values when in either  -helix or  -sheet conformation for  -helices, C  shifts are higher than normal, whereas C  shifts are lower than normal. Note in your reading of the apo-calmodulin paper that C  shifts are used in characterizing the structure of the interdomain linker

Chemical shifts and structural restraints so CSI is a useful technique for identifying secondary structures from chemical shift data are explicit restraints on structure, whether distances or angles or what have you, derivable from chemical shifts, analogous to our nOe-derived distance restraints, our coupling-constant-derived dihedral angle restraints and our hydrogen-exchange derived hydrogen bond restraints? we’ll come back to this question a little later--for now, let’s consider how NMR structures are generated from nOe-derived distance restraints and J-coupling derived dihedral angle restraints.

Calculating NMR structures so we’ve talked some about getting qualitative structural information from NMR, for instance certain secondary structures have characteristic nOe’s, J-couplings and chemical shifts associated with them we’ve also talked about the concept of explicit distance or dihedral angle or hydrogen bond restraints from nOe and J-coupling data etc. how might we use such restraints to actually calculate a detailed, quantitative three-dimensional structure at a high level of accuracy and precision?

In NMR we don’t get a single structure the very first thing to recognize is that our input restraints do not uniquely define a structure at infinitely high precision (resolution) and accuracy--we can never have enough restraints, determined at high enough accuracy and precision, to do that! rather, a set of many closely related structures will be compatible with these restraints--how closely related these compatible structures are will depend on how good/complete our data are! the goal of NMR structure determination is therefore to produce a group of possible structures which is a fair representation of this compatible set.

The NMR Ensemble repeat the structure calculation many times to generate an ensemble of structures consistent w/restraints ideally, the ensemble is representative of the permissible structures--the RMSD between ensemble members accurately reflects the extent of structural variation permitted by the restraints ensemble of 25 structures for Syrian hamster prion protein Liu et al. Biochemistry (1999) 38, 5362.

Random initial structures to get the most unbiased, representative ensemble, it is wise to start the calculations from a set of randomly generated starting structures

Calculating the structures--methods distance geometry (DG) restrained molecular dynamics (rMD) simulated annealing (SA) hybrid methods

DG--Distance geometry In distance geometry, one uses the nOe-derived distance restraints to generate a distance matrix, from which one then calculates a structure Structures calculated from distance geometry will produce the correct overall fold but usually have poor local geometry (e.g. improper bond angles, distances) hence distance geometry must be combined with some extensive energy minimization method to generate good structures

rMD--Restrained molecular dynamics Molecular dynamics involves computing the potential energy V with respect to the atomic coordinates. Usually this is defined as the sum of a number of terms: V total = V bond + V angle + V dihedr + V vdW + V coulomb + V NMR the first five terms here are “real” energy terms corresponding to such forces as van der Waals and electrostatic repulsions and attractions, cost of deforming bond lengths and angles...these come from some standard molecular force field like CHARMM or AMBER the NMR restraints are incorporated into the V NMR term, which is a “pseudoenergy” or “pseudopotential” term included to represent the cost of violating the restraints

Pseudo-energy potentials for rMD Generate fake energy potentials representing the cost of violating the distance or angle restraints. Here’s an example of a distance restraint potential K NOE (r ij -r ij 1 ) 2 if r ij <r ij l K NOE (r ij -r ij u ) 2 if r ij >r ij u 0if r ij l <r ij < r ij u V NOE = where r ij l and r ij u are the lower and upper bounds of our distance restraint, and K NOE is some chosen force constant, typically ~ 250 kcal mol -1 nm -2 So it’s somewhat permissible to violate restraints but it raises V

SA-Simulated annealing SA is very similar to rMD and uses similar potentials but employs raising the temperature of the system and then slow cooling in order not to get trapped in local energy minima SA is very efficient at locating the global minimum of the target function

Ambiguous restraints often not possible to tell which atoms are involved in a NOESY crosspeak, either because of a lack of stereospecific assignments or because multiple protons have the same chemical shift possible to resolve many of these ambiguities iteratively during the calculation process can generate an initial ensemble with only unambiguous restraints, and then use this ensemble to resolve ambiguities--e.g., if two atoms are never closer than say 9 Å in any ensemble structure, one can rule out an nOe between them can also make stereospecific assignments iteratively using what are called floating chirality methods there are now automatic routines for iterative assignment such as the program ARIA.

Criteria for accepting structures typical to generate 50 or more structures, but not all will converge to a final structure consistent with the restraints therefore one uses acceptance criteria for including calculated structures in the ensemble, such as –no more than 1 nOe distance restraint violation greater than 0.4 Å –no dihedral angle restraint violations greater than 5 –no gross violations of reasonable molecular geometry sometimes structures are rejected on other grounds as well, such as having multiple residues with backbone angles in disallowed regions of Ramachandran space or simply having high potential energy in rMD simulations

Precision of NMR Structures (Resolution) judged by RMSD of ensemble of accepted structures RMSDs for both backbone (C , N, C C=O ) and all heavy atoms (i.e. everything except hydrogen) are typically reported, e.g. bb 0.6 Å heavy 1.4 Å sometimes only the more ordered regions are included in the reported RMSD, e.g. for a 58 residue protein you will see RMSD (residues 5-58) if residues 1-4 are completely disordered.

Reporting RMSD two major ways of calculating RMSD of the ensemble: –pairwise: compute RMSDs for all possible pairs of structures in the ensemble, and calculate the mean of these RMSDs –from mean: calculate a mean structure from the ensemble and measure RMSD of each ensemble structure from it, then calculate the mean of these RMSDs –pairwise will generally give a slightly higher number, so be aware that these two ways of reporting RMSD are not completely equal. Usually the Materials and Methods, or a footnote somewhere in the paper, will indicate which is being used.

“Minimized average” a minimized average is just that: a mean structure is calculated from the ensemble and then subjected to energy minimization to restore reasonable geometry, which is often lost in the calculation of a mean this is NMR’s way of generating a single representative structure from the data. It is much easier to visualize structural features from a minimized average than from the ensemble. for highly disordered regions a minimized average will not be informative and may even be misleading--such regions are sometimes left out of the minimized average sometimes when an NMR structure is deposited in the PDB, there will be separate entries for both the ensemble and the minimized average. It is nice when people do this.

What do we need to get a high- resolution NMR structure? usually ~15-20 nOe distance restraints per residue, but the total # is not as important as how many long-range restraints you have, meaning long-range in the sequence: |i-j|> 5, where i and j are the two residues involved good NMR structures usually have ≥ ~ 3.5 long-range distance restraints per residue in the structured regions to get a very good quality structure, it is usually also necessary to have some stereospecific assignments, e.g.  hydrogens; Leu, Val methyls

Assessing Structure Quality NMR spectroscopists usually run their ensemble through the program PROCHECK-NMR to assess its quality high-resolution structure will have backbone RMSD ≤ ~0.8 Å, heavy atom RMSD ≤ ~1.5 Å low RMS deviation from restraints will have good stereochemical quality: –ideally >90% of residues in core (most favorable) regions of Ramachandran plot –very few “unusual” side chain angles and rotamers (as judged by those commonly found in crystal structures) –low deviations from idealized covalent geometry

Structural Statistics Tables list of restraints, # and type precision of structure (RMSD) agreement of ensemble structures with restraints (RMS) calculated energies sometimes also see listings of Ramachandran statistics, deviations from ideal covalent geometry, etc.