Presentation is loading. Please wait.

Presentation is loading. Please wait.

Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc. Protein Structure Determination Lecture 9.

Similar presentations


Presentation on theme: "Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc. Protein Structure Determination Lecture 9."— Presentation transcript:

1 Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc. Protein Structure Determination Lecture 9

2 Selenomethionine MAD Selenomethionine is the amino acid methionine with the Sulfur replaced by a Selenium. Selenium is a heavy atom that also has the propery of "anomalous scatter" at some wavelengths, and not at others. Proteins grown in sMet will incorporate teh Se atoms. These crystals can be solved by Multiwavelength Anomalous Dispersion (MAD). 3 data sets are collected on a single crystal (sometimes at the same time!) F 1 PH, F 2+ PH, F 2+ PH

3 Anomalous dispersion Heavy atom free electrons bound electrons Inner electrons scatter with a phase shift relative to the phase of the free electrons. An anomalous scatter at the Origin scatters with phase . ∆F r ∆F i real part of anomalous scatter imaginary part of anomalous scatter  phase shift of anomalous scatter

4 Reminder: Friedel's Law F(h k l) F(-h -k -l)   Friedel mates have same amplitude, opposite sign. F + (h k l) F - (h k l)

5 Anomalous dispersion violates Friedel's Law F(h k l) F(-h -k -l)   ∆F + (h k l) ∆F - (h k l) + and - contributions for just a heavy atom with anomalous scattering.

6 MAD experiment F + 2 (h k l) F - 2 (h k l) F + 1 (h k l) 1 is a wavelength where Se does not absorb, normal diffraction. 2 is a wavelength where Se does absorb, anomalous ∆F is added. F - 1 (h k l) Protein is grown in sMet (or another anomalous scatterer ispresent, such as Iron). Three data seta are collected on 1 crystal at multiple (usually 2) wavelengths. Friedel mates (same amplitude, merged) collected non-anomalous wavelength. Friedel mates (different, not merged) collected anomalous dispersion wavelength.

7 Friedel mates with anomalous have different amplitudes F + 2 (h k l) F - 2 (h k l) F + 1 (h k l) F - 1 (h k l)

8 Friedel mates with anomalous have different amplitudes F + 2 (h k l) F - 2 (h k l)* F 1 (h k l)

9 Harker diagram with anomalous F 1 (h k l) F - 2 (h k l)*

10 Harker diagram with anomalous F + 2 (h k l)

11 Harker diagram with anomalous F 1 (h k l) F + 2 (h k l) F - 2 (h k l)*

12 MAD phasing using vector math F - 2 (h k l)* = F 1 (h k l) + ∆F - (h k l) |F - 2 (h k l)|e -i  2 - = |F 1 (h k l)|e i  1 + ∆F - (h k l)e i(  h -  h ) |F - 2 (h k l)|cos(  2 -) = |F 1 (h k l)|cos(  1 ) + |∆F - (h k l)|cos(  h -  h ) -|F - 2 (h k l)|sin(  2 -) = |F 1 (h k l)|sin(  1 ) + |∆F - (h k l)|sin(  h -  h ) F + 2 (h k l) = F 1 (h k l) + ∆F + (h k l) |F + 2 (h k l)|e i  2 + = |F 1 (h k l)|e i  1 + ∆F + (h k l)e i(  h +  h ) |F - 2 (h k l)|cos(  2 +) = |F 1 (h k l)|cos(  1 ) + |∆F - (h k l)|cos(  h +  h ) |F - 2 (h k l)|sin(  2 +) = |F 1 (h k l)|sin(  1 ) + |∆F - (h k l)|sin(  h +  h ) Four equations in three unknowns:  1,  2 -,  2 +

13 Phase probability distribution Radii are F p and k*F ph Width are  p and k*  ph The red area are the places in Argand space where both F P and F PH -F H can be

14 Most probable versus best phase The most probable phase is not necessarily the “best” for computing the first e-density map. weighted average, best phase Shaded regions are possible F p and F ph solutions.

15 Figure of merit Figure of merit “m” is a measure of how good the phases are. C is the “center of mass” of a ring of phase probabilities ( probability is the “mass” ). Assume the radius of the ring is 1. If the probabilities are sharply distributed, m≈1. If they are distributed widely, m is smaller. F best (hkl) = F(hkl)*m*e -i  best

16 Centric reflections If the crystal has centrosymmetric symmetry, all reflections are centric, maving phase = 0° or 180° If the crystal has 2-fold, 4-fold or 6-fold rotational symmetry, then the reflections in the 0-plane are centric. (Because the projection of the density is centrosymmetric) For centric reflections: |F ph | = |F p | ± |F h | This means the amplitude |F h | is exact* for centrosymmetric reflections. *assuming perfect scaling.

17 F's that are Syms AND Friedel mates: centric reflections R R Draw any set of Bragg planes parallel to the 2-fold. Project the density onto a line. Notice: The projected density is centrosymmetric. Therefore, phase can only be 0 or 180°. Example: F(0 k l) and F(0 -k -l) if a is a 2-fold.

18 Heavy atom phasing methods SIR = single isomorphous replacement, without anomalous. Fourier transform uses: Figure-of-merit weighted amplitudes, alpha-best phases and centric reflections. MIR = multiple isomorphous replacement, without anomalous. Same Fourier terms, but Figure-of-merit is generally better than for SIR.

19 Heavy atom phasing methods SAD = single wavelength anomalous dispersion. Phases from F+ and F- from one crystal. Fourier transform uses: Figure-of-merit weighted amplitudes, alpha-best phases and centric reflections. MAD = multi-wavelength anomalous dispersion. Phases from three datasets from one crystal at 2 wavelengths (or more). F+, F- at anomalous wavelength, and F at non- anomalous wavelength.

20 In class exercise: phase error F P =5.00  =0.5 F PH1 =5.50  =0.8F H1 =2.23  H1 =-63.4° F PH2 =4.50  =0.9F H2 =0.50  H2 =-164° (1) Draw three circles separated by vectors F H1 and F H2. (2) Draw circular “error bars” of width 2 . (3) Draw circle plot of F p phase probabilities. (4) Estimate the centroid c of probabilit. (5) What is the Figure of Merit, m?

21 Is the initial map good enough? (1) The map is calculated using  best. (2) The map is contoured and displayed using {InsightII, MIDAS, XtalView, FRODO, O,...} (3) A “trace” is attempted.

22 Model building e - density cages (1  contours) displayed using InsightII

23 Information used to build the first model: Sequence and Stereochemistry...plus assorted disulfide and ligand information. Models are built initially by identifying characteristic sidechains (by their shape) then tracing forward and backward along the backbone density until all amino acids are in place. Alpha-carbons can be placed by hand, and numbered, then an automated program will add the other atoms (MaxSprout).

24 Tracing an electron density map sequence: AGDLLEHEIFGMPPAGGA Can you locate the density above in the sequence? Class exercise:

25 DeepView In a web browser on Linux goto: eds.bmc.uu.se/eds/ Enter PDB code: 1dfn Download, Maps Format-->CCP4, Type-->2mFo-DFc, Generate Map, Download linked file. In Linux terminal window: gunzip 1dfn.ccp4.gz In a web browser window on Linux goto: www.rcsb.org Search for: 1dfn Click on Save as: 1dfn.pdb In a Linux terminal window type ( make an alias! ): /afs/rpi.edu/dept/bio/pub/SPDBV/bin/spdbv.Linux Open PDB file: 1dfn.pdb Open electron density map (CCP4): 1dfn.ccp4 Rotate using mouse.

26 R-factor: How good is the model? Depending on the space group, an R-factor of ~55% would be attainable by scaled random data. The R-factor must be < ~50%. Note: It is possible to get a high R-factor for a correct model. What kind of mistake would do this? Calculate F calc ’s based on the model. Compute R-factor

27 What can you do if the phases are not good enough? 1. Collect more heavy atom derivative data 2. Try density modification techniques. Density modification : Fo’s and (new) phases Map Modified map Fc’s and new phases initial phases

28 Density modification techniques Solvent Flattening: Make the water part of the map flat. (1) Draw envelope around protein part (2) Set solvent  to and back transform.

29 Solvent flattening Requires that the protein part can be distinguished from the solvent part. BC Wang’s method: Smooth the map using a 10Å Guassian. Then take the top X% of the map, where X is calculated from the crystal density.

30 Skeletonization (1) Calculate map. (2) Skeletonize it (draw ridge lines) (3) Prune skeleton so that it is “protein-like” (4) Back transform the skeleton to get new phases. Protein-like means: (a) no cycles, (b) no islands

31 Non-crystallographic symmetry If there are two molecules in the ASU, there is a matrix and vector that rotate one to the other: Mr 1 + v = r 2 (1) Using Patterson Correlation Function, find M and v. (2) Calculate initial map. (3) Set  (r 1 ) and  (r 2 ) to (  (r 1 ) +  (r 2 ) )/2 (4) Back transform to get new phases. R R R R

32 What does a good map look like? Before computers, maps were contoured on stacked pieces of plexiglass. A “Richards box” was used to build the model. half- silvered mirror plexiglass stack brass parts model

33 Low-resolution At 4-6Å resolution, alpha helices look like sausages.

34 Medium resolution ~3Å data is good enough to se the backbone with space inbetween.

35 The program BONES traces the density automatically, if the phases are good.

36 BONES models need to be manually connected and sidechains attached. MaxSPROUT converts a fully connected trace to an all-atom model.

37 Errors in the phases make some connections ambiguous.

38 Contouring at two density cutoffs sometimes helps

39 Holes in rings are a good thing Seeing a hole in a tyrosine or phenylalanine ring is universally accepted as proof of good phases. You need at least 2Å data.

40 Can you see in stereo? Try this at home. In 3D, the density is much easier to trace.

41 New rendering programs “CONSCRIPT: A program for generating electron density isosurfaces for presentation in protein crystallography.” M. C. Lawrence, P. D. Bourke

42 Great map: holes in rings

43 Superior map: Atomicity Rarely is the data this good. 2 holes in Trp. All atoms separated.

44 Only small molecule structures are this good Atoms are separated down to several contours. Proteins are never this well-ordered. But this is what the density really looks like.

45 Refinement The gradient* of the R-factor with respect to each atomic position may be calculated. Each atom is moved down-hill along the gradient. “Restraints” may be imposed. *

46 What is a restraint? A restraint is a function of the coordinates that is lowest when the coordinates are “ideal”, and which increases as the coordinates become less ideal.. Stereochemical restraints bond lengths bond angles torsion angles also... planar groups B’s

47 Calculated phases, observed amplitudes = hybrid F's F c ’s are calculated from the atomic coordinates A new electron density map calculated from the F c 's would only reproduce the model. (of course!) Instead we use the observed amplitudes |F o |, and the model phases,  c. Hybrid back transform: Hybrid maps show places where the current model is wrong and needs to be changed.

48 Difference map: F o -F c amplitudes The F o “native” map  (F o ) differs from the Fc map  (F c ) in places where the model is wrong. So we take the difference. In the difference map: Missing atoms?  (F o -F c ) > 0.0 Wrongly placed atoms?  (F o -F c ) < 0.0 Correctly modeled atoms?  (F o -F c ) = 0.0 Q: Subtracting densities (real space) is the same as subtracting amplitudes (reciprocal space) and transforming. T or F?

49 2Fo-Fc amplitudes The F o map plus the difference map is F o where the differences are zero (the atoms are correct) Less than F o where the model has wrong atoms. Greater than F o where the model is missing atoms. F o + (F o -F c ) = 2F o -F c

50 The “free R-factor”: cross- validation The free R-factor is the test set residual, calculated the same as the R-factor, but on the “test set”. Free R-factor asks “how well does your model predict the data it hasn’t seen?” Note: the only difference is which hkl are used to calculate.

51 Why cross-validate? If you have three points, you can fit them to a quadratic equation (3 parameters) with zero residual, but is it right? Observed data R-factor = 0.000!! calculated

52 Fitting and overfitting Fit is correct if additional data, not used in fitting the curve, fall on the curve. Low residual in the “test set” justifies the fit. residual≠0

53 cross-validation =Measuring the residual on data (the “test set”) that were not used to create the model. The residual on test data is likely to be small if is large. a line has 2 parameters

54 parameters versus data Example from Drenth, Ch 13: Papain crystal structure has 25,000 reflections. Papain has 2000 non-H atoms times 4 parameters each (x, y, z, B) equals 8000 parameters data/parameters = 25,000/8000 ≈ 3 <-- this is too small!

55 restraints are “data” bond lengths bond angles torsion angles Bond lengths, angles, etc. are “measurements” that must be fit by the model. The true “residual” should include deviations from ideal bond lengths, angles, etc. In practice, residual in restraints (e.g. deviations from ideal bond lengths, angles) is very low. This means that restraints are essentially “constraints”. planar groups van der Waals

56 constraints reduce the number of parameters bond lengths bond angles Bond lengths, angles, and planar groups may be fixed to their ideal values during refinement (“Torsion angle refinement”). Using constraints, Ser has 3 parameters, Phe 4, and Arg 6. There are an average 3.5 torsion angles per residue. Papain has ~700 torsion angle parameters.  data/parameter=25,000/700≈35 planar groups

57 radius of convergence total residual parameter space...=How far away from the truth can it be, and still find the truth? radius of convergence depends on data & method. More data = fewer false (local) minima Better method = one that can overcome local minima

58 Molecular dynamics w/ Xray refinement MD samples conformational space while maintaining good geometry (low residual in restraints). E = (residual of restraints) + (R-factor) dE/dx i is calculated for each atom i, then we move i downhill. Random vectors added, proportional to temperature T. The simulated annealing MD method: (1) start the simulation “hot” (2) “cool” slowly, trapping structure in lowest minimum. “X-plor” Axel Brünger et al

59 Phase bias, and how to fix it. The model biases the phases. The effect of phase bias is local to the errors. To correct a part of the model, we must first remove that part. An “OMIT MAP” is calculated. The phases for an omit map are derived from a partial model, where some small part has been omitted.

60 Omit maps This residue has been removed before calculating F c. 2F o -F c density = F o + (F o - F c ) = The native map plus the difference map.

61 Two inhibitor peptides bound to thrombin. The inhibitors were omited from the F c calculation. (stereo images) FÉTHIÈRE et al, Protein Science (1996), 5: 1174- 1183.

62 The final model Other data commonly reported: total unique reflections, completeness, free R-factor

63 From crystal to data native data: F p Indexed film Intensity, I p (hkl) = F 2 I is relative Bigger crystal, higher I Better crystal, higher I Longer exposure, higher I More intense Xrays, higher I Because there is no absolute scale: F p and F ph are on different scales Internal scaling

64 From data to Patterson map native data: F p heavy atom data: F ph Calculate difference Patterson Find the best scale factor, w Calculate F diff = w*|F ph | – |F p |

65 From data to phases native data: F p heavy atom data: F ph Calculate difference Patterson Find heavy atom peaks on Harker sections Solve for heavy atom positions using symmetry Calculate heavy atom vectors Estimate phases

66 From data to model Collect native data: F p Collect heavy atom data: F ph Estimate phases Calculate  Trace the map Refine Is the map traceable? density modification? yes no


Download ppt "Multiwavelength Anomalous Dispersion, Density modification, Molecular replacement, etc. Protein Structure Determination Lecture 9."

Similar presentations


Ads by Google