Anomalous Differences Bijvoet differences (hkl) vs (-h-k-l) Dispersive Differences 1 (hkl) vs 2 (hkl) From merged (hkl)’s
X-ray Anomalous Scattering
Dependency on f ’ and f ” Anomalous differences are proportional to 2 f ” Dispersive differences are largely dependent on the f ’ between the wavelengths
Se Absorption Edge X-ray energy in eV f ’ and f ” (electrons) f’f’ f”f” Peak Remote (high) Inflection point
Anomalous differences X-ray energy in eV f ’ and f ” (electrons) f”
Hg Anomalous signal 2.5 x stronger signal over Se X-ray energy in eV 10 f ’ and f ” (electrons)
Dispersive differences X-ray energy in eV f ’ and f ” (electrons) f” f’
Crick and Magdoff Equation (1956) Perturbation due to f ” = (N A /2N T ) 1/2 (2 f ” A /Z eff ) Perturbation due to f ’ = (N A /2N T ) 1/2 ( f ’ A i - f ’ A j /Z eff ) Where N A = number of anomalous scatters, with anomalous scattering factors f ’ and f ” N T = total number of atoms in the structure Z eff = effective normal scattering power for all atoms (6.7e for protein atoms at 2 = 0) Since perturbations to f’ and f” are orthogonal, the net expected signal is the root mean square of these two quantities.
Crick and Magdoff Equation “Pessimistic” means 60% occupancy and 60% optimal f’ and f”
Met8p Space Group C2 6 SeMet x 3 molecules (273 aa) 2.8Å6.7%
COPOX C2 Crystal form 32 Se in ASU (8 SeMet x 4 molecules, 333aa) 2.8Å9.8%
COPOX P3 Crystal form 16Se’s (8 SeMet x 2 molecules, 333aa) 3.5Å10%
To determine the quality of data required to see the achievable signal you need to evaluate the required intensity over background. I = variance(I) = 2 (I), or = sqrt(I) If we want signal to be larger than 2 then if the anomalous signal is 0.03*I (3%) then we want: 0.03*I > 2 (I) –0.03*I > 2*sqrt(I) –Sqrt(I) > 2/0.03 –I > 4444 So, each measured intensity must be at least 4500 counts above background.
….or look at Rmerge If the signal is 3% you need Rmerge < 0.03 Ethan Merritt’s site suggests that you in fact require the Rmerge < signal in the resolution shell where you are comparing the signal. –This is not what I have just shown in real life cases which have both worked and not worked.
Analysis: Anomalous and Dispersive differences Differences should not be higher than the theoretical maximum Trend in differences should follow wavelength expectations –Peak with highest anomalous signal –Inflection vs Remote to have biggest dispersive diff. Good Se data set will have anomalous differences around 6-8%
column 1: bin number column 5: average resolution in bin column 6: / (signed difference) column 7: / (absolute diff) column 8: sqrt( )/sqrt(1/2( ^2+ ^2)) column 9: fraction of theoretically complete data #bin | resolution range | #refl | CNS-Analyse.inp (copox)
CNS analyse.inp “analyse.matrix” f_pf_rf_i f_p0.###0.xxx f_r0.###0.xxx f_i0.### Dispersive differences Anomalous differences Overall values for resolution range Å. sqrt( )/sqrt(1/2( ^2+ ^2)) = rms ( F i,k ) rms (|F i |+|F k |) = rms ( F i,k ) rms ((|F i |+|F k |)/2)
FliG/Cf_pf_rf_i f_p f_r f_i Met8pf_pf_rf_i f_p f_r f_i Copoxf_pf_rf_i f_p f_r f_i A Local Comparison Capsidf_pf_rf_i f_p f_r f_i0.0832
Experimental vs. calculated To obtain a usable signal, the data must be measured with a significantly better (lower) noise level Copoxf_pf_rf_i f_p f_r f_i Calcf_pf_rf_i f_p f_r f_i0.045
Friedel differences vs Centric differences One way to analyze the noise in the data is to compare the merging statistics of the centric reflections to the Bijvoet differences. Centric reflections are reflections which are related through the space groups point symmetry (Laue symmetry). For a two fold axis: [hkl] = -h +k -l
N 1/resol^2 Dmin Nmeas %poss Cm%poss Mlplcty AnomCmpl AnomFrc Rmeas Rmeas0 (Rsym) PCV PCV Overall Nmeas %poss Cm%poss Mlplcty AnomCmpl AnomFrc Rmeas Rmeas0 (Rsym) PCV PCV0 "Improved R-factors for diffraction data analysis in macromolecular crystallography" Kay Diederichs & P. Andrew Karplus, Nature Structural Biology, 4, (1997)
Wavelength Correlation How well to the anomalous pairs correlate between the different wavelengths. –Good data should have an overall correlation between –The resolution of the data is only really good to a correlation to 0.3. –Diagonal- self correlation (by definition = 1) –Off-diagonal – overall correlation
Mannose Binding Protein – 4Yb in these examples the wavelength correlation is in the off-diagonal Inf1PeakInf2High Inf peak Inf High Data has high correlation beyond 1.8Å. Anomalous differences are stronger than Se differences as Yb is a stronger scatter. Trend is correct here with peak wavelength the strongest anomalous scatter. Very high correlations in anomalous signal across wavelengths.
Protein “X”- 9Se in these examples the correlation is in the off-diagonal Inf1PeakHighLow Inf Peak High Low Correlation shows that data is only good to 4.2Å, but the data (and these numbers) are to 2.5Å. Also note that the peak wavelength doesn’t have the highest anomalous signal. The numbers are in the right range though. (off diagonal numbers are wavelength correlations).
Met8pf_pf_rf_i f_p f_r f_i1.000 Capsidf_hf_if_p f_h f_i f_p1.000 Capsid res Corr All Met8p res corr all
Copoxf_pf_rf_i f_p f_r f_e1.000 Copox res corr FliG/Cf_pf_rf_e f_p f_r f_e1.000 FliG/C res Corr All
Correlation of anomalous differences at different wavelengths. (solve nicely puts all three wavelengths in a little table vs resolution. Solve suggests that little contribution to phasing will happen below a correlation of 0.5even though you will probably use data to a correlation of 0.3. Met8p. CORRELATION FOR WAVELENGTH PAIRS DMIN 1 VS 2 1 VS 3 2 VS ALL COPOX. CORRELATION FOR WAVELENGTH PAIRS DMIN 1 VS ALL 0.23 SOLVE FliG/C- 2/3. CORRELATION FOR WAVELENGTH PAIRS DMIN 1 VS 2 1 VS 3 2 VS ALL
Summary Know how much signal you need –Rule of thumb is 1 Se per 17 Kda –Two Met’s were incorporated into U3S to solve structure. Know the quality of the data you expect to collect. –COPOX data might not be able to be solved by SeMet. Might have to return to heavier atoms. Analyze the data to decide on what wavelengths to use, and to what resolution.
Rmerge update Improved R-factors for diffraction data analysis in marcomolecular crystallography (1997) Diederichs and Karplus NSB: 4(4) p269 A discussion of the efffect of redundancy on R-factors. Can also use these R-factors to evaluate Centric merging statistics to overall merging statistics. If there is a good anomalous signal then the overall merging statistics will be higher than the centric stats. Incorporated in SCALA.