Presentation is loading. Please wait.

Presentation is loading. Please wait.

THE PHASE PROBLEM Electron Density

Similar presentations


Presentation on theme: "THE PHASE PROBLEM Electron Density"— Presentation transcript:

1 THE PHASE PROBLEM Electron Density
Structure factor amplitude defined as: Funit cell(S) = ∫r r (r) · exp (2pi r · S) dr Using the inverse Fourier Transform r (r) = ∫r F(S) · exp (-2pi r · S) dS In practice you make a discrete inverse Fourier Transform: r (r) = Shkl Fhkl · exp (-2pi Fhkl) But measure the X-ray diffraction intensities Ihkl ∝ |Fhlk|2 The fact that you cannot measure Fhkl directly is called THE PHASE PROBLEM

2 Solutions to the Phase Problem
Direct methods: - Based upon systematic relations between certain reflections. Need high resolution data & relatively small systems. Overwhelmingly most popular for small molecule structures. Molecular replacement: Find a molecule of known structure which is close enough to your protein of interest to provide a good first guess. Becoming more popular as the spectrum of possible structures is filled up. Heavy atom methods: Soak an atom which is a strong scatter (eg. Hg, Fe, Pb, I,Se ..) into your crystal. Replace the methionine’s in your protein with selenio-methionine derivatives. Use Multiple Isomorphous replacement. Use Multiple or Single Anomolous Diffraction. An old & powerful method for finding phases.

3 Heavy atom methods Must search around for a heavy atom which binds within your crystal & doesn’t destroy the crystal lattice. Can be extremely frustrating! Many soaking, freezing & diffraction experiments. - The heavy atom must bind in an ordered way to the protein. Suppose you have a protein with structure factor FP. For every X-ray intensity measured the addition of the heavy atom adds a term FH to the scattering: FH FP FPH

4 DFPH = | FPH | – | FP | = | FH | cos fP - fH
Heavy atom methods Assuming that FH has an angle fH, the structure factor amplitude will be perturbed by DFPH = | FPH | – | FP | = | FH | cos fP - fH FP FPH FH fPH - fH

5 Heavy atom methods If you can find the location of you heavy atom rH, then you can calculate the heavy atom structure factor FH = fH exp 2pi rH · Shkl just like any other atom within the protein. If you have already measured FH and FPH you can recover a constraint on the phases for the protein: FP FPH FH

6 Additional constraints
A single heavy atoms derivative gives you a two complex phases which may be correct for every reflection (h,k,l). For the approach to work crystals must be isomorphous (ie. Same a, b, c, a, b, g & space group). In order to determine which phase is correct you need to find additional derivatives. Called Multiple Isomorphous Replacement FP FPH FH

7 Additional constraints
If you have a second constraint: Better to draw diagram with FP centred at origin rather than FH & FPH. This then represents FP = FPH – FH Should recover one place where three circles intersect. This is your solution for the phases. Possible solution for light green & light blue measurements. FP Possible solution for dark green & dark blue measurements.

8 Combining phase information
In practice there are errors with respect to each heavy atom experiment. - Therefore you recover a probability distribution for a phase, rather than an absolute phase. Best to take a weighted sum of the probabilities to determine the experimental phases. m is the length of the probability weighted experimental phase & is called the ”figure of merit”. Ideally m =1, but in practice m < 1. Fhkl P(Fhkl) m Fhkl P(Fhkl) m

9 Finding Heavy atom Positions
The previous treatment relied on finding the heavy atom positions. In practice this is usually solvable using the Patterson Function. P(u,v,w) = Shkl | Fhkl |2 exp 2pi u·S In this case you can calculate it without any phase information, but directly from the measured intensities. Note that | Fhkl |2 = Fhlk × Fhkl* hence P(u,v,w) = r(r) ✴ r(r) = ∫ r(r) × r†(r + u) dr where we use the convolution theorem for Fourier Transforms. r†(r + u) represents the inverse of the real electron density. eg. If you had two heavy atoms in a unit cell the Patterson function would look like: u

10 P(u,v,w) = r(r) ✴ r(r) = ∫ r(r) × r(r+u) dr
Patterson Function Since P(u,v,w) = r(r) ✴ r(r) = ∫ r(r) × r(r+u) dr This means that the Patterson Map gives you: A central peak for u = 0 since the density sits upon itself for all atoms (including the protein). An additional peak whenever one heavy atom sits upon another. It rapidly becomes very complicated but can be solved (used to be done by inspection) if you have a small number of heavy atoms. u

11 Difference Patterson Function
Since you are looking for the scattering from the heavy atoms & the protein adds only background its more convenient to calculate the difference Patterson: P(u,v,w) = Shkl ( | FPH |2 - | FP |2) exp 2pi u·S This difference Patterson map gives you the vectors between the heavy atoms directly. - Somewhat easier to interpret than the Patterson for FPH itself since it has less background from protein-protein distances.

12 Solving the Difference Patterson
If you are successful with recovering a derivative & recover a Patterson function then you must solve it to proceed. - The Patterson function gives you a set of constraints in 3D which are the distances between heavy atoms. There exist algorithms for finding unique solutions which work for a reasonable number of heavy atoms usually ten or less; although I think the record may be more than twenty. The number of non-origin peaks is N(N-1) for N heavy atoms. ⇒ invert

13 Magnitude of intensity changes with heavy atoms
May ask how one (or a few) heavy atoms within a protein of eg. 50 kDa could be seen? - eg. Hg has 80 e- - A protein a sea of eg. 4,000 non-carbon atoms with faverage ≈ 7 e- (ie 28,000 e- in total). - How can you see 80 e-/28000 e- = 0.3 % ? On average: Fprotein ≈ faverage × √N (N is the number of non-hydrogen atom random walk result) hence Iprotein ∝ | Fatom |2 ≈ (faverage)2 × N But for the heavy atom IH ∝ FH2 Now |Fprotein + FH ≈ |faverage | × √N + |FH| Hence IPH ≈ | faverage × √N + FH |2

14 = 2 (faverage × √N) × FH + FH 2
IPH ≈ | faverage × √N + FH |2 ≈ (faverage × √N) (faverage × √N) × FH + FH 2 Therefore DIPH ≈ (faverage × √N) (faverage × √N) × FH + FH 2 - (faverage × √N)2 = 2 (faverage × √N) × FH + FH 2 The first term can be evaluate for eg non-hydrogen atoms: 2 (faverage × √N) × FH = 2 × 7 × √4000 × 80 = 70,835 The second term FH 2 = 802 = 6,400 And Iprotein ∝ | Fatom |2 ≈ (faverage)2 × N = 196,000 Hence in this case IPH/Iprotein ≈ 71/196 = 36 %

15 Average intensity change if one heavy atom is bound to a protein
Protein weight 100 % occupancy 50 % occupancy 14,000 51 % 25 % 28,000 36 % 18 % 56,000 12 % 112,000 9 % 224,000 13 % 6 % 448,000 4 % In practice changes of 14 % is a good phasing measurement.

16 Lack of Closure In practice when you have found the phases experimentally there is some mis-match: The mis-match is called the lack of closure & is given the symbol e. Ideal case Reality FH FP FPH FH FP e FPH

17 Phasing Power = √( S |FH|2 / S e2)
From the miss-match you can estimate the phasing power Phasing Power = √( S |FH|2 / S e2) A phasing power of 4 is excellent & is rare: A value between 1 & 2 is acceptable & means that the scattering of the heavy-atom is larger than the lack of closure. FP FPH FH e

18 First Map. Once you have recovered the experimental phases you can make a Fourier transform. At that point if the electron density is good enough you can build a structural model into the density. If this goes well you can then make interative rounds of structural refinement, phase improvement, and more model building until you recover a satisfactory model.

19 Anomalous scattering An assumption to date was that X-rays scattering from an atom, fj, was a real number: - In physics this is equivalent to assuming that all electrons can be treated as scattering from free electrons & therefore contribute a phase change of p for the scattered X-ray. When you go close to the atomic energy levels of certain atoms then you can no longer assume that this holds. X-rays scatter with an anomalous scattering term near an absorption edge. A transition from K to L shell electrons. A photoelectron ejected from the K shell.

20 Atomic absorption coeffecient
For example copper has a K-absorption edge at 1.38 Å due to the photoelectric effect. There are also transitions from K to L at 1.43 Å. If you have Cu in your protein & you record near 1.43 Å you will have an anomalous scattering from this atom. 1.0 2.0 X-ray wavelength (Å) X-ray absorption

21 Anomalous scattering To describe the anomalous scattering (which is wavelength dependent) we modify the atomic scattering factor for the particular heavy atom: fanom = f + Df + if'' = f + f' + f'' Two different symbols are used for the second term depending on what book you look at. f f' f'' fanom

22 Consequence for Friedel paris
Earlier we showed that the reflection (h,k,l) & its opposite (-h,-k,-l) have the same intensity since F-h-k-l = Sj fj(Shkl) · exp (2pi rj · S-h-k-l) = (Fhkl)* & therefore Ihkl = I-h-k-l - These are called Friedel pairs & have the same intensity: When you have anomalous scattering this no longer holds: FPH(hkl) FPH(hkl) FH FH Fhkl Fhkl F-h-k-l F-h-k-l FPH(-h-k-l) F*H FPH(-h-k-l) F*H No anomalous scattering With anomalous scattering

23 When you have anomalous scattering this no longer holds:
Frequently draw the picture with the FPH(-h-k-l) reflected in the real axis. When scaling data it is possible to measure these differences due to the anomalous signal. |FPH(hkl)| – |FPH(-h-k-l)| Fhkl FH FPH(-h-k-l) FPH(hkl) f’ f’’ -f’’ f f' f'' fanom

24 D|Fano| = {|FPH(hkl)| – |FPH(-h-k-l)|} × f’/2f’’
D|F|ano Patterson map If we make a measurement and are careful to measure all the Friedel pairs we can define: D|Fano| = {|FPH(hkl)| – |FPH(-h-k-l)|} × f’/2f’’ where the scaling factor f’/2f’’ is put in for technical normalisation reasons. If we then calculate the Patterson map using Pano(u) = Shkl D|Fano|2 exp 2pi u·S This gives us the inter-distance constraints for the anomolous scatterers. - This is powerful since you only need one set of observations & not two, so the noise level is low even if D|Fano| is small.

25 Multiple Anomalous Diffraction
It is possible to accurately tune synchrotron radiation. - It is therefore possible to collect diffraction data from the same crystal at different wavelengths. Near an X-ray absorption edge the real & imaginary components f’ & f’’ change rapidly: 5 f’ (electrons) f’’ (electrons) 1 -10 X-ray Energy (keV) X-ray Energy (keV)

26 Multiple Anomalous Diffraction
As such it is possible to record three diffraction data sets from a single crystal: - At the peak where f’ has its maximum. - At the peak where f’’ has its minimum (ie. Most negative). - Far removed from the above two wavelengths. This is called Multiple Anomalous diffraction (MAD) since you use three wavelengths to get your data. - You can solve the structure from a single crystal (ie. don’t need additional derivatives or another native data set). No problems with crystal being isomorphous. The text book has a detailed description of the algebra of solving structures using MAD, but for this course you can assume its (more or less) the same as having one native plus two derivative data sets.

27 SAD & SIR In addition to MIR & MAD there is Single Isomorphous Replacement (SIR) or Single Anomalous Diffraction (SAD). - The same as MIR or MAD but with one data set rather than several. In SIR if your data is good you assume that your phases from one heavy atom are the sum of the two possible phases with 50 % probability. - Then go ahead & calculate a map. - The wrong choice of phases adds noise. Relies on the data being good enough to overcome additional noise. SAD is like SIR but with the variation that you also use the anomalous signal to give additonal information. FP FPH FH

28 Molecular Replacement
If you believe that your protein may have a structure similar to that of another protein of known structure you can use that information. - Solving the structure by making a good guess! The problem is to place the possible structure correctly within the unit cell & then you calculate a first electron density map using phases calculated from the know structure. ? Also useful if you have sub-domains of known structure.

29 Rotation The Rotation Function
Once you choose a molecular replacement model you first need to determine its orientation. Done by comparing the experimental Patterson Map with one calculated from your candidate structure. The Patterson map is sensitive to the orientation of the molecule but not its position within the unit cell. Rotation

30 Cross Rotation Function
An overlap function is defined as R of the experimental P(u) with the rotated version of the candidate model Patterson P’r(ur) is defined as: R(a,b,g) = ∫ P(u) × P’r(ur) d(u) Where a, b & g are normally the Euler angles describing rotation. R(a,b,g) is maximal when the rotational angle is correct & is not affected by where in the unit cell the candidate structural model is placed. Poor correlation because the overlap is not excellent. If rotated can recover perfect overlap ie. R(a,b,g) is a maximum.

31 Translation function Once you have found the optimal rotational orientation next need to find the correct translation. In this case the Patterson function is useless since its insensitive to translation. What one does is move the molecule around in the unit cell & calculate the theoretical Fhklcalc values & compare these with the experimental Fhklobs. ?

32 R-factor & Correlation Coeffecient
Two numbers are optimised: The R-factor R = Shkl ||Fobs| - k|Fcalc||/ (Shkl |Fobs|) which should be minimised - ie. calculated structure factors are as close as possible to the experimental structure factors. The Standard Linear Correlation Coeffecient C = Shkl (|Fobs|2 - <|Fobs|2>) × (|Fcalc|2 - <|Fcalc|2>) × {Shkl (|Fobs|2 - <|Fobs|2>)2 × Shkl (|Fcalc|2 - <|Fcalc|2>)2 }-1/2 which should be maximised. ie. When Fobs is much greater than the average, Fcalc should also be greater than the average etc. Useful values are C > 30 % & R< 55 %.

33 Direct methods If you have a small number of atoms & very good resolution you may recover a structure from a native data set. The concept is that there are phase-relations between different Bragg reflections: f(h1) + f(h2) + f(-h1 – h2) = 0 Geometrical arguments can be used to show that this holds when atoms sit on the lattice planes but not in-between lattice planes (chapter 11 of the textbook). In practice this assumption holds approximately for very strong reflections, which are strong since most of the atoms are scattering in phase & therfore on the same lattice plane. - This assumption itself breaks down for large proteins.

34 Shake & Bake Direct methods is very successful for small molecule crystallography. - In practice you pick a few phases & derive the rest from the triplet relations for a limited number of strong reflections. - For molecules with > 150 non-hydrogen atoms the unit cell is so evenly filled with atoms that the phase triplet relation doesn’t work. An algorithm has been written to extend up to about 1000 non-hydrogen atoms but requires about 1.2 Å data. The principle is that the phase triplet is no longer set to zero but obeys a probability distribution. You then shake the phase angles in reciprocal space & bake out the low density regions in real space.

35 Protein Data Bank Entries
This month the number of entries in the protein data bank will surpass 30,000. Current deposit rate is approaching 6,000 per year or 20 per day. - Not all are unique! X-ray crystallography is responsible for approx 75 % - Phasing methods have been very successful!


Download ppt "THE PHASE PROBLEM Electron Density"

Similar presentations


Ads by Google