Presentation on theme: "Overview of the Phase Problem"— Presentation transcript:
1Overview of the Phase Problem ProteinCrystalDataPhasesStructureOverview of the Phase ProblemRememberWe can measure reflection intensitiesWe can calculate structure factors from the intensitiesWe can calculate the structure factors from atomic positionsWe need phase information to generate the image
2What is the Phase Problem X-ray Diffraction ExperimentAll phase information is lostx,y.zFhkl[Real Space][Reciprocal Space]In the X-ray diffraction experiment photons are reflected from the crystal lattice (planes) in different directions giving rise to the diffraction pattern.Using a variety of detectors (film, image plates, CCD area detectors) we can estimate intensities but we loose any information about the relative phase for different reflections.
3Phases Let’s define a phase for an individual atom, fj An atom at xj=0.40, yj=0.25, zj=0.10 for plane fj = 2p[ 2•(0.40) + 1•(0.25) + 3•(0.10)] = 2p(1.)For k = 0 (a 2D case) thenFor plane fj = 2p[ 2•(0.40) + 1•(0.10)] = 2p(0.)Now to understand what this means….
4201 Phases c a fD = 2p[ 2•(0.40) + 1•(0.10)] = 2p(0.) A B G C H D F IE0°720°ca201 planes4p360°2p1080°6p0.4, y, 0.1fD = 2p[ 2•(0.40) + 1•(0.10)] = 2p(0.)
5In General for Any Atom (x, y, z) dhkl6πdhkl4πAtom (j) at x,y,zdhkl2πφcPlane hklRemember: We express any position in the cell as(1) fractional coordinatespxyz = xja+yjb+zjc(2) the sum of integral multiples of the reciprocal axeshkl = ha* + kb* + lc*
7Why Do We Need the Phase? Structure Factor Electron Density Fourier transformInverse Fourier transformStructure FactorElectron DensityIn order to reconstruct the molecular image (electron density) from its diffraction pattern both the intensity and phase, which can assume any value from 0 to 2, of each of the thousands of measured reflections must be known.
8Importance of Phases Phases dominate the image! Hauptman amplitudeswith Hauptman phasesKarle amplitudeswith Karle phasesKarle amplitudeswith Hauptman phasesHauptman amplitudeswith Karle phasesPhases dominate the image!Phase estimates need to be accurate
9Understanding the Phase Problem The phase problem can be best understood from a simple mathematical construct.The structure factors (Fhkl) are treated in diffraction theory as complex quantities, i.e., they consist of a real part (Ahkl) and an imaginary part (Bhkl).If the phases, hkl, were available, the values of Ahkl and Bhkl could be calculated from very simple trigonometry:Ahkl = |Fhkl| cos (hkl)Bhkl = |Fhkl| sin (hkl)this leads to the relationship:(Ahkl)2 + (Bhkl)2 = |Fhkl|2 = Ihkl
10Argand Diagram (Ahkl)2 + (Bhkl)2 = |Fhkl|2 = Ihkl The above relationships are often illustrated using an Argand diagram (right).From the Argand diagram, it is obvious that Ahkl and Bhkl may be either positive or negative, depending on the value of the phase angle, hkl.Note: the units of Ahkl, Bhkl and Fhkl are in electrons.
11The Structure Factor f0 The scattering factor for each atom sinΘ/λf0Atomic scattering factorsHere fj is the atomic scattering factorThe scattering factor for each atomtype in the structure is evaluated atthe correct sinΘ/λ. That value isthe scattering ability of that atom.RememberWe now have an atomic scattering vectorwith a magnitude f0 and direction φj .
12The Structure Factor Sum of all individual atom contributions realimaginaryIndividualatom fjsResultantFhklAhklBhkl
13Electron DensityRemember the electron density (image of the molecule) is the Fourier transform of the structure factor Fhkl. ThusHere V is the volume of the unit cellIn practice, the electron density for one three-dimensional unit cell is calculated by starting at x, y, z = 0, 0, 0 and stepping incrementally along each axis, summing the terms as shown in the equation above for all hkl (as limited by the resolution of the data) at each point in space.
14Solving the Phase Problem Small moleculesDirect MethodsPatterson MethodsMolecular ReplacementMacromoleculesMultiple Isomorphous Replacement (MIR)Multi Wavelength Anomalous Dispersion (MAD)Single Isomorphous Replacement (SIR)Single Wavelength Anomalous Scattering (SAS)Direct Methods (special cases)
15Solving the Phase Problem SMALL MOLECULESThe use of Direct Methods has essentially solved the phase problem for well diffracting small molecule crystals.MACROMOLECULESToday, anomalous scattering techniques such as MAD or SAS are the most common techniques used for de novo structure determination of macromolecules. Both techniques require the presence of one or more anomalous scatterers in the crystal.
16SIR and SAS MethodsNeed a heavy atom (lots of electrons) or a anomalous scatterer (large anomalous scattering signal) in the crystal.SIR - heavy atoms usually soaked in.SAS - anomalous scatterers usually engineered in as selenomethional labels. Can also be soaked.SIR collect a native and a derivative data set (2 sets total). SAS collect one highly redundant data set and keep anomalous pairs separate during processing.SAS - may want to choose a scatterer or wavelength that enhances the anomalous signal.Must find the heavy atoms or anomalous scattererscan use Patterson analysis or direct methods.Must resolve the bimodal ambiguity.use solvent flattening or similar technique
17Heavy Atom Derivatives Heavy atom derivatives MUST be isomorphousHeavy atom derivatives are generally prepared by soaking crystals in dilute ( mM) solutions of heavy atom salts (see Table II below for some examples).Crystal cracking is generally a good indication that that heavy atom is interacting with the crystal lattice, and suggests that a good derivative can be obtained by soaking the crystal in a more dilute solution.Once derivative data has been collected, the merging R factor (Rmerge) between the native and derivative data sets can be used to check for heavy atom incorporation and isomorphism. Rmerge values for isomorphous derivatives range from 0.05 to Values below 0.05 indicate that there is little heavy atom incorporation. Values above 0.15 indicate a lack of isomorphism between the two crystals.
18Finding the Heavy Atoms or Anomalous Scatterers The Patterson function- a F2 Fourier transform with f = 0- vector map (u,v,w instead of x,y,z)- maps all inter-atomic vectors- get N2 vectors!! (where N= number of atoms)The Difference Patterson MapSIR - |DF|2 = |Fnat - Fder|2SAS - |DF|2 = |Fhkl - F-h-k-l|2Patterson map is centrosymmetric- see peaks at u,v,w & -u, -v, -wPeak height proportional to ZiZjPeak u,v,w’s give heavy atom x,y,z’s- Harker analysisOrigin (0,0,0) maps vector of atom to itselfFrom Glusker, Lewis and Rossi
19Harker Analysis Example Space group P21 Patterson symmetry = Space group symmetry minus translationsExample Space group P21P21 space group symmetry operators x,y,z -x,1/2+y,-zx,y,z -x,1/2+y,-zx,y,z [(x,y,z) - (x,y,z)] [(x,y,z) - (-x,1/2+y,-z)]-x,1/2+y,-z [(-x,1/2+y,-z) – (x,y,z)] [(-x,1/2+y,-z) – (-x,1/2+y,-z)]x,y,z x,-1/2, 2z-x,1/2+y,-z -2x, 1/2,-2z 000Harker section v = 1/2 where to look for heavy atom vectors±2x, 1/2, ±2zAutomated programs SOLVE, SHELXD, BNP are available
21The Phase Triangle Relationship DOLM = DOLNMQLFPH = FP + FHONeed value of FHNFrom Glusker, Lewis and RossiFP, FPH, FH and -FH are vectors (have direction)FP <= obtained from native dataFPH <= obtained from derivative or anomalous dataFH <= obtained from Patterson analysis
22The Phase Triangle Relationship MQLONFrom Glusker, Lewis and RossiIn simplest terms, isomorphous replacement finds the orientation of the phase triangle from the orientation of one of its sides. It turns out, however, that there are two possible ways to orient the triangle if we fix the orientation of one of its sides.
23Single Isomorphous Replacement From Glusker, Lewis and RossiNote:FP = proteinFH = heavy atomFP1 = heavy atom derivativeThe center of the FP1circle is placed at the end of the vector -FH1.X1 = ftrueor ffalseX2 = ftrueor ffalseThe situation of two possible SIR phases is called the “phase ambiguity” problem, since we obtain both a true and a false phase for each reflection. Both phase solutions are equally probable, i.e. the phase probability distribution is bimodal.
24Resolving the Phase Ambugity From Glusker, Lewis and RossiNote:FP = proteinFH = heavy atomFP1 = heavy atom derivativeThe center of the FP1circle is placed at the end of the vector -FH1.X1 = ftrueor ffalseX2 = ftrueor ffalseAdd more information:Add another derivative (Multiple Isomorphous Replacement)Use a density modification technique (solvent flattening)Add anomalous data (SIR with anomalous scattering)
25Multiple Isomorphous Replacement Note:FP = proteinFH1 = heavy atom #1FH2 = heavy atom #2FP1 = heavy atom derivativeFP2 = heavy atom derivativeThe center of the FP1 and FP1 circlesare placed at the end of the vector -FH1 and -FH2, respectively.X1 = ftrueX2 = ffalseX3 = ffalsFrom Glusker, Lewis and RossiExact overlap at X1dependent on data accuracydependent on HA accuracycalled lack of closureWe still get two solutions, one true and one false for each reflection from the second derivative. The true solutions should be consistent between the two derivatives while the false solution should show a random variation.
26Similar to noise filtering Solvent FlatteningSimilar to noise filteringResolve the SIR or SAS phase ambiguityFrom Glusker, Lewis and RossiB.C. Wang, 1985Electron density can’t be negativeUse an iterative process to enhance true phase!
27The solvent flattening process was made practical by the introduction of the ISIR/ISAS program suite (Wang, 1985) and other phasing programs such DM and PHASES are based on this approach.
28Handedness Can be Determined by Solvent Flattening
29Does the Correct Hand Make a Difference? YES!The wrong hand will givethe mirror image!
30Anomalous Dispersion Methods All elements display an anomalous dispersion (AD) effect in X-ray diffractionFor elements such as e.g. C,N,O, etc., AD effects are negligibleFor heavier elements, especially when the X-ray wavelength approaches an atomic absorption edge of the element, these AD effects can be very large.The scattering power of an atom exhibiting AD effects is:fAD = fn + f' + if”fnis the normal scattering power of the atom in absence of AD effectsf' arises from the AD effect and is a real factor (+/- signed) added to fnf" is an imaginary term which also arises from the AD effectf" is always positive and 90° ahead of (fn + f') in phase angleThe values of f' and f" are highly dependent on the wave-length of the X-radiation.In the absence AD effects, Ihkl = I-h-k-l (Firedel’s Law).With AD effects, Ihkl ≠ I-h-k-l (Friedel’s Law breaks down).Accurate measurement of Friedel pair differences can be used to extract starting phases if the AD effect is large enough.
31Breakdown of Friedel’s Law (Fhkl Left) Fn represents the total scattering by "normal" atoms without AD effects, f’ represents the sum of the normal and real AD scattering values (fn + f'), f" is the imaginary AD component and appears 90° (at a right angle) ahead of the f’ vector and the total scattering is the vector F+++.(F-h-k-l Right) F-n is the inverse of Fn (at -hkl) and f’ is the inverse of f’, the f" vector is once again 90° ahead of f’. The resultant vector, F--- in this case, is obviously shorter than the F+++ vector.
32Collecting Anomalous Scattering Data Anomalous scatterers, such as selenium, are generally incorporated into the protein during expression of the protein or are soaked into the crystals in a manner similar to preparing a heavy atom derivative.Bromine, iodine, xeon and traditional heavy atom compounds are also good anomalous scatterers.The anomalous signal, the difference between |F+++| and |F---| is generally about one order of magnitude smaller than that between |FPH(hkl)|, and |FP(hkl)|.Thus, the signal-to-noise (S/n) level in the data plays a critical role in the success of anomalous scattering experiments, i.e. the higher the S/n in the data the greater the probability of producing an interpretable electron density map.The anomalous signal can be optimized by data collection at or near the absorption edge of the anomalous scatterer. This requires a tunable X-ray source such as a synchrotron.The S/n of the data can also be increased by collecting redundant data.The two common anomalous scattering experiments are Multiwavelength Anomalous Dispersion (MAD) and single wavelength anomalous scattering/dfiffraction (SAS or SAD)The SAS technique is becoming more popular since it does not require a tunable X-ray source.
35Multiwavelength Anomalous Dispersion Note:FP = proteinFH1 = heavy atomF+PH = F+++F-PH = F---F+H” = f”+++F-H” = f”---The center of the F+PH and F-PH circles are placed at the end of the vector -F+H” and -F-H” respectively.From Glusker, Lewis and RossiIn the MAD experiment a strong anomalous scatterer is introduced into the crystal and data are recorded at several wavelengths (peak, inflection and remote) near the X-ray absorption edge of the anomalous scatterer. The phase ambiguity resolved a manner similar to the use of multiple derivatives in the MIR technique.
36Single Wavelength Anomalous Scattering The SAS method, which combines the use of SAS data and solvent flattening to resolve phase ambiguity was first introduced in the ISAS program (Wang, 1985). The technique is very similar to resolving the phase ambiguity in SIR data.The SAS method does not require a tunable source and successful structure determination can be carried out using a home X-ray source on crystals containing anomalous scatterers with sufficiently large f” such as iron, copper, iodine, xenon and many heavy atom salts.The ultimate goal of the SAS method is the use of S-SAS to phase protein data since most proteins contain sulfur. However sulfur has a very weak anomalous scattering signal with f” = 0.56 e- for Cu X-rays.The S-SAS method requires careful data collection and crystals that diffract to 2Å resolution.A high symmetry space group (more internal symmetry equivalents) increases the chance of success.The use of soft X-rays such as Cr K (= Å) X-rays doubles the sulfur signal (f” = 1.14 e-).There over 20 S-SAS structures in the Protein Data Bank.
37What is the Limit of the SAS Method f” = 0.56e- using Cu K X-rays
38Molecular Replacement Molecular replacement has proven effective for solving macromolecular crystal structures based upon the knowledge of homologous structures.The method is straightforward and reduces the time and effort required for structure determination because there is no need to prepare heavy atom derivatives and collect their data.Model building is also simplified, since little or no chain tracing is required.The 3-dimensional structure of the search model must be very close (< 1.7Å r.m.s.d.) to that of the unknown structure for the technique to work.Sequence homology between the model and unknown protein is helpful but not strictly required. Success has been observed using search models having as low as 17% sequence similarity.Several computer programs such as AmoRe, X-PLOR/CNS PHASER are available for MR calculations.
39Molecular Replacement Use a model of the protein to estimate phasesMust be a structural homologue (RMSD < 1.7Å)Two step process1. find orientation of model (red ==> black)2. find location of orientated model (black ==> blue)px.cryst.bbk.ac.uk/03/sample/molrep.htm
40Molecular Replacement Use a model of the protein to estimate phasesNeed to determine model’s orientation in X1s unit cellUse a Patterson rotation search (a, b, g)zyz conventionThe coordinate system is rotated by an angle a around the original z axis, then by anangle b around the new y axis, and then by an angle g around the final z axis.
41Molecular Replacement Use a model of the protein to estimate phasesNeed to determine orientated model’s location in X1s unit cellUse an R-factor searchOrientated model is stepped through the X1 unit cellusing small increments in x, y, and z (eg. x => x+ step)Point where R is lowest represents the correct locationOther faster methods are available e.g. PHASER