THE PHASE PROBLEM Electron Density

Slides:

Advertisements

Similar presentations

Intensities Learning Outcomes By the end of this section you should: understand the factors that contribute to diffraction know and be able to use the.

Advertisements

Phasing Goal is to calculate phases using isomorphous and anomalous differences from PCMBS and GdCl3 derivatives --MIRAS. How many phasing triangles will.

Protein x-ray crystallography

Introduction to protein x-ray crystallography. Electromagnetic waves E- electromagnetic field strength A- amplitude  - angular velocity - frequency.

Methods: X-ray Crystallography

Overview of the Phase Problem

Bob Sweet Bill Furey Considerations in Collection of Anomalous Data.

DFT/FFT and Wavelets ● Additive Synthesis demonstration (wave addition) ● Standard Definitions ● Computing the DFT and FFT ● Sine and cosine wave multiplication.

EEE539 Solid State Electronics

Chem Single Crystals For single crystals, we see the individual reciprocal lattice points projected onto the detector and we can determine the values.

Experimental Phasing stuff. Centric reflections |F P | |F PH | FHFH Isomorphous replacement F P + F H = F PH FPFP F PH FHFH.

A Brief Description of the Crystallographic Experiment

Twinning in protein crystals NCI, Macromolecular Crystallography Laboratory, Synchrotron Radiation Research ANL Title Zbigniew Dauter.

Hanging Drop Sitting Drop Microdialysis Crystallization Screening.

X-ray Crystallography Kalyan Das. Electromagnetic Spectrum to 10 nm 400 to 700 nm to nm 10 to 400 nm 700 to 10 4 nm X-ray radiation.

Experimental Phasing Andrew Howard ACA Summer School 22 July 2005.

Anomalous Scattering: Theory and Practice Andrew Howard ACA Summer School 29 July 2005 Andrew Howard ACA Summer School 29 July 2005.

Fourier transform. Fourier transform Fourier transform.

19 Feb 2008 Biology 555: Crystallographic Phasing II p. 1 of 38 ProteinDataCrystalStructurePhases Overview of the Phase Problem John Rose ACA Summer School.

In Macromolecular Crystallography Use of anomalous signal in phasing

Experimental Evaluation

PHY 102: Waves & Quanta Topic 8 Diffraction II John Cockburn Room E15)

Overview of the Phase Problem

Phasing based on anomalous diffraction Zbigniew Dauter.

The Effects of Symmetry in Real and Reciprocal Space Sven Hovmöller, Stockholm Univertsity Mirror symmetry 4-fold symmetry.

Miller Indices And X-ray diffraction

MOLECULAR REPLACEMENT Basic approach Thoughtful approach Many many thanks to Airlie McCoy.

Patterson Space and Heavy Atom Isomorphous Replacement

Data quality and model parameterisation Martyn Winn CCP4, Daresbury Laboratory, U.K. Prague, April 2009.

The ‘phase problem’ in X-ray crystallography What is ‘the problem’? How can we overcome ‘the problem’?

Diffraction Basics Coherent scattering around atomic scattering centers occurs when x-rays interact with material In materials with a crystalline structure,

Ionic Conductors: Characterisation of Defect Structure Lecture 15 Total scattering analysis Dr. I. Abrahams Queen Mary University of London Lectures co-financed.

Chem Patterson Methods In 1935, Patterson showed that the unknown phase information in the equation for electron density:  (xyz) = 1/V ∑ h ∑ k.

Chem Structure Factors Until now, we have only typically considered reflections arising from planes in a hypothetical lattice containing one atom.

Phasing Today’s goal is to calculate phases (  p ) for proteinase K using PCMBS and EuCl 3 (MIRAS method). What experimental data do we need? 1) from.

1. Diffraction intensity 2. Patterson map Lecture

Lesson 8 Diffraction by an atom Atomic Displacement Parameters.

Molecular Crystals. Molecular Crystals: Consist of repeating arrays of molecules and/or ions.

Page 1 X-ray crystallography: "molecular photography" Object Irradiate Scattering lens Combination Image Need wavelengths smaller than or on the order.

Methods in Chemistry III – Part 1 Modul M.Che.1101 WS 2010/11 – 8 Modern Methods of Inorganic Chemistry Mi 10:15-12:00, Hörsaal II George Sheldrick

Lesson 13 How the reciprocal cell appears in reciprocal space. How the non-translational symmetry elements appear in real space How translational symmetry.

Lesson 13 How the reciprocal cell appears in reciprocal space. How the non-translational symmetry elements appear in real space How translational symmetry.

X-ray diffraction X-rays discovered in 1895 – 1 week later first image of hand. X-rays have ~ 0.1 – few A No lenses yet developed for x-rays – so no possibility.

Protein Structure Determination Lecture 4 -- Bragg’s Law and the Fourier Transform.

Pattersons The “third space” of crystallography. The “phase problem”

Atomic structure model

Anomalous Differences Bijvoet differences (hkl) vs (-h-k-l) Dispersive Differences 1 (hkl) vs 2 (hkl) From merged (hkl)’s.

Electron Density Structure factor amplitude defined as: F unit cell (S) = ∫ r  (r) · exp (2  i r · S) dr Using the inverse Fourier Transform  (r) =

Calculation of Structure Factors

Electromagnetism Around 1800 classical physics knew: - 1/r 2 Force law of attraction between positive & negative charges. - v ×B Force law for a moving.

Absolute Configuration Types of space groups Non-centrosymmetric Determining Absolute Configuration.

Before Beginning – Must copy over the p4p file – Enter../xl.p4p. – Enter../xl.hkl. – Do ls to see the files are there – Since the.p4p file has been created.

Interpreting difference Patterson Maps in Lab this week! Calculate an isomorphous difference Patterson Map (native-heavy atom) for each derivative data.

X-ray Crystallography Kalyan Das. Electromagnetic Spectrum to 10 nM 400 to 700 nM to nM 10 to 400 nM 700 to 10 4 nM X-ray was discovered.

Phasing in Macromolecular Crystallography

Fourier transform from r to k: Ã(k) =  A(r) e  i k r d 3 r Inverse FT from k to r: A(k) = (2  )  3  Ã(k) e +i k r d 3 k X-rays scatter off the charge.

Today: compute the experimental electron density map of proteinase K Fourier synthesis  (xyz)=  |F hkl | cos2  (hx+ky+lz -  hkl ) hkl.

Lecture 3 Patterson functions. Patterson functions The Patterson function is the auto-correlation function of the electron density ρ(x) of the structure.

Crystallography : How do you do? From Diffraction to structure…. Normally one would use a microscope to view very small objects. If we use a light microscope.

Crystal Structure and Crystallography of Materials Chapter 13: Diffraction Lecture No. 1.

So that k k E 5 = - E 2 = = x J = x J Therefore = E 5 - E 2 = x J Now so 631.

Amyloid Precursor Protein (APP)

Stony Brook Integrative Structural Biology Organization

FOURIER THEORY: KEY CONCEPTS IN 2D & 3D

The theory of diffraction

Phasing Today’s goal is to calculate phases (ap) for proteinase K using MIRAS method (PCMBS and GdCl3). What experimental data do we need? 1) from native.

Introduction to Isomorphous Replacement and Anomalous Scattering Methods Measure native intensities Prepare isomorphous heavy atom derivatives Measure.

Nobel Laureates of X Ray Crystallography

S. Takeda, A. Yamashita, K. Maeda, Y. Maeda

r(xyz)=S |Fhkl| cos2p(hx+ky+lz -ahkl)

Presentation transcript:

THE PHASE PROBLEM Electron Density Structure factor amplitude defined as: Funit cell(S) = ∫r r (r) · exp (2pi r · S) dr Using the inverse Fourier Transform r (r) = ∫r F(S) · exp (-2pi r · S) dS In practice you make a discrete inverse Fourier Transform: r (r) = Shkl Fhkl · exp (-2pi Fhkl) But measure the X-ray diffraction intensities Ihkl ∝ |Fhlk|2 The fact that you cannot measure Fhkl directly is called THE PHASE PROBLEM

Solutions to the Phase Problem Direct methods: - Based upon systematic relations between certain reflections. Need high resolution data & relatively small systems. Overwhelmingly most popular for small molecule structures. Molecular replacement: Find a molecule of known structure which is close enough to your protein of interest to provide a good first guess. Becoming more popular as the spectrum of possible structures is filled up. Heavy atom methods: Soak an atom which is a strong scatter (eg. Hg, Fe, Pb, I,Se ..) into your crystal. Replace the methionine’s in your protein with selenio-methionine derivatives. Use Multiple Isomorphous replacement. Use Multiple or Single Anomolous Diffraction. An old & powerful method for finding phases.

Heavy atom methods Must search around for a heavy atom which binds within your crystal & doesn’t destroy the crystal lattice. Can be extremely frustrating! Many soaking, freezing & diffraction experiments. - The heavy atom must bind in an ordered way to the protein. Suppose you have a protein with structure factor FP. For every X-ray intensity measured the addition of the heavy atom adds a term FH to the scattering: FH FP FPH

DFPH = | FPH | – | FP | = | FH | cos fP - fH Heavy atom methods Assuming that FH has an angle fH, the structure factor amplitude will be perturbed by DFPH = | FPH | – | FP | = | FH | cos fP - fH FP FPH FH fPH - fH

Heavy atom methods If you can find the location of you heavy atom rH, then you can calculate the heavy atom structure factor FH = fH exp 2pi rH · Shkl just like any other atom within the protein. If you have already measured FH and FPH you can recover a constraint on the phases for the protein: FP FPH FH

Additional constraints A single heavy atoms derivative gives you a two complex phases which may be correct for every reflection (h,k,l). For the approach to work crystals must be isomorphous (ie. Same a, b, c, a, b, g & space group). In order to determine which phase is correct you need to find additional derivatives. Called Multiple Isomorphous Replacement FP FPH FH

Additional constraints If you have a second constraint: Better to draw diagram with FP centred at origin rather than FH & FPH. This then represents FP = FPH – FH Should recover one place where three circles intersect. This is your solution for the phases. Possible solution for light green & light blue measurements. FP Possible solution for dark green & dark blue measurements.

Combining phase information In practice there are errors with respect to each heavy atom experiment. - Therefore you recover a probability distribution for a phase, rather than an absolute phase. Best to take a weighted sum of the probabilities to determine the experimental phases. m is the length of the probability weighted experimental phase & is called the ”figure of merit”. Ideally m =1, but in practice m < 1. Fhkl P(Fhkl) m Fhkl P(Fhkl) m

Finding Heavy atom Positions The previous treatment relied on finding the heavy atom positions. In practice this is usually solvable using the Patterson Function. P(u,v,w) = Shkl | Fhkl |2 exp 2pi u·S In this case you can calculate it without any phase information, but directly from the measured intensities. Note that | Fhkl |2 = Fhlk × Fhkl* hence P(u,v,w) = r(r) ✴ r(r) = ∫ r(r) × r†(r + u) dr where we use the convolution theorem for Fourier Transforms. r†(r + u) represents the inverse of the real electron density. eg. If you had two heavy atoms in a unit cell the Patterson function would look like: u

P(u,v,w) = r(r) ✴ r(r) = ∫ r(r) × r(r+u) dr Patterson Function Since P(u,v,w) = r(r) ✴ r(r) = ∫ r(r) × r(r+u) dr This means that the Patterson Map gives you: A central peak for u = 0 since the density sits upon itself for all atoms (including the protein). An additional peak whenever one heavy atom sits upon another. It rapidly becomes very complicated but can be solved (used to be done by inspection) if you have a small number of heavy atoms. u

Difference Patterson Function Since you are looking for the scattering from the heavy atoms & the protein adds only background its more convenient to calculate the difference Patterson: P(u,v,w) = Shkl ( | FPH |2 - | FP |2) exp 2pi u·S This difference Patterson map gives you the vectors between the heavy atoms directly. - Somewhat easier to interpret than the Patterson for FPH itself since it has less background from protein-protein distances.

Solving the Difference Patterson If you are successful with recovering a derivative & recover a Patterson function then you must solve it to proceed. - The Patterson function gives you a set of constraints in 3D which are the distances between heavy atoms. There exist algorithms for finding unique solutions which work for a reasonable number of heavy atoms usually ten or less; although I think the record may be more than twenty. The number of non-origin peaks is N(N-1) for N heavy atoms. ⇒ invert

Magnitude of intensity changes with heavy atoms May ask how one (or a few) heavy atoms within a protein of eg. 50 kDa could be seen? - eg. Hg has 80 e- - A protein a sea of eg. 4,000 non-carbon atoms with faverage ≈ 7 e- (ie 28,000 e- in total). - How can you see 80 e-/28000 e- = 0.3 % ? On average: Fprotein ≈ faverage × √N (N is the number of non-hydrogen atom random walk result) hence Iprotein ∝ | Fatom |2 ≈ (faverage)2 × N But for the heavy atom IH ∝ FH2 Now |Fprotein + FH ≈ |faverage | × √N + |FH| Hence IPH ≈ | faverage × √N + FH |2

= 2 (faverage × √N) × FH + FH 2 IPH ≈ | faverage × √N + FH |2 ≈ (faverage × √N)2 + 2 (faverage × √N) × FH + FH 2 Therefore DIPH ≈ (faverage × √N)2 + 2 (faverage × √N) × FH + FH 2 - (faverage × √N)2 = 2 (faverage × √N) × FH + FH 2 The first term can be evaluate for eg. 4000 non-hydrogen atoms: 2 (faverage × √N) × FH = 2 × 7 × √4000 × 80 = 70,835 The second term FH 2 = 802 = 6,400 And Iprotein ∝ | Fatom |2 ≈ (faverage)2 × N = 196,000 Hence in this case IPH/Iprotein ≈ 71/196 = 36 %

Average intensity change if one heavy atom is bound to a protein Protein weight 100 % occupancy 50 % occupancy 14,000 51 % 25 % 28,000 36 % 18 % 56,000 12 % 112,000 9 % 224,000 13 % 6 % 448,000 4 % In practice changes of 14 % is a good phasing measurement.

Lack of Closure In practice when you have found the phases experimentally there is some mis-match: The mis-match is called the lack of closure & is given the symbol e. Ideal case Reality FH FP FPH FH FP e FPH

Phasing Power = √( S |FH|2 / S e2) From the miss-match you can estimate the phasing power Phasing Power = √( S |FH|2 / S e2) A phasing power of 4 is excellent & is rare: A value between 1 & 2 is acceptable & means that the scattering of the heavy-atom is larger than the lack of closure. FP FPH FH e

First Map. Once you have recovered the experimental phases you can make a Fourier transform. At that point if the electron density is good enough you can build a structural model into the density. If this goes well you can then make interative rounds of structural refinement, phase improvement, and more model building until you recover a satisfactory model.

Anomalous scattering An assumption to date was that X-rays scattering from an atom, fj, was a real number: - In physics this is equivalent to assuming that all electrons can be treated as scattering from free electrons & therefore contribute a phase change of p for the scattered X-ray. When you go close to the atomic energy levels of certain atoms then you can no longer assume that this holds. X-rays scatter with an anomalous scattering term near an absorption edge. A transition from K to L shell electrons. A photoelectron ejected from the K shell.

Atomic absorption coeffecient For example copper has a K-absorption edge at 1.38 Å due to the photoelectric effect. There are also transitions from K to L at 1.43 Å. If you have Cu in your protein & you record near 1.43 Å you will have an anomalous scattering from this atom. 1.0 2.0 X-ray wavelength (Å) X-ray absorption

Anomalous scattering To describe the anomalous scattering (which is wavelength dependent) we modify the atomic scattering factor for the particular heavy atom: fanom = f + Df + if'' = f + f' + f'' Two different symbols are used for the second term depending on what book you look at. f f' f'' fanom

Consequence for Friedel paris Earlier we showed that the reflection (h,k,l) & its opposite (-h,-k,-l) have the same intensity since F-h-k-l = Sj fj(Shkl) · exp (2pi rj · S-h-k-l) = (Fhkl)* & therefore Ihkl = I-h-k-l - These are called Friedel pairs & have the same intensity: When you have anomalous scattering this no longer holds: FPH(hkl) FPH(hkl) FH FH Fhkl Fhkl F-h-k-l F-h-k-l FPH(-h-k-l) F*H FPH(-h-k-l) F*H No anomalous scattering With anomalous scattering

When you have anomalous scattering this no longer holds: Frequently draw the picture with the FPH(-h-k-l) reflected in the real axis. When scaling data it is possible to measure these differences due to the anomalous signal. |FPH(hkl)| – |FPH(-h-k-l)| Fhkl FH FPH(-h-k-l) FPH(hkl) f’ f’’ -f’’ f f' f'' fanom

D|Fano| = {|FPH(hkl)| – |FPH(-h-k-l)|} × f’/2f’’ D|F|ano Patterson map If we make a measurement and are careful to measure all the Friedel pairs we can define: D|Fano| = {|FPH(hkl)| – |FPH(-h-k-l)|} × f’/2f’’ where the scaling factor f’/2f’’ is put in for technical normalisation reasons. If we then calculate the Patterson map using Pano(u) = Shkl D|Fano|2 exp 2pi u·S This gives us the inter-distance constraints for the anomolous scatterers. - This is powerful since you only need one set of observations & not two, so the noise level is low even if D|Fano| is small.

Multiple Anomalous Diffraction It is possible to accurately tune synchrotron radiation. - It is therefore possible to collect diffraction data from the same crystal at different wavelengths. Near an X-ray absorption edge the real & imaginary components f’ & f’’ change rapidly: 5 f’ (electrons) f’’ (electrons) 1 -10 12.4 12.5 12.6 12.7 12.8 12.4 12.5 12.6 12.7 12.8 X-ray Energy (keV) X-ray Energy (keV)

Multiple Anomalous Diffraction As such it is possible to record three diffraction data sets from a single crystal: - At the peak where f’ has its maximum. - At the peak where f’’ has its minimum (ie. Most negative). - Far removed from the above two wavelengths. This is called Multiple Anomalous diffraction (MAD) since you use three wavelengths to get your data. - You can solve the structure from a single crystal (ie. don’t need additional derivatives or another native data set). No problems with crystal being isomorphous. The text book has a detailed description of the algebra of solving structures using MAD, but for this course you can assume its (more or less) the same as having one native plus two derivative data sets.

SAD & SIR In addition to MIR & MAD there is Single Isomorphous Replacement (SIR) or Single Anomalous Diffraction (SAD). - The same as MIR or MAD but with one data set rather than several. In SIR if your data is good you assume that your phases from one heavy atom are the sum of the two possible phases with 50 % probability. - Then go ahead & calculate a map. - The wrong choice of phases adds noise. Relies on the data being good enough to overcome additional noise. SAD is like SIR but with the variation that you also use the anomalous signal to give additonal information. FP FPH FH

Molecular Replacement If you believe that your protein may have a structure similar to that of another protein of known structure you can use that information. - Solving the structure by making a good guess! The problem is to place the possible structure correctly within the unit cell & then you calculate a first electron density map using phases calculated from the know structure. ? Also useful if you have sub-domains of known structure.

Rotation The Rotation Function Once you choose a molecular replacement model you first need to determine its orientation. Done by comparing the experimental Patterson Map with one calculated from your candidate structure. The Patterson map is sensitive to the orientation of the molecule but not its position within the unit cell. Rotation

Cross Rotation Function An overlap function is defined as R of the experimental P(u) with the rotated version of the candidate model Patterson P’r(ur) is defined as: R(a,b,g) = ∫ P(u) × P’r(ur) d(u) Where a, b & g are normally the Euler angles describing rotation. R(a,b,g) is maximal when the rotational angle is correct & is not affected by where in the unit cell the candidate structural model is placed. Poor correlation because the overlap is not excellent. If rotated can recover perfect overlap ie. R(a,b,g) is a maximum.

Translation function Once you have found the optimal rotational orientation next need to find the correct translation. In this case the Patterson function is useless since its insensitive to translation. What one does is move the molecule around in the unit cell & calculate the theoretical Fhklcalc values & compare these with the experimental Fhklobs. ?

R-factor & Correlation Coeffecient Two numbers are optimised: The R-factor R = Shkl ||Fobs| - k|Fcalc||/ (Shkl |Fobs|) which should be minimised - ie. calculated structure factors are as close as possible to the experimental structure factors. The Standard Linear Correlation Coeffecient C = Shkl (|Fobs|2 - <|Fobs|2>) × (|Fcalc|2 - <|Fcalc|2>) × {Shkl (|Fobs|2 - <|Fobs|2>)2 × Shkl (|Fcalc|2 - <|Fcalc|2>)2 }-1/2 which should be maximised. ie. When Fobs is much greater than the average, Fcalc should also be greater than the average etc. Useful values are C > 30 % & R< 55 %.

Direct methods If you have a small number of atoms & very good resolution you may recover a structure from a native data set. The concept is that there are phase-relations between different Bragg reflections: f(h1) + f(h2) + f(-h1 – h2) = 0 Geometrical arguments can be used to show that this holds when atoms sit on the lattice planes but not in-between lattice planes (chapter 11 of the textbook). In practice this assumption holds approximately for very strong reflections, which are strong since most of the atoms are scattering in phase & therfore on the same lattice plane. - This assumption itself breaks down for large proteins.

Shake & Bake Direct methods is very successful for small molecule crystallography. - In practice you pick a few phases & derive the rest from the triplet relations for a limited number of strong reflections. - For molecules with > 150 non-hydrogen atoms the unit cell is so evenly filled with atoms that the phase triplet relation doesn’t work. An algorithm has been written to extend up to about 1000 non-hydrogen atoms but requires about 1.2 Å data. The principle is that the phase triplet is no longer set to zero but obeys a probability distribution. You then shake the phase angles in reciprocal space & bake out the low density regions in real space.

Protein Data Bank Entries This month the number of entries in the protein data bank will surpass 30,000. Current deposit rate is approaching 6,000 per year or 20 per day. - Not all are unique! X-ray crystallography is responsible for approx 75 % - Phasing methods have been very successful!