Refinement with REFMAC

Slides:



Advertisements
Similar presentations
Molecular Replacement
Advertisements

Twinning etc Andrey Lebedev YSBL. Data prcessing Twinning test: 1) There is twinning 2) The true spacegroup is one of … 3) Find the true spacegroup at.
Twinning and other pathologies Andrey Lebedev University of York.
Alexander J. Blake, School of Chemistry The University of Nottingham, Nottingham UK Refinement on weak or problematic small molecule data using SHELXL-97.
Towards Low Resolution Refinement Garib N Murshudov York Structural Laboratory Chemistry Department University of York.
Rosetta Energy Function Glenn Butterfoss. Rosetta Energy Function Major Classes: 1. Low resolution: Reduced atom representation Simple energy function.
Disorder.
Effects of TLS parameters in Macromolecular Refinement Martyn Winn Daresbury Laboratory, U.K. IUCr99 08/08/99.
Refinement Garib N Murshudov MRC-LMB Cambridge 1.
Refinement of Macromolecular structures using REFMAC5
Twinning. Like disorder but of unit cell orientation… –In a perfect single crystal, unit cells are all found in the same orientation. We can consider.
CCP4 workshop: Diamond – 2014 ___________________________________________ Refinement Garib N Murshudov MRC-LMB Cambridge 1.
A COMPLEX NETWORK APPROACH TO FOLLOWING THE PATH OF ENERGY IN PROTEIN CONFORMATIONAL CHANGES Del Jackson CS 790G Complex Networks
Refinement of Macromolecular structures using REFMAC5 Garib N Murshudov York Structural Laboratory Chemistry Department University of York.
Twinning in protein crystals NCI, Macromolecular Crystallography Laboratory, Synchrotron Radiation Research ANL Title Zbigniew Dauter.
A Molecular Replacement Pipeline Garib Murshudov Chemistry Department, University of York 
Pseudo translation and Twinning. Crystal peculiarities Pseudo translation Twin Order-disorder.
Macromolecular structure refinement Garib N Murshudov York Structural Biology Laboratory Chemistry Department University of York.
3. Crystals What defines a crystal? Atoms, lattice points, symmetry, space groups Diffraction B-factors R-factors Resolution Refinement Modeling!
Data Flow SHELX name.res Editor or XP name.ins name.hkl name.lst name.fcf name.cif name.pdb etc. XCIF name.rtf Ray tracer name.bmp Paper / Grant proposal.
Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.
Modelling Workshop - Some Relevant Questions Prof. David Jones University College London Where are we now? Where are we going? Where should.
Automated protein structure solution for weak SAD data Pavol Skubak and Navraj Pannu Automated protein structure solution for weak SAD data Pavol Skubak.
Homology Modeling David Shiuan Department of Life Science and Institute of Biotechnology National Dong Hwa University.
Introduction to Macromolecular X-ray Crystallography Biochem 300 Borden Lacy Print and online resources: Introduction to Macromolecular X-ray Crystallography,
28 th March 2007 MrBUMP – Automated Molecular Replacement Ronan Keegan, Martyn Winn CCP4, Daresbury Laboratory.
28 Mar 06Automation1 Overview of developments within CCP4 Generation 1 ccp4i tasks Generation 2 isolated scripts / web service Generation 3 integrated.
A Molecular Replacement Pipeline Garib Murshudov Chemistry Department, University of York 
A Molecular Replacement Pipeline Garib Murshudov Chemistry Department, University of York 
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
Twinning Non-merohedral twinning Merohedral twinning.
Data quality and model parameterisation Martyn Winn CCP4, Daresbury Laboratory, U.K. Prague, April 2009.
Chem One of the most important aspects of the data reporting process (i.e. publication or presentation) is the graphical presentation of your results.
Multiple alignment: Feng- Doolittle algorithm. Why multiple alignments? Alignment of more than two sequences Usually gives better information about conserved.
R. Keegan 1, J. Bibby 3, C. Ballard 1, E. Krissinel 1, D. Waterman 1, A. Lebedev 1, M. Winn 2, D. Rigden 3 1 Research Complex at Harwell, STFC Rutherford.
Crystallographic Points, Directions, and Planes. ISSUES TO ADDRESS... How to define points, directions, planes, as well as linear, planar, and volume densities.
1. Diffraction intensity 2. Patterson map Lecture
Erice CCP4 (refmac) wokshop Garib N Murshudov York Structural Laboratory Chemistry Department University of York 1.
POINTLESS & SCALA Phil Evans. POINTLESS What does it do? 1. Determination of Laue group & space group from unmerged data i. Finds highest symmetry lattice.
11/23/2015Slide 1 Using a combination of tables and plots from SPSS plus spreadsheets from Excel, we will show the linkage between correlation and linear.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
A simple formula for calculating the momentum spread from the longitudinal density distribution and RF form Recycler Meeting March 11, 2009 A. Shemyakin.
3. Spot Finding 7(i). 2D Integration 2. Image Handling 7(ii). 3D Integration 4. Indexing 8. Results 1. Introduction5. Refinement Background mask and plane.
Ligand Building with ARP/wARP. Automated Model Building Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate.
Direct Use of Phase Information in Refmac Abingdon, University of Leiden P. Skubák.
Fitting EM maps into X-ray Data Alexei Vagin York Structural Biology Laboratory University of York.
Refinement of Macromolecular structures using REFMAC5 Garib N Murshudov York Structural Laboratory Chemistry Department University of York.
Adam Blake, June 9 th Results Quick Review Look at Some Data In Depth Look at One Anomalous Event Conclusion.
CCP4 Version The most recent version of the CCP4 suite is 4.1, which was released at the end of January 2001, with a minor patch release shortly.
SFCHECK Alexei Vagin YSBL, Chemistry Department, University of York.
Automated Refinement (distinct from manual building) Two TERMS: E total = E data ( w data ) + E stereochemistry E data describes the difference between.
Lecture 3 Patterson functions. Patterson functions The Patterson function is the auto-correlation function of the electron density ρ(x) of the structure.
Questionnaire-Part 2. Translating a questionnaire Quality of the obtained data increases if the questionnaire is presented in the respondents’ own mother.
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Complete automation in CCP4 What do we need and how to achieve it?
Progress Report in REFMAC
Giovanni Settanni, Antonino Cattaneo, Paolo Carloni 
Phil Evans MRC Laboratory of Molecular Biology Cambridge
Axel T Brünger, Paul D Adams, Luke M Rice  Structure 
Version 5.3 From SMILE string to dictionary (LIBCHECK): Now coot uses it Segment id is now used Automatic adjustment for weights Improved bond order extraction.
Volume 15, Issue 9, Pages (September 2007)
Garib Murshudov YSBL, Chemistry Department, University of York
The temporary site to download BALBES:
Volume 16, Issue 5, Pages (May 2008)
Volume 96, Issue 7, Pages (April 2009)
Ligand Binding to the Voltage-Gated Kv1
Intrinsic Bending and Structural Rearrangement of Tubulin Dimer: Molecular Dynamics Simulations and Coarse-Grained Analysis  Yeshitila Gebremichael, Jhih-Wei.
The site to download BALBES:
Protein structure prediction
Presentation transcript:

Refinement with REFMAC Garib N Murshudov York Structural Laboratory Chemistry Department University of York

Contents Refinement program – Refmac Simple refinement: Selection of weights Automatic twin refinement – Rfactor warnings Low resolution refinement tools

What can REFMAC do? Simple maximum likelihood restrained refinement Twin refinement Phased refinement (with Hendrickson-Lattmann coefficients) SAD/SIRAS refinement Structure idealisation Library for more than 10000 ligands (from the next version) Covalent links between ligands and ligand-protein Rigid body refinement Low res: NCS local, restraints to external structures, jelly body TLS refinement Map sharpening Occupancy refinement etc

Simple refinement

Simple refinement

“Optimisation” of weights

“Optimisation” of weights After refinement final statistics are: Initial Final R factor 0.2783 0.1831 R free 0.2668 0.2030 Rms BondLength 0.0284 0.0327 Rms BondAngle 4.5704 2.3083 Rms ChirVolume 0.1696 0.1645 RMSD of bond lengths is too large.

“Optimisation” of weights If rmsd of bond lengths is too large (>0.022) or too tight (<0.01) then you may want to change weights. It can be done using weight matrix on the interface. Look at the log file. Refmac prints out current weights it is using. Weight matrix 4.4438701 Actual weight 10.000000 is applied to the X-ray term If rmsd is large then you can use half of currently used weight matrix (around 2.2).

“Optimisation” of weights Change weight matrix

“Optimisation” of weights With new weights RMSD is reasonable. Initial Final R factor 0.2783 0.1876 R free 0.2668 0.2052 Rms BondLength 0.0284 0.0201 Rms BondAngle 4.5704 1.6554 Rms ChirVolume 0.1696 0.1063

Twin refinement

merohedral and pseudo-merohedral twinning Crystal symmetry: P3 P2 P2 Constrain: - β = 90º - Lattice symmetry *: P622 P222 P2 (rotations only) Possible twinning: merohedral pseudo-merohedral - Domain 1 Domain 2 Twinning operator - Crystal lattice is invariant with respect to twinning operator. The crystal is NOT invariant with respect to twinning operator.

Twin refinement (it works with older version also

Twin refinement Twin refinement in REFMAC is carried out in several stages Stage 1: Identify potential twin operators. It is done by analysis of lattice and crystal symmetry. In this case space group is P31 and there are four potential twin operators Potential twin domain 1 with operator: H, K, L, metric score 0.000 Potential twin domain 2 with operator: -K, -H, -L, metric score 0.000 Potential twin domain 3 with operator: -H, -K, L, metric score 0.000 Potential twin domain 4 with operator: K, H, -L, metric score 0.000

Twin refinement: Group/subgroup

Twin refinement 2) Stage 2: Filter using agreement between “twin” related reflections (using Rmerge) Filtering out small twin domains, step 1 Twin ops with Rm > 0.44 will be removed SymOp= -K,-H,-L:R_m=0.248:twin is probable SymOp= -H,-K, L:R_m=0.237:twin is probable SymOp= K, H,-L:R_m=0.027:twin or higher symm At this stage REFMAC may suggest that space group could be higher

Twin refinement: Effect of twin on Rmerge R merges without experimental error No twinning 50% Along non twinned axes with another axis than twin 37.5% Non twin Twin

Twin refinement 3) Stage 3: Estimate twin fractions and remove small twin domains Filtering out small twin domains, step 2 Twin domains with fraction < 7.00000003E-02 are removed Twin operators with estimated twin fractions Twin op: H, K, L: Fr = 0.391; Eq ops: K, -H-K, L; -H-K, H, L Twin op: -K, -H, -L:Fr = 0.112; Eq ops: -H, H+K, -L; H+K, -K, -L Twin op: -H, -K, L:Fr = 0.108; Eq ops: -K, H+K, L; H+K, -H, L Twin op: K, H, -L:Fr = 0.390; Eq ops: H, -H-K, -L; -H-K, K, -L

Twin refinement 3) Stage 4: Perform twin refinement with all survived twin operators (in this example all four operators survive): Twin fractions = 0.3773 0.1246 0.1206 0.3775 Rfactors look very good: Initial Final R factor 0.1912 0.1566 R free 0.1796 0.2047 Rms BondLength 0.0088 0.0235 Rms BondAngle 1.4825 2.1812 Rms ChirVolume 0.1077 0.1336

Rfactors from non-twinned refinement Initial Final R factor 0.3103 0.2779 R free 0.3184 0.3496 Rms BondLength 0.0088 0.0129 Rms BondAngle 1.4825 1.5648 Rms ChirVolume 0.1077 0.1034

Twin refinement: Rfactors – be careful Cyan – perfect twin and twin modelled Black – no twin and not modelled Red – perfect twin and not modelled Blue – no twin and perfect twin modelled Rfactor drop can be as large as 15% without atomic model improvement

Twin refinement: Alternative indexing If crystal can be twinned then there may be more than one indexing of hkl. Different indexing are related with the symmetry operator of lattice but not the crystal. Best way of dealing with indexing “problem” is to use the program pointless by Phil Evans. You can either give a reference mtz file or a reference structure. Then all subsequent data will be indexed in consistent manner.

Low resolution refinement

Low resolution refinement tools Jelly body (implicit normal modes) refinement NCS: local and global restraints NCS constraints Restraints to reference structures Regularised map sharpening Long range B value restraints based on Kullback-Liebler distances Murshudov GN, Skubak P, Lebedev AA, Pannu NS, Steiner RA, Nicholls RA, Winn MD, Long F, Vagin AA “REFMAC5 for the Refinement of Macromolecular Crystal Structures” Acta Cryst: , D67, 355-367

External (reference structure restraints) Restraints to external structures are generated by the program ProSmart: 1) Aligns structure in the presence of conformational changes. Sequence is not used 2) Gernates restraints for aligned atoms 3) Identifies secondary structures (at the moment helix and strand, but the approach is general and can be extended to any motif). 4) Generates restraints for secondary structures Note 1: ProSmart has been written by Rob Nocholls and available from him (now). It will be distributed by ccp4 (hopefully from the next release) Note 2: Robust estimator functions are used for restraints. I.e. if differences between target and model is very large then their contributions are downweighted Should be able to well-align similar proteins, but should also be able to align dissimilar proteins, so that a meaningful score can be given in all cases.

Restraints to current distances The term is added to the target function: Summation is over all pairs in the same chain and within given distance (default 4.2A). dcurrent is recalculated at every cycle. This function does not contribute to gradients. It only contributes to the second derivative matrix. It is equivalent to adding springs between atom pairs. During refinement inter-atomic distances are not changed very much. If all pairs would be used and weights would be very large then it would be equivalent to rigid body refinement. It could be called “implicit normal modes”, “soft” body or “jelly” body refinement. Should be able to well-align similar proteins, but should also be able to align dissimilar proteins, so that a meaningful score can be given in all cases.

External (reference structure restraints) The program will be available from ccp4. Currently if you want to try it you should ask Rob Nicholls at ran105@york.ac.uk Once you have downloaded you can run using this command prosmart –p1 refined_structure.pdb –p2 reference_structure.pdb It will generate many useful info including restraints to the reference structure. Should be able to well-align similar proteins, but should also be able to align dissimilar proteins, so that a meaningful score can be given in all cases.

Auto NCS: local and global Align all chains with all chains using Needleman-Wunsh method If alignment score is higher than predefined (e.g.80%) value then consider them as similar Find local RMS and if average local RMS is less than predefined value then consider them aligned Find correspondence between atoms If global restraints (i.e. restraints based on RMS between atoms of aligned chains) then identify domains For local NCS make the list of corresponding interatomic distances (remove bond and angle related atom pairs) Design weights The list of interatomic distance pairs is calculated at every cycle Should be able to well-align similar proteins, but should also be able to align dissimilar proteins, so that a meaningful score can be given in all cases.

Add external keywords file in refmac interface Browse files

Add external keywords file in refmac interface Select keywords file

Add external keywords file in refmac interface

Instructions you may want to play with # Jelly body Ridge dist sigma 0.01 ridge dist dmax 4.2 # ncs ncsr local # to control restraints to reference structures. # Restraints are generated by prosmart external dmax 4.2 external weight scale 4 external cut 10 Should be able to well-align similar proteins, but should also be able to align dissimilar proteins, so that a meaningful score can be given in all cases.

Low resolution refinement: Some results Initial Simple Jelly NCS local Jelly/NCS Reference structure R 0.3605 0.2218 0.2533 0.2232 0.2535 0.2557 Rfree 0.3563 0.3116 2961 0.3124 0.2955 0.2907 If you want to use current version then you may need to run several time to get parameters right. In this case maximum radius for reference structure restraint was 4.0, maximum radius for NCS local was 4.2, if deviation between reference distance and current distance was more than 10 sigma then it was excluded, sigmas for reference structures were 0.07. At lower resolution (5-7Å) radius may need to be 5.5 and sigma 0.02 Should be able to well-align similar proteins, but should also be able to align dissimilar proteins, so that a meaningful score can be given in all cases.

Conclusions Auto weight works fine for large class of cases, however you may need to change weights Twin is automatic but Rfactors are poor indicators Use of available information may improve low resolution refinement

Acknowledgment York Leiden Alexei Vagin Pavol Skubak Andrey Lebedev Raj Pannu Rob Nocholls Fei Long CCP4, YSBL people ______________________________________________________________________ REFMAC is available from CCP4 or from York’s ftp site: www.ysbl.york.ac.uk/refmac/latest_refmac.html Balbes and other programs: www.ysbl.york.ac.uk/refmac/YSBLPrograms/index.jsp This and other presentations can be found on: www.ysbl.york.ac.uk/refmac/Presentations/