Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reusing phenix.refine for powder data? Ralf W. Grosse-Kunstleve Computational Crystallography Initiative Lawrence Berkeley National Laboratory Workshop.

Similar presentations


Presentation on theme: "Reusing phenix.refine for powder data? Ralf W. Grosse-Kunstleve Computational Crystallography Initiative Lawrence Berkeley National Laboratory Workshop."— Presentation transcript:

1 Reusing phenix.refine for powder data? Ralf W. Grosse-Kunstleve Computational Crystallography Initiative Lawrence Berkeley National Laboratory Workshop on developments and directions of powder diffraction on proteins, June 22/23, 2007

2 My two lives Live 1 (PhD project): –Zeolite structure determination from powder data using extracted intensities Live 2: –Contributions to Xplor/CNS Single-crystal protein crystallography About 80% of all PDB entries refined with Xplor/CNS –Phenix project Fresh start after losing a legal battle

3 Funding: NIH Program Project (NIGMS, PSI), Director - Paul Adams CCI APPS SOLVE / RESOLVE PHASER TEXTAL MolProbity / REDUCE Computational Crystallography Initiative (LBNL) -Paul Adams, Ralf Grosse-Kunstleve, Pavel Afonine -Nigel Moriarty, Nicholas Sauter, Peter Zwart Los Alamos National Lab (LANL) -Tom Terwilliger, Li-Wei Hung Cambridge University -Randy Read, Airlie McCoy Texas A&M University -Tom Ioerger, Jim Sacchettini, Erik McKee Duke University - Jane Richardson, David Richardson, Ian Davis Phenix Collaboration

4 Spectrum of phenix components Automated analysis of data quality: phenix.xtriage Rapid substructure determination: phenix.hyss Phasing: Maximum likelihood – SOLVE, PHASER for SAD Density modification: Statistical density modification (RESOLVE) Automated model building: –Pattern matching methods (RESOLVE or TEXTAL) Structure refinement: phenix.refine (likelihood, annealing, TLS) Advanced automation: AutoSol – hkl to map Ligand building and fitting: eLBOW, AutoLigand Validation and Hydrogens: MolProbity + Reduce

5 phenix.refine - Group ADP refinement - Rigid body refinement - Restrained refinement (xyz, iso/aniso ADP) - Automatic water picking - Bond density - Unrestrained refinement - FFT or direct summation - Hydrogens - Automatic NCS restraints - Simulated Annealing - Occupancies (individual, group) - TLS refinement - Twinned data - X-ray, Neutron, joint X-ray + Neutron refinement

6 Refinement flowchart Input data and model processing Refinement strategy selection Bulk-solvent, Anisotropic scaling, Twinning parameters refinement Ordered solvent (add / remove) Target weights calculation Coordinate refinement (rigid body, individual) (minimization or Simulated Annealing) ADP refinement (TLS, group, individual iso / aniso) Occupancy refinement (individual, group) Output: Refined model, various maps, structure factors, complete statistics PDB model, Any data format (CNS, Shelx, MTZ, …) Files for COOT, O, PyMol Repeated several times

7 Designed to be very easy to use Refinement of individual coordinates and B-factors: % phenix.refine model.pdb data.hkl Same as above plus water picking: % phenix.refine model.pdb data.hkl ordered_solvent=true Run with parameter file: % phenix.refine model.pdb data.hkl parameter_file refinement.main { high_resolution = 2.0 simulated_annealing = True ordered_solvent = True number_of_macro_cycles = 5 } refinement.refine.adp { tls = chain A tls = chain B }

8 How to best make ends meet? GSAS & proteins –Extending a small-molecule powder program to deal with proteins –Advantage: program designed for the field Community used to inputs, outputs, idiosyncrasies –Disadvantage: some approaches suitable for small molecules don’t scale Direct-summation structure factor calculation Neighborhood calculations (nonbonded interactions, a.k.a. anti-bumping restraints) phenix.refine –Extending a single-crystal protein program to deal with powders –Advantage: program designed to deal with large structures Protein, RNA/DNA restraint libraries, optimized algorithms –Disadvantage: new data formats, differences in terminology

9 Two main challenges Challenge 1: –Input/output of powder-specific format Fundamentally trivial but potentially tedious New command? –No interference with existing, non-trivial algorithms for automatic recognition, processing, and consolidation of already very heterogeneous inputs Extend the existing input algorithms? –Nicer, but requires higher degree of collaboration Challenge 2: –Development of a powder-specific target function Based on extracted intensities or primary pattern + pre-fitted profile parameters? Maximum likelihood with or without cross-validation? Will probably require some refactoring of the refinement engine

10 Modular design Application level –phenix wizards (data in, structure out) –phenix.refine –phenix.hyss (hybrid substructure search) –Visible source Library level –cctbx project, organized in modules libtbx, scitbx, cctbx, iotbx, mmtbx –cctbx is intended to cover small-molecule work But nothing yet specific to powders –Unrestricted open source

11 Existing target functions Least-squares (variety) Maximum likelihood on amplitudes Maximum likelihood with experimental phases Least-squares twin target SAD-specific maximum likelihood target implemented in Phaser –Reusing target from external application! Dirty laundry –Severe code duplication in implementation of twin target Needs to be consolidated –Some friction integrating the Phaser ML-SAD target Phaser target relatively slow: we need better bookkeeping to avoid repeated calculations with exactly the same input

12 Precedence for reusing cctbx? cctbx used heavily by all phenix collaborators Phaser uses cctbx -> cctbx supported by CCP4 6.0 and up smtbx: small-molecule toolbox –Group at Durham University, U.K. collaborating with David Watkin at Oxford University, U.K. –Long-term goal: highly integrated single-crystal structure determination (direct methods), automatic model building and refinement –Initial focus: iterative model building and refinement –Initial approach: reuse + adjust cctbx core libraries directly combined with copying sub-modules to smtbx where they are modified –Long term: consolidate duplications as much as possible half the code = half the bugs, reuse of optimizations

13 Summary of ideas Implement powder-specific target function(s) that plug into the refinement engine in the open source cctbx libraries –Can be done stand-alone using ad-hoc input/output methods –Collaborate in making the necessary adjustments to the existing libraries Figure out the best way to handle input/output at the application level –Learn and re-evaluate as we go If the powder field joins in there will be the potential for direct cross-fertilization between three specializations in crystallography –Single-crystal protein –Single-crystal small-molecule –Powder diffraction protein –More? (powder diffraction small-molecule) cctbx libraries are very general Ever increasing integration is the secret behind the stunning successes in the development of computing technology –Can we make this idea work in crystallography?

14 Availability Phenix incl. Graphical User Interface –http://www.phenix-online.org/ –Freely available to academic (non-profit) groups Core libraries (cctbx) –http://cctbx.sourceforge.net/ –Freely available to all

15 Acknowledgments Phenix developers –P.D. Adams –P. Afonine –T.R. Ioerger –A.J. McCoy –E.W. McKee –N.W. Moriarty –R.J. Read –N.K. Sauter –J.N. Smith –L.C. Storoni –T.C. Terwilliger –P.H. Zwart Funding: –LBNL (DE-AC03-76SF00098) –NIH/NIGMS (1P01GM063210) –P HENIX Industrial Consortium


Download ppt "Reusing phenix.refine for powder data? Ralf W. Grosse-Kunstleve Computational Crystallography Initiative Lawrence Berkeley National Laboratory Workshop."

Similar presentations


Ads by Google