Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn.

Slides:



Advertisements
Similar presentations
Molecular Replacement in CCP4
Advertisements

Molecular Replacement
Search in electron density using Molrep
Informational Webinar Illinois Partners for Human Service Monday, July :00 pm.
M.D.Winn, DL, March 28th 2007 Session 4 Core activities Intro Overview: Martyn CCP4 Releases: Charles Installation issues: Francois Meetings: Maeri Python.
RCAC Research Computing Presents: DiaGird Overview Tuesday, September 24, 2013.
Tertiary protein structure viewing and prediction July 1, 2009 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Tertiary protein structure viewing and prediction July 5, 2006 Learning objectives- Learn how to manipulate protein structures with Deep View software.
Lecture 9 Hidden Markov Models BioE 480 Sept 21, 2004.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Protein Interfaces, Surfaces and Assemblies
©2013 Lavastorm Analytics. All rights reserved.1 Lavastorm Analytics Engine 5.0 New Feature Overview.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
Peter J. Briggs, Liz Potterton *, Pryank Patel, Alun Ashton, Charles Ballard, Martyn Winn CLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK *
Chapter Four UNIX File Processing. 2 Lesson A Extracting Information from Files.
28 th March 2007 MrBUMP – Automated Molecular Replacement Ronan Keegan, Martyn Winn CCP4, Daresbury Laboratory.
28 Mar 06Automation1 Overview of developments within CCP4 Generation 1 ccp4i tasks Generation 2 isolated scripts / web service Generation 3 integrated.
Business Unit or Product Name © 2007 IBM Corporation Introduction of Autotest Qing Lin.
Modelling binding site with 3DLigandSite Mark Wass
Molecular Replacement Martyn Winn CCP4 group, Daresbury Laboratory, UK.
Authors Project Database Handler The project database handler dbCCP4i is a small server program that handles interactions between the job database and.
A Molecular Replacement Pipeline Garib Murshudov Chemistry Department, University of York 
BALBES (Current working name) A. Vagin, F. Long, J. Foadi, A. Lebedev G. Murshudov Chemistry Department, University of York.
1 Overview of the Application Hosting Environment Stefan Zasada University College London.
23 rd March 2005CCP4 Annual Developers’ Meeting 1 DL: Releases, Conferences and Other Activities Peter Briggs, CCP4 Daresbury.
1 Scripting Workflows with the Application Hosting Environment Stefan Zasada University College London.
R. Keegan 1, J. Bibby 3, C. Ballard 1, E. Krissinel 1, D. Waterman 1, A. Lebedev 1, M. Winn 2, D. Rigden 3 1 Research Complex at Harwell, STFC Rutherford.
MrBUMP – Molecular Replacement with Bulk Model Preparation Automated search model discovery and preparation for structure solution by molecular replacement.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America EELA Demo: Blast in Grids Ignacio Blanquer.
Using SWARM service to run a Grid based EST Sequence Assembly Karthik Narayan Primary Advisor : Dr. Geoffrey Fox 1.
DroPPC Tutorial DroPPC- A Drosophila Pipeline for Prediction of CRMs 29 th Dec, 2010.
© Geodise Project, University of Southampton, Geodise Middleware & Optimisation Graeme Pound, Hakki Eres, Gang Xue & Matthew Fairman Summer 2003.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
Data Harvesting: automatic extraction of information necessary for the deposition of structures from protein crystallography Martyn Winn CCP4, Daresbury.
1 MrBUMP – Molecular Replacement with Bulk Model Preparation Ronan Keegan, Martyn Winn CCP4 group, Daresbury Laboratory Como May 23rd 2006.
Sep 13, 2006 Scientific Computing 1 Managing Scientific Computing Projects Erik Deumens QTP and HPC Center.
INFSO-RI Enabling Grids for E-sciencE Running ECCE on EGEE clusters Olav Vahtras KTH.
SR Users Meeting 10-11th September 2003 CCP4 Release 5.0 Peter Briggs CCP4/CCLRC Daresbury Laboratory.
February 22-23, Washington D.C. SURA ENDyne Software for Dynamics of Electrons and Nuclei in Molecules. Developed by Dr. Yngve Öhrn and Dr. Erik Deumens,
17 th March 2008 MrBUMP progress report Ronan Keegan & Martyn Winn Daresbury Laboratory.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
© Geodise Project, University of Southampton, Geodise Middleware Graeme Pound, Gang Xue & Matthew Fairman Summer 2003.
Software automation – What STAB sees as key aims? 1.Brief review of activities and recommendations (so far) 2.Reality checks 3. Things to do…
17 th October 2005CCP4 Database Meeting (York) CCP4i Database Overview Peter Briggs.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
SPI NIGHTLIES Alex Hodgkins. SPI nightlies  Build and test various software projects each night  Provide a nightlies summary page that displays all.
CCP4 Molecular Replacement Model Generation Create a CCP4i task for generating Molecular Replacement models. - Selecting suitable PDB entries, based on.
The Gateway Computational Web Portal Marlon Pierce Indiana University March 15, 2002.
© Geodise Project, Scenario: Design optimisation v Model device, discretize, solve, postprocess, optimise Scripting.
EMBL-EBI Eugene Krissinel SSM - MSDfold. EMBL-EBI MSDfold (SSM)
HANDS-ON ConSurf! Web-Server: The ConSurf webserver.
HANDS-ON ConSurf! Web-Server: The ConSurf webserver.
Peter J. Briggs, Alun Ashton, Charles Ballard, Martyn Winn and Pryank Patel CCLRC Daresbury Laboratory, Warrington, Cheshire WA4 4AD, UK The CCP4 project.
Zach Miller Computer Sciences Department University of Wisconsin-Madison Supporting the Computation Needs.
PROTEIN IDENTIFIER IAN ROBERTS JOSEPH INFANTI NICOLE FERRARO.
CyVerse Workshop Discovery Environment Overview. Welcome to the Discovery Environment A Simple Interface to Hundreds of Bioinformatics Apps, Powerful.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
Stony Brook Integrative Structural Biology Organization
Reading e-Science Centre
Licenses and Interpreted Languages for DHTC Thursday morning, 10:45 am
Take a REST from manual searching: PDBe, programmatically
System Design Ashima Wadhwa.
CCP4 6.1 and beyond: Tools for Macromolecular Crystallography
Database Requirements for CCP4 17th October 2005
Releases, Conferences and Other Activities
CCLRC Daresbury Laboratory
Automated Molecular Replacement
MrBUMP: progress and plans
Basic procedure for MD simulations
Presentation transcript:

Bulk Model Construction and Molecular Replacement in CCP4 Automation Ronan Keegan, Norman Stein, Martyn Winn.

Overview Brute force search method for the best model for Molecular Replacement on a target structure. Python script utilising HPC resources. Can also run on single machine. Two main parts: –Model Generation using a variety of methods. –Feeding a selection of the best models into an MR program. User input requirements: target sequence and associated MTZ file.

Overview

Calculate Molecular Weight Estimate number of molecules in the a.s.u. Parse MTZ file for any relevant parameters Process Target information

Using target sequence, program consults services based at the EBI for homologous structures based on sequence matching (OCA). The top match from the sequence based search is then used for a secondary structure based search using the MSDFold/SSM webservice. Using results from above searches, service will also consult PQS at the EBI for any related multimeric structures. As an additional option, the top hits from the search can be aligned using Superpose to construct an ensemble of models to be used at the Molecular Replacement stage. Searching for Homologous Structures

Once the search stage has been completed all of the associated PDB structure files are retrieved. These are then manipulated in several different ways to create a plethora of possible models: –1) PDB Clipping (Pdbcur, Pdbset, Coord_format): Waters and hydrogens are removed Any anomalies in the structure file such as empty fields are corrected (e.g. missing chain identifiers) Select most probable confirmations Individual chains are extracted Model Construction

–2) Molrep Uses own sequence alignment to prune the side chains. Side chains are stripped to lowest common parts. –3) Chainsaw (Norman Stein) Input sequence alignment used to strip side chains. More severe pruning than Molrep: “mixed model”. Can be given many possible alignments to create different models from the same structure. Can use sophisticated sequence aligning such as PSI-Blast and FFAS. Model Construction

A cluster or HPC resource spawns multiple MR jobs each taking one of the constructed models along with the target structure data. Phaser/Amore/Molrep can all be used to do the MR. Phaser used for the Ensemble of top hits. If and when the MR program fits the model structure to the target data the resulting PDB file is processed using Refmac to asses whether it is likely to refine. Results are then provided to the user for all of the top scoring models. User can retrieve the refined structures along with any of the associated log files. Molecular Replacement

Jobs can be submitted via the e-HPTX portal to the Daresbury e-HTPX computational resources (cluster or condor pool) or, if the user has a Grid Certificate, to the UK National Grid Resources. Users can monitor the job results as they are produced via a web page hosted on the e-HTPX server machine and they are notified by when their job is complete. Refined structure files are made available to user for downloading upon completion. First external user as of a couple of days ago! e-HTPX

JCSG Targets N.B. good homologues available Currently working through more challenging examples …

Other points Program can also be run on a single machine in a scaled-down fashion. Can be run from the command line. Easy to swap out Phaser and run Amore, Molrep or other MR program instead. Modularised - Model construction can be run on its own. Other model generating methods can easily be inserted.

Future Plans Make it smarter and quicker. Use better sequence alignment methods such as PSIBlast, FFAS. Use Norman’s Chainsaw program as an extra model creation method. Incorporate Norman’s Amore wrapper. Integrate it into Graeme’s XIA project – make use of scheduler code wrappers & provide a Model Generation module for XIA-MR.