Presentation is loading. Please wait.

Presentation is loading. Please wait.

From crystals to pdb: building a high throughput crystallography pipeline for structural genomics Chiu HJ 1, Wolf G 1, West W 2, van den Bedem H 1, Miller.

Similar presentations


Presentation on theme: "From crystals to pdb: building a high throughput crystallography pipeline for structural genomics Chiu HJ 1, Wolf G 1, West W 2, van den Bedem H 1, Miller."— Presentation transcript:

1 From crystals to pdb: building a high throughput crystallography pipeline for structural genomics Chiu HJ 1, Wolf G 1, West W 2, van den Bedem H 1, Miller MD 1, Zhang Z 1, Morse A 2, Wang X 2, Xu Q 1, Levin I 1, von Delft F 3, Elsliger MA 3, Godzik A 2, Grzechnik SK 2 and Deacon AM 1 1 Stanford Synchrotron Radiation Laboratory, 2575 Sand Hill Road, Menlo Park, CA 94025. 2 University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 3 The Scripps Research Institute, 10550 N. Torrey Pines Rd., La Jolla, CA 92037 The Structure Determination Core (SDC) of the Joint Center for the Structural Genomics (JCSG) is dedicated to developing technologies, which streamline all the steps in the structure determination process from crystals to PDB-ready atomic coordinates. Over the last year the JCSG production capacity has increased dramatically. SDC has screened more than 7000 crystals from 192 protein targets. A total of 232 datasets from 106 targets have been collected and 90 structures have been solved. In order to handle the rapidly growing flow of experimental data, we have developed a set of crystallographic and database tools to both track and streamline our workflow. Crystal cassettes are shipped to SDC from the Crystallomics Core. All relevant crystal information is captured in the central JCSG database and is downloaded in a “Beamline Report”. Crystals are screened automatically using the Stanford Auto-Mounter and Blu-Ice software. The visual and diffraction properties of each crystal are recorded. A computer program, DISTIL, is under development to automatically analyze diffraction images and provide an objective screening evaluation for each crystal. The best crystals for each target are flagged for data collection. A computer program, Xsolve, is used for automatic crystallographic data processing and structure solution. A model building tool providing crystallographers with the best possible initial model for refinement is under development. The results of the analysis are uploaded to a Structure Solution Tracking System. A Refinement Tracking System requests weekly updates and collects all the data necessary for a peer-review Quality Control step, before the coordinates are deposited to the Protein Data Bank. The Joint Center for Structural Genomics Mission: To establish a robust and scalable protein structure determination pipeline that will form the foundation for a large-scale cost effective production center for structural genomics. Structural Genomics of Thermotoga maritima T.maritima genome A system to test the pipeline Small bacterial genome 1877 gene products Proteins should express well in E. coli Proteins from a thermophile may be more stable Process entire genome Establish trends in process e.g. crystallization. Category Number % Nucleic acid binding DNA binding DNA repair DNA replication factor Transcription factor RNA binding Structural Ribosomal protein Translation factor Motor Enzyme 170 109 11 3 37 43 52 12 5 600 9.2 5.9 0.5 0.1 1.9 2.3 2.8 0.6 0.2 32.4 Peptidase Protein Kinase Protein Phosphatase Signal transducer Cell adhesion Structural Protein Transporter Ion channel Ligand Binding or carrier Electron transporter Unknown or unclassified 27 17 8 32 1 61 202 3 255 52 713 1.5 0.9 0.4 1.7 0.0 3.3 10.9 0.2 13.8 2.8 38.5 Total 1877 100% HT Structure Determination 2 nd Generation HT Data Collection 1 st Generation Prototype 3 rd Generation Software Target Selection HT Imaging 1 st Generation Hardware 6 th Generation Software Structure Validation & Deposition Autosubmission of electronic publication Data flow parallels the experimental pipeline, harvesting ~300 parameters from 19 stages HT Crystallization HT Purification HT Expression PDB HT Pipeline Processes, Bottlenecks and Leaks purificationexpression clonin g struc. refinement struc. validation annotation publication phasing data collection xtal screening tracing bl xtal mounting crystallizationimaging harvesting target selection All relevant crystal information is captured in the central JCSG database in the form of Beamline Report Target ID Diffraction properties Resolution Spot quality Diffraction strength Beamline Crystallization coditionVisual properties Robust and automated crystal screening Initial design to production Large-scale capacity Shipping, storage and screening Used by JCSG since June 2002 Implemented on all SSRL beamlines Cassette kits distributed to PX user groups Integration with BLU-ICE Automated sample mounting Automated sample alignment Automated diffraction images Increased screening capacity during SSRL shutdown Leverage existing infrastructure X-ray MicroMax-002 generator installed June 2003 SSRL automated screening system used >4200 crystals screened in 9 months All data uploaded to JCSG DB Screening, collection and structure solution Work closely with BIC on implementation and debugging Still more features needed to handle expanding production Structure solution tracking Local SDC “dataset” database Active crystal report Xsolve: automation of structure determination 2004 developments Improve success rate: better autoindexing, determine optimal resolution for scaling sweeps More general: handle crystallographic details: re-indexing screw axes, merging sweeps More robust operation: catch timeouts, core dumps, infinite loops etc Implement parallelization: develop tools to monitor and control processing on a Linux cluster New program support: HKL2000, SHARP, SHELXD (not completely tested) Mosflm Autoindex Mosflm Integrate Solve Resolve Trace Scala Scale Solve P422 1 mol 2. Solve P422 2 mols 3. Solve P4122 1 mol 2. Solve P4122 2 mol 3. Solve P4222 1 mol 2. Solve P4222 2 mols 3. Autoindex Integrate Scale Solve Trace Main goals Handle majority cases Organize data and workflow Ease information flow to JCSG DB Allow integration of new programs. Use parallel execution of jobs Refinement Tracking System Automation of protein model completion: an inverse kinematics approach Automatically Build Backbone Fragments: Build candidate closing conformations using IK techniques (robotics) Rank according to electron density fit and conformational likelihood Subject top-ranking candidates to real-space, torsion angle SA refinement Results: Closed missing fragments of up to 12 residues in length to within 0.6A all-atom RMSD in 2.8A-model Manually Finalizing Model: Labor intensive, time consuming Existing aids are highly interactive Lotan et al. submitted van den Bedem et al. in preparation Total Crystals Screened at SDC10778 Unique Targets Represented356 TM/non-TM targets299/57 Datasets collected 394 (288 TM, 106 non-TM) Unique Targets Represented194 TM/non-TM targets 146/48 Structures solved155 (94 MAD; 51 MR; 3 SAD; 7 NMR) (125 TM: 30 non-TM) JCSG production statistics (August 10, 2004) can be searched by Shipment ID Dewar Target ID Cassette/puck Installation of a Microsource X-ray generator at 9-2 JCSG production statistics (August 10, 2004) More to come… 22 targets: data collected, not yet solved 92 targets: diffraction better than 3.5Å, not yet solved Growing reliance on the JCSG DB 500 crystals and 8 structures per month 20 cassettes (2000 crystals) inventory 30-40 structures in refinement 2.0 TB of diffraction images 0.5 TB of processing files >100,000 diffraction images Average resolution of structures in PDB 2.0A Average protein chain length 260 aa Average number of residues in asu 480 aa TSRI Administrative Core Ian Wilson Peter Kuhn Marc Elsliger Frank von Delft Tina Montgomery Gye Won Han Rong Chen Angela Walker UCSD Bioinformatics Core John Wooley Adam Godzik Susan Taylor Slawomir Grzechnik Bill West Andrew Morse Jie Quyang Xianhong Wang Jaume Canaves Lukasz Jaroszewski Robert Schwarzenbacher Marc Robinson Rechavi Chris Edwards Olga Kirillova Ray Bean, Josie Alaoen Stanford /SSRL Structure Determination Core Keith Hodgson Ashley Deacon Britt Hedman Guenter Wolf Mitch Miller Henry van den Bedem Qingping Xu Herbert Axelrod Christopher Rife Inna Levin R. Paul Phizackerley Amanda Prado John Kovarik Ross Floyd Irimpan Mathews Michael Solits Aina Cohen Paul Ellis GNF & TSRI Crystallomics Core Ray Stevens Scott Lesley Rebbeca Page Carina Grittini Glen Spraggon Andreas Kreusch Michael DiDonato Daniel McMullan Heath Klock Polat Abdubek Eileen Ambing Tanya Biorac Joanna C. Hale Justin Haugen Mike Hornsby Eric Koesema Edward Nigoghossian Kevin Quijano Megan Wemmer Aprilfawn White Juli Vincent Jeff Velasquez Kin Moy Vandana Sridhar Bernard Collins Thomas Clayton Scientific Advisory Board Carl-Ivar Brändén, Karolinska Inst., Stockholm (retired 2003) Elbert Branscomb, DOE Joint Genome Inst., Walnut Creek Stephen Cusack, EMBL – Outstation Grenoble Leroy Hood, Inst. for Systems Biology, Seattle John Kuriyan, U.C. Berkeley Erkki Ruoslahti, The Burnham Institute James Wells, Sunesis Pharmaceuticals, Inc. Charles Cantor. Sequenom, Inc. Todd Yeates, UCLA-DOE, Inst. for Genomics and Proteomics James Paulson, Consortium for Functional Glycomics, The Scripps Research Institute Exploratory Projects Kurt Wüthrich (NMR) Linda Columbus Touraj Etezady-Esfarjani Wolfgang Peti Virgil Woods (DXMS) Acknowledgements NIH Protein Structure Initiative Grant P50 GM62411


Download ppt "From crystals to pdb: building a high throughput crystallography pipeline for structural genomics Chiu HJ 1, Wolf G 1, West W 2, van den Bedem H 1, Miller."

Similar presentations


Ads by Google