HTP Construct Optimization using Bioinformatics Coupled with Amide Hydrogen Deuterium Exchange (DXMS) and HTP NMR screening Yuanpeng (Janet) Huang Northeast.

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

BiGCaT Bioinformatics Hunting strategy of the bigcat.
Genomics: READING genome sequences ASSEMBLY of the sequence ANNOTATION of the sequence carry out dideoxy sequencing connect seqs. to make whole chromosomes.
High Throughput Protein Domain Elucidation by Limited Proteolysis-Mass Spectrometry Jeff Bonanno and Xia Gao Structural GenomiX, Inc.
Oncomine Database Lauren Smalls-Mantey Georgia Institute of Technology June 19, 2006 Note: This presentation contains animation.
2009 NIGMS Workshop Enabling Technologies for Structural Biology March 4 th -6 th, 2009 Extra-Cellular Mammalian Proteins As Structural Genomics Targets.
1 I NMR screening protocol Story of the samples after they leave the third floor! II Keeping track of the entire process starting with NC5 samples Biography.
Basic Genomic Characteristic  AIM: to collect as much general information as possible about your gene: Nucleotide sequence Databases ○ NCBI GenBank ○
High-Throughput Protein Production Platform for the Northeast Structural Genomics Consortium ER82 WR66 Thomas Acton, Ken Conover, Bonnie Cooper, Yiwen.
Acknowledgements: We thank the LSMBO and the Structural Biology and Genomics Dept. members. This work was supported by funds from SPINE EEC QLG2-CT ,
Cluster analysis of networks generated through homology: automatic identification of important protein communities involved in cancer metastasis Jonsson.
Gene expression analysis summary Where are we now?
2004 PP&CW Optimization of protein expression and solubility Alternative and novel prokaryotic expression systems Eukaryotic expression systems Methods.
Systematic Identification of Protein Domains for Structure Determination Ming Luo, Ph.D. University of Alabama at Birmingham March 29, 2004 NIH.
Protein Expression and Folding Optimization For High-Throughput Proteomics Kate Drahos 9 April 2004.
New Approaches for High-Throughput Identification and Characterization of Protein Complexes Michelle V. Buchanan Oak Ridge National Laboratory NIH Workshop.
Arabidopsis genome John Markley Eldon Ulrich (bioinformatics team leader) Center for Eukaryotic Structural Genomics (CESG)
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Molecular Library and Imaging Francis Collins, NHGRI Tom Insel, NIMH Rod Pettigrew, NIBIB Building Blocks and Pathways Francis Collins,NHGRI Richard Hodes,
GTL User Facilities Facility II: Whole Proteome Analysis Michelle V. Buchanan.
Bioinformatics for biomedicine Protein domains and 3D structure Lecture 4, Per Kraulis
23 May June May 2002 From genes to drugs via crystallography 19 May 1996 Experimental and computational approaches to structure based.
PSI Data Management and Reporting: Expectations, Standards and Utility J. Michael Sauder Director, Bioinformatics NYSGXRC Project Leader.
Biochemistry February Lecture Analytical & Preparative Protein Chemistry II.
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Towards the Management of Information Quality in Proteomics David Stead University of Aberdeen.
Workflow of SeMet Protein Preparation Yingyi Fang Haleema Janjua.
Center for Human Health and the Environment
Workflow Analysis for the Northeast Structural Genomics Consortium at the CABM/Rutgers University/RWJMS Protein Production Facility October 22, 2002 Celia.
BASys: A Web Server for Automated Bacterial Genome Annotation Gary Van Domselaar †, Paul Stothard, Savita Shrivastava, Joseph A. Cruz, AnChi Guo, Xiaoli.
ISMB 2005 Detroit, June 27 th 2005 Proteome 1 Michal Linial Institute of Life Sciences The Hebrew University Jerusalem, Israel Computer Science and Engineering.
Helen M. Berman, Rutgers University EMBO Practical Course Section: Searching Structure Databases September 26, 2008 PSI Structural Genomics Knowledgebase.
Finish up array applications Move on to proteomics Protein microarrays.
Data and Dissemination Core 1. Overview and EFI Website – Heidi Imker, UIUC 2. EFI LabDB LIMS – Wladek Minor, UVA 3. SFLD – Patsy Babbitt, UCSF (post lunch)
Function first: a powerful approach to post-genomic drug discovery Stephen F. Betz, Susan M. Baxter and Jacquelyn S. Fetrow GeneFormatics Presented by.
Automating Steps in Protein Structure Determination by NMR CS April 13, 2009.
Considerations for Sample Preparation. Protein Extraction Mechanical grinding Detergents Other buffers Sonication.
Biological Databases Biology outside the lab. Why do we need Bioinfomatics? Over the past few decades, major advances in the field of molecular biology,
Protein Structure & Modeling Biology 224 Instructor: Tom Peavy Nov 18 & 23, 2009
A new "Molecular Scanner" design for interfacing gel electrophoresis with MALDI-TOF ThP Stephen J. Hattan; Kenneth C. Parker; Marvin L. Vestal SimulTof.
Topic 2 John Markley. Task: choice of targets that meet selection criteria and are likely to yield structures Models from sequences: ORFs, intron/exon.
Structural Biology and Genomics Platform Didier Busso - April 26, 2007 Platform’s Technical Coordinator
Six plasmids for NC5 sample expression and 2D [ 1 H, 15 N] HSQC screening  Rossmann2x3_58: OR25  Rossmann2x3_59: OR26  Rossmann2x3_61: OR27  Rossmann2x3_71:
Protein Structure Initiative Mission Statement. The long- range goal of the Protein Structure Initiative is to make the three- dimensional atomic-level.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Protein Structure Database for Structural Genomics Group Jessica Lau December 13, 2004 M.S. Thesis Defense.
1 Epidermal Growth Factor Receptor (EGFR) the transmembrane + juxtamembrane domains L1CR1L2CR2 JM KinaseCT Extracellular portionIntracellular.
 Six designs (OR25, OR26, OR27, OR28, OR29, OR30) for 2D [ 1 H, 15 N] HSQC screening  OR28 for structure determination Gaohua Liu 1, Nobuyasu Koga 2,
Workflow of the Manual Purification of N/NC5-enriched proteins
1 Workflow Analysis of the Protein Purification Process of SeMet Labeled Proteins September 30, 2005 Haleema Janjua.
Lecture-8 Introduction to Proteomics Huseyin Tombuloglu, Phd GBE423 Genomics & Proteomics.
Copyright OpenHelix. No use or reproduction without express written consent1.
SG KB 2009 NIGMS Workshop: Enabling Technologies for Structural Biology Section on Structural Analysis Helen M. Berman March 4, 2009 How to use the PSI.
Bringing structural biology services through collaboration.
The Biologist’s Wishlist A complete and accurate set of all genes and their genomic positions A set of all the transcripts produced by each gene The location.
Protein structure depends on amino acid sequence and interactions Amino acid sequence Local interactions Long distance interactions Interactions between.
Optimizing Biological Data Integration
Ming Luo, Ph.D. University of Alabama at Birmingham March 29, 2004 NIH
Homology 3D modeling and effect of mutations
MCB test 2 Review M. Alex Miranda 11/5/16.
Target selection strategies for the mouse genome
Prediction of Protein Structure and Function on a Proteomic Scale
Volume 22, Issue 4, Pages (April 2014)
Protein Structure Database for Structural Genomics Group
Complementary Structural Mass Spectrometry Techniques Reveal Local Dynamics in Functionally Important Regions of a Metastable Serpin  Xiaojing Zheng,
Volume 23, Issue 8, Pages (August 2015)
Frank R. Collart Midwest Center for Structural Genomics
Proteins Have Too Many Signals!
Presentation transcript:

HTP Construct Optimization using Bioinformatics Coupled with Amide Hydrogen Deuterium Exchange (DXMS) and HTP NMR screening Yuanpeng (Janet) Huang Northeast Structural Genomics Consortium(NESG) Rutgers University

PSI-1 NESG-RUTGERS HUMAN PROTEIN PRODUCTION in E. Coli ( ) Total PDB/Cloned Targets = 1.4% Cloned Target: 905 Analytical Expressed & Soluble: 215 Purification: 224 Aggregation: 33 Crystal Trial: 63 NMR Screening: 51 PDB (Xray, NMR): 6,7

PSI-2 BIOMEDICAL THEME – Human Cancer Protein Interaction Network (HCPIN) Huang, et al (2008) Targeting the human cancer pathway protein interaction network by structural genomics Molecular & Cellular Proteomics 7:

EFFORTS TO IMPROVE PSI-2 HUMAN PROTEIN PRODUCTIVITY Target Selection Select proteins validated by SwissProt Exclude proteins annotated or predicted to be secreted or TM Gene synthesis and RT-PCR Construct Design and Optimization Identify disordered regions by DisMeta prediction DXMS analysis

SG of extracellular & membrane-bound HCPIN targets -- Chiang, Rossi, Gurla, Montelione, & Anderson, in preparation.

EFFORTS TO IMPROVE PSI-2 HUMAN PROTEIN PRODUCTIVITY Target Selection Select proteins validated by SwissProt Exclude proteins annotated or predicted to be secreted or TM Gene synthesis and RT-PCR Construct Optimization Identify disordered regions by DisMeta prediction DXMS analysis

SOME PARTIALLY DISORDERED PROTEIN STRUCTURES SOLVED BY NESG Interfere with Structural Determination Efforts Identify disordered regions DisMeta - Disorder Prediction MetaServer DXMS - 1 H/ 2 H exchange mass spectrometry

www-nmr.cabm.rutgers.edu/bioinformatics/disorder

Secondary Structure Prediction Disorder Prediction Server Results Summary of Disorder predictions SyR11 Residue 50 Disorder Prediction MetaServer full length truncated difference

Digest LC-MS H2OH2O D2OD2O Peptide Mass shift based on D 2 O exposure duration H/D Exchange MS: Concept Quench (pH ~2.5, -80°C) (on ice) Centroid of peak = Centroid of peak = Sharma, et al Construct optimization for protein NMR structure analysis using amide hydrogen/deuterium exchange mass spectrometry, Proteins 2009 (in press) H. Zheng

WR33 DXMS Analysis > 50% < = 25% > 25% > 60% > 70% > 80% Mouse Homologue, Kobayashi N., et.al C. elegans, WR33, NESG, R. Tejero, J. Aramini CN N C Human Homologue, HR387, NESG, J. Aramini C

Target HTP Construct Design (DisMeta Prediction) Multiple Alternative Constructs Protein Production NMR Screening Xtal Screening PROTEIN PRODUCTION PROTOCOL Construct Optimization By DXMS/DisMeta

HTP Human Protein Construct Design pdb hits (>80%) 2. Find multiple target regions (DisMeta prediction) 3. Propose alternative constructs 1. Select target domains PDB hit regions (<80% seq. id) PFAM domains pdb hits (<80%) PFAM Total number of constructs (domain) ≈ # of target regions × # of alternative constructs

target region disordered Propose Alternative Constructs Identify multiple target regions (TRs) for each target domain All TRs with length > 50aa and cover at least 80% of the target domain Adjust TR for disorder and helix/strand regions For each TR (S,E) Propose 1-4 alternative constructs (S-5, S)x(E, E+5) Remove the ones intersect with helix/strand Adjust N/C ends (-2:2) target domain disordered target domain helix ✗ Total number of constructs (domain) ≈ # of target regions × # of alternative constructs

Target HTP Construct Design (DisMeta Prediction) Multiple Alternative Constructs Protein Production NMR Screening Xtal Screening PROTEIN PRODUCTION PROTOCOL Construct Optimization By DXMS/DisMeta

NMR SALVAGE PROTOCOL

ER C tag C tag C tag C tag C tag

ER Full length Full length Salvage using DisMeta predictions Failed Competition BjR38

ER553 SaR32VpR68 LkR15 Salvage using DX-MS results – 90 Micro-probe

PROGRESS ON PSI-2 RUTGERS HUMAN PROTEIN PRODUCTION ( PRESENT) Total PDB/Cloned Targets (Constructs) = 3%(1.5%) Cloned Target (Construct): 367 (734) Analytical Expressed & Soluble: 216 Purification: 132 Aggregation: 74 Crystal Trial: 49 NMR Screening (Good, Promising): 55(11, 12) PDB (Xray, NMR): 7,4

PROGRESS ON PSI-2 RUTGERS HUMAN PROTEIN PRODUCTION ( PRESENT) Total PDB/Cloned Targets (Constructs) = 3%(1.5%) Cloned Target (Construct): 367 (734) Analytical Expressed & Soluble: 216 Purification: 132 Aggregation: 74 Crystal Trial: 49 NMR Screening (Good, Promising): 55(11, 12) PDB (Xray, NMR): 7,4

HTP robotic NMR micro cryoprobe screening using microgram quantities of protein Protein samples in Microtubes assessed and scored prior to loading the automatic sample changer B600 with samples loaded for data collection 1D proton spectrum with water suppression assess Signal-to-noise, foldedness of protein 2D NH-HSQC spectrum to evaluate the feasibility for structure determination Archieval of the raw data along with the spectral images, quality scores and stability into SPINE database. Setup & Run Data Archival Virtual 96-well SPINE-based Tools Bruker Icon-NMR GVT Swapna

HTP Buffer Optimization Proteins with Good HSQC Precipitation (button testing) Robotic screening using 12 Buffers varying pH, NaCl, Arginine, Acetonitrile, Zn, Ca, Detergent among others. Clear Cloudy precipitated

Twice the time using 1/20 th of the sample 1 mm micro probe and 1.7 mm micro cryoprobe It is now routinely used in NESG HTP structure production pipeline low yield eukaryotic proteins 3D Structure Determination using microgram quantities of protein Aramini et al (2007) Microgram-scale protein structure determination by NMR. Nature Methods. 4:491-3

SUMMARY Current protocol of construct optimization is focused on identification of disorder regions DisMeta Very fast and no-cost HTP construct design protocol is developed together with DisMeta DXMS More reliable, experimental evidence on disordered regions Useful for identification of disordered regions when the prediction is not satisfactory Automated analysis of DXMS data is under development Human protein production in E. Coli is improved NMR micro cryoprobe Efficient and cost-effective HTP robotic NMR screening NMR Structure determination become feasible for human proteins with low expression and low solubility

ACKNOWLEDGEMENTS Salvage by DXMS Will Buchwald Asli Ertekin Seema Sharma Haiyan Zheng Peter Lobel Bioinformatics John Everett Jessica Locke Binchen Mao Sai Tong Protein Production Thomas Acton Li-Chung Ma Ritu Shastry GVT Swapna Rong Xiao Li Zhao Gaetano T. Montelione