Proteomics Informatics – Protein characterization: post-translational modifications and protein-protein interactions (Week 10)

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Protein Quantitation II: Multiple Reaction Monitoring
Proteomics Informatics – Protein characterization I: post-translational modifications (Week 10)
UC Mass Spectrometry Facility & Protein Characterization for Proteomics Core Proteomics Capabilities: Examples of Protein ID and Analysis of Modified Proteins.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Novel labeling technologies on proteins
Quantification of low-abundance proteins in complexes and in total cell lysates by mass spectrometry Bastienne Jaccard and Manfredo Quadroni Université.
Data Processing Algorithms for Analysis of High Resolution MSMS Spectra of Peptides with Complex Patterns of Posttranslational Modifications Shenheng Guan.
PROTEOMICS LECTURE. Genomics DNA (Gene) Functional Genomics TranscriptomicsRNA Proteomics PROTEIN Metabolomics METABOLITE Transcription Translation Enzymatic.
20-30% of a trypsinised proteome are constituted of peptides with Mw≥3000 (TReP) Identification of large peptides by shotgun MS is not efficient Isolation.
Lawrence Hunter, Ph.D. Director, Computational Bioscience Program University of Colorado School of Medicine
Proteomics Informatics – Protein identification II: search engines and protein sequence databases (Week 5)
Proteomics Informatics Workshop Part I: Protein Identification
Previous Lecture: Regression and Correlation
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Peptide Ion Fragmentation Technologies CID = collision induced dissociation CAD = collision activated disassociation.
Analysis of tandem mass spectra - II Prof. William Stafford Noble GENOME 541 Intro to Computational Molecular Biology.
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Proteomics Informatics Workshop Part III: Protein Quantitation
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Proteomics Informatics Workshop Part II: Protein Characterization David Fenyö February 18, 2011 Top-down/bottom-up proteomics Post-translational modifications.
Proteome.
Proteomics Informatics – Protein Characterization II:
A highly abbreviated introduction to proteomics
-The methods section of the course covers chapters 21 and 22, not chapters 20 and 21 -Paper discussion on Tuesday - assignment due at the start of class.
Production of polypeptides, Da, and middle-down analysis by LC-MSMS Catherine Fenselau 1, Joseph Cannon 1, Nathan Edwards 2, Karen Lohnes 1,
Abstract An important goal of the AfCS Protein Chemistry Laboratory is the analysis of ligand-induced changes in protein phosphorylation. Stable Isotope.
Phosphoproteomics and motif mining Martin Miller Ph.d. student CBS DTU
Proteomics Informatics – Protein Characterization II: Protein Interactions (Week 11)
Common parameters At the beginning one need to set up the parameters.
Outline Selection of candidate proteins for the multiplex analysis of DBS via targeted proteomics The currently employed strategies for the selection.
Proteome and interactome Bioinformatics.
Quantification of Membrane and Membrane- Bound Proteins in Normal and Malignant Breast Cancer Cells Isolated from the Same Patient with Primary Breast.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
CSE182 CSE182-L11 Protein sequencing and Mass Spectrometry.
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Overview of Mass Spectrometry
Research questions Regulation of eukaryotic signal transduction: focussing on protein phosphorylation and protein-protein interactions (in the context.
Isotope Labeled Internal Standards in Skyline
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
ISOMATCH-web For automatic matching of isotope peak distributions ■ Automatic matching of a raw spectrum (ASCII format) to theoretical isotopic distributions.
Oct 2011 SDMBT1 Lecture 11 Some quantitation methods with LC-MS a.ICAT b.iTRAQ c.Proteolytic 18 O labelling d.SILAC e.AQUA f.Label Free quantitation.
Deducing protein composition from complex protein preparations by MALDI without peptide separation.. TP #419 Kenneth C. Parker SimulTof Corporation, Sudbury,
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Protein quantitation I: Overview (Week 5). Fractionation Digestion LC-MS Lysis MS Sample i Protein j Peptide k Proteomic Bioinformatics – Quantitation.
Quantitation using Pseudo-Isobaric Tags (QuPIT) and Quantitation using Pseudo-isobaric Amino acids in Cell culture (QuPAC) Parimal Samir Andrew J. Link.
Ho-Tak Lau, Hyong Won Suh, Martin Golkowski, and Shao-En Ong
Molecular techniques for the study of the interaction between……..
Protein/Peptide Quantification
Proteomics Informatics David Fenyő
A perspective on proteomics in cell biology
Rat mesangial α-endosulfine
Proteomics Informatics –
Volume 6, Issue 6, Pages (December 2000)
2D-LC-MS/MS analysis of tryptic digest of HEK293-SUMO3 cells (2 μg inj
Example of MS/MS spectrum of peptide FPTLTGFNR (hypothetical protein with signal peptide EAK88888; N77) from a protein digestion mixture prepared by labeling.
Serine ADP-Ribosylation Depends on HPF1
Targeted Proteomic Study of the Cyclin-Cdk Module
Volume 64, Issue 3, Pages (November 2016)
Methods for the Elucidation of Protein-Small Molecule Interactions
Tryptic phosphopeptides of 32P-labeled IGFBP-5 from T47D cells separated by HPLC and sequenced by tandem MS.a, tandem MS spectrum of the triply charged.
Volume 61, Issue 2, Pages (January 2016)
Proteomics Informatics David Fenyő
Identification of Post Translational Modifications
Volume 43, Issue 3, Pages (August 2011)
Presentation transcript:

Proteomics Informatics – Protein characterization: post-translational modifications and protein-protein interactions (Week 10)

Top down / bottom up Top down Bottom up mass/charge intensity

Top down Bottom up Charge distribution mass/charge intensity mass/charge intensity

Top down Bottom up Isotope distribution mass/charge intensity mass/charge intensity

Fragmentation Top downBottom up Fragmentation

Correlations between modifications Top down Bottom up

Alternative Splicing Top down Bottom up Exon 123

Top down Kellie et al., Molecular BioSystems 2010 Protein mass spectra Fragment mass spectra

Protein Complexes A B A C D Digestion Mass spectrometry

Sowa et al., Cell 2009 Protein Complexes – specific/non-specific binding

Choi et al., Nature Methods 2010

Tackett et al. JPR 2005 Protein Complexes – specific/non-specific binding

Analysis of Non-Covalent Protein Complexes Taverner et al., Acc Chem Res 2008

Non-Covalent Protein Complexes Schreiber et al., Nature 2011

More / better quality interactions Affinity Capture Optimization Screen + Cell extraction Lysate clearance/ Batch Binding Binding/Washing/Eluting SDS-PAGE Filtration LaCava, Hakhverdyan, Domanski, Rout

Over 20 different extraction and washing conditions ~ 10 years or art. (41 pullouts are shown) Molecular Architecture of the NPC Actual model Alber F. et al. Nature (450) Alber F. et al. Nature (450)

Cloning nanobodies for GFP pullouts Atypical heavy chain-only IgG antibody produced in camelid family – retain high affinity for antigen without light chain Aimed to clone individual single-domain VHH antibodies against GFP – only ~15 kDa, can be recombinantly expressed, used as bait for pullouts, etc. To identify full repertoire, will identify GFP binders through combination of high-throughput DNA sequencing and mass spectrometry VHH clone for recombinant expression

Cloning llamabodies for GFP pullouts Llama GFP immunization Lymphocyte total RNA Crude serum VHH amplicon 454 DNA sequencing RT / Nested PCR IgG fractionation & GFP affinity purification VHH DNA sequence library GFP-specific VHH fraction LC-MS/MS GFP-specific VHH clones Bone marrow aspiration Serum bleed bp No. of Reads Read length (bp) VHVH VHH Fridy, Li, Keegan, Chait, Rout

CDR3: 100.0% (14/14); combined CDR: 100.0% (33/33); DNA count: 10 MAQVQLVESGGGLVQAGGSLRLSCVASGRTFSGYAMGWFRQTPGREREAVAAITWSAHSTYYSDSVKDRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS CDR3: 100.0% (14/14); combined CDR: 72.7% (24/33; DNA count: 1 MADVQLVESGGGLVQSGGSRTLSCAASGRVLATYHLGWFRQSPGREREAVAAITWSAHSTYYSDSVKGRFTISIDNARNTGYLQMNSLKPEDTAVYYCTVRHGTWFTVSRYWTDWGQGTQVTVS CDR3: 100.0% (14/14); combined CDR: 72.7% (24/33); DNA count: 1 MAQVQLVESGGALVQAGASLSVSCAASGGTISKYNMAWFRRAPGREREAVAAITWSAHSTYYSDSVKDRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS CDR3: 100.0% (14/14); combined CDR: 42.4% (14/33); DNA count: 1 MAQVQLEESGGGLVQAGDSLTLSCSASGRTFTNYAMAWSRQAPGKERELLAAIDAAGGATYYSDSVKGRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS CDR3: 100.0% (14/14); combined CDR: 42.4% (14/33); DNA count: 1 MAQVQLVESGGGRVQAGGSLTLSCVGSEGIFWNHVMGWFRQSPGKDREFVARISKIGGTTNYADSVKGRFTISIDNTRNTGYLQMNSLKPEDTAVYYCTVRHGTWFTTSRYWTDWGQGTQVTVS CDR1CDR2 CDR3 Underlined regions are covered by MS Rank sequences according to: CDR3 coverage;Overall coverage; Combined CDR coverage;DNA counts; Identifying full-length sequences from peptides

Sequence diversity of 26 verified anti-GFP nanobodies Of ~200 positive sequence hits, 44 high confidence clones were synthesized and tested for expression and GFP binding: 26 were confirmed GFP binders. Sequences have characteristic conserved VHH residues, but significant diversity in CDR regions. FR1 CDR1FR2 CDR2CDR3FR3FR4

HIV-1 gp120 Lipid Bilayer gp41 MA CA NC PR IN RT RNA Particle Genome env rev vpu tat nef 3’ LTR 5’ LTR vif gag pol vpr CAMANC p6 PRRTIN gp41gp120 9,200 nucleotides

Genetic-Proteomic Approach Tagged Viral Protein Tag Protein Complex SDS-PAGE * Mass Spectrometry

I-Dirt for Specific Interaction 3xFLAG Tagged HIV-1WT HIV-1 Infection LightHeavy ( 13 C labeled Lys, Arg) 1:1 Mix Immunoisolation MS I-DIRT = Isotopic Differentiation of Interactions as Random or Targeted LysArg (+6 daltons) Modified from Tackett AJ et al., J Proteome Res. (2005) 4,

IDIRT and Reverse IDIRT Env-3xFLAG Vif-3xFLAG Luo, Jacobs, Greco, Cristae, Muesing, Chait, Rout

Protein Exchange Vif-3F Heavy labeled Vif-3F lysate IP in heavy labeled Vif-3F lysate Vif-3F Light labeled wt lysate Incubation with light labeled wt lysate Vif-3F 15min Vif-3F 5min Stable Interactor Vif-3F Interactor with fast exchange 60min

Env Time Course SILAC Differentially labeled infection harvested at early or late stage of infection Distinguish proteins that interact with Env at early or late stage during infection Early during infectionLate during infection Light Heavy ( 13 C labeled Lys, Arg) 1:1 Mix Immunoisolation MS Early interactor Late interactor

M/Z Peptides Fragments Fragmentation Proteolytic Peptides Enzymatic Digestion Protein Complex Chemical Cross-Linking MS MS/MS Isolation Cross-Linked Protein Complex Interaction Partners by Chemical Cross-Linking

M/Z Peptides Fragments Fragmentation Proteolytic Peptides Enzymatic Digestion Protein Complex Chemical Cross-Linking MS MS/MS Isolation Cross-Linked Protein Complex Interaction Sites by Chemical Cross-Linking

Cross-linking protein n peptides with reactive groups (n-1)n/2 potential ways to cross-link peptides pairwise + many additional uninformative forms Protein A + IgG heavy chain 990 possible peptide pairs Yeast NPC ˜ 10 6 possible peptide pairs

Protein Crosslinking by Formaldehyde ~1% w/v Fal 20 – 60 min ~0.3% w/v Fal 5 – 20 min 1/100 the volume LaCava

Protein Crosslinking by Formaldehyde RED: triplicate experiments, FAl treated grindate BLACK: duplicated experiments, FAl treated cells (then ground) SCORE: Log Ion Current / Log protein abundance Akgöl, LaCava, Rout

Cross-linking Mass spectrometers have a limited dynamic range and it therefore important to limit the number of possible reactions not to dilute the cross-linked peptides. For identification of a cross-linked peptide pair, both peptides have to be sufficiently long and required to give informative fragmentation. High mass accuracy MS/MS is recommended because the spectrum will be a mixture of fragment ions from two peptides. Because the cross-linked peptides are often large, CAD is not ideal, but instead ETD is recommended.

Phosphopeptide identification m precursor = 2000 Da  m precursor = 1 Da  m fragment = 0.5 Da Phosphorylation Localization of modifications

Localization (d min =3) m precursor = 2000 Da  m precursor = 1 Da  m fragment = 0.5 Da Phosphorylation d min >=3 for 47% of human tryptic peptides Localization of modifications

Localization (d min =2) m precursor = 2000 Da  m precursor = 1 Da  m fragment = 0.5 Da Phosphorylation d min =2 for 33% of human tryptic peptides Localization of modifications

Localization (d min =1) m precursor = 2000 Da  m precursor = 1 Da  m fragment = 0.5 Da Phosphorylation d min =1 for 20% of human tryptic peptides Localization of modifications

Localization (d=1*) m precursor = 2000 Da  m precursor = 1 Da  m fragment = 0.5 Da Phosphorylation Localization of modifications

Peptide with two possible modification sites Localization of modifications

Peptide with two possible modification sites MS/MS spectrum m/z Intensity Localization of modifications

Peptide with two possible modification sites MS/MS spectrum m/z Intensity Matching Localization of modifications

Peptide with two possible modification sites MS/MS spectrum m/z Intensity Matching Which assignment does the data support? 1, 1 or 2, or 1 and 2? Localization of modifications

AAYYQK Visualization of evidence for localization AAYYQK

Visualization of evidence for localization

Estimation of global false localization rate using decoy sites By counting how many times the phosphorylation is localized to amino acids that can not be phosphorylated we can estimate the false localization rate as a function of amino acid frequency. Amino acid frequency False localization frequency Y

How much can we trust a single localization assignment? If we can generate the distribution of scores for assignment 1 when 2 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.

Is it a mixture or not? If we can generate the distribution of scores for assignment 2 when 1 is the correct assignment, it is possible to estimate the probability of obtaining a certain score by chance for a given peptide sequence and MS/MS spectrum assignment.

1 and or 2 Ø Localization of modifications

Proteomics Informatics – Protein characterization: post-translational modifications and protein-protein interactions (Week 10)