A highly abbreviated introduction to proteomics

Slides:



Advertisements
Similar presentations
Biological networks Bing Zhang Department of Biomedical Informatics Vanderbilt University
Advertisements

PPI network construction and false positive detection Jin Chen CSE Fall 1.
2. Electrophoretic separation of proteins by charge (isoelectric focusing) and by size (SDS-PAGE) 2D-gel electrophoresis & mass spectrometry 3. Peptide.
MN-B-C 2 Analysis of High Dimensional (-omics) Data Kay Hofmann – Protein Evolution Group Week 5: Proteomics.
Novel labeling technologies on proteins
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
Bioinformatics and Evolutionary Genomics High throughput “functional” data / functional genomics / Omics.
Biological networks Bing Zhang Department of Biomedical Informatics Vanderbilt University
Introduction to BioInformatics GCB/CIS535
Proteomics: A Challenge for Technology and Information Science CBCB Seminar, November 21, 2005 Tim Griffin Dept. Biochemistry, Molecular Biology and Biophysics.
PROTEOMICS LECTURE. Genomics DNA (Gene) Functional Genomics TranscriptomicsRNA Proteomics PROTEIN Metabolomics METABOLITE Transcription Translation Enzymatic.
Biological networks: Types and origin
Protein-Protein Interaction Screens. Bacterial Two-Hybrid System selectable marker RNA polymerase DNA binding protein bait target sequence target.
Affinity chromatography/mass spec Bait protein GST Page 252.
Gene expression and the transcriptome II. SAGE SAGE = Serial Analysis of Gene Expression Based on serial sequencing of 15-bp tags that are unique to each.
Announcements: Proposal resubmissions are due 4/23. It is recommended that students set up a meeting to discuss modifications for the final step of the.
Previous Lecture: Regression and Correlation
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Proteomics Informatics (BMSC-GA 4437) Course Director David Fenyö Contact information
Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS Gygi et al (2003) PNAS 100(12), presented by Jessica.
Fa 05CSE182 CSE182-L9 Mass Spectrometry Quantitation and other applications.
Proteome.
Research Methodology of Biotechnology: Protein-Protein Interactions
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Last Class 1.Junctions: Occluding Junctions, Anchoring Junctions, Communicating Junctions 2. Occluding Junctions: Tight Junction 3. Anchoring Junctions:
(D) Crosslinking Interacting proteins can be identified by crosslinking. A labeled crosslinker is added to protein X in vitro and the cell lysate is added.
Phosphoproteomics and motif mining Martin Miller Ph.d. student CBS DTU
PROTEIN STRUCTURE NAME: ANUSHA. INTRODUCTION Frederick Sanger was awarded his first Nobel Prize for determining the amino acid sequence of insulin, the.
Center for Human Health and the Environment
© 2010 SRI International - Company Confidential and Proprietary Information Quantitative Proteomics: Approaches and Current Capabilities Pathway Tools.
Finish up array applications Move on to proteomics Protein microarrays.
Discovering Macromolecular Interactions. An experimental strategy for identifying new molecular actors in a process candidate approach general screen.
Proteome and interactome Bioinformatics.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Quantification of Membrane and Membrane- Bound Proteins in Normal and Malignant Breast Cancer Cells Isolated from the Same Patient with Primary Breast.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics What is it? How is it done? Are there different kinds? Why would you want to do it (what can it tell you)?
Multiple flavors of mass analyzers Single MS (peptide fingerprinting): Identifies m/z of peptide only Peptide id’d by comparison to database, of predicted.
Integrated Genomic and Proteomic Analyses of a Systematically Perturbed Metabolic Network Science, Vol 292, Issue 5518, , 4 May 2001.
How many interactions are there? ~6,200 genes ~6,200 proteins x 2-10 interactions/protein ~12, ,000 interactions Yeast.
Proteomics Informatics (BMSC-GA 4437) Instructor David Fenyö Contact information
Salamanca, March 16th 2010 Participants: Laboratori de Proteomica-HUVH Servicio de Proteómica-CNB-CSIC Participants: Laboratori de Proteomica-HUVH Servicio.
Oct 2011 SDMBT1 Lecture 11 Some quantitation methods with LC-MS a.ICAT b.iTRAQ c.Proteolytic 18 O labelling d.SILAC e.AQUA f.Label Free quantitation.
Proteomics Informatics (BMSC-GA 4437) Course Directors David Fenyö Kelly Ruggles Beatrix Ueberheide Contact information
Quantitation using Pseudo-Isobaric Tags (QuPIT) and Quantitation using Pseudo-isobaric Amino acids in Cell culture (QuPAC) Parimal Samir Andrew J. Link.
Novel Proteomics Techniques
Organellar Proteomics: Turning Inventories into Insights
MCB test 2 Review M. Alex Miranda 11/5/16.
Proteomics Informatics David Fenyő
Quantifying Ubiquitin Signaling
Protein Complex Discovery
Analytical Characteristics of Cleavable Isotope-Coded Affinity Tag-LC-Tandem Mass Spectrometry for Quantitative Proteomic Studies  Cecily P. Vaughn, David.
A perspective on proteomics in cell biology
The potential for proteomic definition of stem cell populations
The potential for proteomic definition of stem cell populations
A quantitative proteomics strategy to identify SUMO-conjugated proteins. A quantitative proteomics strategy to identify SUMO-conjugated proteins. HeLa.
Ubiquitin code assembly and disassembly
Analysis of newly synthesized proteins by combined pulsed SILAC and click chemistry enrichment. Analysis of newly synthesized proteins by combined pulsed.
Protein Complex Discovery
The principle of the immuno-SILAC method.
Shotgun Proteomics in Neuroscience
What Determines the Specificity and Outcomes of Ubiquitin Signaling?
The Coming Age of Complete, Accurate, and Ubiquitous Proteomes
Proteomics Informatics David Fenyő
Kuen-Pin Wu Institute of Information Science Academia Sinica
Volume 43, Issue 3, Pages (August 2011)
Presentation transcript:

A highly abbreviated introduction to proteomics

A typical shotgun proteomics experiment Collect tens of thousands of MS/MS spectra Can identify >1,000 proteins from cell lysate

Orbi video: http://apps.thermoscientific.com/media/SID/LSMS/Video/webinar/orbitrap_elite/animation/

Shotgun proteomics identifies proteins from the fragmentation mass spectra of their constituent peptides b & y ions Peptide fragmentation Actual peptide tandem (MS/MS) mass spectrum Idealized peptide tandem (MS/MS) mass spectrum from database Idealized peptide tandem (MS/MS) mass spectrum with PTM (phosphoserine) Marcotte (2007) Nature Biotechnology 25:755-757

One common strategy for relative quantification = using isotopically labeled samples (e.g. 15N vs. 14N, 13C vs. 12C, etc.) SILAC = stable isotope labeling with amino acids in cell culture iCAT = isotope tags on cysteines iTRAQ = isobaric labels on cysteines (same mass, different isotopes) AQUA = absolute quantification by spiking in isotopically shifted peptide standards for proteins of interest Mallick & Kuster (2010) Nature Biotechnology 28:695-709

Mass spectrometry strategies for measuring absolute protein abundances for 100’s to 1000’s of proteins adapted from Vogel & Marcotte Nature Biotechnology 2009 27, 825-6

& the current state-of-the-art … Each 100-200K peptides, from ~10,000 proteins spanning ~7 orders of magnitude in abundance

A highly abbreviated introduction to large-scale protein interaction screens

X-ray structure of ATP synthase Schematic version Network representation a b g d b2 e a c12 Total set = protein complex Sum of direct + indirect interactions

High-throughput yeast two-hybrid + DBD Bait Prey Act DNA binding domain Transcription activation domain Prey Act Core transcription machinery Bait DBD transcription operator or upstream activating sequence Reporter gene

High-throughput yeast two-hybrid Haploid yeast cells expressing activation domain- prey fusion proteins Diploid yeast probed with DNA-binding domain- Pcf11 bait fusion protein

High-throughput complex mapping by mass spectrometry Tag Bait Affinity column protein 1 protein 2 SDS- page protein 3 Trypsin digest, identify peptides by mass spectrometry protein 4 protein 5 protein 6

493 bait proteins 3617 “interactions”

A variant: tandem affinity purification (TAP) Tag1 Tag2 Bait Affinity column2 protein 1 Affinity column1 protein 2 SDS- page protein 3 protein 4 + protease protein 5 protein 6 Trypsin digest, identify peptides by mass spectrometry Affinity column1

Estimating accuracy with a well-determined reference set of interactions

Where we were, more or less, until recently in terms of PPI maps

The current state-of-the-art in animal PPI maps ~3,500 affinity purification experiments ~11K interactions / ~2.3K proteins  spans 556 complexes Still daunting for the human proteome Guruharsha et al. (2011) Cell 147, 690–703

>2,000 biochemical fractions, Finding stable protein assemblies by native separations and quantitative mass spec. >2,000 biochemical fractions, including replicates >9,000 hours mass spec machine time Havugimana, Hart, et al., Cell (2012)

The profiles cover > ½ the experimentally verified proteome & proteins within the same stable complexes co-elute Havugimana, Hart, et al., Cell (2012)

Turning separations into complexes 1) One separation, #13 of many Cluster 4) Inferred complexes ~5600 proteins ~120 fractions ... 59 60 61 62 63 64 Co-separation of the exocyst complex exoc1 exoc2 exoc3 exoc4 exoc5 exoc6 exoc7 exoc8 3) Inferred interactions high correlation >> more likely in complex 2) Pairwise protein correlations Machine learning (SVM, Ensemble methods) Hurdle is false positives: since you’re searching the entire space of possible shared complex memberships, where true co-memberships are extremely sparse, high scorers are dominated by false positives. Must use external data to 2b) External data Co-expression, shared protein domains, much more (HumanNet) Other AP-MS datasets (Guruharsha 2011, Malovannaya 2011)

Guiding and testing the reconstruction with known complexes Havugimana, Hart, et al., Cell (2012)

13,998 high-confidence physical interactions / 3,011 proteins A reference map of human protein complexes 13,998 high-confidence physical interactions / 3,011 proteins Defines >600 complexes: >100 heterodimers, >500 with ≥3 components Havugimana, Hart, et al., Cell (2012)

In yeast, phenotypes reflect biological modules. e.g., lethality is tied not to the protein, but to the molecular machine small nucleolar ribonucleoprotein complex SAGA transcription factor/ chromatin remodeling complex TAFIID complex protein phosphatase 2A complex Essential gene Nonessential gene Hart, Lee, & Marcotte, BMC Bioinformatics 8:236 (2007)

The human protein complexes are also strongly enriched for genes linked to the same diseases and phenotypes Havugimana, Hart, et al., Cell (2012)

The complexes are strongly enriched for genes linked to the same diseases, e.g., as for Cornelia de Lange Syndrome prweb.com Now confirmed by Deardorff et al., Am. J. Hum. Genet. 90, 1014–1027 Dermatology Online Journal 7(2): 8

Our current state of the art animal complex map Cuihong Wan Blake Borgeson w/ Andrew Emili’s lab

Our current state of the art animal complex map Extending the map Now 7 animals, >65 separations, nearly 7,000 mass spec experiments >3,500 fractions ~12,000 proteins Our current state of the art animal complex map >3,000 fractions ~9,000 proteins Cuihong Wan Blake Borgeson w/ Andrew Emili’s lab