Lecture 1.11 High Throughput Methods in Proteomics David Wishart University of Alberta Edmonton, AB
Lecture 1.12 Proteomics –molecular biology –chromatography –electrophoresis –mass spectrometry –X-ray crystallography –NMR spectroscopy –microscopy –computational biology Proteomics employs an incredibly diverse range of technologies including:
Lecture 1.13 Proteomics Tools Molecular Biology Tools Separation & Display Tools Protein Identification Tools Protein Structure Tools
Lecture 1.14 Molecular Biology Tools Northern/Southern Blotting Differential Display RNAi (small RNA interference) Serial Analysis of Gene Expression (SAGE) DNA Microarrays or Gene Chips Yeast two-hybrid analysis Immuno-precipitation/pull-down GFP Tagging & Microscopy
Lecture 1.15 SAGE Principle is to convert every mRNA molecule into a short (10-14 base), unique tag. Equivalent to reducing all the people in a city into a telephone book with surnames After creating the tags, these are assembled or concatenated into a long “list” The list can be read using a DNA sequencer and the list compared to a database to ID genes or proteins and their frequency
Lecture 1.16 SAGE Tools
Lecture 1.17 SAGE Convert mRNA to dsDNA Digest with NlaIII Split into 2 aliquots Attach Linkers
Lecture 1.18 SAGE Linkers have PCR & Tagging Endonuclease Cut with TE BsmF1 Mix both aliquots Blunt-end ligate to make “Ditag” Concatenate & Sequence
Lecture 1.19 SAGE of Yeast Chromosome
Lecture DNA Microarrays Principle is to analyze gene (mRNA) or protein expression through large scale non-radioactive Northern (RNA) or Southern (DNA) hybridization analysis Brighter the spot, the more DNA Microarrays are like Velcro chips made of DNA fragments attached to a substrate Requires robotic arraying device and fluorescence microarray reader
Lecture Gene Chip Tools
Lecture DNA Microarrays
Lecture DNA Microarray
Lecture Microarrays & Spot Colour
Lecture Microarray Analysis ExamplesBrain67,679Heart9,400 Liver37,807 Colon4,832 Prostate7,971 Skin3,043 Bone4,832 Lung20,224 Brain Lung Liver Liver Tumor
Lecture Microarray Software
Lecture Yeast Two-Hybrid Analysis Yeast two-hybrid experiments yield information on protein protein interactions GAL4 Binding Domain GAL4 Activation Domain X and Y are two proteins of interest If X & Y interact then reporter gene is expressed
Lecture Invitrogen Yeast 2-Hybrid LexA lacZ LexA X Y Y B42 lacZ LexA X
Lecture Example of 2-Hybrid Analysis Uetz P. et al., “A Comprehensive Analysis of Protein-Protein Interactions in Saccharomyces cerevisiae” Nature 403: (2000) High Throughput Yeast 2 Hybrid Analysis 957 putative interactions 1004 of 6000 predicted proteins involved
Lecture Example of 2-Hybrid Analysis Rain JC. et al., “The protein-protein interaction map of Helicobacter pylori” Nature 409: (2001) High Throughput Yeast 2 Hybrid Analysis 261 H. pylori proteins scanned against genome >1200 putative interactions identified Connects >45% of the H. pylori proteome
Lecture Another Way? Ho Y, Gruhler A, et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415: (2002) High Throughput Mass Spectral Protein Complex Identification (HMS-PCI) 10% of yeast proteins used as “bait” 3617 associated proteins identified 3 fold higher sensitivity than yeast 2-hybrid
Lecture Affinity Pull-down
Lecture Molecular Biology Tools Northern/Southern Blotting Differential Display RNAi (small RNA interference) Serial Analysis of Gene Expression (SAGE) DNA Microarrays or Gene Chips Yeast two-hybrid analysis Immuno-precipitation/pull-down GFP Tagging & Microscopy
Lecture Yeast Protein Localization Huh, K et al., Nature, 425: (2003)
Lecture Yeast Proteome Localized Used 6234 yeast strains expressing full- length, chromosomally tagged green fluorescent protein (GFP) fusion proteins Measured localization by fluorescence microscopy Localized 75% of the yeast proteome, into 22 distinct subcellular localization categories Provided localization information for 70% of previously unlocalized proteins
Lecture Different Cellular Zones
Lecture GFP Tagging the Yeast Proteome
Lecture Fluorescence Microscopy Nucleus Nuclear Periphery Endoplasmic Retic. Bud Neck Mitochondria Lipid particles
Lecture Confirmation by Co-localization (GFP/RFP merging)
Lecture Results
Lecture Proteomics Tools Molecular Biology Tools Separation & Display Tools Protein Identification Tools Protein Structure Tools
Lecture Separation & Display Tools 1D Slab Gel Electrophoresis 2D Gel Electrophoresis Capillary Electrophoresis HPLC (SEC, IEC, RP, Affinity, etc.) Protein Chips
Lecture SDS PAGE
Lecture SDS PAGE Tools
Lecture Isoelectric Focusing (IEF)
Lecture Isoelectric Focusing Separation of basis of pI, not Mw Requires much higher voltages Requires much longer period of time IPG (Immobilized pH Gradient) Typically done in strips or tubes (to facilitate 2D gel work) Uses ampholytes to establish pH gradient
Lecture D Gel Principles SDS PAGE IEF
Lecture Advantages and Disadvantages Provides a hard-copy record of separation Allows facile quantitation Separation of up to 9000 different proteins Highly reproducible Gives info on Mw, pI and post-trans modifications Inexpensive Limited pI range (4-8) Proteins >150 kD not seen in 2D gels Difficult to see membrane proteins (>30% of all proteins) Only detects high abundance proteins (top 30% typically) Time consuming
Lecture D Gel Software
Lecture Capillary Electrophoresis
Lecture Capillary Electrophoresis Capillary Zone Electrophoresis (CZE) –Separates on basis of m/z ratio Capillary Gel Electrophoresis (CGE) –Separates by MW and m/z ratio Capillary Isoelectric Focusing (CIEF) –Separates on basis of pI 2-Dimensional Electrophoresis (2D-CE) –Separates using tandem CE methods
Lecture Chromatography Size Exclusion (size) Reverse Phase (hphob) Ion Exchange (charge) Normal Phase (TLC) Affinity (ligand) HIC (hydrophobicity) 2D Chromatography
Lecture Ciphergen Protein Chips
Lecture Ciphergen Protein Chips Hydrophobic (C 8 ) Arrays Hydrophilic (SiO 2 ) Arrays Anion exchange Arrays Cation exchange Arrays Immobilized Metal Affinity (NTA-nitroloacetic acid) Arrays Epoxy Surface (amine and thiol binding) Arrays
Lecture Ciphergen Protein Chips Normal Tumor
Lecture Protein Arrays
Lecture Different Kinds of Protein Arrays Antibody Array Antigen Array Ligand Array Detection by: SELDI MS, fluorescence, SPR, electrochemical, radioactivity, microcantelever
Lecture Protein (Antigen) Chips His 6 GST ORF Nickel coating H Zhu, J Klemic, S Chang, P Bertone, A Casamayor, K Klemic, D Smith, M Gerstein, M Reed, & M Snyder (2000).Analysis of yeast protein kinases using protein chips. Nature Genetics 26:
Lecture Protein (Antigen) Chips Nickel coating
Lecture Arraying Process
Lecture Probe with anti-GST Mab Nickel coating
Lecture Anti-GST Probe
Lecture Probe with Cy3-labeled Calmodulin Nickel coating
Lecture “Functional” Protein Array Nickel coating
Lecture Proteomics Tools Molecular Biology Tools Separation & Display Tools Protein Identification Tools Protein Structure Tools
Lecture Microsequencing Electro-blotting
Lecture Edman Sequencing
Lecture Microsequencing Generates sequence info from N terminus Commonly done on low picomolar amounts of protein (5-50 ng) Newer techniques allow sequencing at the femtomolar level (100 pg) Up to 20 residues can be read Allows unambiguous protein ID for 8+ AA Relatively slow, modestly expensive
Lecture Protein ID by MS and 2D gel
Lecture Protein ID by MS and 2D gel Requires gel spots to be cut out (tedious) Ideal for high throughput (up to 500 samples per day) Allows modifications to be detected MS allows protein identification by: –Intact protein molecular weight –Peptide fingerprint molecular weights –Sequencing through MS/MS
Lecture Protein ID Protocol
Lecture Typical Results 401 spots identified 279 gene products Confirmed by SAGE, Northern or Southern Confirmed by amino acid composition Confirmed by amino acid sequencing Confirmed by MW & pI
Lecture MS Analysis Software Protein Prospector MS-Fit Mowse PeptideSearch PROWL
Lecture Proteomics Tools Molecular Biology Tools Separation & Display Tools Protein Identification Tools Protein Structure Tools
Lecture Protein Structure Initiative 30 seq 35,000 proteins 10,000 subset 30% ID or 30 seq Solve by 2010 $20,000/Structure
Lecture Structure Determination NMR X-ray
Lecture F T X-ray Crystallography
Lecture NMR Spectroscopy F T
Lecture Structure Determination
Lecture Bottlenecks Producing enough protein for trials Crystallization time and effort Crystal quality, stability and size control Finding isomorphous derivatives Chain tracing & checking Producing enough labeled protein for collection Sample “conditioning” Size of protein Assignment process is slow and error prone Measuring NOE’s is slow and error prone X-rayNMR
Lecture Protein Expression
Lecture Robotic Crystallization
Lecture Synchrotron Light Source
Lecture MAD & X-ray Crystallography MAD (Multiwavelength Anomalous Dispersion Requires synchrotron beam lines Requires protein with multiple scattering centres (selenomethionine labeled) Allows rapid phasing Proteins can now be “solved” in just 1-2 days
Lecture High Throughput NMR Higher magnetic fields (From 400 MHz to 900 MHz) Higher dimensionality (From 2D to 3D to 4D) New pulse sequences (TROSY, CBCANNH) Improved sensitivity New parameters (Dipolar coupling, cross relaxation)
Lecture Automated Structure Generation
Lecture NMR & Structural Proteomics Proc. Natl. Acad. Sci. USA, Vol. 99, , 2002
Lecture NMR & Structural Proteomics Proc. Natl. Acad. Sci. USA, Vol. 99, , 2002
Lecture Auto-comparative Modeling ACDEFGHIKLMNPQRST--FGHQWERT-----TYREWYEGHADS ASDEYAHLRILDPQRSTVAYAYE--KSFAPPGSFKWEYEAHADS MCDEYAHIRLMNPERSTVAGGHQWERT----GSFKEWYAAHADD
Lecture The Goal