9/14/20151 EECS 730 Introduction to Bioinformatics Introduction to Proteomics Luke Huan Electrical Engineering and Computer Science

Slides:



Advertisements
Similar presentations
Genomes and Proteomes genome: complete set of genetic information in organism gene sequence contains recipe for making proteins (genotype) proteome: complete.
Advertisements

Recombinant DNA Technology
Proteomics Examination Yvonne (Bonnie) Eyler Technology Center 1600 Art Unit 1646 (703)
Protein Purification Molecular weight Charge Solubility Affinity.
Recombinant DNA technology
Research Methodology of Biotechnology: Protein-Protein Interactions Yao-Te Huang Aug 16, 2011.
Protein Gel Electrophoresis
A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae Article by Peter Uetz, et.al. Presented by Kerstin Obando.
Proteomics: Its Function and Methods Ryan Victor.
Gene Regulation in Eukaryotes Same basic idea, but more intricate than in prokaryotes Why? 1.Genes have to respond to both environmental and physiological.
Bio 402/502 Section II, Lecture 7 Systems Biology of the Nucleus Dr. Michael C. Yu.
Proteomics The proteome is larger than the genome due to alternative splicing and protein modification. As we have said before we need to know All protein-protein.
Protein domains vs. structure domains - an example.
Introduction to biological networks. protein-gene interactions protein-protein interactions PROTEOME GENOME Citrate Cycle METABOLISM Bio-chemical reactions.
Introduction to BioInformatics GCB/CIS535
Protein analysis and proteomics II Monday, 30 January 2006 Introduction to Bioinformatics DA McClellan
Biology 224 Dr. Tom Peavy Sept 27 & 29 Protein Structure & Analysis- part 2.
CISC667, F05, Lec24, Liao1 CISC 667 Intro to Bioinformatics (Fall 2005) DNA Microarray, 2d gel, MSMS, yeast 2-hybrid.
PROTEOMICS LECTURE. Genomics DNA (Gene) Functional Genomics TranscriptomicsRNA Proteomics PROTEIN Metabolomics METABOLITE Transcription Translation Enzymatic.
Marcotte EM, Pellegrini M, Ng HL, Rice DW, Yeates TO, Eisenberg D. (1999). Detecting protein function and protein-protein interactions from genome sequences.
Protein-Protein Interaction Screens. Bacterial Two-Hybrid System selectable marker RNA polymerase DNA binding protein bait target sequence target.
Affinity chromatography/mass spec Bait protein GST Page 252.
Applications of protomic Presented By: Muhammad Rizwan Roll no: Department of Bioinformatics.
Announcements: Proposal resubmissions are due 4/23. It is recommended that students set up a meeting to discuss modifications for the final step of the.
Protein Interactions and Disease Audry Kang 7/15/2013.
Proteomics Understanding Proteins in the Postgenomic Era.
Proteomics Josh Leung Biology 1220 April 13 th, 2010.
Proteome.
A highly abbreviated introduction to proteomics
歐亞書局 PRINCIPLES OF BIOCHEMISTRY Chapter 9 DNA-Based Information Technologies.
Proteomics I Mass Spectrometry Functional Genomics by Mass Spectrometry (Andersen and Mann, 2000) FEBS Letters 480, optional.
2D-Gel Analysis Jennifer Wagner Image retrieved from
Protein analysis and proteomics (Part 2 of 2). Many of the images in this powerpoint presentation are from Bioinformatics and Functional Genomics by Jonathan.
Electrophoresis PAGE Dr Gihan Gawish.
Table 5-1 Protein Purification Essential for characterizing individual proteins (determining their enzymatic activities, 3D structures, etc.) Two main.
Interactions and more interactions
Western Blotting.
es/by-sa/2.0/. Large Scale Approaches to the Study of Protein Levels and Activity Prof:Rui Alves
Finish up array applications Move on to proteomics Protein microarrays.
Proteomics and annotation. Definition of proteomics Study of all the proteins in an organism Derived from genomics all the DNA in an organsim On some.
Genomics I: The Transcriptome RNA Expression Analysis Determining genomewide RNA expression levels.
Discovering Macromolecular Interactions. An experimental strategy for identifying new molecular actors in a process candidate approach general screen.
Proteome and interactome Bioinformatics.
Protein-protein interactions “The Interactome” Yeast two-hybrid analysis Yeast two-hybrid analysis Protein chips Protein chips Biochemical purification/Mass.
Proteomics The science of proteomics Applications of proteomics Proteomic methods a. protein purification b. protein sequencing c. mass spectrometry.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Systems Biology ___ Toward System-level Understanding of Biological Systems Hou-Haifeng.
Lecture 9. Functional Genomics at the Protein Level: Proteomics.
Genome of the week - Enterococcus faecalis E. faecalis - urinary tract infections, bacteremia, endocarditis. Organism sequenced is vancomycin resistant.
TAP(Tandem Affinity Purification) Billy Baader Genetics 677.
Genomics II: The Proteome Using high-throughput methods to identify proteins and to understand their function.
Proteomics Session 1 Introduction. Some basic concepts in biology and biochemistry.
Central dogma: the story of life RNA DNA Protein.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Proteome and Gene Expression Analysis Chapter 15 & 16.
Announcements: Note that there will be presentations and associated paper summaries for both Thursday and Tuesday classes. The Exam II mean is 81.6 and.
1 Protein-Protein Interactions High-throughput strategy –Prediction from sequence In silico analysis –Protein A from species A: domain 1 and 2 –Protein.
How many interactions are there? ~6,200 genes ~6,200 proteins x 2-10 interactions/protein ~12, ,000 interactions Yeast.
Microarrays and Other High-Throughput Methods BMI/CS 576 Colin Dewey Fall 2010.
1 Genomics Advances in 1990 ’ s Gene –Expressed sequence tag (EST) –Sequence database Information –Public accessible –Browser-based, user-friendly bioinformatics.
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
1 Having genome data allows collection of other ‘omic’ datasets Systems biology takes a different perspective on the entire dataset, often from a Network.
Tymoczko • Berg • Stryer © 2015 W. H. Freeman and Company
Functional organization of the yeast proteome by systematic analysis of protein complexes Presented by Nathalie Kirshman and Xinyi Ma.
2D-Gel Analysis Jennifer Wagner
“Proteomics is a science that focuses on the study of proteins: their roles, their structures, their localization, their interactions, and other factors.”
Protein Complex Discovery
A perspective on proteomics in cell biology
Bioinformatics, Vol.17 Suppl.1 (ISMB 2001) Weekly Lab. Seminar
Protein Complex Discovery
Presentation transcript:

9/14/20151 EECS 730 Introduction to Bioinformatics Introduction to Proteomics Luke Huan Electrical Engineering and Computer Science

9/14/20152 Proteome: Protein complement of a genome Time- and cell- specific protein complement of the genome. Encompasses all proteins expressed in a cell at one time, including isoforms and post-translational modifications.

9/14/20153 Proteome Contrast to genome The genome is constant for one cell and identical for all cells of an organism, and does not change very much within a species The proteome is very dynamic with time and in response to external factors, and differs substantially between cell types. Variable In different cell and tissue types in same organism In different growth and developmental stages of organism Dynamic Depends on response of genome to environmental factors Disease state Drug challenge Growth conditions Stress

9/14/20154 Introduction to proteomics Proteomics is the study of total protein complements, proteomes, e.g. from a given tissue or cell type. Don’t forget that the proteome is dynamic, changing to reflect the environment that the cell is in Definitions Classical - restricted to large scale analysis of gene products involving only proteins Inclusive - combination of protein studies with analyses that have genetic components such as mRNA, genomics, and yeast two- hybrid Examples of important proteomic questions: 1) What proteins are present? 2) What other proteins does a particular protein interact with (networks)? 3) What does a particular protein look like (structure)?

9/14/20155 Genomics vs. proteomics Genomics has provided spectacular amounts of data, but most of it remains uninterpretable at our current level of understanding. In some ways, genomics raises more questions than it answers. The emerging field of proteomics promises to answer some of those questions by systematically studying all of the proteins encoded by the genome.

9/14/ gene is no longer equal to one protein In fact, the definition of a gene is debatable. (ORF, promoter, pseudogene, gene product, etc) 1 gene = how many proteins? There are only 30,000 genes in the human genome, yet there are more than 100,000 proteins in the human proteome. Actually, cataloguing the human proteome requires much more than just 100K proteins. 30,000 genes x myriad of modifications >> 100K protein forms! Modifications include: alternate RNA splicing, chemical modifications, cleavage Chemical modifications include: phosphorylation, acetylation, glycosylation, and many more. 1 gene = 1 protein?

9/14/20157 Why proteomics? Annotation of genomes, i.e. functional annotation Genome + proteome = annotation Protein Function Protein Post-Translational Modification Protein Localization and Compartmentalization Protein-Protein Interactions Protein Expression Studies Differential gene expression is not the answer

9/14/20158 Microarray data doesn ’ t correlate perfectly with protein expression levels Analysis of mRNA transcripts with microarray has provided dynamic information regarding which genes are expressed in cells under a given set of experimental conditions, yielding clues as to which proteins are involved in certain pathways and disease states. However, differences in the half-lives of RNA and proteins, as well as post-translational modifications important to protein function prevent mRNA profiles from being perfectly correlated to the cells’ actual protein profiles.

9/14/20159 Introduction to proteomics Composition of the proteome depends on cell type, developmental phase and conditions Proteome analyses are still struggling to solve the ”basic proteome” of different cells and tissues or limited changes under changing conditions or during processes Current methods can only ”see” the most abundant proteins

9/14/ Types of proteomics Protein Expression Quantitative study of protein expression between samples that differ by some variable Structural Proteomics Goal is to map out the 3-D structure of proteins and protein complexes Functional Proteomics To study protein-protein interaction, 3-D structures, cellular localization and PTMS in order to understand the physiological function of the whole set of proteome.

9/14/ Large-scale protein analysis 2D protein gels Yeast two-hybrid Rosetta Stone approach Pathways

9/14/ D protein electrophoresis and mass spectrometry

9/14/ Two-dimensional protein gels First dimension: isoelectric focusing Electrophorese ampholytes to establish a pH gradient Can use a pre-made strip Proteins migrate to their isoelectric point (pI) then stop (net charge is zero) Range of pI typically 4-9 (5-8 most common)

9/14/ Two-dimensional protein gels Second dimension: SDS-PAGE Electrophorese proteins through an acrylamide matrix Proteins are charged and migrate through an electric field v = Eq / d6  r  Conditions are denaturing Can resolve hundreds to thousands of proteins

9/14/201515

9/14/ Proteins identified on 2D gels (IEF/SDS-PAGE) Protein mass analysis by MALDI-TOF -- done at core facilities -- often detect posttranslational modifications -- matrix assisted laser desorption/ionization time-of-flight spectroscopy

9/14/ Evaluation of 2D gels (IEF/SDS-PAGE) Advantages: Visualize hundreds to thousands of proteins Improved identification of protein spots Disadvantages: Limited number of samples can be processed Mostly abundant proteins visualized Technically difficult Labor-intensive, not really ”high-throughput” methods

9/14/ Yeast-Two-hybrid (Y2H) Aim: Identify pairs of physical interactions among proteins. Solution: Use the transcription mechanism of the cell

9/14/ Yeast-two-hybrid: Principles Recap of biology: Protein vs. domain A protein is composed of modules or domains Domains are individually folded units within the same protein chain. The presence of multiple domains in a protein allow the protein to perform different functions. The central dogma of biology d1d1 d2d2 d3d3 p1p1 d4d4 d5d5 p2p2 TRANSCRIPTION DNA RNA TRANSLATION PROTEIN

9/14/ Yeast-two-hybrid: Principles Normal transcription requires both the DNA-binding domain (BD) and the activation domain (AD) of a transcriptional activator (TA). Transcriptional activator (TA) Protein that is required to activate transcription A DNA-binding domain (BD): binding to DNA, An activation domain (AD): activating transcription of the DNA

9/14/ Yeast-two-hybrid: Principles The binding domain and the activation domain do not necessarily have to be on the same protein. In fact, a protein with a DNA binding domain can activate transcription when simply bound to another protein containing an activation domain this principle forms the basis for the yeast two-hybrid technique

9/14/ Major components of a Yeast-two-hybrid experiment: Bait protein – the protein of interest (X): with a DNA binding domain attached to its N-terminus Prey protein – its potential binding partner (Y): fused to an activation domain A reporter gene (R): a gene whose protein product can be easily detected and measured Yeast-two-hybrid: Principles Protein X interacts with protein Y X and Y form a functional transcriptional activator the reporter gene is transcribed Use the reporter produced as a measure of interaction between X and Y

9/14/ Yeast two-hybrid transcription The yeast two-hybrid technique measures protein-protein interactions by measuring transcription of a reporter gene. If protein X and protein Y interact, then their DNA-binding domain and activation domain will combine to form a functional transcriptional activator (TA). The TA will then proceed to transcribe the reporter gene that is paired with its promoter.

9/14/ Yeast two-hybrid screens Screen a library of proteins for potential binding partner Identifying interacting proteins in a pairwise fashion Feasible at a large scale (genome scale) Z A bait prey Reporter Gene Bait Protein Binding Domain Prey Protein Activation Domain Bait-prey model

9/14/

9/14/ red = cellular role & subcellular localization of interacting proteins are identical; blue = localiations are identical; green = cellular roles are identical

9/14/ Y2H Identify proteins that are physically associated in vivo. Use yeast S. cerevisiae as a host Disadvantage The fused proteins must be able to fold correctly and exist as a stable protein inside the yeast cells Advantage Yeast is closer to higher eukaryotics than in vitro experiments or those systems based on bacterial hosts Weak and transient interactions Often the most interesting in signaling cascades Are more readily detected in two-hybrid since the reporter gene strategy results in a significant amplification. Always a trade-off between the identification of weak interactions and the number of false positives

9/14/ <4% Low overlap among independent experiments Uetz et al Ito et al proteins <23% Uetz et al Ito et al interactions High false positives and false negatives in yeast-two hybrid data Two sets of independent experiments Ito et al PNAS 1999 Uetz et al Nature 2000

9/14/ False positives Proteins with transcription activation activity (bait works by itself) Proteins that normally never see each other (e.g. due to the time/space constraints) are expressed together and may be sticky Proteins are expressed at high levels and this promotes promiscuous interaction Another protein bridges the two interacting partners

9/14/ False negatives Proteins become toxic upon expression in yeast Proteins are toxic when expressed and targeted into the yeast nucleus. Proteins proteolyse essential yeast proteins or proteins essential for the system like the DNA binding domain or the activation domain. Proteins don’t get into the nucleus (membrane protein esp.) Proteins are not modified correctly in heterologous environment

9/14/ Final Remark on Y2H Although the outcome of a screening often results in many new hypotheses, they still need to be validated by other techniques. There is enough reason to remain sceptic about two- hybrid screenings but the most convincing argument in favor of the two-hybrid is the number and speed Referred to as functional screens Interacting proteins might give a functional hint if at least one of the partners has a known functional commitment in a well understood signaling pathway.

9/14/ Analysis of protein complexes Aim: Identification of complexes and their sub units. Solution: a two step method Isolation of only relevant complexes Identification of complex units.

9/14/ Affinity chromatography/mass spec Major methods High throughput mass spectrometric protein complex identification (HMSPCI) Tandem affinity purification (TAP) Again, bait – prey model Very sensitive method Identify multi-protein complexes Not really possible in yeast two-hybrid

9/14/ Methods 1. Attach tags to bait proteins Introduce DNA encoding these into cells Cells express modified proteins Proteins form complexes with other proteins in vivo Cells have to express modified protein properly Tag can interfere with protein folding and function Overexpressed protein may be toxic to cell Kumar and Snyder, 2002

9/14/ Methods 2. Bait proteins and associated proteins are precipitated on an affinity column Tag sticks to column along with protein complex Elute other proteins Elute tagged protein 3. Resolve proteins on an SDS- PAGE gel Separate by charge & weight 4. Cut out protein bands Proteins of same size will be in same band 5. Digest protein bands with trypsin Results in segments of proteins

9/14/ Methods Mass spectrometry to analyze protein composition: 6. Samples are vaporized and ionized 7. Ions enter mass analyzer and are separated by mass to charge ratio 8. Ions are detected and a signal generated 9. Compare signal to database to identify proteins in complex

9/14/ Methods

9/14/ Affinity chromatography/mass spec Data on complexes deposited in databases

9/14/201539

9/14/201540

9/14/ Affinity chromatography/mass spec False positives: sticky proteins Bait protein GST

9/14/ Affinity chromatography/mass spec False negatives: Bait must be properly localized and in its native condition Affinity tag may interfere with function Transient protein interactions may be missed Highly specific physiological conditions may be required Bias against hydrophobic, and small proteins Bait protein GST

9/14/ The Rosetta Stone approach Marcotte et al. (1999) and other groups hypothesized that some pairs of interacting proteins are encoded by two genes in many genomes, but occasionally they are fused into a single gene. By scanning many genomes for examples of “fused genes,” several thousand protein-protein predictions have been made.

9/14/ Yeast topoisomerase II E. coli gyrase B E. coli gyrase A Fig Page 256 The Rosetta Stone approach

9/14/ Function Prediction from Interaction It is possible to deduct functions of a protein through the functions of its interaction partners. A difficult task: Within-class, cross-class interactions Available methods based on protein interaction Neighboring counting method Methods based on χ 2 -statistics Markov Random Fields Simulated annealing