Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 8 Proteomics Using high-throughput methods to identify proteins and to understand their function This chapter describes a variety of technologies.

Similar presentations


Presentation on theme: "Chapter 8 Proteomics Using high-throughput methods to identify proteins and to understand their function This chapter describes a variety of technologies."— Presentation transcript:

1 Chapter 8 Proteomics Using high-throughput methods to identify proteins and to understand their function This chapter describes a variety of technologies used to identify proteins and to understand their function. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

2 Contents Definition of proteomics Proteomics technologies
2-D gel electrophoresis Mass spectrometry Protein chips Yeast two-hybrid method Biochemical genomics Using proteomics to uncover transcriptional networks The topics covered in this chapter include a definition of proteomics, technologies used in proteomics, biochemical genomics, and the use of proteomics to uncover networks of transcriptional regulation in yeast. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

3 What is proteomics? An organism’s proteome The goals of proteomics
A catalog of all proteins Expressed throughout life Expressed under all conditions The goals of proteomics To catalog all proteins To understand their functions To understand how they interact with each other An organism’s proteome is composed of the entire range of proteins expressed throughout the organism’s life under all biologically relevant conditions. The goal of proteomics is to catalog these proteins and to understand their functions and interactions. We note that the term “structural proteomics” is sometimes applied to high-throughput projects characterizing the 3-D structure of proteins. Instead of “structural proteomics,” we use the term “structural genomics,” which is discussed in detail in another chapter. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

4 The challenges of proteomics
Splice variants create an enormous diversity of proteins ~25,000 genes in humans give rise to 200,000 to 2,000,000 different proteins Splice variants may have very diverse functions Proteins expressed in an organism will vary according to age, health, tissue, and environmental stimuli Proteomics requires a broader range of technologies than genomics While genomics has greatly facilitated proteomics projects, characterizing a proteome is considerably more complex than sequencing a genome. At the most basic level, there are far more proteins than genes in a eukaryotic organism. For example, humans possess approximately 25,000 genes, but are estimated to have between 200,000 and 2 million unique proteins. Many of these proteins are produced by alternative splicing. These splice variants are likely to have nonoverlapping functions. In addition, the exact proteins that are expressed at any given moment depend on a person’s age, health, and environmental stimuli. To complicate matters further, the diverse chemical properties of proteins make it difficult to develop a “one size fits all” approach to characterizing the proteome. Instead, a wide variety of technologies is necessary. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

5 Diversity of function in splice variants
Example: the calcitonin gene Gene variant #1 Protein: calcitonin Function: increases calcium uptake in bones Gene variant #2 Protein: calcitonin gene-related polypeptide Function: causes blood vessels to dilate An example of the complex relationship between genes and expressed proteins can be seen in the calcitonin gene. One protein (calcitonin) produced by this gene is responsible for increasing calcium uptake in bones, while a second splice variant (calcitonin gene-related peptide) causes blood vessels to dilate; hence, two completely different functions are regulated by proteins originating from a single gene. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

6 Posttranslational modifications
Posttranslational modifications are defined as any changes to the covalent bonds of a protein after it has been fully translated. Proteolytic cleavage Fragmenting protein Addition of chemical groups to one or more amino acids on the protein Protein function may be altered by posttranslational modifications as well. Posttranslational modifications are defined as any changes to the covalent bonds of a protein after it has been fully translated. These changes can be broken into two broad categories: proteolytic cleavage (i.e., fragmenting the protein) and the addition of chemical groups to one or more amino acids on the protein. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

7 Chemical modifications
Phosphorylation: activation and inactivation of enzymes Acetylation: protein stability, used in histones Methylation: regulation of gene expression Acylation: membrane tethering, targeting Glycosylation: cell–cell recognition, signaling GPI anchor: membrane tethering Hydroxyproline: protein stability, ligand interactions Sulfation: protein–protein and ligand interactions Disulfide-bond formation: protein stability Deamidation: protein–protein and ligand interactions Pyroglutamic acid: protein stability Ubiquitination: destruction signal Nitration of tyrosine: inflammation There are numerous types of posttranslational modifications, a subset of which is shown on the slide. Phosphorylation is among the most important because it provides a reversible mechanism for activating enzymes by attaching phosphate groups to tyrosines, threonines, and serines. A brief description of each of the different types of modifications is shown in the slide. A detailed description of the chemistry involved in each of these posttranslational modifications is beyond the scope of this chapter. The list given here is meant to show the functional diversity of posttranslational modification and its importance. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

8 Practical applications
Comparison of protein expression in diseased and normal tissues Likely to reveal new drug targets Today ~500 drug targets Estimates of possible drug targets: 10,000–20,000 Protein expression signatures associated with drug toxicity To make clinical trials more efficient To make drug treatments more effective A variety of practical applications is expected to come out of proteomics. Comparisons of protein expression in diseased and normal tissues may lead to the development of better diagnostics or the discovery of new drug targets. At present, there are roughly 500 drug targets, but estimates suggest that as many as 20,000 may exist. Protein expression patterns associated with drug toxicity may also improve drug effectiveness and efficiency in clinical trials. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

9 Technologies for proteomics
2-D gel electrophoresis (2-dimensional) Separates proteins in a mixture on the basis of their molecular weight and charge Mass spectrometry Reveals identity of proteins based on computer software that can uniquely identify individual proteins Protein chips A wide variety of identification methods structure, biochemical activity, and interactions with other proteins Yeast two-hybrid method Determines how proteins interact with each other Biochemical genomics (Enzymatic) Screens gene products for biochemical activity Many technologies are being exploited to meet the challenges of proteomics. Among the most widely used techniques are two-dimensional (2-D) gel electrophoresis, mass spectrometry, protein chips, the yeast two-hybrid method, and biochemical genomics. 2-D gel electrophoresis separates proteins in a mixture, based on their molecular weight and charge. Mass spectrometry in combination with computer software can uniquely identify individual proteins. Protein chips are a relatively new technology that can be used to identify proteins on the basis of their structure, biochemical activity, and interactions with other proteins. The yeast two-hybrid technique uses an in vivo approach to reveal protein–protein interactions. Biochemical genomics allows researchers to assay gene products for biochemical activity. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

10 2-D gel electrophoresis
Polyacrylamide gel Voltage across both axes pH gradient along first axis neutralizes charged proteins at different places pH constant on a second axis where proteins are separated by weight x–y position of proteins on stained gel uniquely identifies the proteins Acidic Basic High MW 2-D gel electrophoresis is one of the oldest proteomics technologies. In this approach, proteins are usually first separated by their charge in a tube of polyacrylamide with a pH gradient going from end to end. When a protein encounters a pH level where its charge is neutralized, it no longer moves along an applied electric field. Once proteins have been separated on the basis of charge, the tube is transferred onto a second gel with constant pH. Applying an electric field across the second gel will separate the proteins on the basis of their molecular weight. The end result is that each protein will have a unique x–y position on the 2-D gel. Samples of proteins identified by their position on the gel can be removed for further experimental analysis. An example of a 2-D gel is shown in the slide. Low MW © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

11 Differential in gel electrophoresis
Label protein samples from control and experimental tissues Fluorescent dye #1 for control Fluorescent dye #2 for experimental sample Mix protein samples together Identify identical proteins from different samples by dye color with benzoic acid Cy3 without benzoic acid Cy5 Differential in gel electrophoresis is a recent development that has allowed researchers to compare proteomic profiles in two different samples more accurately. To understand this technology, consider two populations of E. coli, one grown in the presence of benzoic acid and the other grown in its absence. The proteins in one sample are labeled with one fluorescent dye (Cy3 in this case), and the proteins in the second sample are labeled with another dye (Cy5). The two dyes are matched for charge and mass so that they will affect proteins migrating in a gel in the same way. Protein samples derived under the two conditions are then mixed together and loaded onto a single 2-D gel. After the gel has been run, it is exposed to light of one wavelength in order to excite the Cy3 dye and light of another wavelength in order to excite the Cy5 dye. Examples of the results are shown in the slide. Images captured in this way can be further processed by software to estimate differences in protein expression between individual proteins expressed under the two conditions. On the aggregate level, the images can be subtracted or overlaid to compare overall patterns of protein expression. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

12 Caveats associated with 2-D gels
Poor performance of 2-D gels for the following: Very large proteins Very small proteins Less abundant proteins Membrane-bound proteins Presumably, the most promising drug targets Despite the long history of 2-D gel electrophoresis, there are several caveats associated with this technique. First, it does not work well with very large or very small proteins, and low-abundance proteins are difficult to detect with this technique. Also, membrane-bound proteins cannot be characterized using 2-D gels. Unfortunately, the most promising drug targets belong to this class of proteins. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

13 Mass spectrometry Measures mass-to-charge ratio
Components of mass spectrometer Ion source Mass analyzer Ion detector Data acquisition unit Mass spectrometers are devices that measure the mass-to-charge ratios of ions. These ions might be very simple or as complex as peptides. Four components make up every mass spectrometer: an ion source, a mass analyzer, an ion detector, and a data acquisition unit. Because mass spectrometers are only able to analyze ions, a sample must be ionized first to create an ion source. The mass analyzer typically consists of some combination of magnetic or electric fields that can be manipulated by the experimenter to determine the mass-to-charge ratio of an ion of interest. The ion detector measures the presence of ions, and the data acquisition unit allows experimental measurements to be analyzed by computer. The picture in the slide shows a researcher using a mass spectrometer. A mass spectrometer © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

14 Ion sources used for proteomics
ESI Proteomics requires specialized ion sources Electrospray Ionization (ESI) With capillary electrophoresis and liquid chromatography Matrix-assisted laser desorption/ionization (MALDI) Extracts ions from sample surface MALDI Proteomics experiments require methods that can ionize peptides without too much degradation. Two prevalent techniques are electrospray ionization (ESI) and matrix-assisted laser desorption/ionization (MALDI). ESI is commonly used with some kind of separation technology, such as liquid chromatography or capillary electrophoresis, that reduces the complexity of samples fed into the mass spectrometer. Capillary electrophoresis allows the separation of miniscule amounts (10-15 to M) of protein within minutes. It works much like gel electrophoresis except that proteins are separated along very narrow quartz tubes with very high voltages across their ends. The protein that comes out of the tube is ionized when it is passed through a very fine needle and an electric field. As the solvent containing the proteins evaporates, only charged peptide ions remain, which are then further selected through a sampling cone, as shown in the figure in the slide. ESI is well suited to proteomics experiments because it is relatively nondestructive to proteins. MALDI differs from ESI in that it does not require the sample to be separated into its constituent proteins first. A biological tissue can be embedded in a “matrix” that consists of a chemical which is highly absorbent of UV light. When a pulse of laser light is fired at the matrix, the matrix becomes extremely hot in a very short period of time, causing neighboring proteins to vaporize and ionize such that they can be passed to the mass analyzer. A schematic of the MALDI process is shown in the slide. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

15 Mass analyzers used for proteomics
Ion trap Captures ions on the basis of mass-to-charge ratio Often used with ESI Time of flight (TOF) Time for accelerated ion to reach detector indicates mass-to-charge ratio Frequently used with MALDI Also other possibilities Ion Trap Time of Flight Time-of-flight and ion-trap mass analyzers are often used in proteomics. Ion-trap mass analyzers are frequently used with electrospray ionization. Ion traps work a little bit like radios in that they can be tuned to accept only ions with a particular mass-to-charge ratio. This is done by manipulating electric and magnetic fields around the ions. Once ions of a particular mass-to-charge ratio are captured, they can be fragmented and used to generate a mass spectrum. The figure in the slide illustrates the concept of an ion trap. The time-of-flight mass analyzer is used with MALDI and takes a different approach to measuring the mass-to-charge ratio. All ions coming from the ion source are imparted with the same kinetic energy, which is equal to (1/2)mv2, where m is mass and v is velocity. Because different ions will have different masses, they will travel from the ion source to the detector at different speeds. Their arrival times at the detector can thereby be measured to reveal their mass-to-charge, ratio. These arrival times (or times of flight) can then be used to generate mass spectra. A schematic of a time-of-flight mass analyzer is shown in the slide. There are many other types of mass analyzers that are also used for proteomics. The examples given here are used for the purpose of illustrating how a mass analyzer may work. Detector © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

16 A mass spectrum The mass spectrum shown in this slide comes from the analysis of a mixture of five proteins that were digested with the protease trypsin. The mass-to-charge ratio is plotted along the x-axis, and the signal intensity for individual ions is plotted along the y-axis, which is normalized to the highest peak. Different peaks represent the presence of individual ionized peptides. In the next slide, we explain how this kind of data can be used to identify proteins. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

17 Identifying proteins with mass spectrometry
Preparation of protein sample Extraction from a 2-D gel Digestion by proteases — e.g., trypsin Mass spectrometer measures mass-charge ratio of peptide fragments Identified peptides are compared with database Software used to generate theoretical peptide mass fingerprint (PMF) for all proteins in database Match of experimental readout to database PMF allows researchers to identify the protein A mass-spectrometry experiment can begin with extraction of a protein of interest from a 2-D gel. Typically, proteins are too large to be analyzed directly by mass spectrometers, so they must first be broken down into smaller, more manageable peptides. This is done by digesting the protein with a protease, such as trypsin, that cleaves the protein between specific amino acids. The mass spectrum generated is processed by a computer that attempts to identify the protein likely to be represented by the spectrum. Theoretical peptide mass fingerprints (i.e., mass spectra) are calculated for all proteins in a database. This is done by first identifying the trypsin cleavage sites in all proteins in the database and then calculating the mass of the peptides that would result from cleavage with trypsin. These calculated fragments are then compared with the fragments obtained from the mass-spectrometry experiment. A close match allows researchers to identify the protein represented by the experimental mass spectrum. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

18 Stable-isotope protein labeling
Stable isotopes used to label proteins under different conditions Variety of labeling methods Enzymatic Metabolic Via chemical reaction Relative abundance of labeled and nonlabeled proteins measured in mass spectrum Stable-isotope (i.e., nonradioactive) protein labeling is a technique used to quantify differences in protein expression under various experimental conditions. Because different isotopes of an element have different masses, they can have a measurable effect on the mass-to-charge ratio of an ionic peptide, which leads to shifts in peak position in the mass spectrum. Biologists can take advantage of this fact to measure the abundance of individual proteins under different experimental conditions. This can be done by labeling a protein with a particular isotope by using an enzymatic reaction, by incorporating isotopes into metabolites (e.g., an amino acid containing the isotope), or via a chemical reaction. For example, bacteria grown under different experimental conditions can be differentiated by growing one sample in normal medium and growing another sample in broth containing amino acids that have incorporated the less common isotope 13C of carbon. Proteins from the two bacteria samples are mixed together and digested by trypsin in preparation for analysis by mass spectrometry. The mass spectrum of the labeled and unlabeled proteins reveals the relative abundance of the protein of interest under the two different experimental conditions. The figure in the slide summarizes the procedure used in stable-isotope protein labeling. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

19 Data from a MALDI experiment
Distributions of individual proteins in a slice of rat brain Tissue is coated with UV-absorbing matrix MALDI ion source with laser sampling tissue every 180 mm Mass-spectrum peaks reveal individual proteins Image processing for false-color images In addition to studying isolated proteins, MALDI ion sources can be used to sample many different proteins simultaneously from an intact tissue sample. The picture in the slide shows results from a proteomics experiment using MALDI to measure the presence of different proteins across a region of rat brain. In the upper left-hand corner is an optical image of a transverse section of the brain before it was coated with matrix. The other pictures are false-color images showing the variations in the concentration of individual proteins across the brain section. Individual proteins were differentiated by the position of the highest peaks in their mass spectra, shown by the numbers associated with each image. Variations in the pattern of protein expression indicate that proteins can be highly localized within a single tissue type. Sections of rat brain imaged by mass spectrometry © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

20 Limitations of mass spectrometry
Not very good at identifying minute quantities of protein Trouble dealing with phosphorylated proteins Doesn’t provide concentrations of proteins Improved software eliminating human mediated analysis is necessary for high-throughput projects Are only able to identify hundreds of proteins in a single day Mass spectrometry has rapidly become a workhorse of the proteomics community; however, several challenges remain. For example, minute quantities of proteins are not amenable to analysis by mass spectrometry. In addition, some proteins with posttranslational modifications can have altered fragmentation patterns that make their mass spectra difficult to interpret. This is the case for phosphorylated proteins, although several techniques have been developed recently to get around this problem. A major drawback of current mass spectrometry is that there is no straightforward way to determine the concentration of a protein. For many proteins, such as those involved in signal transduction pathways or in transcriptional regulation, knowledge of their concentrations in a cell would greatly aid in understanding how they function. The goal of proteomics is the large-scale analysis of proteins. To meet this challenge, new innovations in analysis software will be necessary to eliminate the need for mass-spectrum analysis by humans. At present, the most advanced proteomics laboratories using mass spectroscopy are only able to identify hundreds of proteins in a single day. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

21 Yeast proteins detected
Protein chips Thousands of proteins analyzed simultaneously Wide variety of assays Antibody–antigen Enzyme–substrate Protein–small molecule Protein–nucleic acid Protein–protein Protein–lipid Protein chips are similar to nucleic acid microarrays in that they are able to simultaneously detect thousands of different molecules; however, the diverse chemistry of proteins requires more varied methods for detecting proteins and measuring their activity. To date, protein chips have been designed to detect the presence of proteins by using antibodies; to detect protein–protein, protein–nucleic acid, protein–small molecule, and protein–lipid interactions; and to measure enzyme–substrate reactions. The image in the slide comes from a protein-chip experiment that uses antibodies to detect yeast proteins. Each dot in the array represents a different protein. The next slide explains how protein chips are manufactured. Note that the terms “protein” and “peptide microarray” are sometimes used in place of “protein chip.” Yeast proteins detected using antibodies © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

22 Polydimethylsiloxane
Fabricating protein chips: physical array that can hold proteins, isolate them from each other, and prevent them from becoming denatured Protein substrates: minipads Polyacrylamide or agarose gels Glass Nanowells Proteins deposited on chip surface by robots Polydimethylsiloxane The first challenge of fabricating a protein chip is to construct a physical array that can hold proteins, isolate them from each other, and prevent them from becoming denatured. This can be accomplished by depositing miniature pads of polyacrylamide or agarose gel on a glass surface, attaching proteins directly to glass slides, or constructing tiny wells called nanowells (shown in slide) that can hold nanoliters of solution containing a particular protein. In the case of miniature gel pads, acrylamide is trapped between a glass slide and a quartz plate that has a mask drawn on it. The quartz plate is then exposed to ultraviolet light, which causes the acrylamide in regions not covered by the mask to polymerize. The glass slide is then rinsed with water, which washes away the unpolymerized acrylamide, leaving the researcher with a set of gel pads that can be as small as 25 X 25 X 20 microns. Proteins (or other molecules) can then be deposited directly onto the gel pads. Attaching proteins to glass directly can be accomplished by treating a glass surface with a reagent that covalently attaches to proteins. The protein-chip substrates described here should be considered only a sampling, as new techniques are constantly being developed. For all the methods described thus far, proteins are deposited using traditional technology or by robots similar to those used in the manufacture of nucleic acid microarrays. The next slide focuses on strategies for attaching proteins (or other molecules) to the chip’s surface. UV PDMS Quartz mask Glass slide © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

23 Protein attachment strategies
Diffusion (pads) Protein suspended in random orientation, but presumably active Adsorption/Absorption Some proteins inactive Covalent attachment Affinity Orientation of protein precisely controlled Diffusion Adsorption/ Absorption Covalent Proteins (or other molecules) can be attached to protein chips by a variety of methods. If the chip uses gel pads, the protein is freely suspended inside the gel and is presumed to maintain an active state regardless of its orientation. This is not the case for chips that use adsorption, absorption, or covalent attachment. A protein can stick to an adsorbing surface such that its active site is hidden from a probing molecule that is applied to the chip. Similarly, strategies that covalently attach a protein (usually via the N terminus) to the chip surface may interfere with the active site of the protein. In contrast, affinity strategies allow the orientation of proteins to be more precisely controlled. This control can be accomplished, for example, by antibodies that are attached to the chip’s surface, but bind the protein in such a way that its activity is not compromised. Affinity antibodies © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

24 Classes of capture molecules Agent that interacts with molecules applied to the chip to carry out some kind of assay Different capture molecules must be used to study different interactions Examples Antibodies (or antigens) for detection Proteins for protein-protein interaction Enzyme-substrate for biochemical function analytical microarrays and functional protein chips Antigen– antibody Protein– protein Aptamers: short peptides The term “capture molecule” refers to the agent that interacts with molecules applied to the chip to carry out some kind of assay. Several different types of capture molecules have been used to study proteins. The resulting arrays can be divided into two broad categories: analytical microarrays and functional protein chips. Analytical microarrays use attached antibodies or antigens to detect specific proteins in complex mixtures that are applied to the chip. Conversely, functional protein chips use vast arrays of different proteins to probe protein function. For example, protein–protein interactions can be studied by applying specific proteins to protein chips. Biochemical function can be probed by adding substrates to a protein chip. Protein lipid interactions can be examined by adding lipids in the form of liposomes to protein chips, and from a practical standpoint, putative drugs can be added to protein chips to determine a drug target. Similarly, a ligand chip can be constructed and proteins can be applied to it. Beyond the world of proteins, researchers have been able to successfully demonstrate a working carbohydrate chip. Enzyme– substrate Receptor– ligand © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

25 Reading out results Fluorescence
Most common method Fluorescent probe or tag Can be read out using standard nucleic acid microarray technology Surface-enhanced laser desorption/ionization (SELDI) Laser ionizes proteins captured by chip Mass spectrometer analyzes peptide fragments Atomic-force microscopy Detects changes in chip surface due to captured proteins Methods for reading out the results of a protein-chip experiment vary, but fluorescence continues to be the most common, because of its sensitivity and ease of use. The chip is probed directly with a fluorescently tagged protein or other molecule. Fluorescence on the chip can then be measured with preexisting instruments used to read out nucleic acid microarrays. An alternative is termed SELDI (surface-enhanced laser desorption/ionization), which is similar to the MALDI method described in the previous section on mass spectrometry. With SELDI, a laser is used to very precisely ionize proteins captured on the chip, which can then be identified using a mass spectrometer. Atomic-force microscopy has been used to detect proteins captured on antibody chips by measuring microscopic changes in the surface of the chip. Atomic-force microscopes work by essentially dragging a very small “stick” over a bumpy surface. In this case, the stick is able to measure deflections on the order of angstroms. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

26 Difficulties in designing protein chips
Unique process is necessary for constructing each probe element Challenging to produce and purify each protein on chip Proteins can be hydrophobic or hydrophilic Difficult to design a chip that can detect both Protein’s function may be dependent on posttranslational modification or an interaction with another biological molecule Challenging and constantly improving with new technological advancements Unlike nucleic acid microarrays, which rely on hybridization with nucleic acid probes, protein chips must deal with the diverse chemistry of proteins, which introduces certain engineering challenges. For example, a protein chip that uses antibodies to identify different proteins will necessitate the production of a specific antibody for each protein. This process can be laborious, because the production of specific antibodies can be quite difficult. A similar problem arises when making a protein chip consisting of many different proteins attached to the chip. Each protein has to be produced and purified, which is a much greater challenge than building oligonucleotides for DNA microarrays. Another problem lies in designing a chip that can detect both hydrophobic and hydrophilic proteins at the same time. Of course, there is the more general problem plaguing proteomics that a protein’s function may be dependent on posttranslational modification or an interaction with another biological molecule that is altered or missing from the chip. Nonetheless, protein chips hold great promise, and the coming years will doubtlessly see many new innovations. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

27 Regulation of transcription
UE TATA box © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

28 Yeast two-hybrid method
Goal: Determine how proteins interact with each other Method Use yeast transcription factors Gene expression requires the following: A DNA-binding domain An activation domain A basic transcription apparatus Attach protein1 to DNA-binding domain (bait) Attach protein2 to activation domain (prey) Reporter gene expressed only if protein1 and protein2 interact with each other The yeast two-hybrid method exploits the gene transcription machinery of yeast to study interactions between proteins in vivo. Gene transcription in S. cerevisiae, as in most eukaryotic cells, requires a DNA-binding domain, an activation domain, and the basic transcription apparatus. A protein of interest (call it protein1) is attached to the DNA-binding domain of a well-characterized transcription factor such as Gal4. This protein is called the “bait.” Protein2 is attached to the activation domain from the same transcription factor. The protein2-activation domain complex is called the “prey.”If protein1 and protein2 interact with each other, then DNA binding and activation will occur and a reporter gene will be expressed. The yeast two-hybrid method is shown schematically in the next slide. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

29 A schematic of the yeast two-hybrid method
The figure in the slide shows in greater detail how the yeast two-hybrid method works. Two sets of yeast colonies are grown. In this example, the first set of colonies consists of all yeast open reading frames (ORFs) attached to the DNA-binding domain. The second set of colonies consists of all ORFs attached to the activation domain. Yeast from the two sets of colonies are mated with each other to produce every possible protein–protein interaction in the offspring. Cells expressing the reporter gene are selected, and their ORFs are identified by DNA sequencing to reveal the precise protein–protein interaction. This method has been successfully applied to determine all protein–protein interactions in the yeast Saccharomyces cerevisiae, as shown in the next slide. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

30 Results from a yeast two-hybrid experiment
Goal: To characterize protein–protein interactions among 6,144 yeast ORFs 5,345 were successfully cloned into yeast as both bait and prey Identity of ORFs determined by DNA sequencing in hybrid yeast 692 protein–protein interaction pairs Interactions involved 817 ORFs interactome on a genome-wide scale The term “interactome” refers to all possible combinations of protein–protein interactions in a given organism. Two groups of researchers recently used the yeast two-hybrid method to describe the yeast interactome. They were able to successfully clone 5,345 of the 6,144 ORFs into yeast strains as both bait and prey. When the yeast were mated and selected for expression of the reporter genes, the researchers found 692 protein–protein interaction pairs involving 817 ORFs. In the same study, the researchers used a protein chip to study a subset of the ORFs used in the two-hybrid experiments. Interestingly, the protein chip revealed interactions that were absent in the two-hybrid results, suggesting that this approach might be an even more sensitive technique for studying the interactome. Some of the caveats associated with the two-hybrid method are discussed in the next slide. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

31 Caveats associated with the yeast two-hybrid method
There is evidence that other methods may be more sensitive (protein chip) Some inaccuracy reported when compared against known protein–protein interactions False positives False negatives As the previous slide has indicated, the two-hybrid method can be used to study the interactome on a genome-wide scale. However, this technique does have some shortcomings. One is that it may be less sensitive than other methods. For example, a direct comparison of a protein chip with the two-hybrid method showed that as many as 42% of the proteins tested on a chip showed interactions with other proteins, while only 8% of proteins studied with the yeast two-hybrid technique did. In addition, there is confirmed evidence of false positives and false negatives resulting from the two-hybrid method. False negatives are especially a problem when the proteins being studied are membrane bound or require posttranslational modification for normal functioning. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

32 Subcellular localization of the yeast proteome
Complete genome sequences allow each ORF to be precisely tagged with a reporter molecule Tagged ORF proteins indicate subcellular localization Useful for the following: Correlating to regulatory modules Verifying data on protein–protein interactions Annotating genome sequence Knowing how proteins interact with each other is useful in determining their function and understanding their role in various biochemical pathways. Similarly, physically localizing proteins at the subcellular level reveals protein involvement in different organelles. It also provides a method for correlating subcellular localization to transcriptional regulation and assists in the annotation of genome sequences. Because proteins that do not localize to the same subcellular region are unlikely to interact with each other, localization data can also be used to verify previous data on protein–protein interactions. The sequencing of entire genomes allows researchers to systematically determine subcellular locations for every ORF. In 2003, a group from the University of California, San Francisco, became the first to localize an organism’s full proteome. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

33 Attaching a GFP tag to an ORF
Marker gene: HIS free medium PCR product GFP HIS3MX6 Homologous recombination Chromosome ORF1 ORF2 As in the case of protein–protein interactions, the yeast genome was used to determine the subcellular location of products of all ORFs. This was done by inserting the sequence for the green fluorescent protein (GFP) together with a marker gene (HIS3MX6) that makes it possible to select transformed yeast strains in histidine-free medium. Oligonucleotide primers corresponding to each ORF were used to generate ORF-specific sequences flanking the GFP and selectable marker. This approach permits the sequence to be incorporated into the chromosomal DNA via homologous recombination. The fusion protein resulting from this insertion has the GFP attached at the C-terminus of the ORF protein. Inside the living cell, the precise location of the protein can be determined by localizing the protein’s fluorescence. Fusion protein NH2 protein GFP COOH © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

34 Location of proteins revealed
75% of yeast proteome localized > 40% of proteins in cytoplasm 67% of proteins were previously un-localized Localizations correlate with transcriptional modules cytoplasm nucleus The image in this slide shows a tagged protein that was localized to the nucleus. Note the contrast between the glowing nucleus and the much fainter cytoplasm. This study was able to successfully localize 75% of the entire yeast proteome, with greater than 40% of the proteins being found in the cytoplasm, and other proteins being localized in 21 distinct subcellular regions, including the nucleus, mitochondria, and the endoplasmic reticulum. The researchers were also able to show that subcellular localizations frequently correlated with transcriptional modules. By comparing their data with previous work on protein–protein interactions, they were also able to show that proteins localized in different regions had different probabilities of interacting with each other. For example, proteins in the cytoplasm were only 1.3 times more likely to interact with each other by chance, while microtubule-bound proteins were 56 times more likely to do so. A protein localized to the nucleus © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

35 Pregenomics biochemical assays
Methods used to find genes responsible for specific biochemical activity before the inception of genomics Laboriously purify responsible protein Often expensive and time consuming Expression cloning Introduce cDNA pools into cells Look for biochemical activity in those cells Caveat: Often difficult to detect biochemical activity in cell’s biochemical “background” Prior to genomics, researchers searching for genes responsible for a specific biochemical activity were limited to techniques that lacked sensitivity or were laborious and expensive. For example, a biologist could attempt to purify a protein responsible for a particular biochemical reaction. If the protein were isolated in sufficient quantities, it could be sequenced and used to construct a probe that would hopefully hybridize to the cognate gene. However, this approach often proved too difficult or expensive for many proteins. Alternatively, an expression-cloning approach could be taken. In this method, pools of cDNA are introduced into cells. If the cells express the biochemical behavior of interest, the cDNA pools are subdivided further until the gene of interest is cloned. One disadvantage of expression cloning is that cDNA-induced biochemical activity can be masked by the biochemical background of the host cell, making it impossible to detect, and thereby greatly reducing the sensitivity of this technique. Biochemical genomics overcomes the hurdles of traditional methods by greatly simplifying biochemical assays and by providing greater sensitivity than expression cloning. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

36 Biochemical genomics Genome of an organism is already known Approach
Construct plasmids for all ORFs Attach ORFs to sequence that will facilitate purification Transform cells Isolate ORF products Test for biochemical activity The identification of all ORFs in fully sequenced genomes makes biochemical genomics possible. The approach is straightforward: Plasmids for all ORFs in a genome are constructed and inserted into host cells. This procedure can be done with a fully sequenced genome because specific PCR primers can be designed for each individual ORF. The ORF is usually fused with a tag that will allow the ORF gene product to be purified later. Isolated ORF products are then pooled together and tested for various biochemical activities. An example of this process is shown on the next slide. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

37 Biochemical genomics in yeast
6,144 ORF yeast strains made ORFs fused to glutathione S-transferase (GST) for purification purposes Biochemical assay revealed three new biochemical reactions associated with yeast ORFs tRNA ligase pre-tRNA In 1996, the genome of baker’s yeast (S. cerevisiae) was fully sequenced, allowing all yeast ORFs to be identified. This accomplishment permitted researchers to make 6,144 yeast strains, each containing a unique ORF fused to glutathione S-transferase, a tag that makes purification of the ORF gene products possible. Sixty-four pools, each containing 96 different purified ORF products, were then tested for various biochemical activities. The picture at the bottom of the slide shows how tRNA ligase activity was detected using biochemical genomics. Radioactively labeled fragments of tRNA were added to all 64 ORF pools and then assayed for biochemical activity. Lane 35 is the only lane that shows ligated tRNA, indicating that a tRNA ligase must exist in this pool. While the function of the ORF expressing tRNA ligase activity was already known, researchers were able to use biochemical genomics to reveal three new biochemical reactions associated with ORFs. Ligated tRNA tRNA halves Abc Used as screening pools © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

38 Microfluidics Proteomics requires greater automation
Microfluidics: a “lab on a chip” Microvalves and pumps allow control of nanoliter amounts Can control biochemical reactions A microfluidics chip Microfluidics is a recent innovation that holds great promise for developing high-throughput methods in proteomics. It capitalizes on fabrication technology for microchips to create complex networks of piping, microvalves, and pumps that constitute a “lab on a chip.” The valves and pumps allow experimenters to precisely control the flow of nanoliters of solution into individual chambers where biochemical reactions can be performed. Microfluidics plumbing is constructed from a class of chemical compounds called elastomers. These compounds are soft, stretchy materials that can form tight seals to prevent leakage. They have the added benefits of being cheaper than silicon (the compound used to make traditional microchips) and of having less stringent fabrication requirements. The image on the slide shows the layout of a working microfluidics chip with 256 individual chambers that can be individually filled with solution from two sources and purged from the chip without mixing with contents from other chambers. The different colors in the figure signify different dyes that are pumped through the chip. This microchip has 2,056 microvalves. The next slide illustrates how solution flow is controlled in this particular chip. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

39 Microfluidics in action
loading compartmentalization purging mixing 500 mm The yellow and blue dyes represent two different samples. The loading of the samples into the chip is shown in the upper left of the slide. The samples are segregated from each other during compartmentalization and later mixed together. Mixed solutions can be selectively examined by purging individual chambers, as shown in the lower right portion of the slide. All actions are accomplished through pneumatic pressure. A “control” layer of pipes and valves sits on top of a “flow” layer, where the chambers are located. Junctions between the flow and control layers form valves that can be closed by applying pneumatic pressure through the appropriate pipes in the control layer. Picture the valves as flaps that open when pressure is applied in one direction and close when it is applied in the opposite direction. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

40 Finding transcription-factor targets
All yeast transcription factors were used to make yeast strains Use chromatin immunoprecipitation to select factors attached to promoter regions on DNA (ChIP) DNA fragments used on microarray to identify transcription-factor targets (ChIP on chip) antibodies Genomic DNA Previous sections of this chapter have focused on how proteomics is used to study protein–protein interactions. Here, we discuss how proteomics has revealed patterns of gene regulation. Transcription factors are proteins that regulate gene expression. Following the sequencing of the yeast genome, a catalog of all yeast transcription factors was created. A group from the Massachusetts Institute of Technology used this information to make yeast strains expressing each transcription factor. The researchers were able to successfully express 106 of the 141 known transcription factors. Using chromatin immunoprecipitation (ChIP), they were able to selectively isolate the regions of DNA where transcription factors were bound. This is accomplished by first cross-linking the transcription factor to the DNA in vivo, using formaldehyde. All of the DNA is then sheared, and antibodies specific to the expressed transcription factor are used to precipitate out DNA regions that were bound to the transcription factor. The DNA extracted from this precipitate is then used for a microarray experiment to identify the different promoter regions that had bound to a particular transcription factor. A schematic describing the experimental paradigm used to isolate the DNA targets of transcription factors is shown in the slide. Using this method, the MIT group was able to discover many targets of transcription factors in yeast. By applying computer algorithms to this data, the researchers were further able to reveal a wide variety of regulation network motifs used by yeast. Some of these motifs are shown in the next slide. Over 106 TFs of 141 are active © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

41 Network motifs for transcriptional regulation
Autoregulation Multicomponent loop Multi-input motif TF Motif This slide shows a subset of the motifs for transcriptional regulation discovered in yeast. The light-blue circles represent transcription factors, and the red rectangles indicate the DNA promoter regions. The solid arrows show transcription factors binding to promoters, and the dashed arrows link transcription-factor genes to their respective transcription factors. Autoregulation is a pattern of gene regulation where a transcription factor binds to the promoter of its own gene and activates expression of itself. A multicomponent loop can consist of as few as two transcription factors and two genes. In the slide, the Rox1 transcription factor regulates the YAP6 transcription- factor gene, which produces Yap6 protein, which in turn regulates the Rox1 transcription-factor gene. The multi-input motif shows a situation where individual transcription factors can regulate multiple overlapping genes. At the bottom of the slide is an example of a regulator chain showing how a chain of transcription factors and genes works. Regulator chain © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

42 Studying proteins with posttranslational modifications
Example: tyrosine phosphorylation Traditional method Radioactive labeling with 32P followed by gel electrophoresis or chromatography Problems using mass spectrometry are being overcome to allow high-throughput analysis Better purification techniques Mass spectrometers more capable of detecting phosphorylated peptides The study of proteins with posttranslational modifications is just beginning. Traditional methods can yield good results in this area. For example, 2-D gel electrophoresis has been very successful in separating proteins of varying degrees of phosphorylation. Phosphorylated proteins can be detected by labeling them with radioactive 32P or by using an antibody that can selectively detect a phosphorylated protein. Unfortunately, a number of technical hurdles exist for using high-throughput methods like mass spectrometry to analyze posttranslational modifications. For instance, certain posttranslational modifications are unstable and will be lost before a protein is broken down into peptide fragments during the ionization step in mass spectrometry. Sulfation and phosphorylation of serine and threonine are examples of unstable modifications. In constrast, phosphorylation of tyrosine and arginine methylation appear to be stable. Yet even when dealing with stable modifications, there are problems in analyzing phosphorylated peptides in peptide mixtures. This problem has been overcome to some degree by purifying phosphorylated proteins prior to analysis by mass spectrometry. Despite these challenges, recent advancements have allowed researchers to study posttranslational modifications in a number of cases, thus lending an optimistic outlook to this very important area of proteomics. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

43 HUPO Human Proteome Organization (HUPO) was established in 2002
Mission Consolidate proteomics organizations in different countries into single worldwide body Scientific and educational programs to help spread proteomics knowledge and technology Coordination of public proteomics initiatives Examples of current initiatives Human Liver Proteome Project Human Plasma Proteome Project The Human Proteome Organisation (HUPO) was established in 2002 to help coordinate international efforts in proteomics. Its goals include the consolidation of regional proteomics organizations into a single international body, the dissemination of scientific and educational programs to assist in the development of proteomics technology and knowledge, and the coordination of public proteomics initiatives. Two initiatives currently underway are the Human Liver Proteome Project and the Human Plasma Proteome Project. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

44 Future prospects The next decade may see the complete deciphering of the proteome of yeast More initiatives, like the Human Liver Proteome Project, are underway Better understanding of disease It is possible that the next decade may see the almost complete deciphering of the yeast proteome. New initiatives like the Human Liver Proteome project are likely to have a tremendous medical impact. Ultimately, proteomics will give us a much clearer picture of the mechanics of disease than even genomics can, because it is ultimately proteins that are the functional players in biology. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

45 Summary I Goals of proteomics Proteomics methods
Identify and ascribe function to proteins under all biologically plausible conditions Proteomics methods 2-D gel electrophoresis for separating proteins on the basis of charge and molecular weight Mass spectrometry for identifying proteins by measuring the mass-to-charge ratio of their ionized peptide fragments Protein chips to identify proteins, to detect protein–protein interactions, to perform biochemical assays, and to study drug–target interactions In this chapter, we first described the goals of the field of proteomics, which are to identify and ascribe function to proteins under all biologically plausible conditions. We then described different proteomics methods, including 2-D gel electrophoresis for separating proteins based on charge and molecular weight; mass spectrometry for identifying proteins by measuring the mass-to-charge ratio of their ionized peptide fragments; and protein chips, which can be used to identify proteins, to detect protein–protein interactions, to perform biochemical assays, and to study drug–target interactions. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

46 Summary II Some accomplishments of proteomics Example: yeast
Proteomics methods (continued) Yeast two-hybrid method for studying protein–protein interactions Biochemical genomics for high-throughput assays Some accomplishments of proteomics Example: yeast Yeast two-hybrid method reveals interactome Transcriptional regulatory networks deduced Biochemical genomics uncovers new ORF functions Subcellular localization of proteins Additional proteomics methods we described include the yeast two-hybrid method for studying protein–protein interactions and biochemical genomics for genome-wide biochemical assays. We also described some recent accomplishments of proteomics, including description of the interactome of all interacting proteins in yeast as identified by the yeast two-hybrid method, transcriptional regulatory networks deduced from chromatin immunoprecipitation experiments, the definition of new functions for ORFs by biochemical genomics approaches, and the subcellular localization of the proteome in yeast. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458

47 Summary III Future prospects
Better technology for studying posttranslational modifications ~10 years for completion of yeast proteome? Hopes for the near future include better proteomics methods for studying posttranslational modifications and possibly the complete characterization of the yeast proteome. © 2005 Prentice Hall Inc. / A Pearson Education Company / Upper Saddle River, New Jersey 07458


Download ppt "Chapter 8 Proteomics Using high-throughput methods to identify proteins and to understand their function This chapter describes a variety of technologies."

Similar presentations


Ads by Google