Presentation is loading. Please wait.

Presentation is loading. Please wait.

Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions.

Similar presentations


Presentation on theme: "Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions."— Presentation transcript:

1 Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions.

2 In order to fully understand what proteins do, we need to know something about the other proteins that they interact with – their partners. Most proteins need a partner to, either allosterically or by covalent modification, cause them to change their conformation /activity. Two proteomic methods for predicting protein : protein interactions are DNA sequence analysis and yeast two-hybrid (Y2H) analysis. Interaction between 7-TM receptor and heterotrimeric (  ) G-protein.

3 Comparative genomic analysis can be used to predict protein-protein interactions. Suppose species A has a protein with two domains (1 and 2) and species B has two proteins, one (1’) containing domain 1 and the other (2’) containing domain 2. Then it is possible to predict that, in species B, proteins 1’ and 2’ may interact to generate the same function as seen in the single ‘dual domain’ protein in species A. This approach has been used to predict many protein: protein interactions in yeast and bacteria. This sort of in silico analysis can give valuable insights into protein- protein interaction – but it is limited to the specific situation where a protein encoded by a single gene in one species is replaced by two or more interacting proteins in others.

4 Measuring protein-protein interactions. The yeast two-hybrid (Y2H) method uses a protein of interest as bait in order to discover interacting (or ‘prey’) proteins. A transcription factor is cut into two pieces – the DNA binding domain (DBD) and the activation domain (AD) which stimulates RNA polymerase to begin transcription. Fused to the DBD is the bait protein (a). Fused to the AD is a prey ORF – which can be any known or unknown protein (b). Neither DND-B nor any of the AD-Ps can, on their own, initiate transcription. When the bait and prey proteins are produced in the same cell, they may interact and transcription can be initiated.

5 i.e. the two domains of the transcription factor (DBD and AD) do NOT need to be transcribed in a single protein – if they are able to interact (as a consequence of the bait: prey interaction) then transcription will occur. A typical reporter gene is His3 (which leads to the production of the amino acid histidine). Without His3 activity, cells cannot grow unless histidine is added to the growth medium. The chimeric proteins (i.e. DBD-B and AD-P) are made as translational fusions in yeast. Plasmids are made, one in which a bait coding sequence is fused to the DBD domain coding sequence, others in which prey cDNA sequences are fused to the AD coding sequence. Plasmids are then transformed into a suitable yeast strain which allows expression of the individual chimeric proteins.

6 An alternative is to use a lacZ reporter gene, the resulting enzyme  - galactosidase generates a blue colour indicating activity and therefore bait:prey interaction. Any ORF can be tested with Y2H, which means that a proteome-wide survey can be performed rapidly by transforming a cDNA library into cells that contain bait plasmids. In this way, every protein in a proteome can be tested individually for its potential to interact with bait. lacZ phenotype - blue colonies

7 The Y2H method is not perfect – if the prey ORF was the His3 ORF then the cell would grow in the absence of interaction with the bait. As a control, cells containing only prey must be tested on media lacking histidine. Also, not all proteins work well inside a cell nucleus and there may be false-positive or false negative results due to improper protein folding. The greatest benefit of YH2 is that yeast cells can express genes from almost any species, which means this is a powerful proteomics method for Drosophila, C. elegans, Arabidopsis, zebra fish, mice, humans and also yeast. A lacZ phenotype on both plates (i.e. a false-positive), is boxed in red. The negative control (i.e. no bait) is boxed in black.

8 Automated Y2H Screening Y2H screening can be done at ‘high throughput’ and quality. The protocol is as follows: Combine the bait with the pray library via an optimised mating protocol - use of microtitre plates and laboratory robotics Select the ‘positives’ via a quantitative signal PCR out the library inserts and analyze them by sequencing and bioinformatics

9 Biochemical evidence for interactions between bait and prey proteins must follow-up Y2H–based indications of interactions. This evidence may include: Affinity chromatography – link the bait to a gel matrix and use this to specifically bind (and purify) interacting prey from complex protein mixtures. Gel overlay assays – similar to Western blots – except that instead of using a specific antibody to probe the membrane, use the bait protein. The pray/bait interaction is then visualised using an antibody to the bait protein.

10 Co-immunoprecipitation – use an anti-bait antibody to immunoprecipitate bait protein from a complex mixture of proteins – prey proteins that bind bait are likely to be co-precipitated.

11 Protein Interaction Databases Using Y2H approaches, databases of protein interactions have been created. The application of automated high- throughput Y2H analysis has lead to a dramatic increase in the number of protein interactions in these databases. However, it is likely that only a small fraction of the total number of interactions has, as yet, been identified. Parallel efforts in yeast and Drosophila show little overlap in datasets. False positives continue to be a problem – given that the number of interactions that need to be tested by biochemical techniques is overwhelming.

12 Computational approaches for verification of Y2H data. 50%, or more, of the high-throughput Y2H data are likely to be false positives i.e. the results are ‘noisy’. Computational methods, designed to test the quality of ‘interaction maps’, have been developed. The basis for most of these strategies is to test for correlation between interaction data and other properties of the proteins, protein networks or the corresponding genes. Interactions that are evolutionarily conserved have a higher probability of being biologically relevant than those detected in only a single organism. Similarly, if two proteins implicated in an interaction, have paralogues that also interact, this interaction is of increased likeliness. Genes whose encoded proteins interact may be more likely, than random gene pairs, to be transcriptionally co-regulated.

13 Connectivity of protein networks i.e. if protein A interacts with proteins B and C, then the finding that B and C also interact forms a closed loop of three proteins and gives a measure of interaction reliability. Such a group of proteins may form a conserved module reflecting a discrete biological activity. Such modules are often evolutionarily conserved. Another approach is to evaluate the functional activities of interacting proteins. Given that a set of interacting proteins is likely to work in the same biological process, common functional annotations for such proteins support the relevance of the interaction. Other comparisons can be made between interaction data and the available set of protein structures or protein domains.

14 http://dip.doe-mbi.ucla.edu/

15 The DIP can be searched in a number of different ways. Here the database has been searched with a protein sequence. The search results show, in addition to Protein Name/Description, a Node and a Links link.

16 The Node link gives information about that protein as a ‘node’ in an interacting network. A graphical view of the node is available – using the graph link.

17 A list of the interacting proteins is available from the ‘Links’ link in the search result.

18 Network graphs A network of interacting proteins – a graph. The lines between nodes – are called edges or arcs. The number of edges touching a node is called the degree of the node.

19 The C. elegans interactome – a network of 2898 nodes connected by 5460 edges. The terms core and non-core refer to the ‘confidence’ of the interaction in HT-Y2H screens; interologs are conserved interactions as found by in silico searches and scaffolds are interactions revealed by other partial (i.e. relating to specific biological processes [e.g. protein degradation] ) interactome maps of C. elegans.

20 A highly interconnected sub-network around two C. elegans proteins – that are components of a conserved network of transcription factors.

21 http://www.droidb.org/Index.jsp

22 http://string.embl.de/

23

24 Interactome Properties A feature of protein networks, that emerges from large-scale approaches, is that the number of links per protein is highly non-uniform, ranging from a few hubs with many connections to the great majority of hubs with only a few connections. i.e. there is a ‘scale-free’ degree distribution. Any two proteins can be connected by a path with only a few links (a characteristic of the World Wide Web!). This is the ‘small-world’ property. The evolution of this topology can be explained by the preferential attachment of new nodes to ones that already have many links, in a process related to gene duplication. i.e. highly connected proteins are more likely to interact with a protein that is duplicated. Thus, highly connected proteins gain even more links. Another aspect of protein networks is their robustness, with random loss of proteins mostly affecting the many proteins with only a few partners rather than the small number of hubs. In yeast, deletion of genes encoding highly connected proteins is three times more likely to result in lethal phenotype than deletion of other genes.

25 Summary In order to fully understand what proteins do, we need to know something about the other proteins that they interact with. Comparative genomic analysis can be used to predict protein-protein interactions. Experimentally, the yeast two-hybrid (Y2H) method uses a protein of interest as bait in order to discover interacting (or ‘prey’) proteins. Y2H screening can be done at ‘high throughput’ and quality. Biochemical evidence (affinity chromatography, gel-overlay assays, co- immunoprecipitations) for interactions between bait and prey proteins must follow-up Y2H–based indications of interactions. Computational approaches are used for verification of Y2H data. Protein : protein interactions (the Interactome) can be represented by network graphs.

26 References Uetz, P. et al., (2000) ‘A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae’. Nature 403, 623-627. Fields, S. (2005) ‘High-throughput two-hybrid analysis – the promise and the peril’. FEBS Journal, 272, 5391-5399. Siming, L. et al., (2004) ‘A map of the interactome network of the metazoan C. elegans’. Science 303, 540-543.


Download ppt "Biol 729 – Proteome Bioinformatics Dr M. J. Fisher - Protein: Protein Interactions."

Similar presentations


Ads by Google