Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part II Protein interactions and networks Peer Bork EMBL & MDC Heidelberg & Berlin Proteome analysis in.

Similar presentations


Presentation on theme: "Part II Protein interactions and networks Peer Bork EMBL & MDC Heidelberg & Berlin Proteome analysis in."— Presentation transcript:

1 Part II Protein interactions and networks bork@embl.de http://www.bork.embl-heidelberg.de/ Peer Bork EMBL & MDC Heidelberg & Berlin Proteome analysis in silico

2 www. bork.embl-heidelberg.de II. Protein network analysis STRING: a framework for network analysis Towards spatial and temporal network aspects Building and destroying interaction networks Genomic context analysis: Interaction predictions

3 Genomic context methods to predict protein interactions Korbel et al., Nat. Biotechn. 04Morett et al., Nat. Biotechn. 03 Dandekar et al. TIBS 98 Overbeek et al. PNAS 99 Pellegrini et al. PNAS 99 Enright et al. Nature 99 Marcotte et al. Science 99

4 Prediction of analogous enzymes by anti-correlation of gene occurrences Species A B C D Gene a - + - - Gene b + - + - Collaboration with Enrique Morett et al., Mexico Application:thiamine-PPbiosynthesis Morett et al., Nature Biotech. 21(03)790

5 Gene neighbourhood conservation at evolutionary time scales Conservation of divergently transcribed gene pairs reveal functional constraints

6 The more conserved divergently transcribed neighboring genes are, the higher is their level of co-expression The resulting prediction method can reliably predict associations between>2500 pairs of genes; ca 650 of which are supported by other methods Korbel, Jensen, von Mering, Bork Nat. Biotechnol. 2004, July

7 Transcriptional regulators comprise the majority of conserved divergently transcribed gene pairs They are all Self- Regulatory !

8 Coverage: Homology vs. context (80% accuracy level, taken from STRING COG mode) Huynen, Snel, von Mering and Bork. Curr.Opin.Cell.Biol. 15(03)191

9 www. bork.embl-heidelberg.de II. Protein network analysis STRING: a framework for network analysis Towards spatial and temporal network aspects Building and destroying interaction networks Genomic context analysis: Interaction predictions Building and destroying interaction networks

10 Three context methods to predict functional interactions Phylogenetic Co-occurenceConserved NeighborhoodGene fusion events …allowing the study of networks combined and quantified in STRING Von Mering et al. NAR 31(03)258

11 pathway representation comparative genomics: functional modules purinebiosynthesis histidinebiosynthesis Biochemical pathways vs functional modules www.string.embl-heidelberg.de

12 Giant component of gene context network The more conservation (red) the higher the number of connections High local connectivity, (c=0.6); hence lot of substructure

13 pathway representation comparative genomics: functional modules purinebiosynthesis histidinebiosynthesis unsupervisedclustering Biochemical pathways vs functional modules Coverage: >70% Specificity: ca 90% Von Mering et al. PNAS 100 (2003) 15428

14 - Functional assignment of >3000 hypothetical proteins - ‘Target’ for transcription regulators, transporters etc. - Pathways links (CoA and nucleotide biosynth.) Biological discoveries - Missing enzymes in known pathways - Potentially novel pathways/processes/complexes - Independent modules within known pathways

15 STRINGannotations Uracil Permease Uncharacterized Pyrimidine biosynthesis known Query protein: Known transcriptional regulator PyrR Query protein: Putative transcriptional regulator, uncharacterized Riboflavin biosynthesis Uncharacterized response regulatornovel Doerks et al. TIG, 2004 Synergies between homology and context based methods

16 www. bork.embl-heidelberg.de - Functional assignment of >3000 hypothetical proteins - ‘Target’ for transcription regulators, transporters etc. - Pathways links (CoA and nucleotide biosynth.) Biological discoveries - Missing enzymes in known pathways - Potentially novel pathways/processes/complexes - Independent modules within known pathways

17 Information Processing: Translation, Transcription, DNA. Cellular Processes: Transport, Motility, Signalling Metabolism: Anabolism, Catabolism, Energy Unassigned/Uncharacterized, or multiple assignments Functional Categories (COG): YeaG YcgB YeaH YfbU YeaH predicted Integrin I domain predicted ATPase domain YeaG Functional modules in E.coli About 650 modules predicted (120 metabolic) About 140 modules dominated by ‘hypotheticals’ (Only modules with >3 nodes shown)

18 www. bork.embl-heidelberg.de II. Protein network analysis STRING: a framework for network analysis Towards spatial and temporal network aspects Building and destroying interaction networks Genomic context analysis: Interaction predictions Building and destroying interaction networks STRING: a framework for network analysis

19 Functional associations between proteins 80.000 from large-scale approaches in yeast 80.000 from large-scale approaches in yeast yeast two-hybrid (Uetz et al., 2000, Ito et al., 2000,2001) 5125 complex-purification (analysis by mass spectrometry) 18027 (TAP) 33014 (HMS-PCI) mRNA synexpression (cell-cycle + Rosetta data) ~ 15000 in silico predictions (neighborhood, fusion, cooccurence) ~ 7000 (~9000 new) ~ 7000 (~9000 new) genetic interactions (synthetic lethality, Tong et.al. 2001 + MIPS) 886 small-scale interactions (MIPS+YPD) ~11000

20 Binary interactions vs. groups of interacting proteins LPD1 ARC1 CDC3 CDC10 SHS1 CIN2 CDC12 CDC11 SPR28 GIN4 TAP purification two-hybrid interaction HMS-PCI purification annotated member of septin complex Counting functional associations:

21 E G P T B F O A R D M C U EGMPTBFOARDCU Distribution of interacting proteins (TAP complexes) energy production aminoacid metabolism other metabolism translation transcription transcriptional control protein fate cellular organization transport and sensing stress and defense genome maintenance cellular fate/organization uncharacterized interaction density (actual interactions per 1000 possible pairs) 010

22 Reference interactions E G P T B F O A R D M C U EGMPTBFOARDCU E G P T B F O A R D M C U EGMPTBFOARDCU manually annotated protein complexes: MIPS / YPD high-throughput interaction data: OVERLAP OF 2+ METHODS 10907 interactions 2455 interactions

23 Protein interaction datasets E G P T B F O A R D M C U EGMPTBFOARDCU E G P T B F O A R D M C U EGMPTBFOARDCU E G P T B F O A R D M C U EGMPTBFOARDCU E G P T B F O A R D M C U EGMPTBFOARDCU E G P T B F O A R D M C U EGMPTBFOARDCU E G P T B F O A R D M C U EGMPTBFOARDCU purified complexes (TAP) purified complexes (HMS-PCI) genomic associations synthetic lethals yeast two-hybrid mRNA synexpression 18027 interactions 886 interactions 5125 interactions 16496 interactions 7446 interactions 33014 interactions

24 0.11110100 0.1 11 10 100 Accuracy Coverage purified complexes TAP yeast two-hybrid two methods three methods Purified Complexes HMS-PCI combined evidence mRNA synexpression genomic associations synthetic lethality fraction of reference set covered by data ( %; log scale) fraction of data confirmed by reference set (%; log scale) filtered data raw data parameter choices Benchmarking high-throughput interaction data (update to 89 species) A probabilistic approach for function prediction Von Mering.C, Krause. R, Snel, B., Oliver, S.G., Fields, S. and Bork, P Nature 417(2002)399

25 Please show me the functional context of these proteins? ATP1 STRING: known and predicted functional links QCR2

26 STRING: known and predicted functional links Ubiquinol-Cyt.C reductase ATP synthase QCR2 High-throughput Experiments Literature Co-occurrence Phylogenetic Profiles Conserved Neighborhood Known Pathways/Complexes Co-expression

27 www. bork.embl-heidelberg.de II. Protein network analysis STRING: a framework for network analysis Towards spatial and temporal network aspects Building and destroying interaction networks Genomic context analysis: Interaction predictions STRING: a framework for network analysis Towards spatial and temporal network aspects

28 EMBL’s Structural and Computational Biology unit From molecules to organisms Protein/DNA Complex Subcellular structure Cell NMR Xray EM 3D tomography Cell Biology Gene expression Developmental Biology Organism Endosomes Peroxisomes Mitochondria Golgi ER Microtubules + - + + + + + + + + + + Nucleus ComputationalBiology Synchrotons In red: other EMBL units Corefacilities

29 Characterise the domains Predict which subunits interact Build assembly TAP (Cellzome) x300= (exosome case study: Aloy et al., EMBO Rep, 2002) From interactions to 3D protein complexes: Large scale modeling and EM mapping Side-chain to side-chain Side-chain to main-chain Interface of 3D structure of interaction Parameters Have we seen any of these domains interacting before? rules constraints Compatible? EM screen

30 Aloy, P., Boettcher B., Ceulemans, H., Leutwein, C., Mellwig, C., Fischer, S., Gavin, A.-C., Bork, P., Superti-Furga, G., Serrano, L. and Russell, R.B. Science 303 (2004) 2026 Structure-based assembly of protein complexes From functional associations to three dimensional assemblies Analysis of 101 yeast complexes yeast complexes and their interactions 3D

31 Dynamic complex formation during the 90 min yeast cell cycle Multiple arrays reveal 600 periodically expressed genes Projection to interaction data identifies novel assemblies Details on the time dependent formation in some assemblies revealed Some unknown proteins detected in well-studied cell cycle assemblies Color: periodically expressed proteins 4D Lichtenberg, Larsen et al

32 www. bork.embl-heidelberg.de Losses/Gains of Functional Associations M. pneumoniae & M. genitalium M. pneumoniae only (Linked by conserved neighborhood or fused proteins, combined score >0.95)

33 www. bork.embl-heidelberg.de urease enzyme complex ABC-type phosphate transport system (incl. regulator) fructose-specific phosphotransferase system (plus assoc. enzymes) ribose/xylose sugar-transport glycerol metabolism M. pneumoniae M. pulmonis U. parvum + + + + + + + + + + + + Gene present (+) Differentialanalysis Comparison of the interaction networks in three mollicutes

34 TCA cycle Modification of functional modules at evolutionary time scales Huynen et al TIM 1999

35 Summary (network analysis) Gene context methods have already ca 90% specificity/70% sensitivity in predicting functional modules in prokaryotes Gene context and other concepts for interaction predictions not only complement homology approaches, but are about to offer more functional information than blast et al. In eukaryotes, accurate prediction of networks and modules is still difficult and heterogenous expermental data have to be integrated Spatial and temporal aspects of protein networks have a great potential although data are still limited

36

37 Context methods Functional modules STRING Networks in 3D Network in4D Ulrik de Lichtenberg, Soren Brunak (CBS) Christos Ouzounis et al. (EBI) Rob Russell, Pattrick Aloy, Bettina Boettcher (EMBL) Cellzome AG Berend Snel, Martijn Huynen (Nejm) Enrique Morett et al. (Mex) Credits + all other group members + many experim. collaborators Chicken international sequencing and analysis consortium


Download ppt "Part II Protein interactions and networks Peer Bork EMBL & MDC Heidelberg & Berlin Proteome analysis in."

Similar presentations


Ads by Google