Download presentation
Presentation is loading. Please wait.
1
FunCoup: reconstructing protein networks in the worm and other animals Andrey Alexeyenko, Erik Sonnhammer Stockholm Bioinformatics Center
2
C. elegans computed interactomes
3
FunCoup is a data integration framework to discover functional coupling in eukaryotic proteomes with data from model organisms A worm B worm ? Find orthologs* Mouse Human Fly Yeast High-throughput evidence
4
FunCoup Each piece of data is evaluated Data FROM many eukaryotes (7) Practical maximum of data sources (>60) Predicted networks FOR a number of eukaryotes (8) Organism-specific efficient and robust Bayesian frameworks Orthology-based information transfer and phylogenetic profiling Networks predicted for different types of functional coupling (metabolic, signaling etc.)
5
C. elegans’ benefit from the model species data integration: Li&Vidal’s set 5535 pairs IntAct (Oct. 2007) 4517 pairs 6841 Other C. elegans data 36000 predicted C.elegans pairs
6
Species: H. sapiens M. musculus R. norvegicus D. melanogaster C. elegans S. cerevisiae A. thaliana Data sources in FunCoup: Types: Protein-protein interactions Protein domain associations Protein-DNA interactions mRNA expression Protein expression miRNA targeting Sub-cellular co-localization Phylogenetic profiling
7
Multilateral data transfer Human Ciona Worm Mouse Rat Fly Yeast Arabidopsis FunCoup Data from the same species is an important but not indispensable component of the framework. Hence, a network can be constructed for an organism with no experimental datasets at all.
8
InParanoid P r o t e o m e A P r o t e o m e B Automatic clustering of orthologs and in-paralogs from pairwise species comparisons Maido Remm, Christian E. V. Storm and Erik L. L. Sonnhammer Journal of Molecular Biology 314, 5 Journal of Molecular Biology 314, 5, 14 December 2001, Pages 1041-1052 Reciprocally best hits ~ seed orthologs Inparalogs
9
How orthology works? Log overlap between KEGG pathways and complexes (Gavin et al., 2006)
10
Comparing networks Rat Human Mouse
11
Conclusions FunCoup: is a flexible, exhaustive, and robust framework to infer confident functional links enables practical web access to candidate interactions in both small and global-scale network context is open towards better data quality and coverage http://FunCoup.sbc.su.se
12
Acknowledgements: Carsten Daub Kristoffer Forslund Anna Henricson Olof Karlberg Martin Klammer Mats Lindskog Kevin O’Brien Tomas Ohlson Sanjit Rupra Gabriel Östlund Sean Hooper All previous interaction network developers
13
Talk outline Other network resources Why FunCoup Orthology and InParanoid Implementation Applications and future development
14
FunCoup is a naïve Bayesian network (NBN) Bayesian inference: Genes A and B are functionally coupled Genes A and B co- expressed P(C|E) = (P(C) * P(E|C)) / P(E) A B
15
Problem:Solution: Treat ALL inparalogs equally, and choose the BEST value In situatons with multiple inparalogs, how to deal with alternative evidence?
16
Problem:Solution: Naïve Bayesian network. Calculate a belief change instead (likelihood ratios, LR). Assume NO data dependency Absolute probabilities of FC are intractable. The full Bayesian network is impossible A B P(B|C), P(C|B) P(B|A), P(A|B) P(B|D), P(D|B) P(A|C ), P(C|A ) P(D|C), P(C|D) P(A|D ), P(D|A ) P(E|+) / P(E|-) A B P(E|+) / P(E|-)
17
gene evolution functional link Problem:Solution: Via groups of orthologs that emerged from speciation How to establish optimal bridges between species?
18
Homologs P r o t e o m e A P r o t e o m e B Homologs: proteins with similar sequence and, thus, common origin
19
An InParanoid cluster of orthologs Inparalogs
20
Problem:Solution: Enforce confidence check and remove insignificant nodes Some LR are weak and arise due to non-representative sampling P(E|+) / P(E|-) A B P(E|+) / P(E|-) χ 2 - test
21
Reciprocally best hits P r o t e o m e A P r o t e o m e B
22
Problem:Solution: Multinet Decide which types of FC are needed (provide as positive training sets) and perform the previous steps customized Definitions and notions of FC vary A <> B P(E|+) / P(E|-) A| BA| B A <> B A || B A|BA|B
23
Proteins of the Parkinson’s disease pathway (KEGG #05020) Physical protein-protein interaction “Signaling” link Metabolic “non-signaling” link Multinet presents several link types in parallel
24
The limits of data integration
25
FunCoup’s web interface Hooper S., Bork P. Medusa: a simple tool for interaction graph analysis. Bioinformatics. 2005 Dec 15;21(24):4432-3. Epub 2005 Sep 27. http://FunCoup.sbc.su.se
26
Reconctructing the “regulatory blueprint”* in C. intestinalis *Imai KS, Levine M, Satoh N, Satou Y (2006) Regulatory blueprint for a chordate embryo. Science, 26:1183-7. Proteins of the “Regulatory Blueprint for a Chordate Embryo” [ * ] 18 links mentioned in [ * ] AND found by FunCoup Links found by FunCoup (about 140) The rest, 202 links from [*] that FunCoup did not find, not shown
27
Orthologs Functional link Inparalogs C. elegans D. melanogaster human S cerevisiae Overview and comparison of ortholog databases Alexeyenko A, Lindberg J, Pérez-Bercoff Å, Sonnhammer ELL Drug Discovery Today:Technologies (2006) v. 3; 2, 137-143
28
Problem: Solution: Find them individually for each data set and FC class, accounting for the joint “feature – class” distribution Distribution areas informative of FC may vary 01Pearson r + + + + + + + +++ +++ +++ ++ + ++ - - - ----- -- ------ - - -- - - -
29
Validation Jack-knife procedure: Take “positive” and “negative” sets Split each randomly as 50:50 Use the first parts to train the algorithm, the second to test the performance Repeat a number of times Analysis Of VAriance: Introduce features A, B, C in the workflow of FunCoup (e.g., using PCA, selecting nodes of BN by relevance, ways of using ortholog data etc.) Run FunCoup with all possible combinations of absence/presence of A, B, C to produce a balanced and orthogonal ANOVA design with replicates Study effects of A,B,C or their combinations AxB, BxC,.. AxBxC to see if they influence the performance significantly (whereas all other effects did not exist)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.