Presentation is loading. Please wait.

Presentation is loading. Please wait.

Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,

Similar presentations


Presentation on theme: "Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,"— Presentation transcript:

1 Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics, Karolinska Institute

2 FunCoup is a data integration framework to discover functional coupling in eukaryotic proteomes with data from model organisms A mouse B mouse ? Find orthologs Human Fly Rat Yeast High-throughput evidence Andrey Alexeyenko and Erik L.L. Sonnhammer. Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Research. Published in Advance February 25, 2009

3 FunCoup Each piece of data is evaluated Data FROM many eukaryotes (7) Practical maximum of data sources (>50)‏ Predicted networks FOR a number of eukaryotes (10…)‏ Organism-specific efficient and robust Bayesian frameworks Orthology-based information transfer and phylogenetic profiling Networks predicted for different types of functional coupling (metabolic, signaling etc.)‏ http://FunCoup.sbc.su.se

4 FunCoup was queried for any links between members of TGFβ pathway (left blue circle) and habituées of known cancer pathways (members of at least 7 out of 18 groups; right blue circle). MAPK1 and MAPK3 belonged to both categories. TGFβ cancer pathway cross-talk http://FunCoup.sbc.su.se

5 FunCoup: recapitulation of known cancer pathways Figure 5 from: The Cancer Genome Atlas Research Network Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature. 2008 Sep 4. [Epub ahead of print] The same genes submitted to FunCoup No TCGA data were used. Outgoing links are not shown.

6 Single molecular markers are (often) far from perfect. Combinations (signatures) should perform better. The problem: How to select optimal combinations? × Outcome, Optimal treatment, Severity/urgency etc.

7 Biomarker discovery in network context The idea: Construct multi-gene predictors with regard to network context Reduce the computational complexity Make marker sets biologically sound Accounting for network context is taking either: a)network neighbors or b)genes at remote network positions

8 “Rotterdam” dataset (Wang et al., 2005): 286 patients Expression: ~22000 probes Clinical data: Estrogen receptor status: +/ – Lymph. node status: all – Relapse : yes/no and time (days) × Procedure Individual probe p-values (~22000): Estrogen receptor-specific ability to predict relapse Select most significant probes (1000): Candidate members for marker signatures Compile set of probes: N probes at a time (e.g. N=20 or N=50) 1.Split data: 75% to train, 25% to test. 2.Produce a linear regression equation (weight terms step-wise, reward for performance, penalize for complexity) on the train sub-set. 3.Apply the equation to the test set to predict outcome (relapse yes/no). 4.Record the specificity/sensitivity (Type I/II error rates) as ROC curve. Repeat m times RELAPSE = γ 1 g 1 + γ 2 g 2 + γ 3 g 3 + … + γ N g N

9 Procedure Select most significant probes (1000): Candidate members for marker signatures Compile set of probes: N probes at a time (e.g. N=20 or N=50) 1.Split data: 75% to train, 25% to test. 2.Produce a linear regression equation (weight terms step-wise, reward for performance, penalize for complexity) on the train sub-set. 3.Apply the equation to the test set to predict outcome (relapse yes/no). 4.Record the specificity/sensitivity (Type I/II error rates) as ROC curve. Repeat m times RELAPSE = γ 1 g 1 + γ 2 g 2 + γ 3 g 3 + … + γ N g N Test X randomly retieved sets Take the best ones Account for the network context

10 Candidate signature in the network Biomarker candidates

11 Ready signature in the network RELAPSE = γ 1 EIF3S9 + γ 2 CRHR1 + γ 3 LYN + … + γ N KCNA5

12 Testing “ top ”, “ free ”, and “ network ” approaches Top

13 Signature involves genes mutated in cancer

14 Tumour tcga-02-0114-01a-01w Cancer individuality: each tumor is unique in its molecular state and set of mutated/disordered genes

15 Partial correlations: a way to get rid of spurious links 0.7 0.6 0.4

16 Cancer individuality via network view Functional coupling transcription  ?  transcription transcription  ?  methylation methylation  ?  methylation mutation  methylation mutation  transcription mutation  ?  mutation + mutated gene

17 is a framework for biomarker discovery: Markers can be discovered and presented in the network dimension. Choice of data types to incorporate is unlimited – from metabolite profiling to patient phenotypes. Useful features: Web-based resource ready for further expansion and presenting new research results in an interactome perspective; Cross-species network comparison of human and model organisms. Efficient query system to retrieve network environments of interest. http://FunCoup.sbc.su.se

18 Thank you for attention!

19 Decomposing biological context r PLC = 0.88 r PLC = 0.95 r PLC = 0.76 Common Develomental Dioxin-enabled ANOVA ( Analysis Of VAriance ): Look at F-ratios: Signal of interest / Residual (“error”) variance

20 Accounting for edge features: dioxin-enabled vs. dioxin-sensitive links Andrey Alexeyenko, Deena M Wassenberg, Edward K Lobenhofer, Jerry Yen, Erik LL Sonnhammer, Elwood Linney, Joel N Meyer Transcriptional response to dioxin in the interactome of developing zebrafish. submitted.

21 a


Download ppt "Expression signatures as biomarkers: solving combinatorial problems with gene networks Andrey Alexeyenko Department of Medical Epidemiology and Biostatistics,"

Similar presentations


Ads by Google