EnrichNet: network-based gene set enrichment analysis Presenter: Lu Liu
The problem: Functional Interpretation Identify and assess functional associations between an experimentally derived gene/protein set and well-known gene/protein sets
Agenda Related research The method The Evaluation The results The conclusion
Related Research Over-representation analysis (ORA) Gene set enrichment analysis (GSEA) Modular enrichment analysis (MEA)
Limitations ORA tend to have low discriminative power Functional information from interaction network disregarded Missing annotation gene/protein ignored Tissue-specific gene/protein set association often infeasible
Agenda Related research The method The Evaluation The results The conclusion
General workflow Input gene/protein list(>=10), a database of interest (KEGG etc.) Processing gene mapping, score the distance with RWR, compare scores with background model Output A pathways/processes ranking table, visualization of sub-networks
Input
Output
The method Input Gene Set Pathway 1 Pathway N ……. RWR ……. Pathway 1 Pathway N
Algorithm for distance score
Relate scores to a background model Discretized into equal-sized bins Quatify each pathway’s deviation from average
Agenda Related research The method The Evaluation The results The conclusion
Evaluation method Compare with ORA 5 datasets and 2 reference gene sets from literature 1.select 100 most DEGs 2.get association scores of EnrichNet and ORA 3.compute a running-sum statistic for all gene sets The consensus of GSEA-derived(SAM-GS, GAGE) pathway ranking as external benchmark pathway ranking
Agenda Related research The method The Evaluation The results The conclusion
The results-EnrichNet vs ORA
The results-Xd-score vs Q-value
The results-comparative validation
Protein–protein interaction sub-networks (largest connected components) for target and reference set pairs with small overlap, predicted to be functionally associated by EnrichNet: (a) gastric cancer mutated genes (blue) and genes/proteins from the BioCarta pathway ‘Role of Erk5 in Neuronal Survival’ (magenta, the shared genes are shown in green); (b) bladder cancer mutated genes (blue) and genes/proteins from Gene Ontology term ‘Tyrosine phosphorylation of Stat3’ (GO: , magenta; the only shared gene NF2 is shown in green).
Protein–protein interaction sub-network (largest connected component) for the PD gene set (blue) and genes/proteins from GO term ‘Regulation of interleukin-6 biosynthetic process’ (magenta, GO: ; the only shared gene IL1B is shown in green).
The results-tissue specificity EnrichSet don’t require additional gene expression measurement data Brain tissue: Xd-scores over-representated Non-Brain tissue: center of Xd-score distribution significant lower
The conclusion EnrichNet sometimes has more discriminative power when target sets and pathway set has large overlaps EnrichNet can identifies novel function associations through direct and indirect molecular interactions when target sets and pathway set has little overlaps