Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University.

Similar presentations


Presentation on theme: "Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University."— Presentation transcript:

1 Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University

2 Announcement No class this Wed Change of schedule – miRNA lecture moved to a later time More time for project – only the last class is used for presentation Today –lecture more relevant to the projects –Discuss possible class projects –Decide on the groups Decide on the project topic by next Monday – meeting with me later this week is recommended.

3

4 http://www.rithme.eu/img/storage_cost.gif

5 Gene Expression Microarray

6 Gene Networks/Pathways Regulatory network Metabolic pathways Signaling pathways Protein-protein interaction networks Gene interaction networks Co-expression network

7 Networks/Pathways Resources www.pathguide.org KEGG HPRD MIMI BIND …

8 Networks/Pathways in Research Genes don’t act alone One gene – one disease model is not sufficient Need to understand how genes coordinate and work together as a system

9 Networks/Pathways How to build the network? Manual curation – e.g., IPA Automatic inference from literature – e.g., NLP based method Inference from data – e.g., co-expression network Integration from multiple resources – e.g., STRING database (http://string.embl.de/)

10 Networks/Pathways How to build the network? Manual curation – e.g., IPA Automatic inference from literature – e.g., NLP based method Inference from data – e.g., co-expression network Integration from multiple resources – e.g., STRING database (http://string.embl.de/)

11 How to use the network? Functional inference Identify new candidate for further investigation Dynamical simulation Other types of inferences Networks/Pathways

12 a p m 1 2 b E2F1 E2F2 E2F3 Myc 17-5p17-3p18a19a20a19b92-1 c Myc E2F mir-17-92 Reviewed by: Coller et al. (2008), PLoS Genet 3(8): e146 Figures from Dr. Baltz Agula MicroRNA (miRNA)

13 Gene Co-Expression HMMR siRNA

14

15

16 Expansion –Negative correlation –Multiple breast cancer datasets –More anchor genes –… Is there a way to find all highly correlated genes in multiple datasets? Do these genes form clusters? Gene Co-Expression Network

17 Step 1: Compute pairwise PCC values Step 2: Weighted or unweighted? –Unweighted – need to select a cutoff on PCC –Weighted – need to consider transformation of the data –Keep the scale-free topology Step 3: Identify “dense” networks (subgraphs) from the overall graph –Hierarchical clustering –Graph mining Gene Co-Expression Network

18 Definition of “dense” –Ratio of connectivity: for a subgraph with K nodes and L edges r = L/(K(K-1)/2). –K-core: a subgraph in which every node is connected to at least K other nodes (within this subgraph). Identification of all the “dense” networks is usually an NP-complete problem. –Heuristic or approximate algorithms are used – e.g., greedy algorithm Graph Mining

19 Frequent network mining CODENSE –Originally applied to yeast microarray data, later expanded to cancers –Used for functional annotation

20 Data selection and correlation Selected 23 datasets from Gene Expression Omnibus (GEO) –Search term “human metastatic cancer” –Contain both control and tumor, # sample > 8 –Only primary biopsy Correlation – PCC > 0.75 (really high similarity) For CODENSE –Edge support in at least 4 datasets –Connectivity ratio r > 40% (L > r∙n(n-1)/2) –# of nodes > 20

21 Results from CODENSE 44 networks are identified # of nodes: 21 ~ 74 (average 44) Connectivity: 0.41 ~ 0.78

22 Finding New Functions Relation to BRCA1

23 Comparing ER- and ER+ breast cancer patients Estrogen receptor status is one of the key biomarkers for breast cancer prognosis (ER- indicates poor prognosis) Select a dataset (GSE2034, Wang et al) from GEO containing 286 samples (77 ER-, 209 ER+) Compare the ER- group vs ER+ group, select the networks that is most perturbed The network containing HMMR is most perturbed – more than half of the genes are differentially regulated

24 Select gene signature from a network to predict survival Use the genes in this network as features to cluster patients in the Rosseta data (295 breast cancer patients) and compare the survival between the two groups. Log-rank test p < 1e-8

25 Possible Project Topics: 1.Compare the gene expression profiles between tumor and its microenvironment – differential expression, gene co-expression network, and tissue-tissue expression network. 2.Similarly compare the co-expression network between different types of tissues. 3.Herpes virus and cancer; predict human gene targets for virus (Herpes virus) microRNAs. 4.Gene expression “stalling” prediction using “stalling index” from ChIP-seq data for RNA polymerase II. 5.TF binding motif prediction using graph theoretical method. 6.MicroRNA co-expression network to predict microRNA transcription regulation. 7.Your own research problem …


Download ppt "Gene Co-expression Network Analysis BMI 730 Kun Huang Department of Biomedical Informatics Ohio State University."

Similar presentations


Ads by Google