Presentation is loading. Please wait.

Presentation is loading. Please wait.

“An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA.

Similar presentations


Presentation on theme: "“An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA."— Presentation transcript:

1 “An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA

2 Contents Here we consider the application of a generalized WGCNA that keeps track of the sign of the co-expression information. standard unsigned networks are based on Here we focus on signed networks based on

3 Step 1: Define a Gene Co-expression Similarity Step 2: Define a Family of Adjacency Functions Step3: Determine the AF Parameters Step 4: Define a Measure of Node Dissimilarity Step 5: Identify Network Modules (Clustering) Step 5: Find Biologically Interesting Modules Step 6: Find Key Genes in Interesting Modules General Framework Of Network Construction

4 Adjacency Functions: Hard and Soft Thresholding A network can be represented by an adjacency matrix, A=[a ij ], that encodes how a pair of nodes is connected. –A is a symmetric matrix with entries in [0,1] –For unweighted networks, hard thresholding is applied to S to yield A. If s ij > τ, a ij = 1 else a ij = 0. –For weighted networks, soft thresholding is applied with 0 < a ij < 1, and a ij = s ij β. –Both types of adjacency functions can be applied to unsigned and signed co-expression similarity measures. In this analysis we employ soft thresholding.

5 Defining a co-expression similarity measures that keeps track of the sign Unsigned networks are based on the absolute value of the correlation. Signed networks preserve sign information from the correlation Cor(x i,x j )

6 Generalized Connectivity A gene’s connectivity (also known as degree) equals the row sum of the adjacency matrix. Intuitively for unweighted networks this is the number of direct neighbors a gene has. For our signed networks, the connectivity of the i-th gene measures the extent of positive correlations with the other genes in the network.

7 For high powers of beta, signed weighted networks exhibit approximate scale free topology Scale Free Topology refers to the frequency distribution of the connectivity k, P(k)~k -λ p(k)=proportion of nodes that have connectivity k

8 How to check Scale Free Topology? Idea: Log transformation p(k) and k and look at scatter plots Linear model fitting R 2 index can be used to quantify scale free topology In our cancer and mouse embryonic stem cell applications, we find R 2 = 0.97 and 0.94 for β= 12 and 22, respectively.

9 The scale free topology criterion for choosing the parameter values of an adjacency function. A) CONSIDER ONLY THOSE PARAMETER VALUES THAT RESULT IN APPROXIMATE SCALE FREE TOPOLOGY B) SELECT THE PARAMETERS THAT RESULT IN THE HIGHEST MEAN NUMBER OF CONNECTIONS Criterion A is motivated by the finding that most metabolic networks (including gene co-expression networks, protein-protein interaction networks and cellular networks) have been found to exhibit a scale free topology Criterion B leads to high power for detecting modules (clusters of genes) and hub genes.

10 Trade-off between criterion A and criterion B when varying the power β in signed cancer network

11 Trade-off between criterion A and criterion B when varying the power β in signed mouse embryonic stem cell network

12 How to measure distance in a network? Biological Answer: look at shared neighbors with the topological overlap matrix. –Intuition: if 2 people share the same friends they are close in a social network –In an unsigned network negatively correlated genes are treated as friends while in the signed network they are treated as enemies. –Two genes have high topological overlap if they share (positively correlated) friends

13 Topological Overlap leads to a network distance measure (Ravasz et al 2002) Generalized in Zhang and Horvath (2005) to the case of weighted networks.

14 SIMPLE TOM example In this simple example TOM 1,2 reduces to a. If cor(x 1, x u ) and cor(x u, x 2 ) = -1, then in an unsigned network TOM 1,2 = 1, while in a signed network TOM 1,2 = 0.

15 Application: comparing Signed to Unsigned Networks using brain cancer data described in Horvath S, Zhang B, Carlson M, Lu KV, Zhu S, Felciano RM, Laurance MF, Zhao W, Shu, Q, Lee Y, Scheck AC, Liau LM, Wu H, Geschwind DH, Febbo PG, Kornblum HI, Cloughesy TF, Nelson SF, Mischel PS (2006) "Analysis of Oncogenic Signaling Networks in Glioblastoma Identifies ASPM as a Novel Molecular Target", PNAS | November 14, 2006 | vol. 103 | no. 46 | 17402-17407

16 Preservation of Modules between Unsigned and Signed Methods in Brain Cancer Unsigned NetworkSigned Network Message: no difference between signed and unsigned analysis

17 Analysis of Networks in Mouse ESC data described in Ivanova et al

18 Preservation of Large Modules between Unsigned and Signed Methods in Mouse embryonic stem cells. Signed network exhibits 4 additional modules Unsigned NetworkSigned Network

19 Gene significance Definition Differential gene expression test between control versus knockout –Control: Mouse microrray samples treated with empty virus versus –Knockout: microarray samples treated with a Oct4 RNAi (Oct4 is of major biological importance in ES pluripotency) Individual gene significance = t-test statistic –Note that the t-test keep tracks of the sign Goal: To relate gene significance to intramodular connectivity

20 Absolute Mean Significance Increases Once New Modules are Found via Signed WGCNA UnsignedSigned Message: signed networks allowed us to split large modules into smaller, biologically more significant modules

21 Behind the Scenes: Brown Module is Hidden within Turquoise UnsignedSigned

22 Signed WGCNA shows influence of known pluripotency transcription factors Separated into their own module, both the connectivity and relative gene significance of the TF’s increase.

23 Brown Module Shows Oct4 is a highly connected hub and it is highly significant in this module. This module could not have been detected in an unsigned network. Note that the signed intramodular connectivity is a biologically important screening variable. Biological importance of module is verified by 2 fold enrichment of Oct4 and Nanog binding.

24 Conclusion Signed weighted gene co-expression network analysis is a robust extension of unsigned WGCNA, preserving large modules while finding new and biologically interesting modules, thus facilitating a system’s level understanding of gene and/or protein interactions.

25 Acknowledgement Biostatistics/Bioinformatics Steve Horvath Qing Zhou Peter Langfelder


Download ppt "“An Extension of Weighted Gene Co-Expression Network Analysis to Include Signed Interactions” Michael Mason Department of Statistics, UCLA."

Similar presentations


Ads by Google