Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and.

Similar presentations


Presentation on theme: "Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and."— Presentation transcript:

1 Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, omnesres.com, @omnesresnetwork In this study I have used publicly available clinical and RNA-SEQ data from the TCGA to investigate each gene’s correlation with survival for 16 different cancers, which included 6,495 patients. For the measure of correlation I used multivariate Cox regression, with gene expression, grade, sex, and age as the multivariates. To improve performance of the models gene expression was inverse normal transformed. Cancers showed large differences in numbers of significantly correlated genes, which could not be explained by sample size or events. However, even cancers with low signal to noise displayed meaningful expression patterns of protective and harmful genes, and gene set enrichments with MSigDB. The most significant protective and harmful genes were not shared across cancers, but these genes were enriched in gene sets that were shared across certain groups of cancers. These groups of cancers were independently recapitulated by both unsupervised clustering of Cox coefficients for individual genes, and for gene programs. This is the first time comprehensive lists of prognostic genes has been made publicly available, and the first time cancers have been compared using a measure of correlation to survival, which contains more information than expression, including hidden information such as treatment response. While some cancers show a large number of genes with p-values below.05 such as LGG, other cancers such as STAD have a nearly flat distribution of p-values. See Table 1 for more details. Abstract Cancers display a range of p-value distributions Clustering of cancers using gene programs Using established gene programs from Hoadley et al., I found the average normalized Cox coefficient for each cancer across the program. The programs and cancers were then clustered with hierarchical clustering. Overall the same cancers which shared gene sets and clustered together with individual gene Cox coefficients again clustered together. Conclusions RNA-SEQ can find meaningful survival correlations across 16 different cancers Cancers show a wide range in the number of genes that meet a FDR cutoff, which should be used to inform p-values found for individual genes Cancers do not share prognostic genes, but do share gene sets Cancers can be clustered with Cox coefficients of individual genes and gene programs The same groupings of cancers (for example LIHC/LUAD/KIRP and COAD/GBM/LUSC) were found with three independent methods, indicating that using prognostic information to compare cancers can find unappreciated commonalities among cancers. Clustering of cancers using gene Cox coefficients Harmful and protective genes display opposite expression patterns I clustered patients with the 100 most significant prognostic genes and 100 most significant harmful genes. In general, protective genes showed very similar expression patterns across patients, and this same trend was seen for harmful genes. This has important implications for identifying a small gene set to predict patient survival. There are thousands of combinations of genes that would give very similar predictions, making identification of one set of genes of questionable utility. It is important to note that while all the genes for STAD did not pass a FDR cutoff, they still showed meaningful expression patterns. The prognostic genes in any cancer can cluster patients into two statistically different groups These Kaplan Meier plots were made with the clusters from above. Not surprisingly the LGG clusters were highly statistically different. More surprising was finding the clusters from STAD also being highly statistically different, indicating that even the cancers with low signal to noise contain useful biological information. As a result, all cancers were included in future analyses. Overlaps of prognostic genes Characteristics of datasets and patients included in this study Overlaps of gene sets The overlaps of the 100 most significant prognostic genes in each cancer. Using the 250 most significant harmful genes in each cancer, I found the 100 most enriched gene sets with MSigDB. These are the overlaps of those 100 gene sets. Some cancers such as LIHC, LUAD, and KIRP share large numbers of gene sets. Cancers were clustered using normalized Cox coefficients of prognostic genes. In general cancers which shared gene sets clustered together. References Hoadley, K. A. et al. Multiplatform analysis of 12 cancer types reveals molecular classification within and across tissues of origin. Cell 158, 929-944 (2014).


Download ppt "Pan-cancer analysis of prognostic genes Jordan Anaya Omnes Res, In this study I have used publicly available clinical and."

Similar presentations


Ads by Google