Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling.

Similar presentations

Presentation on theme: "Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling."— Presentation transcript:

1 Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling

2 P-values of genes Starting with a vector of p-values from –t.test(irradiated, control) –wilcoxon(irradiated, control) –lm(formula, data)

3 Distribution of p-values two-tailed

4 Distribution of p-values one-tailed

5 Distribution of p-values Proportion of unchanged genes, π 0 library(qvalue) (Storey&Tibshirani 2001) qvalue(pvals)$pi0

6 Annotation Anntotation of the genes available from Bioconductor –MetaData for commercial arrays –AnnBuilder for home- made –Unigene name, code, symbol, entrez gene, GO terms, KEGG pathways, Pubmed ids...

7 Gene Set Enrichment Analysis Mootha et al, Nat Genet. 2003, 34:267 Use the gene sets that are made by GO terms, KEGG terms, name containing ’kinase’, genes that cluster together Make a vector of –all not in group -sqrt(G/(N-G)) –all in group sqrt(N-G/G)

8 Running sum The sum of the values in vector will be 0 Plot the running sum: The peak is at a point at p=0.1

9 GSEA The enrichment score can be used to determine the importance of gene set. Permutation technique to get significance.

10 Hypergeometric probability Used in dChip and DAVID. Input is –# genes in the gene set (n), # genes on array (n+m) –# selected genes in the gene set (x), # selected genes (N) dhyper() gives the density

11 Selecting genes Have to set a threshold, p0, for the p-values. p < p0 selected p0 = 0.001 is not informative p0 = 0.1 at the maximum of the peak dissect(pvals) –(BMC Bioinformatics, to appear)

12 Will get a p-value Tested 4000 GO terms, need for correction for multiple testing p.adjust(pvals,”fdr”) Look at significant terms, p<0.001

13 Cisplatin data Mouse embryonic stem cells exposed to various doses (low, medium and high). Harvested at 0<t<24 Low doses, early time points –Few genes changed –Few pathways changed Indications of what will come




17 Preprocessing For internal use at Not updated Code for working with widgets, definining MIAME-compliant object, AffyBatch (exprSet), doing tests, building linear models, correlation tests, GSEA Updating together with Agata Meglicz. It will be improved soon.

18 Demonstration cdf=“hgu133a” source(“gsea.R”) gsea() dissectGUI()

Download ppt "Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling."

Similar presentations

Ads by Google