Presentation is loading. Please wait.

Presentation is loading. Please wait.

Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling.

Similar presentations


Presentation on theme: "Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling."— Presentation transcript:

1 Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling

2 P-values of genes Starting with a vector of p-values from –t.test(irradiated, control) –wilcoxon(irradiated, control) –lm(formula, data)

3 Distribution of p-values two-tailed

4 Distribution of p-values one-tailed

5 Distribution of p-values Proportion of unchanged genes, π 0 library(qvalue) (Storey&Tibshirani 2001) qvalue(pvals)$pi0

6 Annotation Anntotation of the genes available from Bioconductor –MetaData for commercial arrays –AnnBuilder for home- made –Unigene name, code, symbol, entrez gene, GO terms, KEGG pathways, Pubmed ids...

7 Gene Set Enrichment Analysis Mootha et al, Nat Genet. 2003, 34:267 Use the gene sets that are made by GO terms, KEGG terms, name containing ’kinase’, genes that cluster together Make a vector of –all not in group -sqrt(G/(N-G)) –all in group sqrt(N-G/G)

8 Running sum The sum of the values in vector will be 0 Plot the running sum: The peak is at a point at p=0.1

9 GSEA The enrichment score can be used to determine the importance of gene set. Permutation technique to get significance.

10 Hypergeometric probability Used in dChip and DAVID. Input is –# genes in the gene set (n), # genes on array (n+m) –# selected genes in the gene set (x), # selected genes (N) dhyper() gives the density

11 Selecting genes Have to set a threshold, p0, for the p-values. p < p0 selected p0 = 0.001 is not informative p0 = 0.1 at the maximum of the peak dissect(pvals) –(BMC Bioinformatics, to appear)

12 Will get a p-value Tested 4000 GO terms, need for correction for multiple testing p.adjust(pvals,”fdr”) Look at significant terms, p<0.001

13 Cisplatin data Mouse embryonic stem cells exposed to various doses (low, medium and high). Harvested at 0<t<24 Low doses, early time points –Few genes changed –Few pathways changed Indications of what will come

14

15

16

17 Preprocessing For internal use at www.medgencentre.nl/pla www.medgencentre.nl/pla Not updated Code for working with widgets, definining MIAME-compliant object, AffyBatch (exprSet), doing tests, building linear models, correlation tests, GSEA Updating together with Agata Meglicz. It will be improved soon.

18 Demonstration cdf=“hgu133a” source(“gsea.R”) gsea() dissectGUI()


Download ppt "Working with enriched gene sets in R Peter Svensson Micheline Giphart-Gassler Harry Vrieling."

Similar presentations


Ads by Google