Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Eric Olson

Similar presentations


Presentation on theme: "Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Eric Olson"— Presentation transcript:

1 Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Eric Olson eric@genesifter.net

2 General microarry data analysis workflow From raw data to biological significance Comparison statistics Two-way ANOVA GeneSifter Overview The Gene Expression Omnibus (GEO) Microarray analysis of gene expression following hookworm infection Data overview Dissection of the immune response using 2-way ANOVA Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung

3 Experimental Design Number of groups, factors, replicates Data management Data, sample annotation, gene annotation, databases Differential Expression Comparison statistics, Correction for multiple testing, Clustering Biological significance Individual genes, Biological themes Platform Selection One-color, two-color, platform comparisons System access Ease of you, accessibility Making data public and using public data MIAME, Journals, GEO, meta-analysis The Microarray Data Analysis Process

4 Experimental Design Number of groups, factors, replicates Data management Data, sample annotation, gene annotation, databases Differential Expression Comparison statistics, Correction for multiple testing, Clustering Biological significance Individual genes, Biological themes Platform Selection One-color, two-color, platform comparisons System access Ease of you, accessibility Making data public and using public data MIAME, Journals, GEO, meta-analysis The Microarray Data Analysis Process

5 Experiment Design Type of experiment –Two groups Normal vs. cancer Control vs. treated –Three or more groups, single factor Time series Dose response Multiple treatment –Four or more groups, multiple factors Time series with control and treated cells The type of experiment and number of groups and factors will determine the statistical methods needed to detect differential expression Replicates –The more the better, but at least 3 –Biological better than technical Rigorous statistical inferences cannot be made with a sample size of one. The more replicates, the stronger the inference. Pavlidis P, Li Q, Noble WS. The effect of replication on gene expression microarray experiments. Bioinformatics. 2003 Sep 1;19(13):1620-7. Experimental Design and Other Issues in Microarray Studies - Kathleen Kerr - http://ra.microslu.washington.edu/learning/documents/KerrNAS.pdf

6 Differential Expression The fundamental goal of microarray experiments is to identify genes that are differentially expressed in the conditions being studied. Comparison statistics can be used to help identify differentially expressed genes and cluster analysis can be used to identify patterns of gene expression and to segregate a subset of genes based on these patterns. Statistical Significance –Fold change Fold change does not address the reproducibility of the observed difference and cannot be used to determine the statistical significance. –Comparison statistics 2 group –t-test, Welch’s t-test, Wilcoxon Rank Sum, 3 or more groups, single factor –One-way ANOVA, Kruskal-Wallis 4 or more groups, multiple factors –Two-way ANOVA Comparison tests require replicates and use the variability within the replicates to assign a confidence level as to whether the gene is differentially expressed. Supporting material - Draghici S. (2002) Statistical intelligence: effective analysis of high-density microarray data. Drug Discov Today, 7(11 Suppl).: S55-63.

7 difference between groups difference within groups t-test for comparison of two groups Calculate t statistic t = Determine confidence level for t (probability that t could occur by chance) df = n 1 + n 2 - 2 Mean grp 1 – Mean grp 2 ((s 1 2 /n 1 ) + (s 2 2 /n 2 )) 1/2 = s = variance n = size of sample The larger the difference between the groups and the lower the variance the bigger t will be and the lower p will be

8 Gene 1 Fold Change = 5.3 p = 0.19 Gene 2 Fold Change = 5.3 p = 0.03 Mean Signal Fold change vs. p value 2 groups, 4 replicates each Mean, standard deviation, fold change and p-value calculated Differential Expression

9 Analysis of Variance (ANOVA) Like t-test, identifies genes with large differences between groups and small differences within groups For use with 3 or more groups One-way and two-way One-way examines effects of one factor on gene expression Two-way can examine effects of two factors on gene expression as well as the interaction of the two factors Pavlidis P. Using ANOVA for gene selection from microarray studies of the nervous system. Methods. 2003 Dec;31(4):282-9. Glantz S. Primer of Biostatistics. 5 th Edition. McGraw-Hill. Glantz S, Slinker B. Primer of Regression and Analysis of Variance. McGraw-Hill.

10 Two-way ANOVA Example WT - WT + R6/2 - R6/2 + Triple treatment in Huntington’s Disease model (R6/2 mice, GSE857, Affymetrix U74Av2) Treatment - + Disease WT R6/2 3 Disease effect Treatment effect Interaction Disease and treatment effect (no Interaction) Gene expression pattern 3

11 Pavlidis P, Noble WS. Analysis of strain and regional variation in gene expression in mouse brain. Genome Biol. 2001;2(10):RESEARCH0042. Two-way ANOVA compared to t-test t-testTwo-way Disease Differences 274 791 Treatment - + Disease WT R6/2 3 3 Triple treatment in Huntington’s Disease model (R6/2 mice, GSE857, Affymetrix U74Av2)

12 Analysis Workflow Examples 2 groups (apoE -/- aorta vs. wt aorta) 5 groups, single factor (Drosophila Innate Immune Response Time Series) 12 groups, two factors (Immune response to hookworms in mouse lung) t-test BH (FDR) Up regulated Down regulated Gene Lists One-way ANOVA BH (FDR) Clustering Gene Lists Two-way ANOVA BH (FDR) Clustering Gene Lists Individual genes of interest Biological themes (Pathways, molecular functions, etc.)

13 General microarry data analysis workflow From raw data to biological significance Comparison statistics Two-way ANOVA GeneSifter Overview The Gene Expression Omnibus (GEO) Microarray analysis of gene expression following hookworm infection Data overview Dissection of the immune response using 2-way ANOVA Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung

14 Accessibility Web-based Secure Data management Data Annotation (MIAME) Multiple upload tools CodeLink Affymetrix Illumina Agilent Custom Differential Expression - Powerful, accessible tools for determining Statistical Significance R based statistics Bioconductor Comparison Tests t-test, Welch’s t-test, Wilcoxon Rank sum test, one-way ANOVA, two-way ANOVA Correction for Multiple Testing Bonferroni, Holm, Westfall and Young maxT, Benjamini and Hochberg Unsupervised Clustering PAM, CLARA, Hierarchical clustering Silhouettes GeneSifter – Microarray Data Analysis

15 Integrated tools for determining Biological Significance One Click Gene Summary™ Ontology Report Pathway Report Search by ontology terms Search by KEGG terms or Chromosome

16 The GeneSifter Data Center Free resource Training Research Publishing 6 areas Cardiovascular Cancer Endocrinology Neuroscience Immunology Oral Biology Access to : Data Analysis summary Tutorials WebEx

17 The GeneSifter Data Center www.genesifter.net/dc

18 Using the Gene Expression Omnibus (http://www.microarraysuccess.org/newsletter) The Gene Expression Omnibus (GEO) Gene expression data repository (mostly microarrays) Over 3000 data sets All array platforms represented Searchable by Platform Species Experiment annotation Downloadable data

19 General microarry data analysis workflow From raw data to biological significance Comparison statistics Two-way ANOVA GeneSifter Overview The Gene Expression Omnibus (GEO) Microarray analysis of gene expression following hookworm infection Data overview Dissection of the immune response using 2-way ANOVA Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung

20 Project Analysis : Two-way ANOVA Scott lab, Johns Hopkins University (Bloomberg School of Public Health ) Affymetrix Mouse 430 2.0 Wild type and SCID mice Control and 5 time points after infection CEL files available (loaded and MAS5 processed in GeneSifter) Alex Loukas, and Paul Prociv. Immune Responses in Hookworm Infections. Clinical Microbiology Reviews, October 2001, p. 689-703, Vol. 14, No. 4

21 Analysis of Variance (ANOVA) Like t-test, identifies genes with large differences between groups and small differences within groups For use with 3 or more groups One-way and two-way One-way examines effects of one factor on gene expression Two-way can examine effects of two factors on gene expression as well as the interaction of the two factors Pavlidis P. Using ANOVA for gene selection from microarray studies of the nervous system. Methods. 2003 Dec;31(4):282-9. Glantz S. Primer of Biostatistics. 5 th Edition. McGraw-Hill. Glantz S, Slinker B. Primer of Regression and Analysis of Variance. McGraw-Hill.

22 Project Analysis : Two-way ANOVA Factor One: Strain (2 levels, SCID, WT) Factor Two: Time after infection (6 levels, con, 2,3,4,8,12 dpi) Gene expression pattern WT SCIDStrain: Time: Strain Effect Time Effect Interaction

23 Project Analysis : Two-way ANOVA

24 Identify Factors Indicate number of levels for each Identify levels for each factor

25 Project Analysis : Two-way ANOVA Assign levels for each factor to cells Include fold-change cutoff if desired Select effect to filter on first (you can switch later)

26 Two-way ANOVA : Strain Effects

27 Biological Significance Gene Annotation Sources UniGene - organizes GenBank sequences into a non-redundant set of gene-oriented clusters. Gene titles are assigned to the clusters and these titles are commonly used by researchers to refer to that particular gene. LocusLink (Entrez Gene) - provides a single query interface to curated sequence and descriptive information, including function, about genes. Gene Ontologies – The Gene Ontology™ Consortium provides controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products, that can be used by databases such as Entrez Gene. KEGG - Kyoto Encyclopedia of Genes and Genomes provides information about both regulatory and metabolic pathways for genes. Reference Sequences- The NCBI Reference Sequence project (RefSeq) provides reference sequences for both the mRNA and protein products of included genes. GeneSifter maintains its own copies of these databases and updates them automatically.

28 One-Click Gene Summary

29 Two-way ANOVA : Strain Effects

30 Ontology Report

31 Ontology Report : z-score R = total number of genes meeting selection criteria N = total number of genes measured r = number of genes meeting selection criteria with the specified GO term n = total number of genes measured with the specific GO term Reference: Scott W Doniger, Nathan Salomonis, Kam D Dahlquist, Karen Vranizan, Steven C Lawlor and Bruce R Conklin; MAPPFinder: usig Gene Ontology and GenMAPP to create a global gene-expression profile from microarray data, Genome Biology 2003, 4:R7

32 Z-score Report

33 KEGG Report

34 Two-way ANOVA : Strain Effects

35 Strain effects - Visualization Visualization of 517 genes (strain effect p < 0.001)

36 Segregation of expression patterns using k-medoids clustering Strain effects - Partitioning

37 Silhouette widths are used to find “best” number of clusters kmean sil. width 20.71 40.41 60.25 Dudoit S, Fridlyand J. A prediction-based resampling method for estimating the number of clusters in a dataset. Genome Biol. 2002 Jun 25;3(7):RESEARCH0036. Epub 2002 Jun 25. Strain effects - Partitioning

38 Strain : Cluster 1

39 Strain : Cluster 2

40 Two-way ANOVA : Time Effects

41

42 Time : Cluster 1

43 Time : Cluster 2

44 Two-way ANOVA : Interaction

45

46 Interaction : Cluster 3

47 Interaction : Cluster 2

48 Two-way ANOVA : Summary Immune response to hookworms in mouse lung 12 groups (3 biological replicates) 2 factors (Strain and Time) ~39,000 genes56 genes Z-scores Pattern selection – Hierachical clustering, PAM (Interaction) Two-way ANOVA Interaction Strain Time 517 genes 1054 genes Biological process Transcription (4) Circadian Rhythm (3) Biological process Immune response (8) Chitin catabolism (4)

49 Strain effects, time effects and interaction

50 GeneSifter Workflow Examples 2 groups (apoE -/- aorta vs. wt aorta) 5 groups, single factor (Drosophila Innate Immune Response Time Series) 12 groups, two factors (Immune response to hookworms in mouse lung) t-test BH (FDR) Up regulated Down regulated Gene Lists One-way ANOVA BH (FDR) Clustering Gene Lists Two-way ANOVA BH (FDR) Clustering Gene Lists Individual genes of interest Biological themes (Pathways, molecular functions, etc.)

51 Resources Monthly Webinar Series 8/10/06 - Microarray analysis of gene expression in Huntington's Disease peripheral blood - a platform comparison Archived - Using 2-way ANOVA to dissect gene expression following myocardial infarction in mice Archived - Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Archived - The microarray data analysis process - from raw data to biological significance Archived - Microarray analysis of gene expression in androgen-independent prostate cancer Archived - Microarray analysis of gene expression in male germ cell tumors

52 Eric Olson eric@genesifter.net Thank You www.genesifter.net Trial account, tutorials, sample data and Data Center


Download ppt "Using 2-way ANOVA to dissect the immune response to hookworm infection in mouse lung Eric Olson"

Similar presentations


Ads by Google