A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb. 2006 Analysis of (cDNA) Microarray.

Slides:



Advertisements
Similar presentations
Mixed Designs: Between and Within Psy 420 Ainsworth.
Advertisements

Optimal designs for one and two-colour microarrays using mixed models
FACTORIAL ANOVA Overview of Factorial ANOVA Factorial Designs Types of Effects Assumptions Analyzing the Variance Regression Equation Fixed and Random.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
M. Kathleen Kerr “Design Considerations for Efficient and Effective Microarray Studies” Biometrics 59, ; December 2003 Biostatistics Article Oncology.
Statistical tests for differential expression in cDNA microarray experiments (2): ANOVA Xiangqin Cui and Gary A. Churchill Genome Biology 2003, 4:210 Presented.
Microarray technology and analysis of gene expression data Hillevi Lindroos.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
FACTORIAL ANOVA.
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Sandrine Dudoit1 Microarray Experimental Design and Analysis Sandrine Dudoit jointly with Yee Hwa Yang Division of Biostatistics, UC Berkeley
Lecture 23: Tues., Dec. 2 Today: Thursday:
10 Hypothesis Testing. 10 Hypothesis Testing Statistical hypothesis testing The expression level of a gene in a given condition is measured several.
Differentially expressed genes
‘Gene Shaving’ as a method for identifying distinct sets of genes with similar expression patterns Tim Randolph & Garth Tan Presentation for Stat 593E.
Experimental Design Terminology  An Experimental Unit is the entity on which measurement or an observation is made. For example, subjects are experimental.
ViaLogy Lien Chung Jim Breaux, Ph.D. SoCalBSI 2004 “ Improvements to Microarray Analytical Methods and Development of Differential Expression Toolkit ”
(4) Within-Array Normalization PNAS, vol. 101, no. 5, Feb Jianqing Fan, Paul Tam, George Vande Woude, and Yi Ren.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Affymetrix GeneChips Oligonucleotide.
CDNA Microarrays Neil Lawrence. Schedule Today: Introduction and Background 18 th AprilIntroduction and Background 25 th AprilcDNA Mircoarrays 2 nd MayNo.
Multiple testing in high- throughput biology Petter Mostad.
Practical Issues in Microarray Data Analysis Mark Reimers National Cancer Institute Bethesda Maryland.
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb MPSS Massively Parallel.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
Statistics: Examples and Exercises Fall 2010 Module 1 Day 7.
Panu Somervuo, March 19, cDNA microarrays.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Sensitivity A Simple Method.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Applying statistical tests to microarray data. Introduction to filtering Recall- Filtering is the process of deciding which genes in a microarray experiment.
Joint analysis of three seemingly independent microarray experiments via multivariate mixed-model equations with null residual covariance structure Antonio.
ANOVA (Analysis of Variance) by Aziza Munir
IAP workshop, Ghent, Sept. 18 th, 2008 Mixed model analysis to discover cis- regulatory haplotypes in A. Thaliana Fanghong Zhang*, Stijn Vansteelandt*,
Maximum Likelihood Estimator of Proportion Let {s 1,s 2,…,s n } be a set of independent outcomes from a Bernoulli experiment with unknown probability.
Bioinformatics Expression profiling and functional genomics Part II: Differential expression Ad 27/11/2006.
A A R H U S U N I V E R S I T E T Faculty of Agricultural Sciences Introduction to analysis of microarray data David Edwards.
The Analysis of Microarray data using Mixed Models David Baird Peter Johnstone & Theresa Wilson AgResearch.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Statistical Principles of Experimental Design Chris Holmes Thanks to Dov Stekel.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Antonio (Toni) Reverter Bioinformatics Group CSIRO Livestock Industries Queensland.
1 Analysis of Variance & One Factor Designs Y= DEPENDENT VARIABLE (“yield”) (“response variable”) (“quality indicator”) X = INDEPENDENT VARIABLE (A possibly.
Introduction to Statistical Analysis of Gene Expression Data Feng Hong Beespace meeting April 20, 2005.
Statistical Methods for Identifying Differentially Expressed Genes in Replicated cDNA Microarray Experiments Presented by Nan Lin 13 October 2002.
1 Analysis Considerations in Industrial Split-Plot Experiments When the Responses are Non-Normal Timothy J. Robinson University of Wyoming Raymond H. Myers.
Statistics for Differential Expression Naomi Altman Oct. 06.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Co-Expression Reverter et.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb GP3xCLI GenePix Post-Processing.
Reverter et al., XV AAABG, Melbourne – July 2003 Intensities versus intensity ratios in the analysis of cDNA microarray data Toni Reverter (
Analyzing Expression Data: Clustering and Stats Chapter 16.
1 Estimation of Gene-Specific Variance 2/17/2011 Copyright © 2011 Dan Nettleton.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
Hybridization Design for 2-Channel Microarray Experiments Naomi S. Altman, Pennsylvania State University), NSF_RCN.
Methods for Dummies Second level Analysis (for fMRI) Chris Hardy, Alex Fellows Expert: Guillaume Flandin.
Empirical Bayes Analysis of Variance Component Models for Microarray Data S. Feng, 1 R.Wolfinger, 2 T.Chu, 2 G.Gibson, 3 L.McGraw 4 1. Department of Statistics,
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Final Remarks Genetical.
Distinguishing active from non active genes: Main principle: DNA hybridization -DNA hybridizes due to base pairing using H-bonds -A/T and C/G and A/U possible.
Statistical Analysis for Expression Experiments Heather Adams BeeSpace Doctoral Forum Thursday May 21, 2009.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray.
ANOVA and Multiple Comparison Tests
Microarray Data Analysis Xuming He Department of Statistics University of Illinois at Urbana-Champaign.
12 Inferential Analysis.
One-Factor Experiments
Chapter 10 – Part II Analysis of Variance
Presentation transcript:

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Analysis of (cDNA) Microarray Data: Part IV. Mixed-Model Equations I

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Kerr & Churchill, 2001 PNAS 98: Mixed-Model Equations Setting the scene (1/3):

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Wolfinger et al, 2001 J Comp Biol 8:625 Mixed-Model Equations Setting the scene (2/3): Two-Step Mixed-Model Source: G Rosa Assumes most genes are not DE. Otherwise some important effects are lost. One for each gene!

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Wolfinger et al, 2001 J Comp Biol 8:625 Mixed-Model Equations Setting the scene (2/3): Two-Step Mixed-Model Source: G Rosa Step 1. Global Normalisation Step 1. Gene Models

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Wolfinger et al, 2001 J Comp Biol 8:625 Mixed-Model Equations Setting the scene (2/3): Two-Step Mixed-Model Source: G Rosa 2003.

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Hoeschele & Li, 2005 Biostatistics 6:183 Mixed-Model Equations Setting the scene (3/3): Joint versus Gene-Specific Mixed-Models 1.Flexibility on how the (co)variance structure is modelled and different formulations can be compared, eg. Homo- vs Hetero-geneous within- gene variance) 2.Allows the evaluation of gene- specific treatment contrasts that include main effects 3.Allows model evaluation and residual analysis 1.Low Power (fewer degrees of freedom), but exact if: 2.Genes have the same number of probes in each array; 3.Residual variance must be homogeneous across genes; and 4.Large number of genes.

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations Examples: Joint vs Gene-Specific Mixed-Models Joint Gene-Specific (Two-Step) Variety has been fitted as an additional fixed effect AI vs IVF vs NT Embrios

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations Illustration: AB 1 2 x 5 (to generate entire data) This could represent a row (or column) of the second array. 2 Arrays, 5 Genes/Array, Genes spotted five times, 2 Treatments (A, B)

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations Illustration: AB 1 2 x 5 (to generate entire data) 2 Arrays, 5 Genes/Array, Genes spotted five times, 2 Treatments (A, B) DATA test; INFILE ‘h:\UNE_data\Test\c1c2’; INPUT array gene dye $ treat $ intens; RUN; PROC GLM; CLASS array dye gene treat; MODEL intens = array dye gene*treat; RANDOM gene*treat; LSMEANS array dye gene*treat / pdiff; RUN; SAS Code: Y = Array + Dye + Gene*Treatment + Error Model: Random

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations Illustration: AB 1 2

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations Illustration: AB 1 2

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: y ijkgtr = C i Component of Design (Reference, All-Pairs) + A j Array slide (1, 2, …, 14) + D k Dye (Red, Green) + T t Treatment (Diets: High, Medium, Low) + (AD) jk Array * Dye + G g Main effect of Gene + (AG) jg Gene * Array + (DG) kg Gene * Dye + (TG) tg Gene * Treatment +  ijkgtr Random Error Byrne et al, 2005 J Anim Sci 83:1-12 Random effects Log 2 Intensities

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: Byrne et al, 2005 J Anim Sci 83: The terms C, A, D, T, and AD account for all effects that are not gene specific, have no biological significance, and the fitting of which aims at normalizing the data by accounting for systematic effects. 2.The random gene effect G contains the average level of gene expression (averaged over the other factors). 3.The random gene  array in (AG) models the effects for each spot and it serves to account for the spot-to-spot variability inherent in spotted microarray data. It allows us to extract appropriate information about the treatments and obviates the need to form ratios (Wolfinger et al., 2001). 4.The random gene  dye in (DG) models the gene specific dye effects occurring when some genes exhibit higher fluorescent signal when labeled with one dye or the other, regardless of the treatment (Kerr et al., 2002). 5.The effect of interest was the random interaction between genes and diet treatments, (TG) because it captured differences from overall averages that were attributable to specific combination of diet and gene.

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: y ijkgtr = C i + A j + D k + T t + (AD) jk + G g Main effect of Gene  Between Genes + (AG) jg Gene * Array  B/w G within Array + (DG) kg Gene * Dye  B/w G within Dye + (TG) tg Gene * Treatment  B/w G within Trt +  ijkgtr Random Error  Within Gene Byrne et al, 2005 J Anim Sci 83:1-12 Log 2 Intensities Decomposition of Total Variance

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: y ijkgtr = C i + A j + D k + T t + (AD) jk + G g + (AG) jg + (DG) kg + (TG) tg +  ijkgtr Log 2 Intensities Byrne et al, 2005 J Anim Sci 83:1-12

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: Byrne et al, 2005 J Anim Sci 83:1-12 ModelVariance component a LogL 22 2g2g  2 jg Gene * Array  2 kg Gene * Dye  2 tg Gene * Diet Full ,732 R ,449 R ,933 R ,444 REML Estimates Full %Total Variance

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: Byrne et al, 2005 J Anim Sci 83:1-12 Residual Plots

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: Byrne et al, J Anim Sci 83:1-12 Measures of (possible) Differential Expression HIGH vs LOW Contrast Alternatively:

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: Byrne et al, J Anim Sci 83:1-12 Full %Total Variance (SD = 0.454) ± 2 SD ± 3 SD dgdg ± 2 SD± 3 SD < > % Total5.81.5

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations The Diets experiment: Byrne et al, J Anim Sci 83:1-12 Histogram of p-values for a Test of d g assuming Normality (Using 100 bins at 0.01 interval) We estimate true Null p-values per bin If we set our cutoff for significance at c = 0.01, we could estimate FDR to be 73.43/331 =

A Quantitative Overview to Gene Expression Profiling in Animal Genetics Armidale Animal Breeding Summer Course, UNE, Feb Mixed-Model Equations Concluding Remarks on DE Genes: 1.The assumption of Normality of d g is questionable 2.Hence, resorting to “tabled values” (ie. 2, 3, SD from the mean) could be suboptimal 3.Instead, using the proportion of the total variation that is attributed to the Gene by Treatment (diet) interaction could be safer option. 4.We let the mixed-model tell us how many genes are likely to be DE 5.CLAIM: The proportion of the total variation that is attributed to the Gene by Treatment (diet) interaction allows us to control the FDR 6.CLAIM: The proportion of the total variation that is attributed to the Gene by Treatment (diet) interaction provides a lower bound for the mixing proportion in the extreme cluster(s) in a model based clustering via mixtures of distributions.