1.The multiphase framework for microarray experiments 3 The framework is based on Brien et al. (2011). We define a phase to be the period of time during which a set of units are engaged in producing a particular outcome. The outcome can be material for processing in the next phase, or values for response variables, or both. Only the final phase need have a response variable. Also, one phase might overlap another phase. Then, multiphase experiments consist of two or more such phases. Generally, multiphase experiments randomize (randomly allocate) the outcomes from one phase to the next phase.
Physical phases in a microarray experiment One list of phases, and their outcomes, that commonly occurs. 4 Potentially a designed experiment at every phase (Speed & Yang with Smyth, 2008). Design Two-phase design (McIntyre, 1955; Kerr, 2003): 1 st phase can be an experiment or an observational (epidemiological) study; 2 nd phase is a laboratory phase (Brien et al., 2011). Array production or purchase Selection or production of biological material Organisms/ Tissues Sample acquisition & storage Samples RNA extraction Extracts Labelling Aliquots Arrays Hybridized arrays Scanning Measure- ments Hybridization, incl. post- washing Multiple randomizations (Brien & Bailey, 2006) in a seven-phase process.
The components of the framework I. Identify the set of phases involved. II. Determine the design for each phase. III. Produce factor-allocation description of the experiment. IV. Formulate the full mixed model. V. Derive the ANOVA table and use it to investigate design; and obtain a mixed model of convenience. 5
II.Design for a phase: how to allocate one phase to the next The basic principles used in single phase experiment are applied for each phase, along with others outlined by Brien et al. (2011): a)Randomization to avoid bias. Is it OK to process material from the same organism first in every phase, during operator or equipment warm-up in a phase? Randomization makes the experiment robust to such biases. b)Minimizing variability. Focusing on experimental design, an obvious way is to look to remove batch effects for batches built into the processes: o e.g. different acquisition times, processing days, operators, batches of reagent or sets of simultaneously-processed specimens. 6
3.A sources-of-variability Affymetrix microarray experiment A modified version of a study to examine the variability of labelling and hybridization described by Zakharkin et al. (2005): It involved 8 rats. 7 Production of biological material Rat Array purchase 4 Affymetrix arrays Labelling 2 Aliquots RNA extraction Extract Sample acquisition & storage Mammary gland sample Hybridization, incl. post- washing 4 Hybridized arrays Scanning Measure- ments Extract halved for labelling Each aliquot halved for hybridization I.List of phases
II. Assignment of factors for the sources-of- variability experiment Suppose that: Sampling of the rats is in a random order. For each rat, the RNA is extracted immediately the tissue is obtained; i.e. RNA-extraction order from samples is not random. In the labelling phase, the 2 half-extracts from an extract will be labelled consecutively. In the hybridization phase, where washing occurs in batches of 16, one half-aliquot from all aliquots will be hybridized in one batch, and the remaining ones in a second batch. — This means all rats are included in a batch and that the estimate of hybridization variability includes batch variability. In the scanning phase, the arrays will be scanned in a completely random order If scanning is to be batched, Brien et al. (2011) would suggest batches in this phase line up with those in the hybridization phase. Halving extracts and aliquots introduces technical replication, an example of laboratory replication (Brien et al., 2011). 8
III.Factor-allocation description (Brien et al., 2011) 9 Production of biological material Rat Array purchase 4 Affymetrix arrays Labelling 2 Aliquots RNA extraction Extract Sample acquisition & storage Mammary gland sample Hybridization, incl. post- washing 4 Hybridized arrays Scanning Measure- ments Extract halved for labelling Each aliquot halved for hybridization Solid arrow indicates randomization Factor-allocation diagrams have a panel for a set of objects (here the set of outcomes of a phase); a panel lists the factors indexing a set. 8 Samplings 8 samples 32 Scans 32 scans 32 Chips 32 arrays 2Labellings in O 8Occasions 32 half-aliquots 2 Batches 16 Hybridizations in B 32 hybridized arrays 8 rats 8 Rats 2AliquotHalves in O,L 8Extractions 16 half-extracts 2HalfExtracts in E 2 A 1 Dashed arrow indicates systematic
IV.Mixed model for experiment 10 To get mixed model use Brien & Demétrio’s (2009) method: In each panel, form terms as all combinations of the factors, subject to nesting restrictions; For each term from each panel, add to either fixed or random model. Mixed model: Grand mean | Rats + Samplings + Extractions + Extractions HalfExtracts + Occasion + Occasion Labellings + Occasion Labellings AliquotHalves + Batches + Batches Hybridizations + Chips + Scans 8 Samplings 8 samples 32 Scans 32 scans 32 Chips 32 arrays 2Labellings in O 8Occasions 32 half-aliquots 2 Batches 16 Hybridizations in B 32 hybridized arrays 8 rats 8 Rats 2AliquotHalves in O,L 8Extractions 16 half-extracts 2HalfExtracts in E 2 A 1
V.Mixed model of convenience 11 8 Samplings 8 samples 32 Scans 32 scans 32 Chips 32 arrays 2Labellings in O 8Occasions 32 half-aliquots 2 Batches 16 Hybridizations in B 32 hybridized arrays 8 rats 8 Rats 2AliquotHalves in O,L 8Extractions 16 half-extracts 2HalfExtracts in E 2 A 1 Given confounding, can measure variability from: i. Rats + Samplings + Extractions + Occasions; ii. Extractions HalfExtracts + Occasion Labellings; iii. Batches iv. Occasion Labellings AliquotHalves + Batches Hybridizations + Chips + Scans. Mixed model of convenience for fitting: Grand mean | Rats + Extractions HalfExtracts + Batches + Batches Hybridizations. This model does not include all sources of variation. ANOVA useful for this
Summary Microarray experiments are multiphase: One might employ an experimental design in every phase to randomize and block the processing order in the current phase. Factor-allocation description can be used to formulate the analysis for an experiment, this analysis including terms from every phase. The multiphase framework is flexible in that it can easily be adapted to another set of phases. 12
References Brien, C.J., and Bailey, R.A. (2006) Multiple randomizations (with discussion). J. Roy. Statist. Soc., Ser. B, 68, 571–609. Brien, C.J. and Demétrio, C.G.B. (2009) Formulating mixed models for experiments, including longitudinal experiments. J. Agr. Biol. Env. Stat., 14, Brien, C.J., Harch, B.D., Correll, R.L. and Bailey, R.A. (2011) Multiphase experiments with at least one later laboratory phase. I. Orthogonal designs. J. Agr. Biol. Env. Stat., 16(3): Kerr, M. K. (2003) Design Considerations for Efficient and Effective Microarray Studies. Biometrics, 59(4), McIntyre, G. A. (1955). Design and analysis of two phase experiments. Biometrics, 11, Speed, T. P. and Yang, J. Y. H. with Smyth, G. (2008) Experimental design in genomics, proteomics and metabolomics: an overview. Advanced Topics in Design of Experiments. Workshop held at INI, Cambridge, U.K. Zakharkin, Stanislav O., et al. (2005) Sources of variation in Affymetrix microarray experiments. BMC Bioinformatics, 6, Web address for Multitiered experiments site: 13