Formulating mixed models for experiments, including longitudinal experiments (accepted for publication in JABES) Chris Brien 1 & Clarice Demétrio 2 1 University.

Slides:



Advertisements
Similar presentations
EXPERIMENTAL DESIGN Science answers questions with experiments.
Advertisements

McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. Part Three SOURCES AND COLLECTION OF DATA.
1 Correlation and Simple Regression. 2 Introduction Interested in the relationships between variables. What will happen to one variable if another is.
Factorial Designs & Managing Violated Statistical Assumptions
Assumptions underlying regression analysis
Objectives Uncertainty analysis Quality control Regression analysis.
Multiphase experiments in the biological sciences Chris Brien Phenomics and Bioinformatics Research Centre, University of South Australia Joint work with:
Randomization-based analysis of experiments (Tiers over mixed models)
1 Panel Data Analysis – Advantages and Challenges Cheng Hsiao.
Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs Chris Brien 1, Bronwyn Harch 2, Ray Correll 2 & Rosemary.
2nd level analysis – design matrix, contrasts and inference
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
7/20/03Copyright Ed Lipinski and Mesa Community College, All rights reserved. 1 Research Methods Summer 2009 Using Between Subjects and Within.
Determining How Costs Behave
Writing up results from Structural Equation Models
Research Methodology Statistics Maha Omair Teaching Assistant Department of Statistics, College of science King Saud University.
Robust microarray experiments by design: a multiphase framework Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia
Optimal designs for one and two-colour microarrays using mixed models
A tale of randomization Chris Brien Phenomics & Bioinformatics Research Centre, University of South Australia. The Australian Centre for Plant Functional.
Factor-allocation in gene- expression microarray experiments Chris Brien Phenomics and Bioinformatics Research Centre University of South Australia
MCUAAAR: Methods & Measurement Core Workshop: Structural Equation Models for Longitudinal Analysis of Health Disparities Data April 11th, :00 to.
PSYC512: Research Methods PSYC512: Research Methods Lecture 13 Brian P. Dyre University of Idaho.
Multiple Comparisons in Factorial Experiments
Random Assignment Experiments
LSU-HSC School of Public Health Biostatistics 1 Statistical Core Didactic Introduction to Biostatistics Donald E. Mercante, PhD.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
1 Design of Engineering Experiments Part 3 – The Blocking Principle Text Reference, Chapter 4 Blocking and nuisance factors The randomized complete block.
Chapter 4 Randomized Blocks, Latin Squares, and Related Designs
DOCTORAL SEMINAR, SPRING SEMESTER 2007 Experimental Design & Analysis Analysis of Covariance; Within- Subject Designs March 13, 2007.
Experiments with both nested and “crossed” or factorial factors
STT 511-STT411: DESIGN OF EXPERIMENTS AND ANALYSIS OF VARIANCE Dr. Cuixian Chen Chapter 14: Nested and Split-Plot Designs Design & Analysis of Experiments.
Chapter 14Design and Analysis of Experiments 8E 2012 Montgomery 1.
Experimental Design, Response Surface Analysis, and Optimization
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
Principles in the design of multiphase experiments with a later laboratory phase: orthogonal designs Chris Brien 1, Bronwyn Harch 2, Ray Correll 2 & Rosemary.
Designs with Randomization Restrictions RCBD with a complete factorial in each block RCBD with a complete factorial in each block –A: Cooling Method –B:
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Analysis of Variance & Multivariate Analysis of Variance
Nested and Split Plot Designs. Nested and Split-Plot Designs These are multifactor experiments that address common economic and practical constraints.
Formulating mixed models for experiments, including longitudinal experiments [JABES (2009) 14, ] Chris Brien 1 & Clarice Demétrio 2 1 University.
Formulating mixed models for experiments, including longitudinal experiments [JABES (2009) 14, ] Chris Brien 1 & Clarice Demétrio 2 1 University.
1 Introduction to mixed models Ulf Olsson Unit of Applied Statistics and Mathematics.
Quantitative Business Analysis for Decision Making Multiple Linear RegressionAnalysis.
Introduction to Multilevel Modeling Using SPSS
Text reference, Chapter 14, Pg. 525
Statistical Modelling Chapter X 1 X.Sample size and power X.AHow it is done X.BPower X.CComputing the required sample size for the CRD and RCBD with a.
Formulating mixed models for experiments, including longitudinal experiments [JABES (2009) 14, ] Chris Brien1 & Clarice G.B. Demétrio2 1University.
INT 506/706: Total Quality Management Introduction to Design of Experiments.
Introduction ANOVA Mike Tucker School of Psychology B209 Portland Square University of Plymouth Drake Circus Plymouth, PL4 8AA Tel: +44 (0)
Much of the meaning of terms depends on context. 1.
URBDP 591 A Lecture 8: Experimental and Quasi-Experimental Design Objectives Basic Design Elements Experimental Designs Comparing Experimental Design Example.
Testing Hypotheses about Differences among Several Means.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Chapter coverage Part A Part A –1: Practical tools –2: Consulting –3: Design Principles Part B (4-6) One-way ANOVA Part B (4-6) One-way ANOVA Part C (7-9)
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
Lecture 7: Parametric Models for Covariance Structure (Examples)
The Mixed Effects Model - Introduction In many situations, one of the factors of interest will have its levels chosen because they are of specific interest.
ANOVA Overview of Major Designs. Between or Within Subjects Between-subjects (completely randomized) designs –Subjects are nested within treatment conditions.
ANCOVA.
1 Topic 14 – Experimental Design Crossover Nested Factors Repeated Measures.
1 Chapter 5.8 What if We Have More Than Two Samples?
Longitudinal Data & Mixed Effects Models Danielle J. Harvey UC Davis.
Stats Methods at IC Lecture 3: Regression.
Linear Mixed Models in JMP Pro
Topics Randomized complete block design (RCBD) Latin square designs
Introduction to Experimental and Observational Study Design
Randomized Complete Block and Repeated Measures (Each Subject Receives Each Treatment) Designs KNNL – Chapters 21,
Longitudinal Data & Mixed Effects Models
Much of the meaning of terms depends on context.
Presentation transcript:

Formulating mixed models for experiments, including longitudinal experiments (accepted for publication in JABES) Chris Brien 1 & Clarice Demétrio 2 1 University of South Australia, 2 ESALQ, Universidade de São Paulo Web address for Multitiered experiments site:

Outline 1.Preliminaries 2.A longitudinal Randomized Complete Block Design (RCBD) 3.Why randomization-based models? 4.A three-phase example 5.Concluding comments 2

1)Three-stage method 3 Intratier Random and Intratier Fixed models: Essentially models equivalent to a randomization model. Homogeneous Random and Fixed models: Terms added to intratier models and others shifted between intratier random and intratier fixed models. General Random and General Fixed models: Perhaps reparameterize terms in homogeneous models, particularly if a longitudinal experiment, and omit aliased terms from random model.  May yield a model of convenience, not full mixed model.  Demonstrate by example (motivated by Piepho et al., 2004); extension of Brien and Bailey, 2006, section 7)

Tiers Randomization-based tiers are the foundation of the method. View randomization as the assignment of one set of objects to another.  e.g. treatments to plots. A tier is a set of factors indexing a set of objects. Would not be need if all experiments were two-tiered, as only two sets of factors needed:  block or unit or unrandomized factors;  treatment or randomized factors. Tiers is a general term for these sets For two-tiered factorial RCBD:  the units or unrandomized tier might be {Blocks, Plots} and  the treatments or randomized tier {A, B} 4

Notation Factor relationships A*B factors A and B are crossed A/B factor B is nested within A Generalized factor A  B is the ab-level factor formed from the combinations of A with a levels and B with b levels Symbolic mixed model Fixed terms | random terms e.g. (A*B | Blocks/Runs) A*B = A + B + A  B Blocks/Runs = Blocks + Blocks  Runs Functions on generalized factors gf(.) generalized factor from all factors in argument. uc(.)some, possibly structured, form of unequal correlation between levels of generalized factors. td(.) systematic trend across levels of generalized factors. 5

2)A longitudinal RCBD A field experiment comparing 3 different tillage methods Laid out according to an RCBD with 4 blocks. On each plot one water collector is installed in each of 4 layers and the amount of nitrogen leaching measured. 6 (Piepho et al., 2004, Example 1) Intratier Random and Intratier Fixed models:  The unrandomized tier is {Block, Plot, Lay};  The randomized tier is {Tillage}.  The only longitudinal factor is Lay. Intratier Random:(Block / Plot) * Lay = Block + Lay + Block  Lay + Block  Plot + Block  Plot  Lay ; Intratier Fixed:Tillage. 3Tillage 3 treatments 48 layer-plots 4Blocks 3Plots in B 4 Lay So not the "Split-plot-in-Time“ analysis with Random: Block / Plot / Subplot; Fixed: Tillage * Lay. But, what are Subplots?

A longitudinal RCBD (cont'd) 7 Homogeneous Random and Fixed models: Terms added to intratier models and others shifted from intratier random to intratier fixed models and vice versa.  Take the fixed factors to be Block, Tillage and Lay and the random factor to be Plot.  Terms involving Block and Lay that are in the Intratier Random model are shifted to the fixed model.  Lay#Tillage is of interest so that the fixed model includes Tillage * Lay. Homogeneous Random: Block  Plot + Block  Plot  Lay = (Block  Plot) / Lay Fixed:Block + Lay + Block  Lay + Tillage + Tillage  Lay = (Block + Tillage) * Lay Intratier Random:(Block / Plot) * Lay = Block + Lay + Block  Lay + Block  Plot + Block  Plot  Lay ; Intratier Fixed:Tillage.  Have all possible terms given the randomization.

A longitudinal RCBD (cont'd) 8  The subject term for Lay is Block  Plot;  Expected that there will be unequal correlation between observations with different levels of Lay and same levels of Block  Plot;  No aliased random terms. General random:(Block  Plot) / uc(gf(Lay))  (Block  Plot) / uc(Lay)  Trends for Lay are of interest, but not for the qualitative factor Tillage nor for Block. General fixed:(Block + Tilllage) * td(Lay) Mixed model: (Block + Tilllage) * td(Lay) | (Block  Plot) / uc(Lay) Homogeneous Random: Block  Plot + Block  Plot  Lay = (Block  Plot) / Lay Fixed:Block + Lay + Block  Lay + Tillage + Tillage  Lay = (Block + Tillage) * Lay General Random and General Fixed models: Reparameterize terms in homogeneous models and omit aliased terms from random model.  For longitudinal experiments, form longitudinal error terms: (subject term) ^ gf(longitudinal factors)  A subject term for a longitudinal factor is a generalized factor whose levels are units on which the successive observations are taken.  Allow unequal correlation (uc) between longitudinal factor levels and using gf allows arbitrary uc between these factors.

3)Why randomization-based models? It is common to form models by writing down a list of terms, sometimes drawing on models for related experiments, and designating each term as fixed or random.  e.g. Split-plot-in-Time for the longitudinal RCBD Here derive models from tiers: factors indexing sets of objects.  Ensures all the terms, taken into account in the randomization, are included in the analysis and that the incorporation of any other terms is intentional.  Call such models randomization-based in that randomization is used in determining the terms in the model. Strongly recommend against using Rule 5 in Piepho et al. (2003), as done by Littel et al. (2006, Sec. 4.2).  Rule 5 involves substituting randomized factors for unrandomized factors.  Leads to a misidentification of sources of variation, including the possible omission of the experimental units (EUs) as sources. 9

RCBD and rule 5 Mixed model, equivalent to randomization model for an RCBD: Treatments | Blocks + Blocks  Plots. Rule 5 modifies this to Treatments | Blocks + Blocks  Treatments. Of course, latter more economical as Plots no longer needed. However, latter does not include Blocks  Plots, whose levels are the EUs. Clearly, levels of Blocks  Treatments are not EUs, as Treatments not applied to it levels. Blocks  Plots and Blocks  Treatments are two different sources of variablity: inherent variability vs block-treatment interaction. This "trick" is confusing, unnecessary and not always possible. 10

4)A three-phase example Experiment to investigate differences between pulp produced from different Eucalypt trees. Chip phase:  3 lots of 5 trees from each of 4 areas were processed into wood chips.  Each area differed in i) kinds of trees (2 species) and ii) age (5 and 7 years).  For each of 12 lots, chips from 5 trees were combined and 4 batches selected. Pulp phase:  Batches were cooked to produce pulp & 6 samples obtained from each cooking. Measurement phase:  Each batch processed in one of 48 Runs of a laboratory refiner with its 6 samples randomly placed on 6 positions in a pan in the refiner.  For each run, 6 times of refinement (30, 60, 90, 120, 150 and 180 minutes) were randomized to the 6 positions in the pan.  After allotted time, a sample taken from a pan and its degree of refinement measured. 11 (Pereira, 1969) 288 positions 6 Positions in R 48 Runs 6 Times 6 times 2 Kinds 2 Ages 3 Lots in K, A 4 Batches in K, A, L 48 batches 6Samples in C 48Cookings 288 samples

Profile plot of data 12 Shows: a) curvature in the trend over time; b) some trend variability; c) variance heterogeneity, in particular between the Ages

Formulated and fitted mixed models 13 Using 3-stage process, following model of convenience is formulated from the 4 tiers: General random:Runs / Positions + (Kinds  Ages  Lots) / uc(Times) + (Kinds  Ages  Lots  Batches) / uc(Times); General fixed: Kinds * Ages * td(Times) This model:  Does not contain Cooking/Samples because of aliasing.  Has variance components for Runs, Position, Lots, Batches.  Allows for some form of unequal correlation between Times.  Includes trends over Times. The full fitted model, obtained using ASReml-R (Butler et al., 2007), has:  For variance, a)unstructured, heterogeneous covariance between Times arising from Runs, Batches and Cookings and that differs for Ages and b)a component for Lots variability.  For time, trend whose intercepts and curvature (characterized by cubic smoothing splines (Verbyla, 1999)) differ for Ages and whose slopes differ for Kinds. (details in Brien & Demétrio, 2009.)

Formulated and fitted mixed models 14 Using 3-stage process, following model of convenience is formulated from the 4 tiers: General random:Runs / Positions + (Kinds  Ages  Lots) / uc(Times) + (Kinds  Ages  Lots  Batches) / uc(Times); General fixed: Kinds * Ages * td(Times)  Does not contain Cooking/Samples. Random trends over time examined using cubic smoothing splines (Verbyla, 1999). The full fitted model, obtained using ASReml-R (Butler et al., 2007), is:  Ages + Kinds*lin(xTime) | spl(xTime)/Ages + Kinds  Ages  Lots + (Runs + Cookings + (Kinds  at(Ages)  Lots  Batches)  ush(Times) o where xTime indicates that actual values of the levels of Times used. o at operator indicates that the parameters differ for the Ages; o ush indicates an unstructured covariance matrix.  For time, trend whose intercept and curvature differ for Ages and slope differs for Kinds.  For variance, unstructured, heterogeneous covariance between Times arising from Batches, Cookings and Runs and that differs for Ages and a component for Lots variability (details in Brien & Demétrio, 2009.)

Predicted degree of refinement 15 Same Age (differ in slope) Different Age (differ in intercept and curvature)

5)Concluding comments Formulate a randomization-based mixed model:  to ensure that all terms appropriate, given the randomization, are included;  and makes explicit where model deviates from a randomization model. Based on dividing the factors in an experiment into tiers. To obtain fit, a model of convenience is often used:  When aliased random sources, terms for all but one are omitted to obtain fit;  But re-included in fitted model if retained term is in fitted model. All 11 examples from Piepho et al. (2004) are in Brien and Demétrio (2009). 16

References Brien, C.J., and Bailey, R.A. (2006) Multiple randomizations (with discussion). J. Roy. Statist. Soc., Ser. B, 68, 571–609. Brien, C.J. and Demétrio, C.G.B. (2009) Formulating mixed models for experiments, including longitudinal experiments. Accepted for publication in JABES. Butler, D., Cullis, B.R., Gilmour, A.R. and Gogel, B.J. (2007) Analysis of mixed models for S language environments: ASReml-R reference manual. DPI Publications, Brisbane. Littel, R., Milliken, G., Stroup, W., Wolfinger, R. and Schabenberger, O. (2006) SAS for Mixed Models. 2 nd edn. SAS Press, Cary. Pereira, R.A.G. (1969) Estudo Comparativo das Propriedades Físico- Mecânicas da Celulose Sulfato de Madeira de Eucalyptus saligna Smith, Eucalyptus alba Reinw e Eucalyptus grandis Hill ex Maiden. Escola Superior de Agricultura `Luiz de Queiroz', University of São Paulo, Piracicaba, Brasil. Piepho, H.P., Büchse, A. and Emrich, K. (2003) A hitchhiker's guide to mixed models for randomized experiments. Journal of Agronomy and Crop Science, 189, 310–322. Piepho, H.P., Büchse, A. and Richter, C. (2004) A mixed modelling approach for randomized experiments with repeated measures. Journal of Agronomy and Crop Science, 190, 230–247. Verbyla, A.P., Cullis, B.R., Kenward, M.G. and Welham, S.J. (1999) The analysis of designed experiments and longitudinal data by using smoothing splines (with discussion). Applied Statistics, 48, 269–