Presentation on theme: "Combined Analysis of Experiments Basic Research –Researcher makes hypothesis and conducts a single experiment to test it –The hypothesis is modified and."— Presentation transcript:
Combined Analysis of Experiments Basic Research –Researcher makes hypothesis and conducts a single experiment to test it –The hypothesis is modified and another experiment is conducted –Combined analysis of experiments is seldom required –Experiments may be repeated to Provide greater precision (increased replication) Validate results from initial experiment Applied Research –Recommendations to producers must be based on multiple locations and seasons that represent target environments (soil types, weather patterns)
Multilocational trials Often called MET = multi-environment trials How do treatment effects change in response to differences in soil and weather throughout a region? –What is the range of responses that can be expected? Detect and quantify interactions of treatments and locations and interactions of treatments and seasons in the recommendation domain Combined estimates are valid only if locations are randomly chosen within target area –Experiments often carried out on experiment stations –Generally use sites that are most accessible or convenient –Can still analyze the data, but consider possible bias due to restricted site selection when making interpretations
Preliminary Analysis Complete ANOVA for each experiment –Do we have good data from each site? –Examine residual plots for validity of ANOVA assumptions, outliers Examine experimental errors from different locations for heterogeneity –Perform F Max test or Levene’s test for homogeneity of variance –If homogeneous, perform a combined analysis across sites –If heterogeneous, may need to use a transformation or break sites into homogeneous groups and analyze separately –Differences in means across sites are often greater than treatment effects –Does not prevent a combined analysis, but may contribute to error heterogeneity if there are associations between means and variances
MET Linear Model (for an RBD) Y ijk = + i + j(i) + k + ( ) ik + ijk = mean effect i = i th location effect j(i) = j th block effect within the i th location k = k th treatment effect ik = interaction of the k th treatment in the i th location ijk = pooled error Environments = Locations = Sites Blocks are nested in locations –SS for blocks is pooled across locations
Treatment x Environment Interaction Obtain a preliminary estimate of interaction of treatment with environment or season Will we be able to make general recommendations about the treatments or should they be specific for each site? –Error degrees of freedom are pooled across sites, so it is relatively easy to detect interactions –Consider the relative magnitude of variation due to the treatments compared to the interaction MS –Are there rank changes in treatments across environments (crossover interactions)?
Treatments and locations are random SourcedfSSMSExpected MS Locationl-1SSLM1 Blocks in Loc.l(r-1)SSB(L)M2 Treatmentt-1SSTM3 Loc. X Treatment(l-1)(t-1)SSLTM4 Pooled Error l(r-1)(t-1)SSEM5 F for Locations = (M1+M5)/(M2+M4) Satterthwaite’s approximate df N1’ = (M1+M5) 2 /[(M1 2 /(l-1))+M5 2 /(l)(r-1)(t-1)] N2’ = (M2+M4) 2 /[(M2 2 /(l-1))+M4 2 /(l)(r-1)(t-1)] F for Treatments = M3/M4 F for Loc. x Treatments = M4/M5
Treatments and locations are fixed SourcedfSSMSExpected MS Locationl-1SSLM1 Blocks in Loc.l(r-1)SSB(L)M2 Treatmentt-1SSTM3 Loc. X Treatment(l-1)(t-1)SSLTM4 Pooled Error l(r-1)(t-1)SSEM5 F for Locations = M1/M2 F for Treatments = M3/M5 F for Loc. x Treatments = M4/M5 Fixed Locations constitute the entire population of environments OR represent specific environmental conditions (rainfall, elevation, etc.)
Treatments are fixed, Locations are random SourcedfSSMSExpected MS Locationl-1SSLM1 Blocks in Loc.l(r-1)SSB(L)M2 Treatmentt-1SSTM3 Loc. X Treatment(l-1)(t-1)SSLTM4 Pooled Error l(r-1)(t-1)SSEM5 F for Locations = M1/M2 F for Treatments = M3/M4 F for Loc. x Treatments = M4/M5 SAS uses slightly different rules for determining Expected MS No direct test for Locations for this model
SAS Expected Mean Squares PROC GLM; Class Location Rep Variety; Model Yield = Location Rep(Location) Variety Location*Variety; Random Location Rep(Location) Location*Variety/Test; Source Type III Expected Mean Square Location Var(Error) + 3 Var(Location*Variety) + 7 Var(Rep(Location)) + 21 Var(Location) Dependent Variable: Yield Source DF Type III SS Mean Square F Value Pr > F Location 1 0.505125 0.505125 0.20 0.6745 Error 5.8098 15.027788 2.586644 Error: MS(Rep(Location)) + MS(Location*Variety) - MS(Error) Varieties fixed, Locations random
Treatments are fixed, Years are random SourcedfSSMSExpected MS Yearsl-1SSYM1 Blocks in Yearsl(r-1)SSB(Y)M2 Treatmentt-1SSTM3 Years X Treatment(l-1)(t-1)SSYTM4 Pooled Error l(r-1)(t-1)SSEM5 F for Years = M1/M2 F for Treatments = M3/M4 F for Years x Treatments = M4/M5
Locations and Years in the same trial Can analyze as a factorial Can determine the magnitude of the interactions between treatments and environments –TxY, TxL, TxYxL For a simpler interpretation, consider all year and location combinations as “sites” and use one of the models presented for multilocational trials Sourcedf Yearsy-1 Locationsl-1 Years x Locations(y-1)(l-1) Block(Years x Locations)yl(r-1)
Combined Lab or Greenhouse Study (CRD) SourcedfSSMSExpected MS Triall-1SSLM1 Treatmentt-1SSTM2 Trial x Treatment(l-1)(t-1)SSLTM3 Pooled Error lt(r-1)SSEM4 Assume Treatments are fixed, Trials are random A “trial” is a repetition of a replicated experiment If there are no interactions, consider pooling SSLT and SSE –Use a conservative P value to pool (e.g. >0.25 or >0.5) F for Trials = M1/M4 (SAS would say M1/M3) F for Treatments = M2/M3 F for Trials x Treatments = M3/M4
Preliminary ANOVA Assumptions for this example: –locations and blocks are random –Treatments are fixed If Loc. x Treatment interactions are significant, must be cautious in interpreting main effects combined across all locations SourcedfSSMSF Totallrt-1SSTot Locationl-1SSLM1M1/M2 Blocks in Loc.l(r-1)SSB(L)M2 Treatmentt-1SSTM3M3/M4 Loc. X Treatment(l-1)(t-1)SSLTM4M4/M5 Pooled Error l(r-1)(t-1)SSEM5
Genotype by Environment Interactions (GEI) When the relative performance of varieties differs from one location or year to another… –how do you make selections? –how do you make recommendations to farmers?
Genotype x Environment Interactions (GEI) How much does GEI contribute to variation among varieties or breeding lines? P = G + E + GE P is phenotype of an individual G is genotype E is environment GE is the interaction DeLacey et al., 1990 – summary of results from many crops and locations 70-20-10 rule E: GE: G 20% of the observed variation among genotypes is due to interaction of genotype and environment
Stability Many approaches for examining GEI have been suggested since the 1960’s Characterization of GEI is closely related to the concept of stability. “Stability” has been interpreted in different ways. –Static – performance of a genotype does not change under different environmental conditions (relevant for disease resistance, quality factors) –Dynamic – genotype performance is affected by the environment, but its relative performance is consistent across environments. It responds to environmental factors in a predictable way.
Measures of stability CV of individual genotypes across locations Regression of genotypes on an environmental index –Eberhart and Russell, 1966 Ecovalence –Wricke, 1962 Superiority measure of cultivars –Lin and Binns, 1988 Many others…
Analysis of GEI – other approaches Rank sum index (nonparametric approach) Cluster analysis Factor analysis Principal component analysis AMMI Pattern analysis Analysis of crossovers Partial Least Squares Regression Factorial Regression