Presentation on theme: "Design and Analysis of Augmented Designs in Screening Trials"— Presentation transcript:
1Design and Analysis of Augmented Designs in Screening Trials Kathleen YeaterUSDA-ARS-SPA3rd Curators WorkshopFebruary 3, 2010
25 Basic Steps of Experiment 1. Research Planning 2. Experimental Design 3. Summarize Observations 4. Analysis – Statistical Inference 5. Document / Present study resultsWhy start with this? This is how the talk is constructed / organized.
3Research Planning - What is the Question? Is the focus on development?Are you trying to find something better?Is it discovery research?Not necessarily a specific hypothesisThis is why the phrase “screening” is in the Title.
4Remember the Basics, the 3 R’s ? ReplicationValid estimation of error varianceReduction of variation among plotsControls (reduces) error varianceUse blocking to control heterogeneity present in experiment; Block at scale of variabilityRandomizationUnbiased estimates of means and variancesMinimum blocking –fewer blocks; blocking at scale of variabilityCause and Effect
5What if replication is ?Impractical, prohibitively expensive, impossibleNot enough material (seed, -icides)Not enough spaceNot enough timeToo many entries
6What leads to unreplicated design in field trials? 3 R’s, making cause and effect statementsIn screening, making a cut based on good/bad in testing…What is the Research Question again?Design for the experiment under consideration; DO NOT experiment for designIn principles of experimental design (3 Rs), trying to make cause and effect statements.Whereas in screening, we’re trying to make first cut diagnoses like in an Early Generation Variety Testing.
7Augmented Design Introduced by various publications of W.T. Federer Developed for plant breeding researchGenotypesYieldDiseaseInsecticides All are excellent subjectHerbicides variables in a ScreeningFertilizers TrialWith Screening Designs – focus is on development – trying to find something better – it is discovery researchWe have identified our Research Question in the Research Planning phase.Now we move on to step two, which is to identify an appropriate design to implement the experiment.
8Augmented Design as Experimental Design Utilizes experimental designs principles for arrangement of checksNew treatments (n) are not replicated and checks are replicated as points of referenceUsually want between 4-6 checksn can be large
9Augmented Design - Implementation I – Select any experiment design for the check(s)II – Enlarge the blocks or increase number of rows and/or columns to accommodate the new test entries (treatments, n)III – New test entries are randomly distributed among blocks/rows/columns
10Design Set-up - RCB A B C D moisture gradient Block effect now removes moisture effect, fair comparisons among treatments.
11Design Set-up – Augmented RCB 1419C12D23520138C18A424D16Bmoisture gradient1922D11CB3A6Now we’ve expanded the block out to include the unreplicated treatments.B10215DAC21177
12Advantages of Augmented Design More than one check included4 to 6 optimalAllows for estimate of experimental error and is efficientLess physical space needed
13How to select a checkChecks are units of experimentation [varieties/genotypes/cultivars/entries] with known ranges of various measurement characteristics that you want to evaluateWhat are good checks for your objective?YieldCHO contentSeed characteristicsA quantitative measurement that holds constant
14Variability of ChecksThe test entries (n) range of measurement will be 5.0 – 40.0.The mean of the overall test entries ~ 12.6.What do you think about these checks?Did I do a good job of selecting appropriate checks?
15Consistent Checks that cover the range of our test data. The test entries (n) range of measurement will be 5.0 – 40.0.The mean of the overall test entries ~ 12.6.
16Augmented RCBDGoal: Screen 300 new entries for response X. This is an Early Generation Screening Trial.4 additional genotypes are CHECK entries (A, B, C, D)6 Blocks (field plots, time placement for lab assay, location in growth chamber)Randomize location of A, B, C, D within each block (replicate the check genotypes within block for spatial variation)300/6 = 50 new entries randomly selected and placed within each blockGive each genotype a random number assignment from 1 to 300, then numbers 1-49 are used in Block 1, numbers are in Block 2, etc.\Discuss how any design is selectable here, you could have incomplete block, split plot/block are easily applied, the key is that the any experimental design can be augmented to accommodate a set of new treatments that are to be replicated once.
17How many repeats of each check for optimal design? Design Resources ServerIndian Agricultural Statistics Research Institute (ICAR), New Delhi, India.Online Design Generation-IAugmented DesignThe reference by Parsad et al has not been peer-reviewed as far as my research tells me.
19Cells filled out, Enter block sizes Augmented Designs Home (Augmented designs) Outline of Analysis Welcome to construction of Augmented Designs. Use this to generate augmented design. Fill in the number of test treatments, control treatments and number of blocks etc. Number of Test treatments (w): 300 Number of Control Treatments (u) 4 Number of Blocks (b) 6 Number of replication of control 2 Optimum Total Number of Experimental Units required: 348 To enter Block Sizes click here Block 1: 58 Block 2: 58 Block 3: 58 Block 4: 58 Block 5: 58 Block 6: 58 Total Number of Experimental Units = 348; Assigned so far = 348; Remaining = 0. Submit
21RCB Model - Augmented Y = u + check + block + test entry + error checks are fixed effects (source of experimental error)block, test entry, and error are random effectsRecall:Fixed effects = parameter estimation (mean and experimental error)Random effects = sources of variabilityJust like you can’t replicate a block – we’re also not replicating a treatmentThese are random effects – variance estimatesTreatments are Random b/c 1) they represent a random selection of the population 2) Another time we might have a different sample, hence they are randomSelect appropriate model to account for variation present in data from experiment
22Data Structure – Summarize Observations Data data-set;input BLOCK ENTRY $ CHECK $ Response ;datalines;1 99_3 C1 99_1 A2 99_3 C99_2 B6 99_4 DSwitch Genotypes to Entry or Entries in previous slidesWe need to have dummy coding to estimate the fixed and random effects
23Analysis of Augmented RCB proc mixed;class CHECK BLOCK ENTRY;model response = CHECK / solution;random BLOCK ENTRY / solution;lsmeans CHECK;run;Remember – Check has a label of 0 for the overall entries in the data
24Covariance Parameters Covariance Parameter EstimatesCov Parm EstimateBLOCK variance component of blockENTRY variance component of entryResidual error varianceVariance estimate corresponding with this response is greatest with the Entry – this is what you want, it shows the greatest variability
25LSMEANS – Checks Least Squares Means Standard Effect CHECK Estimate ErrorCHECKCHECK ACHECK BCHECK CCHECK DCan you all see that where the entries lie within the checks. Checks are more appropriate.
26SOLUTION option in MODEL statement presents estimates of the fixed effect parametersSolution for Fixed EffectsStandardEffect CHECK Estimate ErrorInterceptCHECKCHECK ACHECK BCHECK CCHECK DIntercept = grand mean
27Estimated BLUPs Best Linear Unbiased Predictors Random effects – estimate the varianceEstimate “realized values of random variables” (test entries)Augmented designs – use SOLUTION option in RANDOM statementrandom BLOCK ENTRY / solution;
28Solution for Random Effects Std ErrEffect ENTRY BLOCK Estimate Pred DF t Value Pr > |t|BLOCKBLOCKBLOCKENTRYENTRYENTRYENTRYENTRYENTRYENTRY <.0001ENTRYENTRYENTRYENTRYENTRYENTRY <.0001ENTRY <.0001ENTRY
29“Realized values”Rearrange estimates of entries from highest to lowestproc sort data=data-set;by DESCENDING estimate;run;Add Intercept to Estimate values – calculate predicted adjusted mean valuesdata data-set;pred_adjmean = estimate ;
30Predicted Adjusted Mean Values StdErr pred_Obs Effect BLOCK ENTRY Estimate Pred DF tValue Probt adjmeanENTRY _ <ENTRY _ <3 ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <ENTRY _ <Discuss how to ‘look’ at the data. So, if our focus is to find something ‘better’ i.e. ‘higher up the list’.The focus is not really on doing any multiple comparisons, there is no need. The ranking of the pred_adjusted means allows you to visualize which ‘treatments’ or worth pushing forward for further research. You need to be more willing to accept Type I or Type II errors, because they will probably happen.This is a Linear Model based on the checks. These numbers are a linear model prediction. The ranks and the ordering are the information that help you move forward, it is the potential of the test entry.The effect of the test entry is random, conclusion drawn pertain only to the response of the fixed effects (Checks). Conclusions about the levels at hand – Narrow Space Inference
31Augmented Designs - Recap Screening – Discovery DrivenSelect 4-6 meaningful checksSelect appropriate experimental design and increase rows and columns to include unreplicated test entries in each blockRCB is simplest case, split-plots, factorials are also possibilities (can look at interactions and autocorrelations)Mixed model analysesPhase II begins – Select entries to do pilot study to elicit better estimate of true response; generate hypotheses
32Augmented Designs - References To get started:Google!Federer et al (2001) Agron. J. 93:Federer, W.T. (2005) Agron. J. 97:Burgueño and Crossa (2000) SAS Macro for Analysing Unreplicated DesignsCIMMYT CRIL (crop research informatics laboratory)IASRI, Augmented Design toolIRRISTAT