Statistical Design of Experiments Training for AOCS Journal Editors

Statistical Design of Experiments Training for AOCS Journal Editors
Frank Rossi, Associate Director Statistics, Kraft Foods May 3, 2015

Cake Baking Experiment
Objective Cake formula has been determined Need to determine the cooking time and temperature levels that produce the best cake Temperature can be varied from 300 to 450 degrees F Time can be varied time from 30 to 50 minutes Previous research has shown that a baked cake with an internal temperature of 200 degrees is ideal

Step 1 Bake some cakes where cooking time and temperature are systematically varied across the specified ranges. Measure the internal temperature on each one. The time and temperature combinations form a square.

Design with the data collected What have we learned so far? Somewhere around here seems likely to deliver How do we get to this exactly?

Step 2 Bake some more cakes to better define the cooking time and temperature space. Measure the internal temperature on these too. The added points form a circle

Complete data set These points are sufficient to build a model to predict the internal temperature Why did I run two more at the center? Predictions are valid only within the circle! 191 189 186

What does the model look like? A surface plot indicates the time and temperature combinations that are expected to deliver the desired internal temperature of 200 degrees

Some Thoughts on the Cake Baking Experiments We don’t particularly care about the individual experimental runs We use the set of them to understand the effects of time and cook temperature on internal temperature Since the design is created so that we can independently quantify the effects on time and cooking temperature in the most efficient manner, we need all of the experimental runs to accomplish this We use the repeatability of replicate runs as a ruler to assess the significance of the effects of cooking time and temperature By randomizing the order of experimental runs we can make sure that the effects of any uncontrollable factors are not mixed up with the effects of cooking time and temperature.

Statistically Designed Experiments
How are they different? In a statistically designed experiment, factors (often ingredient levels and/or processing conditions) are systematically varied so that their effects can be quantified in an efficient way Statistical analysis of designed studies focuses on the development of models that relate how factors affect the responses These models can provide a wide variety of information, such as: an ordering of important formulation/processing variables an area of interest for further study optimal product formulations (subject to cost and production constraints, if applicable) specifications

Experimental Design Types
There are several types of statistically designed experiments, each focusing on a different objective Screening designs are used to: reducing the number of factors under investigation identifying factor ranges further study Response Surface designs are used to: develop a model of the space defined by the factor ranges (optimization) Mixture designs have factor levels that add to a fixed total objective can be screening or optimization Robust (Taguchi) designs are used to: determine the factor settings that reduce variability due to factors that are costly, difficult or impossible to control Step 2 in my cake baking was a response surface design

Screening Design Most often have 2 levels for each factor
Designs are factorial combinations – the complete set or a fraction Example: 8 factors are studied in a 16 run Fractional Factorial design

Screening Design Step 1 in my cake baking design was a screening design Independently vary only 2 levels of each factor

Response Surface Designs
Most often have a small number of factors varied over 3-5 levels Common design types have all continuous factors : Central Composite Box Behnken Custom designs can be created to include discontinuous factors D-optimal design Example: five factors are investigated in a 28 run Central Composite design

Response Surface Designs
Step 2 in my cake baking was a response surface design There are 5 levels for each factor This is the minimum required to fit a response surface model Covering the design space more densely is sometimes done

Mixture Designs Factors sum to a fixed total Common design types:
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Triple S ec T equila L ime J uic e Common design types: Simplex Lattice Example: mixture design varying tequila, triple sec, and lime juice to determine the optimal Margherita

Robust Designs Objective is robustness to “noise”:
Noise is variation that is costly, difficult or impossible to control Example: make a macaroni and cheese product robust to consumer preparation variation Prep 1 - overheating the sauce and under-draining the pasta (thin consistency) Prep 2 - under-heating the sauce and over-draining the pasta (thick consistency) Both average and variation across the noise is analyzed Screening designs are typically used

Analysis of Experimental Designs
Models relate design factors to responses Expressed in the form of an equation Example – simple model for a 3 factor experiment: The X’s represent the factor settings, the betas are coefficients that are estimated based on the data from the experiment. A small beta for a specific factor indicates that the factor has a small effect on the response. A large beta for a specific factor indicates that the factor has a large effect on the response. The sign of the beta indicates the direction of the effect

Experimental Design Models
Models can be very simple or much more complex Simple model - main effects model: More complex model - main effects and pairwise interactions model: Even more complex model - full quadratic model:

Objectives, Designs and Models
Designs are linked to project objectives, models are linked to design types Screening designs are based on simpler models – main effect models or ones containing some or all possible factor interaction pairs Response surface designs are based on more complex models – full quadratic models or variants of these More data is needed to fit more complex models. The key to models that fit well and are easy to explain and understand is having a design that makes the factor main effects and interactions independent of each other.

Independence Varying factors independently means that we can separate their effects in the analysis model A poor design: We cannot separate the effect of factor 1 and factor 2 – we say that these factors are confounded Run Factor 1 Factor 2 1 50 .5 2 60 3 70 1.5

Independence What does this mean? A better design:
We can now separate the effects of factor 1 and factor 2 – they are no longer confounded Run Factor 1 Factor 2 1 50 .5 2 1.5 3 70 4

Independence A full factorial design (all possible factor level combinations) guarantees that we will be able to separate out the main effects and pairwise interactions of the factors In some instances a fraction of the set of all possible factor level combinations will allow separation of all main effects and pairwise interactions We can also use a fraction if fitting a simpler model is what is desired If even just one design run is missing the design may be compromised as the independence of the factors will likely be lost!

Design Resolution A design is characterized its by resolution – how complicated a model can we fit to the response data? Resolution 3 designs – main effects are not confounded with other main effects but are confounded with pair-wise interactions Resolution 4 designs – main effects are not confounded with other main effects or pair-wise interactions; pair-wise interactions are confounded with each other Resolution 5 designs – there is no confounding between main effects, between pair-wise interactions, or between main effects and pair-wise interactions

Design Strategy Trade off between available resources and information
Designs that provide the most amount of information require the most runs Designs with smaller numbers of runs may not provide as much information as we would like Example: to investigate 6 factors you could use… an 8 run resolution 3 design. a 16 run resolution 4 design. a 32 run resolution 5 design.

Replication Replicating design points informs us about the reliability of the results Would you make the same conclusions about the experimental results for both of these studies? Replication variability is a ruler that we use to judge the significance of the size of the factor effects. 250 300 350 400 450 500 169 165 170 185 191 189 25 30 35 40 45 50 55 time 250 300 350 400 450 500 169 165 170 185 205 25 30 35 40 45 50 55 time The factor effects are much larger than the variability in the repeated runs. The factor effects are not larger than the variability in the repeated runs.

Randomization The order in which we make the experimental runs can be important Factors external to the design may influence the results, for example: several raw material batches may be needed to complete the design runs there may be a time sequence effect from the start to the end of the experiment Randomization helps keep the effects of any external factors separate from the factors in the experiment

Randomization Example: 3 factor 8 run full factorial experiment
4 runs can be made from each batch of raw material. Randomized order: the effect of raw material batch is spread out across the factor main effects and interactions. Run Batch Factor 1 Factor 2 Factor 3 1 -1 2 3 4 5 6 7 8

Blocking Sometimes randomization isn’t enough!
Certain factors external to the design may have such strong effects that they need to be considered in the experimental design rather than dealt with through randomization Including these factors as blocks means confounding the block levels with higher order factor interactions More runs may be needed to accommodate a blocking factor

Blocking Example: 3 factor 8 run full factorial experiment
4 runs can be made from each batch of raw material, which we know alone will have a large effect. Batch is confounded with the three way interaction of Factor 1, Factor 2 and Factor 3. Run Batch Factor 1 Factor 2 Factor 3 1 -1 2 3 4 5 6 7 8

Model Fitting Regression analysis is used to fit the design model to the response data Example – model fitting results from a 3 factor central composite design Factors: steam pressure, pump pressure and line speed Response: viscosity The model for this design is:

Model Fitting Regression model results
Percentage of response variability explained by the model. Significance test for the model. P-values < .05 are often considered significant Total response variability. Response variability explained by the model.

Model Fitting Regression model results Significance test for
P-values < .05 are often considered significant Estimate of Estimate of Significance test for P-values < .05 are often considered significant Estimate of

Model Fitting Pairwise interactions represented in a matrix plot
Significant interaction between factor 1 and factor 2

Model Assessment Typical assessment tools used for other linear regression models apply here Residuals appear to be normally distributed No patterns in residual plots:

Model Refinement Is this a critical next step?
Often regression models are refined to remove terms that are not statistically significant This is not as critical for designed experiments – why? Designs are created so that the factors are independent of each other The conclusions of factor significance will not change very much Model predictions will be similar for the full and refined models So is it necessary? Maybe preferred Sometimes software makes this difficult to work with multiple responses as they may have different significant effects

Model Refinement Analysis results for full and refined models
Full Model Estimates are the same or very nearly so P-values are directionally similar Refined Model

Model Verification How do we know we can depend on a model’s predictions? This is a critical step! Example: I want to use my viscosity model to identify factor level combinations that achieve a target of 4300 There are many possibilities! Line speed = 22 Line speed = 31 Line speed = 40

Model Verification It is critical that factor level combinations of interest be verified! Factor level combinations should be run, responses measured and then compared to the model predictions to verify the model’s accuracy Model verification is helped when some of the original runs from the experiment are repeated This verifies that the new runs are matching the original Maybe differences are not the model but the second execution!

How to Judge a Paper With a DOE
Things to consider about the design Does the design match the objective (or stated conclusions)? You cannot optimize with a screening design Need at least three levels for each continuous factor for optimization Need to be able to fit at minimum a full quadratic model for optimization How was the design executed? Randomization or a restricted randomization would be best If not is there a compelling reason? This should be mentioned or discussed in the paper’s text Is there replication? This helps demonstrate the validity of the results

How to Judge a Paper With a DOE
Things to consider about the analysis How much of the analysis is shared in the paper? Response data ANOVA or regression output (p-values) Visualization of the factor effects Is it clear that the authors understand the output in the analysis? Is there a focus on just one or two things in the output (like R2) Is there any assessment of the models and how well they fit the data? Residual analyses Plot of observed by predicted Have the models been refined (insignificant terms removed from the model) Is there any verification of the models’ ability to predict? Are predictions made within the design space?

Assessments of Individual Papers
Scored as: Good (G), Fair (F), Poor (P), None (N) Criteria S. DEMIRKOL ET AL. (2006) S.E. LUMOR AND C.C. AKOH (2005) C.F. TORRES ET AL. (2002) J.R. Orives Et Al. (2014) JAOCS JAOCS N. Anarjan Et Al. (2014) Appropriate Design Replication Randomization Output Presentation Output Discussion Model Assessment Model Validation

Example 1 (2006) OPTIMIZATION OF ENZYMATIC ALCOHOLYSIS OF SOYBEAN OIL
Design – faced centered cube (central composite design) with 3 centerpoint replicates is appropriate for optimization Observed response data, predictions and residuals are included in the paper No discussion of design execution (randomization)

Statistical analysis results include model coefficients and p-values The authors have refined the model by removing statistically insignificant terms The model results are visualized in contour plots

Only the barest assessment of the fit of the model: plot of observed by predicted (the residuals are differences between the points are the line) No validation of the predicted optimum and it is out of the design space Optimum level for enzyme/oil weight ratio is .09, the range of this factor in the design is .10 to .20

Scored as: Good (G), Fair (F), Poor (P), None (N) Criteria Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Appropriate Design G Replication Randomization N Output Presentation Output Discussion Model Assessment P Model Validation

Example 2 (2005) INCORPORATION OF STEARIC ACID INTO PO:PKO
Design – three factor central composite design (in a sphere) with 9 (OMG!) centerpoint replicates is appropriate for optimization Two design runs were “excluded from model fitting” – explained as outliers. These are critical for fitting the model! No mention of randomization

Statistical analysis results include model coefficients and p-values The authors have refined the model by removing statistically insignificant terms The model results are visualized in contour plots

Model assessment: normal probability plot of residuals and observed by predicted plot

Model validation with 5 additional runs

Scored as: Good (G), Fair (F), Poor (P), None (N) Criteria Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Appropriate Design G P Replication Randomization N Output Presentation Output Discussion Model Assessment F Model Validation

Example 3 (2002) OPTIMIZATION OF THE ACIDOLYSIS OF FISH OIL WITH CLA
Design: 22 full factorial designs with center-points at each of 3 levels of a third factor – this is not appropriate for a true optimization since curvature cannot be fit for two of the factors Replication and randomization included in the design

Statistical analysis results include model coefficients and p-values There is no evidence that any model assessment was performed

There is a detailed discussion of the modeling results and they are visualized in a number of informative plots. The authors use and description of desirability functions to assess multiple responses demonstrates understanding of their modeling There was no validation of the optimized conditions

Scored as: Good (G), Fair (F), Poor (P), None (N) Criteria Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Appropriate Design G G/P P Replication Randomization N Output Presentation Output Discussion Model Assessment F Model Validation

Example 4 (2014) Experimental Design Applied for Cost and Efficiency of Antioxidants in Biodiesel Design - Simplex centroid design is appropriate for mixture optimization Replication of center-point is included in the design No mention of randomization

Example 4 Only very basic analysis results are presented in the paper
(2014) Experimental Design Applied for Cost and Efficiency of Antioxidants in Biodiesel Only very basic analysis results are presented in the paper No mention of either model assessment or refinement

Example 4 (2014) Experimental Design Applied for Cost and Efficiency of Antioxidants in Biodiesel Visualization of the response surface in a three dimensional plot is more sexy than informative (a contour plot will be more useful) The authors do a nice job of describing in the text the factor effects and their implications on both the response of most interest and cost

Example 4 (2014) Experimental Design Applied for Cost and Efficiency of Antioxidants in Biodiesel The authors use multi-response optimizer functionality in their software to determine a cost effective blend There is no validation of the blend though there is a reference that the prediction is close to the observed response in a design point nearby

Scored as: Good (G), Fair (F), Poor (P), None (N) Criteria Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Appropriate Design G G/P P Replication G+ F Randomization N Output Presentation Output Discussion Model Assessment Model Validation

Example 5 JAOCS Methanolysis of Hevea brasiliensis oil by SO3H-MCM-41 catalyst: Box-Behnken experimental design Design - Box-Behnken design limits the area where the model can be used to predict (no corner points in the 4-dimensional hypercube) There is no mention of randomization

Example 5 JAOCS Methanolysis of Hevea brasiliensis oil by SO3H-MCM-41 catalyst: Box-Behnken experimental design There are no statistical analysis results presented in the paper, only the regression equation The equation looks to be in coded units for the factors (the coefficients are similar in scale though the factors themselves are not) There is no evidence of a statistical assessment of the significance of the factor effects

Example 5 JAOCS Methanolysis of Hevea brasiliensis oil by SO3H-MCM-41 catalyst: Box-Behnken experimental design Only the barest assessment of the fit of the model: plot of observed by predicted (the residuals are differences between the points are the line)

Example 5 JAOCS Methanolysis of Hevea brasiliensis oil by SO3H-MCM-41 catalyst: Box-Behnken experimental design The discussion of the factor effects is based only on a visual evaluation of the factor effects from viewing 3-dimensional plots of the response surface Main effect and interaction plots would make it easier for readers to understand the nature of the factors and their interactions The optimum is validated with three replicated runs

Scored as: Good (G), Fair (F), Poor (P), None (N) Criteria Example 1 Example 2 Example 3 Example 4 Example 5 Example 6 Example 7 Appropriate Design G G/P P F Replication G+ Randomization N Output Presentation Output Discussion Model Assessment Model Validation

Example 6 JAOCS TRANSESTERIFICATION OF SANITATION WASTE FOR BIODIESEL PRODUCTION USING RESPONSE SURFACE METHODOLOGY The 2 factor Central Composite Design (the authors call it a 3x3 factorial which is also correct) is appropriate for optimization There is no mention of randomization

Example 6 JAOCS TRANSESTERIFICATION OF SANITATION WASTE FOR BIODIESEL PRODUCTION USING RESPONSE SURFACE METHODOLOGY ANOVA results for all responses are provided in the paper. It is evident that the authors have refined the models be removing non-significant terms and they provide the final prediction equations There is no assessment of the fit of the models (not even R2!)so we do not know how accurately the equations will predict

Example 6 JAOCS TRANSESTERIFICATION OF SANITATION WASTE FOR BIODIESEL PRODUCTION USING RESPONSE SURFACE METHODOLOGY The identified optimum values are all design points so it appears that the developed models were not used to determine these There appears to be an error in this one since 40 is outside the design space but there is a data point at Temp = 60 that corresponds to this value (and neglects to mention that responses at the replicate points are 91.9 and 92.2)

Example 6 JAOCS TRANSESTERIFICATION OF SANITATION WASTE FOR BIODIESEL PRODUCTION USING RESPONSE SURFACE METHODOLOGY Visualization of the response surface in a three dimensional plot is more sexy than informative (a contour plot will be more useful) The discussion of the factor effects is based only on a visual evaluation of the factor effects from viewing 3-dimensional plots of the response surface Main effect and interaction plots would make it easier for readers to understand the nature of the factors and their interactions

Example 7 (2014) Optimization of Mixing Parameters for a-Tocopherol Nanodispersions Prepared Using Solvent Displacement Method 3 Factor Central Composite Design in a sphere is appropriate for optimization Though not explicitly mentioned in the paper the design execution appears to be randomized Observed and predicted response values are included in the table

Example 7 (2014) Optimization of Mixing Parameters for a-Tocopherol Nanodispersions Prepared Using Solvent Displacement Method ANOVA results for all responses are provided in the paper. It is evident that the authors have refined the models be removing non-significant terms and they provide the final prediction equations

Example 7 (2014) Optimization of Mixing Parameters for a-Tocopherol Nanodispersions Prepared Using Solvent Displacement Method There is discussion of model assessment though R2, a lack of fit test and observed versus predicted plots

Example 7 (2014) Optimization of Mixing Parameters for a-Tocopherol Nanodispersions Prepared Using Solvent Displacement Method Contour plots and a multi-response optimization plot help the readers understand how the optimum factor levels were determined The predicted optimum was validated with an additional experimental run

Statistical Design of Experiments Training for AOCS Journal Editors

Similar presentations

Presentation on theme: "Statistical Design of Experiments Training for AOCS Journal Editors"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Statistical Design of Experiments Training for AOCS Journal Editors

Similar presentations

Presentation on theme: "Statistical Design of Experiments Training for AOCS Journal Editors"— Presentation transcript:

Similar presentations

About project

Feedback