Presentation on theme: "1 DESIGN OF EXPERIMENTS by R. C. Baker How to gain 20 years of experience in one short week!"— Presentation transcript:
1 DESIGN OF EXPERIMENTS by R. C. Baker How to gain 20 years of experience in one short week!
2 Role of DOE in Process Improvement DOE is a formal mathematical method for systematically planning and conducting scientific studies that change experimental variables together in order to determine their effect of a given response. DOE makes controlled changes to input variables in order to gain maximum amounts of information on cause and effect relationships with a minimum sample size.
3 Role of DOE in Process Improvement DOE is more efficient that a standard approach of changing “one variable at a time” in order to observe the variable’s impact on a given response. DOE generates information on the effect various factors have on a response variable and in some cases may be able to determine optimal settings for those factors.
4 Role of DOE in Process Improvement DOE encourages “brainstorming” activities associated with discussing key factors that may affect a given response and allows the experimenter to identify the “key” factors for future studies. DOE is readily supported by numerous statistical software packages available on the market.
5 BASIC STEPS IN DOE Four elements associated with DOE: 1. The design of the experiment, 2. The collection of the data, 3. The statistical analysis of the data, and 4. The conclusions reached and recommendations made as a result of the experiment.
6 TERMINOLOGY Replication – repetition of a basic experiment without changing any factor settings, allows the experimenter to estimate the experimental error (noise) in the system used to determine whether observed differences in the data are “real” or “just noise”, allows the experimenter to obtain more statistical power (ability to identify small effects)
7 TERMINOLOGY.Randomization – a statistical tool used to minimize potential uncontrollable biases in the experiment by randomly assigning material, people, order that experimental trials are conducted, or any other factor not under the control of the experimenter. Results in “averaging out” the effects of the extraneous factors that may be present in order to minimize the risk of these factors affecting the experimental results.
8 TERMINOLOGY Blocking – technique used to increase the precision of an experiment by breaking the experiment into homogeneous segments (blocks) in order to control any potential block to block variability (multiple lots of raw material, several shifts, several machines, several inspectors). Any effects on the experimental results as a result of the blocking factor will be identified and minimized.
9 TERMINOLOGY Confounding - A concept that basically means that multiple effects are tied together into one parent effect and cannot be separated. For example, 1. Two people flipping two different coins would result in the effect of the person and the effect of the coin to be confounded 2. As experiments get large, higher order interactions (discussed later) are confounded with lower order interactions or main effect.
10 TERMINOLOGY Factors – experimental factors or independent variables (continuous or discrete) an investigator manipulates to capture any changes in the output of the process. Other factors of concern are those that are uncontrollable and those which are controllable but held constant during the experimental runs.
11 TERMINOLOGY Responses – dependent variable measured to describe the output of the process. Treatment Combinations (run) – experimental trial where all factors are set at a specified level.
12 TERMINOLOGY Fixed Effects Model - If the treatment levels are specifically chosen by the experimenter, then conclusions reached will only apply to those levels. Random Effects Model – If the treatment levels are randomly chosen from a population of many possible treatment levels, then conclusions reached can be extended to all treatment levels in the population.
13 PLANNING A DOE Everyone involved in the experiment should have a clear idea in advance of exactly what is to be studied, the objectives of the experiment, the questions one hopes to answer and the results anticipated
14 PLANNING A DOE Select a response/dependent variable (variables) that will provide information about the problem under study and the proposed measurement method for this response variable, including an understanding of the measurement system variability
15 PLANNING A DOE Select the independent variables/factors (quantitative or qualitative) to be investigated in the experiment, the number of levels for each factor, and the levels of each factor chosen either specifically (fixed effects model) or randomly (random effects model).
16 PLANNING A DOE Choose an appropriate experimental design (relatively simple design and analysis methods are almost always best) that will allow your experimental questions to be answered once the data is collected and analyzed, keeping in mind tradeoffs between statistical power and economic efficiency. At this point in time it is generally useful to simulate the study by generating and analyzing artificial data to insure that experimental questions can be answered as a result of conducting your experiment
17 PLANNING A DOE Perform the experiment (collect data) paying particular attention such things as randomization and measurement system accuracy, while maintaining as uniform an experimental environment as possible. How the data are to be collected is a critical stage in DOE
18 PLANNING A DOE Analyze the data using the appropriate statistical model insuring that attention is paid to checking the model accuracy by validating underlying assumptions associated with the model. Be liberal in the utilization of all tools, including graphical techniques, available in the statistical software package to insure that a maximum amount of information is generated
19 PLANNING A DOE Based on the results of the analysis, draw conclusions/inferences about the results, interpret the physical meaning of these results, determine the practical significance of the findings, and make recommendations for a course of action including further experiments
20 SIMPLE COMPARATIVE EXPERIMENTS Single Mean Hypothesis Test Difference in Means Hypothesis Test with Equal Variances Difference in Means Hypothesis Test with Unequal Variances Difference in Variances Hypothesis Test Paired Difference in Mean Hypothesis Test One Way Analysis of Variance
21 CRITICAL ISSUES ASSOCIATED WITH SIMPLE COMPARATIVE EXPERIMENTS How Large a Sample Should We Take? Why Does the Sample Size Matter Anyway? What Kind of Protection Do We Have Associated with Rejecting “Good” Stuff? What Kind of Protection Do We Have Associated with Accepting “Bad” Stuff?
22 Single Mean Hypothesis Test After a production run of 12 oz. bottles, concern is expressed about the possibility that the average fill is too low. Ho: = 12 Ha: <> 12 level of significance = =.05 sample size = 9 SPEC FOR THE MEAN: 12 +.1
23 Single Mean Hypothesis Test Sample mean = 11.9 Sample standard deviation = 0.15 Sample size = 9 Computed t statistic = -2.0 P-Value = 0.0805162 CONCLUSION: Since P-Value >.05, you fail to reject hypothesis and ship product.
24 Single Mean Hypothesis Test Power Curve
25 Single Mean Hypothesis Test Power Curve - Different Sample Sizes
26 DIFFERENCE IN MEANS - EQUAL VARIANCES Ho: Ha: level of significance = =.05 sample sizes both = 15 Assumption: = Sample means = 11.8 and 12.1 Sample standard deviations = 0.1 and 0.2 Sample sizes = 15 and 15
27 DIFFERENCE IN MEANS - EQUAL VARIANCES Can you detect this difference?
28 DIFFERENCE IN MEANS - EQUAL VARIANCES
29 DIFFERENCE IN MEANS - unEQUAL VARIANCES Same as the “Equal Variance” case except the variances are not assumed equal. How do you know if it is reasonable to assume that variances are equal OR unequal?
30 DIFFERENCE IN VARIANCE HYPOTHESIS TEST Same example as Difference in Mean: Sample standard deviations = 0.1 and 0.2 Sample sizes = 15 and 15 ********************************** Null Hypothesis: ratio of variances = 1.0 Alternative: not equal Computed F statistic = 0.25 P-Value = 0.0140071 Reject the null hypothesis for alpha = 0.05.
31 DIFFERENCE IN VARIANCE HYPOTHESIS TEST Can you detect this difference?
32 DIFFERENCE IN VARIANCE HYPOTHESIS TEST -POWER CURVE
33 PAIRED DIFFERENCE IN MEANS HYPOTHESIS TEST Two different inspectors each measure 10 parts on the same piece of test equipment. Null hypothesis: DIFFERENCE IN MEANS = 0.0 Alternative: not equal Computed t statistic = -1.22702 P-Value = 0.250944 Do not reject the null hypothesis for alpha = 0.05.
34 PAIRED DIFFERENCE IN MEANS HYPOTHESIS TEST - POWER CURVE
35 ONE WAY ANALYSIS OF VARIANCE Used to test hypothesis that the means of several populations are equal. Example: Production line has 7 fill needles and you wish to assess whether or not the average fill is the same for all 7 needles. Experiment: sample 20 fills from each of the 9 needles and test at 5% level of sign. Ho: =
36 RESULTS: ANALYSIS OF VARIANCE TABLE
37 SINCE NEEDLE MEANS ARE NOT ALL EQUAL, WHICH ONES ARE DIFFERENT? Multiple Range Tests for 7 Needles
38 VISUAL COMPARISON OF 7 NEEDLES
39 FACTORIAL (2 k ) DESIGNS Experiments involving several factors ( k = # of factors) where it is necessary to study the joint effect of these factors on a specific response. Each of the factors are set at two levels (a “low” level and a “high” level) which may be qualitative (machine A/machine B, fan on/fan off) or quantitative (temperature 80 0 /temperature 90 0, line speed 4000 per hour/line speed 5000 per hour).
40 FACTORIAL (2 k ) DESIGNS Factors are assumed to be fixed (fixed effects model) Designs are completely randomized (experimental trials are run in a random order, etc.) The usual normality assumptions are satisfied.
41 FACTORIAL (2 k ) DESIGNS Particularly useful in the early stages of experimental work when you are likely to have many factors being investigated and you want to minimize the number of treatment combinations (sample size) but, at the same time, study all k factors in a complete factorial arrangement (the experiment collects data at all possible combinations of factor levels).
42 FACTORIAL (2 k ) DESIGNS As k gets large, the sample size will increase exponentially. If experiment is replicated, the # runs again increases.
43 FACTORIAL (2 k ) DESIGNS (k = 2) Two factors set at two levels (normally referred to as low and high) would result in the following design where each level of factor A is paired with each level of factor B.
44 FACTORIAL (2 k ) DESIGNS (k = 2) Estimating main effects associated with changing the level of each factor from low to high. This is the estimated effect on the response variable associated with changing factor A or B from their low to high values.
45 FACTORIAL (2 k ) DESIGNS (k = 2): GRAPHICAL OUTPUT Neither factor A nor Factor B have an effect on the response variable.
46 FACTORIAL (2 k ) DESIGNS (k = 2): GRAPHICAL OUTPUT Factor A has an effect on the response variable, but Factor B does not.
47 FACTORIAL (2 k ) DESIGNS (k = 2): GRAPHICAL OUTPUT Factor A and Factor B have an effect on the response variable.
48 FACTORIAL (2 k ) DESIGNS (k = 2): GRAPHICAL OUTPUT Factor B has an effect on the response variable, but only if factor A is set at the “High” level. This is called interaction and it basically means that the effect one factor has on a response is dependent on the level you set other factors at. Interactions can be major problems in a DOE if you fail to account for the interaction when designing your experiment.
49 EXAMPLE: FACTORIAL (2 k ) DESIGNS (k = 2) A microbiologist is interested in the effect of two different culture mediums [medium 1 (low) and medium 2 (high)] and two different times [10 hours (low) and 20 hours (high)] on the growth rate of a particular CFU [Bugs].
50 EXAMPLE: FACTORIAL (2 k ) DESIGNS (k = 2) Since two factors are of interest, k =2, and we would need the following four runs resulting in
51 EXAMPLE: FACTORIAL (2 k ) DESIGNS (k = 2) Estimates for the medium and time effects are Medium effect = [(15+39)/2] – [(17 + 38)/2] = -0.5 Time effect = [(38+39)/2] – [(17 + 15)/2] = 22.5
52 EXAMPLE: FACTORIAL (2 k ) DESIGNS (k = 2)
53 EXAMPLE: FACTORIAL (2 k ) DESIGNS (k = 2) A statistical analysis using the appropriate statistical model would result in the following information. Factor A (medium) and Factor B (time)
54 EXAMPLE: CONCLUSIONS In statistical language, one would conclude that factor A (medium) is not statistically significant at a 5% level of significance since the p-value is greater than 5% (0.05), but factor B (time) is statistically significant at a 5 % level of significance since this p- value is less than 5%.
55 EXAMPLE: CONCLUSIONS In layman terms, this means that we have no evidence that would allow us to conclude that the medium used has an effect on the growth rate, although it may well have an effect (our conclusion was incorrect).
56 EXAMPLE: CONCLUSIONS Additionally, we have evidence that would allow us to conclude that time does have an effect on the growth rate, although it may well not have an effect (our conclusion was incorrect).
57 EXAMPLE: CONCLUSIONS In general we control the likelihood of reaching these incorrect conclusions by the selection of the level of significance for the test and the amount of data collected (sample size).
58 2 k DESIGNS (k > 2) As the number of factors increase, the number of runs needed to complete a complete factorial experiment will increase dramatically. The following 2 k design layout depict the number of runs needed for values of k from 2 to 5. For example, when k = 5, it will take 2 5 = 32 experimental runs for the complete factorial experiment.
59 Interactions for 2k Designs (k = 3) Interactions between various factors can be estimated for different designs above by multiplying the appropriate columns together and then subtracting the average response for the lows from the average response for the highs.
60 Interactions for 2k Designs (k = 3)
61 2 k DESIGNS (k > 2) Once the effect for all factors and interactions are determined, you are able to develop a prediction model to estimate the response for specific values of the factors. In general, we will do this with statistical software, but for these designs, you can do it by hand calculations if you wish.
62 2 k DESIGNS (k > 2) For example, if there are no significant interactions present, you can estimate a response by the following formula. (for quantitative factors only)
63 ONE FACTOR EXAMPLE
64 ONE FACTOR EXAMPLE The output shows the results of fitting a general linear model to describe the relationship between GRADE and #HRS STUDY. The equation of the fitted general model is GRADE = 29.3 + 3.1* (#HRS STUDY) The fitted orthogonal model is GRADE = 75 + 15 * (SCALED # HRS)
65 Two Level Screening Designs Suppose that your brainstorming session resulted in 7 factors that various people think “might” have an effect on a response. A full factorial design would require 2 7 = 128 experimental runs without replication. The purpose of screening designs is to reduce (identify) the number of factors down to the “major” role players with a minimal number of experimental runs. One way to do this is to use the 2 3 full factorial design and use interaction columns for factors.
66 Note that * Any factor d effect is now confounded with the a*b interaction * Any factor e effect is now confounded with the a*c interaction * etc. * What is the d*e interaction confounded with????????
67 Problems that Interactions Cause! Interactions – If interactions exist and you fail to account for this, you may reach erroneous conclusions. Suppose that you plan an experiment with four runs and three factors resulting in the following data:
68 Problems that Interactions Cause! Factor A Effect = 0 Factor B Effect = 0 In this example, if you were assuming that “smaller is better” then it appears to make no difference where you set factors A and B. If you were to set factor A at the low value and factor B at the low value, your response variable would be larger than desired. In this case there is a factor A interaction with factor B.
69 Problems that Interactions Cause!
70 Resolution of a Design Resolution III Designs – No main effects are aliased with any other main effect BUT some (or all) main effects are aliased with two way interactions Resolution IV Designs – No main effects are aliased with any other main effect OR two factor interaction, BUT two factor interactions may be aliased with other two factor interactions Resolution V Designs – No main effect OR two factor interaction is aliased with any other main effect or two factor interaction, BUT two factor interactions are aliased with three factor interactions.
71 Common Screening Designs Fractional Factorial Designs – the total number of experimental runs must be a power of 2 (4, 8, 16, 32, 64, …). If you believe first order interactions are small compared to main effects, then you could choose a resolution III design. Just remember that if you have major interactions, it can mess up your screening experiment.
72 Common Screening Designs Plackett-Burman Designs – Two level, resolution III designs used to study up to n-1 factors in n experimental runs, where n is a multiple of 4 ( # of runs will be 4, 8, 12, 16, …). Since n may be quite large, you can study a large number of factors with moderately small sample sizes. (n = 100 means you can study 99 factors with 100 runs)
73 Other Design Issues May want to collect data at center points to estimate non-linear responses More than two levels of a factor – no problem (multi-level factorial) What do you do if you want to build a non- linear model to “optimize” the response. (hit a target, maximize, or minimize) – called response surface modeling
77 CLASSROOM EXERCISE STUDENT IN-CLASS EXPERIMENT: Collect data for experiment to determine factor settings (two factors) to hit a target response (spot on wall). Factor A – height of shaker (low and high) Factor B – location of shaker (close to hand and close to wall) Design experiment – would suggest several replications
78 CLASSROOM EXERCISE Conduct Experiment – student holds 3 foot “pin the tail on the donkey” stick and attempts to hit the target. An observer will assist to mark the hit on the target. Collect data – students take data home for week and come back with what you would recommend AND why. YOU TELL THE CLASS HOW TO PLAY THE GAME TO “WIN”.