Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 11 Analysis of Variance General Objective: The quantity of information contained in a sample is affected by various factors. This chapter introduces.

Similar presentations


Presentation on theme: "Chapter 11 Analysis of Variance General Objective: The quantity of information contained in a sample is affected by various factors. This chapter introduces."— Presentation transcript:

1 Chapter 11 Analysis of Variance General Objective: The quantity of information contained in a sample is affected by various factors. This chapter introduces three different experimental designs, two of which are direct extensions of the unpaired and paired designs. A new technique called analysis of variance is used to determine how the different experimental factors affect the average response. © 1998 Brooks/Cole Publishing/ITP

2 Specific Topics 1. The analysis of variance 2. The completely randomized design 3. Tukey’s method of paired comparisons 4. The randomized block design 5. Factorial experiments © 1998 Brooks/Cole Publishing/ITP

3 11.1 The Design of an Experiment n The way that a sample is selected is called the sampling plan or experimental design and determines the amount of information in the sample. n Some research involves an observational study, in which the researcher does not actually produce the data but only observes the characteristics of data that already exist. n The sampling plan is a plan for collecting the data. n Experimentation: The researcher deliberately imposes one or more experimental conditions on the experimental units in order to determine their effect on the response. © 1998 Brooks/Cole Publishing/ITP

4 Definitions: An experimental unit is the object on which a measurement (or measurements) is taken. A factor is an independent variable whose values are controlled and varied by the experimenter, e.g., drug dosage. A level is the intensity setting of a factor, e.g., 5 ml of Drug A. A treatment is a specific combination of factor levels, e.g., 5 ml of Drug A and 10 ml of Drug B. The response is the variable being measured by the experimenter, e.g., degree of pain relief. © 1998 Brooks/Cole Publishing/ITP

5 Example 11.1 A group of people is randomly divided into an experimental and a control group. The control group is given an aptitude test after having eaten a full breakfast. The experimental group is given the same test without having eaten any breakfast. What are the factors, levels, and treatments in this experiment? Solution The experimental units are the people on which the response (test score) is measured. The factor of interest could be described as “ meal” and has two levels: “ breakfast” and “no breakfast.” Since this is the only factor controlled by the experimenter, the two levels—“ breakfast” and “no breakfast”— also represent the treatments of interest in the experiment.

6 © 1998 Brooks/Cole Publishing/ITP Example 11.2 Suppose that the experimenter in Example 11.1 began by randomly selecting 20 men and 20 women for the experiment. These two groups were then randomly divided into 10 each for the experimental and control groups. What factors, levels, and treatments in this experiment? Solution Now there are two factors of interest to the experimenter, and each factor has two levels: -“ Gender ” at two levels: men and women -“ Meal” at two levels breakfast and no breakfast

7 11.2 What Is an Analysis of Variance? n Responses exhibit variability. n In an analysis of variance (ANOVA), the total variation in the response measurements is divided into portions that may be attributed to various factors, e.g., amount of variation due to Drug A and amount due to Drug B. n The variability of the measurements within experimental groups and between experimental groups and their comparisons is formalized by the ANOVA. © 1998 Brooks/Cole Publishing/ITP

8 You can better understand the logic underlying an analysis of variance by looking at the simple experiment below Consider two sets of samples randomly selected from populations 1(  ) and 2 (  ), each with identical pairs of means, Figure 11.1 n Is it easier to detect the difference in the two means when you look at set A or set B? n You will probably agree that set A shows the difference much more clearly. In set A, the variability of the measurements within the groups is much smaller. n In set B, there is more variability within the groups, despite the identical difference in the means. © 1998 Brooks/Cole Publishing/ITP

9 11.3 The Assumption for an Analysis of Variance n Analysis of variance procedures are fairly robust when sample sizes are equal and when the data are fairly mound-shaped. Assumptions of ANOVA Test and Estimation Procedures: -The observations within each population are normally distributed with a common variance  2. -Assumptions regarding the sampling procedures are specified for each design. © 1998 Brooks/Cole Publishing/ITP

10 n ANOVA for three different experimental designs: 1.Independent random sampling from several populations— an extension of the unpaired t test. 2.An extension of the paired-difference or matched pairs design — random assignment of treatments within matched sets of observations. 3.A design that allows you to judge the effect of two experimental factors on the response.

11 11.4 The Completely Randomized Design: A One-Way Classification n A completely randomized design is one in which random samples are selected independently from each of k populations. Example 11.3 A researcher is interested in the effects of five types of insecticide for use in controlling the boll weevil in cotton fields. Explain how to implement a completely randomized design to investigate the effects of the five insecticides on crop yield. Solution The only way to generate the equivalent of five random samples from the hypothetical populations corresponding to the five insecticides is to use a method called a randomized assignment. © 1998 Brooks/Cole Publishing/ITP

12 A fixed number of cotton plants are chosen for treatment, and each is assigned a random number. Suppose that each sample is to have an equal number of measurements. Using a randomization device, you can assign the first n plants chosen to receive insecticide 1, the second n plants to receive insecticide 2, and so on, until all five treatments have been assigned. n The completely randomized design has these: -Involves only one factor, designated a one-way classification, e.g., dosage level for Drug A only. -k levels corresponding to the k populations, which are also the treatments for this deign, e.g., five possible dosage levels—5,10,15,20, and 25 ml of Drug A. -Are the k population means all the same, or is at least one different from the others? - The ANOVA compares the population means simulta- neously versus pair by pair as with Student’s t.

13 11.5 The Analysis of Variance for a Completely Randomized Design Suppose you want to compare k population means based on independent random samples of size n 1, n 2, …, n K from normal populations with a common variance  2. They have the same shape, but potentially different locations: © 1998 Brooks/Cole Publishing/ITP

14 Partitioning the Total Variation in an Experiment Let x ij be the j th measurement ( j  1,2, …, n i ) in the i th sample. The total sum of squares is: If we let G represent the grand total of all n observations, then the correction for the mean is: This Total SS is partitioned into two components. The first component, called the sum of squares for treatments (SST), measures the variation among k sample means: where T i is the total of the observations for treatment i. © 1998 Brooks/Cole Publishing/ITP

15 The second component, called the sum of squares for error (SSE), is used to measure the pooled variations within the k samples: We can show algebraically that, in the analysis of variance, Total SS  SST  SSE Therefore, you need to calculate only two of the three sums of squares. Each of the three sources of variance, when divided by its appropriate degrees of freedom, provides an estimate of the variation in the experiment. Since total SS involves n squared observations, its degrees of freedom are d f  (n  1). © 1998 Brooks/Cole Publishing/ITP

16 Similarly, the sum of squares for treatments involves k squared observations, and its degrees of freedom are d f  (k  1). Finally, the sum of squares for error, a direct extension of the pooled estimate in Chapter 10, has: Notice that the degrees of freedoms for the three sources of variation are additive — that is, d f (total)  d f (treatments)  d f (error) The three sources of variation and their respective degrees of freedom are combined to form the mean squares as MS  SS  d f. The total variation in the experiment is then displayed in an ANOVA table. See Example 11.4 for such a table.

17 © 1998 Brooks/Cole Publishing/ITP ANOVA Table for k Independent Random Samples: Completely Randomized Design

18 © 1998 Brooks/Cole Publishing/ITP Figure 11.3

19 © 1998 Brooks/Cole Publishing/ITP Testing the Equity of the Treatment Means The mean squares in the analysis of variance table can be used to test the null hypothesis The alternative hypothesis is H a : At least one of the means is different from the others using the following theoretical argument: F test for Comparing k Populations Means 1.Null hypothesis: 2.Alternative hypothesis: H a : One or more pairs of population means differ 3.Test statistic: F  MST/ MSE, where F is based on d f 1  (k  1) and d f 2  (n  k)

20 © 1998 Brooks/Cole Publishing/ITP 4.Rejection region: Reject H 0 if F  F , where F  lies in the upper tail of the F distribution (with d f 1  k  1 and d f 2  n  k) or if p-value <  Assumptions: -The samples are randomly and independently selected from their respective populations. -The populations are normally distributed with means and equal variances,

21 Remember that   2 is the common variance for all k populations. The quantity MSE  SSE/(n  k) is a pooled estimate of   2, a weighted average of all k sample variances, whether or not H 0 is true. If H 0 is true, then the variation in the sample means, measured by MST  [SST/ (k  1)], also provides an unbiased estimate of   2. However, if H 0 is false and the population means are different, then MST— which measures the variance in the sample means — is unusually large, shown in Figure 11.4. The test statistic F  MST/ MSE tends to be larger that usual if H 0 is false. Hence, you can reject H 0 for large values of F, using a right-tailed statistical test. When H 0 is true, this test statistic has an F distribution with d f 1  (k  1) and d f 2  (n  k) degrees of freedom and right-tailed critical values of the F distribution can be used. © 1998 Brooks/Cole Publishing/ITP

22 Figure 11.4 Sample means drawn from identical versus different populations

23 © 1998 Brooks/Cole Publishing/ITP Example 11.6 The researcher in Example 11.4 believes that students who have no breakfast will have significantly shorter attention spans but that there may be no difference between those who eat a light or full breakfast. Find a 95% confidence interval for the average attention span for students who eat no breakfast, as well as a 95% confidence interval for the difference in the average attention spans for light versus full breakfast eaters. Solution With s  2  MSE = 5.9333 so that with d f  (n  k) =12, you can calculate the two confidence intervals:

24 © 1998 Brooks/Cole Publishing/ITP n For no breakfast: or between 7.03 and 11.77 minutes. n For a light versus a full breakfast: a difference of between  2.36 and 4.36 minutes.

25 © 1998 Brooks/Cole Publishing/ITP You can see that the second confidence interval does not indicate a difference in the average attention spans for students who ate light versus full breakfasts, as the researcher suspected. If the researcher, because of prior beliefs, wishes to test the other two possible pairs of means—none versus light breakfast, and none versus full breakfast—the methods given in Section 10.6 should be used for testing all three pairs.

26 Figure 11.6 Boxplots for Example 11.6

27 How Do I Know Whether My Calculations Are Accurate? 1.When calculating sums of squares, be certain to carry at least six significant figures before performing subtractions. 2.Remember, sums of squares can never be negative. 3.Always check your ANOVA table to make certain that the degrees of freedom sum to the total degrees of freedom (n - 1) and that the sums of the squares sum to Total SS. © 1998 Brooks/Cole Publishing/ITP

28 11.6 Ranking Populations Means n If two means differ by more than you conclude that the pair of population means differ. n The studentized range is the difference between the smallest and the largest means in a set of k sample means. Tukey’s method of paired comparisons makes the probability of declaring that a difference exists between at least one pair in a set of k treatment means, when no difference exists, at least equal to . n Tukey’s method assumes that the sample means are independent and based on samples of equal size. © 1998 Brooks/Cole Publishing/ITP

29 Yardstick for Making Comparisons: where k  Number of treatments s 2  MSE = Estimator of the common variance  2 and d f  Number of degrees of freedom for s 2 n t  Common sample size—that is, the number of observations in each of the k treatment means  Tabulated value from Tables 11(a) and 11(b) in Appendix I, for  .05 and.01, respectively, and for various combinations of k and d f Rule: Two population means are judged to differ if the corresponding sample means differ by  or more.

30 As noted in the display, the values of q  (k, d f ) are listed in Tables 11(a) and 11(b) in Appendix I for  .05 and.01, respectively. A portion of Table 11(a) in Appendix I is reproduced in Table 11.2. To illustrate the use of Tables 11(a) and 11(b), suppose you want to make pairwise comparisons of k  5 means with .05 for an analysis of variance, where s 2 possesses 9 d f. The tabulated value for k  5, d f  9, and .05, shaded in Table 11.2, is © 1998 Brooks/Cole Publishing/ITP

31 11.7 The Randomized Block Design: A Two-Way Classification n The randomized block design is a direct extension of the matched pairs or paired-difference design. n k treatment means are compared. n The design uses blocks of k experimental units that are relatively similar or homogeneous, with one unit within each block randomly assigned to each treatment. If the design involves k treatments within each of b blocks, then the total number of observations is n  bk. n The purpose of blocking is to remove or isolate the block-to- block variability that might hide the effect of the treatments. © 1998 Brooks/Cole Publishing/ITP

32 11.8 The Analysis of Variance for a Randomized Block Design n The randomized block design identifies two factors—treatments and blocks. See Example 11.8. Partitioning the Total Variation in the Experiment Let x ij be the response when the i th treatment (i  1, 2, …, k) is applied in the j th block (j  1, 2, …, b). The total variation is the n  bk observations is This is partitioned into three (rather that two) parts in such a way the Total SS  SSB  SST  SSE. © 1998 Brooks/Cole Publishing/ITP

33 n SSB (sum of squares for blocks) measures the variation among the block means. n SST (sum of squares for treatments) measures the variation among the treatment means. n SSE (sum of squares for error) measures the variation of the differences among the treatment observations within blocks, which measures the experimental error. Calculating the Sums of Squares for a Randomized Block Design, k Treatments in b Blocks where G    x ij  Total of all n  bk observations

34  (Sum of squares of all x-values)  CM with T i  Total of all observances receiving treatment i, i  1, 2, …, k B j  Total of all observations in block j, j  1, 2, …, b © 1998 Brooks/Cole Publishing/ITP

35 n Each of the three sources of variation, when divided by the appropriate degree of freedom, provides an estimate of the variation in the experiment. Since Total SS involves n  bk squared observations, its degrees of freedom are d f  (n  1). Similarly, SST involves k squared totals, and its degree of freedom are d f  (k  1), while SSB involves b squared totals and has (b  1) degrees of freedom. Finally, since the degrees of freedom are additive, the remaining degrees of freedom associated with SSE can be shown algebraically to be d f  (b  1)(k  1). The four sources of variation and their respective degrees of freedom are combined to form the mean squares as MS  SS  df. n The total variation in the experiment is then displayed in an analysis of variance table as: © 1998 Brooks/Cole Publishing/ITP

36 ANOVA Table for a Randomized Block Design, k treatments and b Blocks

37 Testing the Equality of the Treatment and Block Means The mean squares in the analysis of variance table can be used to test the null hypothesis. H 0 :No difference among the k treatments means H 0 :No difference among the block means versus the alternative hypothesis H a :At least one of the means is different from the others using a theoretical argument similar to the one we used for the completely randomized design. Remember that  2 is the common variance for the observations in all bk block–treatment combinations. The quantity MSE  SSE / (b  1)(k  1) is an unbiased estimate of  2 whether or not H 0 is true. © 1998 Brooks/Cole Publishing/ITP

38 The two mean squares, MST and MSB, estimate  2 only if H 0 is true and tend to be unusually large if H 0 is false and either the treatment or block means are different. n The test statistics F  MST/ MSE and F  MSB/ MSE n are used to test the equality of treatment and block means, respectively. n Both statistics tend to be larger than usual if H 0 is false. Hence, you can reject H 0 for large values of F, using right-tailed critical values of the F distribution with appropriate degrees of freedom. © 1998 Brooks/Cole Publishing/ITP

39 Tests for a Randomized Block Design: For comparing treatment means: 1.Null hypothesis: H 0 : The treatment means are equal 2.Alternative hypothesis: H a : At least two of the treatment means differ 3.Test statistic: F  MST/ MSE, where F is based on d f 1  (k  1) and d f 2  (b  1)(k  1) 4.Rejection region: Reject if F  F , where F  lies in the upper tail or the F distribution, or when the p-value  For comparing block means: 1. Null hypothesis: H 0 : The block mean are equal 2. Alternative hypothesis: H a : At least two of the block means differ 3. Test statistic: F  MSB / MSE, where F is based on d f 1  (b  1) and d f 2  (b  1)(k  1) 4. Rejection region: Reject if F  F , where F  lies in the upper tail or the F distribution, or when the p-value 

40 Comparing Treatment and Block Means: Use s 2  MSE with df  (b  1)(k  1) to estimate  2 in comparing treatment and block means. (See Example 11.10.) -Tukey’s yardstick for comparing block means: -Tukey’s yardstick for comparing treatment means: -(1   )100% confidence interval for the difference in two block means: where is the average of all observations in block i © 1998 Brooks/Cole Publishing/ITP

41 -(1-  )100% confidence interval for the difference in two treatment means: where is the average of all observations in treatment i. Note: The values from Table 11 in Appendix I, t  /2 from Table 4 in Appendix I, and s 2  MSE all depend on df  (b  1)(k  1) degrees of freedom. © 1998 Brooks/Cole Publishing/ITP

42 n Some Cautionary Comments on Blocking: -A randomized block design should not be used when treatments and blocks both correspond to experimental factors of interest to the researcher, e.g., dosage levels of two drugs. -Remember that blocking may not always be beneficial. -Remember that you cannot construct confidence intervals for individual treatment means unless it is reasonable to assume that the b blocks have been randomly selected from a population of blocks.

43 11.9 The a  b Factorial Experiment: A Two-Way Classification n You may be interested in two factors and also the interaction or relationship between the two factors, e.g., two drugs and their interactions. n Consider two different examples of interaction on responses in this situation. © 1998 Brooks/Cole Publishing/ITP

44 Example 11.11 Suppose that the two supervisors are each observed on three randomly selected days for each of the three different shifts. The average outputs for the three shifts are shown in Table 11.4 for each of the supervisors. Look at the relationship between the two factors in the line chart for these means, shown in Figure 11.10. Notice that supervisor 2 always produces a higher output, regardless of the shift. The two factors behave independently; that is, the output is always about 100 units higher for supervisor 2, no matter which shift you look at. Table 11.4 Shift Supervisor Day Swing Night 1 487 498 550 2 602 602 637

45 © 1998 Brooks/Cole Publishing/ITP Figure 11.10 Interaction plot for means in Table 11.4

46 © 1998 Brooks/Cole Publishing/ITP Now consider another set of data for the same situation, shown in Table 11.5. There is a definite difference in the results, depending on which shift you look at, and the interaction can be seen in the crossed lines of the chart in Figure 11.11. Table 11.5 Shift Supervisor Day Swing Night 1 602 498 450 2 487 602 657

47 © 1998 Brooks/Cole Publishing/ITP Figure 11.11 Interaction plot for means in Table 11.4

48 In a factorial experiment there are j  k possible combinations of the levels for the two factors that are from the treatments. For example, four dosages of Drug A and five dosages of Drug B result in 4  5  20 levels. n If you obtain two observations for each of the factor combinations of a complete factorial experiment, e.g., two subjects took 5ml of Drug A and 10ml of Drug B, then you have two replications of the experiment. © 1998 Brooks/Cole Publishing/ITP

49 11.10 The Analysis of Variance for an a  b Factorial Experiment n An analysis of variance for a two-factor experiment replicated r times follows the same pattern as the previous designs. n If the letters A and B are to be used to identify the two factors, the total variation in the experiment is partitioned into four parts in such a way that Total SS  SSA  SSB  SS(AB)  SSE where SSA (sum of squares for factor A) measures the variation among the factor A means. SSB (sum of squares for factor B) measures the variation among factor B means. © 1998 Brooks/Cole Publishing/ITP

50 SSE (sum of errors) measures the variation of the differences among the observations within each combination of factor levels—the experimental error. n Sums of squares SSA and SSB are often called the main effect of squares, to distinguish them from the interaction sum of squares. n You can assume that there are: - a levels of factor A - b levels of factor B - r replications for each of the ab factor combinations - A total of n  abr observations. © 1998 Brooks/Cole Publishing/ITP

51 n Finally, the equality of means for various levels of the factor combinations (the interaction effect) and for the levels of both main effects, A and B, can be tested using the ANOVA F tests, as shown below: Tests for a Factorial Experiment: For Interaction: 1.Null hypothesis: H 0 : Factors A and B do not interact 2.Alternative hypothesis: H a : Factors A and B interact 3.Test statistic: F  MS(AB)/MSE, where F is based on d f 1  (a  1)(b  1) and d f 2  ab(r  1) 4.Rejection region: Reject H 0 when F  F  where F  lies in the upper tail of the F distribution, or when p-value  © 1998 Brooks/Cole Publishing/ITP

52 For main effects, factor A: 1.Null hypothesis: H 0 : There are no differences among the factor A means 2.Alternative hypothesis: H a : At least two of the factor A means differ 3.Test statistic: F  MSA / MSE, when F is based on d f 1  (a  1) and d f 2  ab(r  1) 4.Rejection region: Reject H 0 when F  F  or when p-value  For main effects, factor B: 1.Null hypothesis: H 0 : There are no differences among the factor B means 2.Alternative hypothesis: H a : At least two of the factor B means differ 3. Test statistic: F  MSB / MSE, where F is based on d f 1  (b  1) and d f 2  ab(r  1) 4. Rejection region: Reject H 0 when F  F  or when p-value  © 1998 Brooks/Cole Publishing/ITP

53 If the interaction effect is significant, the differences in the treatment means can be further studied by looking at the a  b factor– level combinations. n If the interaction effect is not significant, then the significance of the main effect means should be investigated. n Use the overall F test and then Tukey’s method for paired comparisons and/or specific confidence levels. Remember that these analysis of variance procedures always use s  2  MSE as the best estimator of  2 with degrees of freedom d f  ab(r  1). © 1998 Brooks/Cole Publishing/ITP

54 11.11 Revisiting the Analysis of Variance Assumptions n Residual Plots - Once the effects of sources of variation have been removed, the “leftover” variability in each observation is called the residual for that data point. - These residuals represent experimental error, the basic variability in the experiment. - The residual have approximately normal distribution with a mean of 0 and the same variation for each treatment group. © 1998 Brooks/Cole Publishing/ITP

55 n Options for plotting residuals -The normal probability plot of residuals is a graph that plots the residuals for each observation against the expected value of that residual had it come from a normal distribution. n If the residuals are approximately normal, the plot will closely resemble a straight line sloping upward at a 45-degree angle. -The plot of residuals versus fit or residuals versus variables is a graph that plots the residuals against the expected design we have used. n This plot should show a random scatter around the horizontal “zero error line” and the constant spread indicates no violation in the constant variance assumption. © 1998 Brooks/Cole Publishing/ITP

56 11.12 A Brief Summary n Design variables are factors whose effect you want to control, e.g., pain. n Treatment variable are factors whose effect you want to investigate, e.g., drug dosage. n In experiments, the levels of the variables are measured experimentally, rather that controlled or preselected ahead of time. How much pain relief is experienced as a result of drug dosage? n Results may also be analyzed using linear or multiple regression analysis. (See Chapters 12 and 13.) © 1998 Brooks/Cole Publishing/ITP

57 Key Concepts and Formulas I. Experimental Designs 1.Experimental units, factors, levels, treatments, response variables. 2.Assumptions: Observations within each treatment group must be normally distributed with a common variance  2. 3.One-way classification—completely randomized design: Independent random samples are selected from each of k populations. 4.Two-way classification—randomized block design: k treatments are compared within b relatively homogeneous groups of experimental units called blocks. 5. Two-way classification — a  b factorial experiment: Two factors, A and B, are compared at several levels. Each factor– level combination is replicated r times to allow for the investigation of an interaction between the two factors.

58 © 1998 Brooks/Cole Publishing/ITP II.Analysis of Variance 1.The total variation in the experiment is divided into variation (sums of squares) explained by the various experimental factors and variation due to experimental error (unexplained). 2.If there is an effect due to a particular factor, its mean square(MS  SS/df ) is usually large and F  MS(factor)/MSE is large. 3.Test statistics for the various experimental factors are based on F statistics, with appropriate degrees of freedom (d f 2  Error degrees of freedom).

59 © 1998 Brooks/Cole Publishing/ITP III.Interpreting an Analysis of Variance 1.For the completely randomized and randomized block design, each factor is tested for significance. 2.For the factorial experiment, first test for a significant interaction. If the interactions is significant, main effects need not be tested. The nature of the difference in the factor– level combinations should be further examined. 3.If a significant difference in the population means is found, Tukey’s method of pairwise comparisons or a similar method can be used to further identify the nature of the difference. 4. If you have a special interest in one population mean or the difference between two population means, you can use a confidence interval estimate. (For randomized block design, confidence intervals do not provide estimates for single population means).

60 © 1998 Brooks/Cole Publishing/ITP IV.Checking the Analysis of Variance Assumptions 1.To check for normality, use the normal probability plot for the residuals. The residuals should exhibit a straight-line pattern, increasing at a 45  angle. 2.To check for equality of variance, use the residuals versus fit plot. The plot should exhibit a random scatter, with the same vertical spread around the horizontal “zero error line.”


Download ppt "Chapter 11 Analysis of Variance General Objective: The quantity of information contained in a sample is affected by various factors. This chapter introduces."

Similar presentations


Ads by Google