1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.

2 2 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design and Analysis of Variance and Analysis of Variance nAnalysis of Variance and the Completely Randomized Design and the Completely Randomized Design nMultiple Comparison Procedures

3 3 Slide © 2008 Thomson South-Western. All Rights Reserved nStatistical studies can be classified as being either experimental or observational. nIn an experimental study, one or more factors are controlled so that data can be obtained about how the factors influence the variables of interest. nIn an observational study, no attempt is made to control the factors. nCause-and-effect relationships are easier to establish in experimental studies than in observational studies. An Introduction to Experimental Design and Analysis of Variance nAnalysis of variance (ANOVA) can be used to analyze the data obtained from experimental or observational studies.

4 4 Slide © 2008 Thomson South-Western. All Rights Reserved An Introduction to Experimental Design and Analysis of Variance nIn this chapter three types of experimental designs are introduced. a completely randomized design a completely randomized design a randomized block design a randomized block design a factorial experiment a factorial experiment

5 5 Slide © 2008 Thomson South-Western. All Rights Reserved An Introduction to Experimental Design and Analysis of Variance nA factor is a variable that the experimenter has selected for investigation (the independent variable). nA treatment is a level of a factor. nExperimental units are the objects of interest in the experiment. nA completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units.

6 6 Slide © 2008 Thomson South-Western. All Rights Reserved Analysis of Variance: A Conceptual Overview Analysis of Variance (ANOVA) can be used to test Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means. for the equality of three or more population means. Analysis of Variance (ANOVA) can be used to test Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means. for the equality of three or more population means. Data obtained from observational or experimental Data obtained from observational or experimental studies can be used for the analysis. studies can be used for the analysis. Data obtained from observational or experimental Data obtained from observational or experimental studies can be used for the analysis. studies can be used for the analysis. We want to use the sample results to test the We want to use the sample results to test the following hypotheses: following hypotheses: We want to use the sample results to test the We want to use the sample results to test the following hypotheses: following hypotheses: H 0 :  1  =  2  =  3  = ... =  k H a : Not all population means are equal

7 7 Slide © 2008 Thomson South-Western. All Rights Reserved H 0 :  1  =  2  =  3  = ... =  k H a : Not all population means are equal If H 0 is rejected, we cannot conclude that all If H 0 is rejected, we cannot conclude that all population means are different. population means are different. If H 0 is rejected, we cannot conclude that all If H 0 is rejected, we cannot conclude that all population means are different. population means are different. Rejecting H 0 means that at least two population Rejecting H 0 means that at least two population means have different values. means have different values. Rejecting H 0 means that at least two population Rejecting H 0 means that at least two population means have different values. means have different values. Analysis of Variance: A Conceptual Overview

8 8 Slide © 2008 Thomson South-Western. All Rights Reserved For each population, the response (dependent) For each population, the response (dependent) variable is normally distributed. variable is normally distributed. For each population, the response (dependent) For each population, the response (dependent) variable is normally distributed. variable is normally distributed. The variance of the response variable, denoted  2, The variance of the response variable, denoted  2, is the same for all of the populations. is the same for all of the populations. The variance of the response variable, denoted  2, The variance of the response variable, denoted  2, is the same for all of the populations. is the same for all of the populations. The observations must be independent. The observations must be independent. nAssumptions for Analysis of Variance Analysis of Variance: A Conceptual Overview

9 9 Slide © 2008 Thomson South-Western. All Rights Reserved nSampling Distribution of Given H 0 is True   Sample means are close together because there is only because there is only one sampling distribution one sampling distribution when H 0 is true. when H 0 is true. Analysis of Variance: A Conceptual Overview

10 Slide © 2008 Thomson South-Western. All Rights Reserved nSampling Distribution of Given H 0 is False 33 33 11 11 22 22 Sample means come from different sampling distributions and are not as close together when H 0 is false. when H 0 is false. Analysis of Variance: A Conceptual Overview

11 Slide © 2008 Thomson South-Western. All Rights Reserved Analysis of Variance nBetween-Treatments Estimate of Population Variance nWithin-Treatments Estimate of Population Variance nComparing the Variance Estimates: The F Test nANOVA Table

12 Slide © 2008 Thomson South-Western. All Rights Reserved Between-Treatments Estimate of Population Variance  2 Denominator is the degrees of freedom associated with SSTR Numerator is called the sum of squares due to treatments (SSTR) The estimate of  2 based on the variation of the The estimate of  2 based on the variation of the sample means is called the mean square due to sample means is called the mean square due to treatments and is denoted by MSTR. treatments and is denoted by MSTR.

13 Slide © 2008 Thomson South-Western. All Rights Reserved The estimate of  2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE. The estimate of  2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE. Within-Treatments Estimate of Population Variance  2 Denominator is the degrees of freedom associated with SSE Numerator is called the sum of squares due to error (SSE)

14 Slide © 2008 Thomson South-Western. All Rights Reserved Comparing the Variance Estimates: The F Test If the null hypothesis is true and the ANOVA If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR d.f. MSTR/MSE is an F distribution with MSTR d.f. equal to k - 1 and MSE d.f. equal to n T - k. equal to k - 1 and MSE d.f. equal to n T - k. If the means of the k populations are not equal, the If the means of the k populations are not equal, the value of MSTR/MSE will be inflated because MSTR value of MSTR/MSE will be inflated because MSTR overestimates  2. overestimates  2. Hence, we will reject H 0 if the resulting value of Hence, we will reject H 0 if the resulting value of MSTR/MSE appears to be too large to have been MSTR/MSE appears to be too large to have been selected at random from the appropriate F selected at random from the appropriate F distribution. distribution.

15 Slide © 2008 Thomson South-Western. All Rights Reserved nSampling Distribution of MSTR/MSE Do Not Reject H 0 Reject H 0 MSTR/MSE Critical Value FF FF Sampling Distribution of MSTR/MSE  Comparing the Variance Estimates: The F Test

16 Slide © 2008 Thomson South-Western. All Rights Reserved Test for the Equality of k Population Means F = MSTR/MSE H 0 :  1  =  2  =  3  = ... =  k  H a : Not all population means are equal nHypotheses nTest Statistic

17 Slide © 2008 Thomson South-Western. All Rights Reserved Test for the Equality of k Population Means nRejection Rule where the value of F  is based on an F distribution with k - 1 numerator d.f. and n T - k denominator d.f. Reject H 0 if p -value <  p -value Approach: Critical Value Approach: Reject H 0 if F > F 

18 Slide © 2008 Thomson South-Western. All Rights Reserved ANOVA Table n The ANOVA table decomposes the variance into two components (sources): a between-group component (treatments) and a within-group component (error). The F- ratio, which in this case equals 15.6234, is a ratio of the between-group estimate to the within-group estimate. Since the P-value of the F-test is less than 0.05, there is a statistically significant difference between the means from one level to another at the 95.0% confidence level.

19 Slide © 2008 Thomson South-Western. All Rights Reserved AutoShine, Inc. is considering marketing a long- AutoShine, Inc. is considering marketing a long- lasting car wax. Three different waxes (Type 1, Type 2, and Type 3) have been developed. n Example: AutoShine, Inc. In order to test the durability In order to test the durability of these waxes, 5 new cars were waxed with Type 1, 5 with Type 2, and 5 with Type 3. Each car was then repeatedly run through an automatic carwash until the wax coating showed signs of deterioration. Testing for the Equality of k Population Means: A Completely Randomized Experimental Design

20 Slide © 2008 Thomson South-Western. All Rights Reserved The number of times each car went through the The number of times each car went through the carwash before its wax deteriorated is shown on the next slide. AutoShine, Inc. must decide which wax to market. Are the three waxes equally effective? n Example: AutoShine, Inc. Testing for the Equality of k Population Means: A Completely Randomized Experimental Design Factor... Car wax Treatments... Type I, Type 2, Type 3 Experimental units... Cars Response variable... Number of washes

21 Slide © 2008 Thomson South-Western. All Rights Reserved 12345273029283133283130302928303231 Sample Mean Sample Variance Observation Wax Type 1 Wax Type 2 Wax Type 3 2.5 3.3 2.5 2.5 3.3 2.5 29.0 30.4 30.0 Testing for the Equality of k Population Means: A Completely Randomized Experimental Design

22 Slide © 2008 Thomson South-Western. All Rights Reserved nHypotheses where:  1 = mean number of washes using Type 1 wax  2 = mean number of washes using Type 2 wax  3 = mean number of washes using Type 3 wax H 0 :  1  =  2  =  3  H a : Not all the means are equal Testing for the Equality of k Population Means: A Completely Randomized Experimental Design

23 Slide © 2008 Thomson South-Western. All Rights Reserved nTest Statistic There is insufficient evidence to conclude that the mean number of washes for the three wax types are not all the same (waxes are the same). nConclusion F = MSTR/MSE The p -value of 0.42 is greater than 0.10, where F = 2.81. Therefore, we cannot reject H 0. Testing for the Equality of k Population Means: A Completely Randomized Experimental Design

24 Slide © 2008 Thomson South-Western. All Rights Reserved Source of Variation Sum of Squares Degrees of Freedom MeanSquares F Treatments Error Total 2 14 5.2 33.2 38.4 12 2.60 2.77.939 nANOVA Table Testing for the Equality of k Population Means: A Completely Randomized Experimental Design p -Value.42

25 Slide © 2008 Thomson South-Western. All Rights Reserved nExample: Reed Manufacturing Janet Reed would like to know if Janet Reed would like to know if there is any significant difference in the mean number of hours worked per week for the department managers at her three manufacturing plants (in Buffalo, Pittsburgh, and Detroit). An F test will be conducted using An F test will be conducted using  =.05. Testing for the Equality of k Population Means: An Observational Study

26 Slide © 2008 Thomson South-Western. All Rights Reserved nExample: Reed Manufacturing A simple random sample of five A simple random sample of five managers from each of the three plants was taken and the number of hours worked by each manager in the previous week is shown on the next slide. Testing for the Equality of k Population Means: An Observational Study Factor... Manufacturing plant Treatments... Buffalo, Pittsburgh, Detroit Experimental units... Managers Response variable... Number of hours worked

27 Slide © 2008 Thomson South-Western. All Rights Reserved 123454854575462 7363666474 5163615456 Plant 1 Buffalo Plant 2 Pittsburgh Plant 3 Detroit Observation Sample Mean Sample Variance 55 68 57 26.0 26.5 24.5 Testing for the Equality of k Population Means: An Observational Study

28 Slide © 2008 Thomson South-Western. All Rights Reserved H 0 :  1  =  2  =  3  H a : Not all the means are equal where:  1 = mean number of hours worked per week by the managers at Plant 1 week by the managers at Plant 1  2 = mean number of hours worked per  2 = mean number of hours worked per week by the managers at Plant 2 week by the managers at Plant 2  3 = mean number of hours worked per week by the managers at Plant 3 week by the managers at Plant 3 Develop the hypotheses. p -Value and Critical Value Approaches p -Value and Critical Value Approaches Testing for the Equality of k Population Means: An Observational Study

29 Slide © 2008 Thomson South-Western. All Rights Reserved TreatmentErrorTotal 49030879821214 24525.667 Source of Variation Sum of Squares Degrees of Freedom MeanSquare 9.55 F ANOVA Table ANOVA Table Testing for the Equality of k Population Means: An Observational Study p -Value.0033 We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plant. The p -value <.05, so we reject H 0.

30 Slide © 2008 Thomson South-Western. All Rights Reserved Multiple Comparison Procedures nSuppose that analysis of variance has provided statistical evidence to reject the null hypothesis of equal population means. nFisher’s least significant difference (LSD) procedure can be used to determine where the differences occur.

32 Slide © 2008 Thomson South-Western. All Rights Reserved Fisher’s LSD Procedure where the value of t a /2 is based on a where the value of t a /2 is based on a t distribution with n T - k degrees of freedom. nRejection Rule Reject H 0 if p -value <  p -value Approach: Critical Value Approach: Reject H 0 if t t a /2

33 Slide © 2008 Thomson South-Western. All Rights Reserved Fisher’s LSD Procedure Based on the Test Statistic x i - x j nExample: Reed Manufacturing Recall that Janet Reed wants to know Recall that Janet Reed wants to know if there is any significant difference in the mean number of hours worked per week for the department managers at her three manufacturing plants. Analysis of variance has provided Analysis of variance has provided statistical evidence to reject the null hypothesis of equal population means. Fisher’s least significant difference (LSD) procedure can be used to determine where the differences occur.

34 Slide © 2008 Thomson South-Western. All Rights Reserved nLSD for Plants 1 and 2 Fisher’s LSD Procedure Based on the Test Statistic x i - x j Conclusion Conclusion Test Statistic Test Statistic = |55  68| = 13 Reject H 0 if > 6.98 Rejection Rule Rejection Rule Hypotheses (A) Hypotheses (A) The mean number of hours worked at Plant 1 is The mean number of hours worked at Plant 1 is not equal to the mean number worked at Plant 2.

35 Slide © 2008 Thomson South-Western. All Rights Reserved nLSD for Plants 1 and 3 Fisher’s LSD Procedure Based on the Test Statistic x i - x j Conclusion Conclusion Test Statistic Test Statistic = |55  57| = 2 Reject H 0 if > 6.98 Rejection Rule Rejection Rule Hypotheses (B) Hypotheses (B) There is no significant difference between the mean There is no significant difference between the mean number of hours worked at Plant 1 and the mean number of hours worked at Plant 1 and the mean number of hours worked at Plant 3. number of hours worked at Plant 3.

36 Slide © 2008 Thomson South-Western. All Rights Reserved nLSD for Plants 2 and 3 Fisher’s LSD Procedure Based on the Test Statistic x i - x j Conclusion Conclusion Test Statistic Test Statistic = |68  57| = 11 Reject H 0 if > 6.98 Rejection Rule Rejection Rule Hypotheses (C) Hypotheses (C) The mean number of hours worked at Plant 2 is The mean number of hours worked at Plant 2 is not equal to the mean number worked at Plant 3. not equal to the mean number worked at Plant 3.

37 Slide © 2008 Thomson South-Western. All Rights Reserved nThe experiment-wise Type I error rate gets larger for problems with more populations (larger k ). Type I Error Rates  EW = 1 – (1 –  ) ( k – 1)! The comparison-wise Type I error rate  indicates the level of significance associated with a single pairwise comparison. The comparison-wise Type I error rate  indicates the level of significance associated with a single pairwise comparison. The experiment-wise Type I error rate  EW is the probability of making a Type I error on at least one of the ( k – 1)! pairwise comparisons. The experiment-wise Type I error rate  EW is the probability of making a Type I error on at least one of the ( k – 1)! pairwise comparisons.

39 Slide © 2008 Thomson South-Western. All Rights Reserved n Experimental units are the objects of interest in the experiment. n A completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units. n If the experimental units are heterogeneous, blocking can be used to form homogeneous groups, resulting in a randomized block design (single sample, each block(element) is used in all treatments). Randomized Block Design

40 Slide © 2008 Thomson South-Western. All Rights Reserved For a randomized block design the sum of squares total (SST) is partitioned into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error. For a randomized block design the sum of squares total (SST) is partitioned into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error. n ANOVA Procedure Randomized Block Design SST = SSTR + SSBL + SSE The total degrees of freedom, n T - 1, are partitioned such that k - 1 degrees of freedom go to treatments, b - 1 go to blocks, and ( k - 1)( b - 1) go to the error term. The total degrees of freedom, n T - 1, are partitioned such that k - 1 degrees of freedom go to treatments, b - 1 go to blocks, and ( k - 1)( b - 1) go to the error term.

41 Slide © 2008 Thomson South-Western. All Rights Reserved Source of Variation Sum of Squares Degrees of Freedom MeanSquare F Treatments Error Total k - 1 n T - 1 SSTR SSE SST Randomized Block Design n ANOVA Table BlocksSSBL b - 1 ( k – 1)( b – 1) p - Value

42 Slide © 2008 Thomson South-Western. All Rights Reserved Randomized Block Design n Example: Crescent Oil Co. Crescent Oil has developed three Crescent Oil has developed three new blends of gasoline and must decide which blend or blends to produce and distribute. A study of the miles per gallon ratings of the three blends is being conducted to determine if the mean ratings are the same for the three blends.

43 Slide © 2008 Thomson South-Western. All Rights Reserved Randomized Block Design n Example: Crescent Oil Co. Five automobiles have been Five automobiles have been tested using each of the three gasoline blends and the miles per gallon ratings are shown on the next slide. Factor... Gasoline blend Treatments... Blend X, Blend Y, Blend Z Blocks... Automobiles Response variable... Miles per gallon

44 Slide © 2008 Thomson South-Western. All Rights Reserved Randomized Block Design 29.8 28.8 28.4 TreatmentMeans 12345 31302933263029293125302928292630.33329.33328.66731.00025.667 Type of Gasoline (Treatment) BlockMeans Blend X Blend Y Blend Z Automobile(Block)

45 Slide © 2008 Thomson South-Western. All Rights Reserved Source of Variation Sum of Squares Degrees of Freedom MeanSquare F Treatments Error Total 2 14 5.20 5.47 62.00 8 2.60.68 3.82 n ANOVA Table Randomized Block Design Blocks51.3312.804 p -Value.07

46 Slide © 2008 Thomson South-Western. All Rights Reserved Factorial Experiments n In some experiments we want to draw conclusions about more than one variable or factor. n Factorial experiments and their corresponding ANOVA computations are valuable designs when simultaneous conclusions about two or more factors are required. n For example, for a levels of factor A and b levels of factor B, the experiment will involve collecting data on ab treatment combinations. n The term factorial is used because the experimental conditions include all possible combinations of the factors.

47 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 13 Experimental Design and Analysis of Variance nIntroduction to Experimental Design and Analysis of Variance and Analysis of Variance nAnalysis of Variance and the Completely Randomized Design and the Completely Randomized Design nMultiple Comparison Procedures

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.

Similar presentations

Presentation on theme: "1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.

Similar presentations

Presentation on theme: "1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS."— Presentation transcript:

Similar presentations

About project

Feedback