Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計.

Similar presentations


Presentation on theme: "1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計."— Presentation transcript:

1 1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計

2 2 2 Slide Chapter 13 Analysis of Variance and Experimental Design n An Introduction to Analysis of Variance n Analysis of Variance: Testing for the Equality of k Population Means k Population Means n Multiple Comparison Procedures n An Introduction to Experimental Design n Completely Randomized Designs n Randomized Block Design

3 3 3 Slide n Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means using data obtained from observational or experimental studies. n We want to use the sample results to test the following hypotheses.  H 0 :  1  =  2  =  3  = ... =  k   H a : Not all population means are equal n If H 0 is rejected, we cannot conclude that all population means are different. n Rejecting H 0 means that at least two population means have different values. An Introduction to Analysis of Variance

4 4 4 Slide Assumptions for Analysis of Variance n For each population, the response variable is normally distributed. The variance of the response variable, denoted  2, is the same for all of the populations. The variance of the response variable, denoted  2, is the same for all of the populations. n The observations must be independent.

5 5 5 Slide Analysis of Variance: Testing for the Equality of K Population Means n Between-Samples Estimate of Population Variance n Within-Samples Estimate of Population Variance n Comparing the Variance Estimates: The F Test n The ANOVA Table

6 6 6 Slide Between-Samples Estimate of Population Variance A between-samples estimate of  2 is called the mean square between (MSB). A between-samples estimate of  2 is called the mean square between (MSB). n The numerator of MSB is called the sum of squares between (SSB). n The denominator of MSB represents the degrees of freedom associated with SSB. = = _ _

7 7 7 Slide Within-Samples Estimate of Population Variance The estimate of  2 based on the variation of the sample observations within each sample is called the mean square within (MSW). The estimate of  2 based on the variation of the sample observations within each sample is called the mean square within (MSW). n The numerator of MSW is called the sum of squares within (SSW). n The denominator of MSW represents the degrees of freedom associated with SSW.

8 8 8 Slide Comparing the Variance Estimates: The F Test n If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSB/MSW is an F distribution with MSB d.f. equal to k - 1 and MSW d.f. equal to n T - k. If the means of the k populations are not equal, the value of MSB/MSW will be inflated because MSB overestimates  2. If the means of the k populations are not equal, the value of MSB/MSW will be inflated because MSB overestimates  2. n Hence, we will reject H 0 if the resulting value of MSB/MSW appears to be too large to have been selected at random from the appropriate F distribution.

9 9 9 Slide Test for the Equality of k Population Means n Hypotheses H 0 :  1  =  2  =  3  = ... =  k  H 0 :  1  =  2  =  3  = ... =  k   H a : Not all population means are equal n Test Statistic F = MSB/MSW n Rejection Rule Reject H 0 if F > F  Reject H 0 if F > F  where the value of F  is based on an F distribution with k - 1 numerator degrees of freedom and n T - 1 denominator degrees of freedom.

10 10 Slide The figure below shows the rejection region associated with a level of significance equal to  where F  denotes the critical value. The figure below shows the rejection region associated with a level of significance equal to  where F  denotes the critical value. Sampling Distribution of MSTR/MSE Do Not Reject H 0 Reject H 0 MSTR/MSE Critical Value FF FF

11 11 Slide The ANOVA Table Source of Sum of Degrees of Mean Variation Squares Freedom Squares F TreatmentSSTR k - 1 MSTR MSTR/MSE Error SSE n T - k MSE Total SST n T - 1 SST divided by its degrees of freedom n T - 1 is simply the overall sample variance that would be obtained if we treated the entire n T observations as one data set.

12 12 Slide Example: Reed Manufacturing n Analysis of Variance J. R. Reed would like to know if the mean number of hours worked per week is the same for the department managers at her three manufacturing plants (Buffalo, Pittsburgh, and Detroit). A simple random sample of 5 managers from each of the three plants was taken and the number of hours worked by each manager for the previous week is shown on the next slide.

13 13 Slide n Analysis of Variance Plant 1Plant 2Plant 3 ObservationBuffalo Pittsburgh Detroit ObservationBuffalo Pittsburgh Detroit 1 48 73 51 1 48 73 51 2 54 63 63 2 54 63 63 3 57 66 61 3 57 66 61 4 54 64 54 4 54 64 54 5 62 74 56 5 62 74 56 Sample Mean 55 68 57 Sample Mean 55 68 57 Sample Variance 26.0 26.5 24.5 Sample Variance 26.0 26.5 24.5 Example: Reed Manufacturing

14 14 Slide n Analysis of Variance Hypotheses Hypotheses H 0 :  1  =  2  =  3  H a : Not all the means are equal where: where:  1 = mean number of hours worked per week by the managers at Plant 1  2 = mean number of hours worked per week by the managers at Plant 2  2 = mean number of hours worked per week by the managers at Plant 2  3 = mean number of hours worked per week by the managers at Plant 3 Example: Reed Manufacturing

15 15 Slide n Analysis of Variance Mean Square Between Mean Square Between Since the sample sizes are all equal Since the sample sizes are all equal x = (55 + 68 + 57)/3 = 60 x = (55 + 68 + 57)/3 = 60 SSB = 5(55 - 60) 2 + 5(68 - 60) 2 + 5(57 - 60) 2 = 490 SSB = 5(55 - 60) 2 + 5(68 - 60) 2 + 5(57 - 60) 2 = 490 MSB = 490/(3 - 1) = 245 MSB = 490/(3 - 1) = 245 Mean Square Within Mean Square Within SSW = 4(26.0) + 4(26.5) + 4(24.5) = 308 SSW = 4(26.0) + 4(26.5) + 4(24.5) = 308 MSW = 308/(15 - 3) = 25.667 MSW = 308/(15 - 3) = 25.667 = = Example: Reed Manufacturing

16 16 Slide n Analysis of Variance F - Test F - Test If H 0 is true, the ratio MSB/MSW should be near 1 If H 0 is true, the ratio MSB/MSW should be near 1 since both MSB and MSW are estimating  2. If H a since both MSB and MSW are estimating  2. If H a is true, the ratio should be significantly larger than is true, the ratio should be significantly larger than 1 since MSB tends to overestimate  2. 1 since MSB tends to overestimate  2. Rejection Rule Rejection Rule Assuming  =.05, F.05 = 3.89 (2 d.f. numerator, Assuming  =.05, F.05 = 3.89 (2 d.f. numerator, 12 d.f. denominator). Reject H 0 if F > 3.89 12 d.f. denominator). Reject H 0 if F > 3.89 Example: Reed Manufacturing

17 17 Slide Example: Reed Manufacturing n Analysis of Variance Test Statistic Test Statistic F = MSB/MSW = 245/25.667 = 9.55 F = MSB/MSW = 245/25.667 = 9.55 Conclusion Conclusion F = 9.55 > F.05 = 3.89, so we reject H 0. The mean F = 9.55 > F.05 = 3.89, so we reject H 0. The mean number of hours worked per week by department number of hours worked per week by department managers is not the same at each plant. managers is not the same at each plant.

18 18 Slide n Analysis of Variance ANOVA Table ANOVA Table Source of Sum of Degrees of Mean Source of Sum of Degrees of Mean Variation Squares Freedom Square F Variation Squares Freedom Square F Treatments 490 2 245 9.55 Treatments 490 2 245 9.55 Error 308 12 25.667 Error 308 12 25.667 Total 798 14 Total 798 14 Example: Reed Manufacturing

19 19 Slide Multiple Comparison Procedures Suppose that analysis of variance has provided statistical evidence to reject the null hypothesis of equal population means. Fisher’s least significance difference (LSD) procedure can be used to determine where the differences occur.

20 20 Slide Fisher’s LSD Procedure n Hypotheses H 0 :  i =  j H 0 :  i =  j H a :  i  j H a :  i  j n Test Statistic n Rejection Rule Reject H 0 if t t a /2 Reject H 0 if t t a /2 where the value of t a /2 is based on a t distribution with n T - k degrees of freedom.

21 21 Slide n Hypotheses H 0 :  i =  j H 0 :  i =  j H a :  i  j H a :  i  j n Test Statistic x i - x j x i - x j n Rejection Rule Reject H 0 if | x i - x j | > LSD Reject H 0 if | x i - x j | > LSDwhere Fisher’s LSD Procedure Based on the Test Statistic x i - x j __ _ _ _ _

22 22 Slide n Fisher’s LSD Assuming  =.05, Hypotheses (A) H 0 :  1 =  2 Hypotheses (A) H 0 :  1 =  2 H a :  1  2 H a :  1  2 Test Statistic Test Statistic | x 1 - x 2 | = |55 - 68| = 13 | x 1 - x 2 | = |55 - 68| = 13 Conclusion Conclusion The mean number of hours worked at Plant 1 is not equal to the mean number worked at Plant 2. __ Example: Reed Manufacturing

23 23 Slide n Fisher’s LSD Hypotheses (B) Hypotheses (B) H 0 :  1 =  3 H 0 :  1 =  3 H a :  1  3 H a :  1  3 Test Statistic Test Statistic | x 1 - x 3 | = |55 - 57| = 2 | x 1 - x 3 | = |55 - 57| = 2 Conclusion Conclusion There is no significant difference between the mean number of hours worked at Plant 1 and the mean number of hours worked at Plant 3. __ Example: Reed Manufacturing

24 24 Slide n Fisher’s LSD Hypotheses (C) Hypotheses (C) H 0 :  2 =  3 H 0 :  2 =  3 H a :  2  3 H a :  2  3 Test Statistic Test Statistic | x 2 - x 3 | = |68 - 57| = 11 | x 2 - x 3 | = |68 - 57| = 11 Conclusion Conclusion The mean number of hours worked at Plant 2 is not equal to the mean number worked at Plant 3. __ Example: Reed Manufacturing

25 25 Slide An Introduction to Experimental Design n Statistical studies can be classified as being either experimental or observational. n In an experimental study, one or more factors are controlled so that data can be obtained about how the factors influence the variables of interest. n In an observational study, no attempt is made to control the factors. n Cause-and-effect relationships are easier to establish in experimental studies than in observational studies.

26 26 Slide An Introduction to Experimental Design n A factor is a variable that the experimenter has selected for investigation. n A treatment is a level of a factor. n Experimental units are the objects of interest in the experiment. n A completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units. n If the experimental units are heterogeneous, blocking can be used to form homogeneous groups, resulting in a randomized block design.

27 27 Slide Completely Randomized Designs n Between-Treatments Estimate of Population Variance n Within-Treatments Estimate of Population Variance n Comparing the Variance Estimates: The F Test n The ANOVA Table n Pairwise Comparisons

28 28 Slide In the context of experimental design, the between- samples estimate of  2 is referred to as the mean square due to treatments (MSTR). In the context of experimental design, the between- samples estimate of  2 is referred to as the mean square due to treatments (MSTR). n It is the same as what we previously called mean square between (MSB). n The formula for MSTR is n The numerator is called the sum of squares due to treatments (SSTR). n The denominator k - 1 represents the degrees of freedom associated with SSTR. Between-Treatments Estimate of Population Variance = _

29 29 Slide Within-Treatments Estimate of Population Variance The second estimate of  2, the within-samples estimate, is referred to as the mean square due to error (MSE). The second estimate of  2, the within-samples estimate, is referred to as the mean square due to error (MSE). n It is the same as what we previously called mean square within (MSW). n The formula for MSE is n The numerator is called the sum of squares due to error (SSE). n The denominator n T - k represents the degrees of freedom associated with SSE.

30 30 Slide ANOVA Table for a Completely Randomized Design Source of Sum of Degrees of Mean Source of Sum of Degrees of Mean Variation Squares Freedom Squares F Variation Squares Freedom Squares F Treatments SSTR k - 1 Treatments SSTR k - 1 Error SSE n T - k Error SSE n T - k Total SST n T - 1 Total SST n T - 1

31 31 Slide Example: Home Products, Inc. Home Products, Inc. is considering marketing a long- lasting car wax. Three different waxes (Type 1, Type 2, and Type 3) have been developed. In order to test the durability of these waxes, 5 new cars were waxed with Type 1, 5 with Type 2, and 5 with Type 3. Each car was then repeatedly run through an automatic carwash until the wax coating showed signs of deterioration. The number of times each car went through the carwash is shown on the next slide. Home Products, Inc. must decide which wax to market. Are the three waxes equally effective?

32 32 Slide Example: Home Products, Inc. Wax Wax Wax Wax Wax Wax ObservationType 1 Type 2 Type 3 ObservationType 1 Type 2 Type 3 1 48 73 51 1 48 73 51 2 54 63 63 2 54 63 63 3 57 66 61 3 57 66 61 4 54 64 54 4 54 64 54 5 62 74 56 5 62 74 56 Sample Mean 55 68 57 Sample Mean 55 68 57 Sample Variance 26.0 26.5 24.5 Sample Variance 26.0 26.5 24.5

33 33 Slide n Completely Randomized Design Hypotheses Hypotheses H 0 :  1  =  2  =  3  H a : Not all the means are equal where: where:  1 = mean number of washes for Type 1 wax  2 = mean number of washes for Type 2 wax  2 = mean number of washes for Type 2 wax  3 = mean number of washes for Type 3 wax Example: Home Products, Inc.

34 34 Slide n Completely Randomized Design Mean Square Between Treatments Mean Square Between Treatments Since the sample sizes are all equal x = (x 1 + x 2 + x 3 )/3 = (55 + 68 + 57)/3 = 60 x = (x 1 + x 2 + x 3 )/3 = (55 + 68 + 57)/3 = 60 SSTR = 5(55 - 60) 2 + 5(68 - 60) 2 + 5(57 - 60) 2 = 490 MSTR = 490/(3 - 1) = 245 MSTR = 490/(3 - 1) = 245 Mean Square Error Mean Square Error SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308 SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308 MSE = 308/(15 - 3) = 25.667 = ___ Example: Home Products, Inc.

35 35 Slide n Completely Randomized Design Rejection Rule Rejection Rule Assuming  =.05, F.05 = 3.89 (2 d.f. numerator and 12 d.f. denominator). Reject H 0 if F > 3.89. Test Statistic Test Statistic F = MSTR/MSE = 245/25.667 = 9.55 F = MSTR/MSE = 245/25.667 = 9.55 Conclusion Conclusion Since F = 9.55 > F.05 = 3.89, we reject H 0. The mean number of carwashes are not the same for all three waxes. Example: Home Products, Inc.

36 36 Slide n Completely Randomized Design ANOVA Table ANOVA Table Source of Sum of Degrees of Mean Source of Sum of Degrees of Mean Variation Squares Freedom Squares F Variation Squares Freedom Squares F Treatments 490 2 245 9.55 Treatments 490 2 245 9.55 Error 308 12 25.667 Error 308 12 25.667 Total 798 14 Total 798 14 Example: Home Products, Inc.

37 37 Slide Randomized Block Design n The ANOVA Procedure n Computations and Conclusions

38 38 Slide n The ANOVA procedure for the randomized block design requires us to partition the sum of squares total (SST) into three groups: sum of squares due to treatments, sum of squares due to blocks, and sum of squares due to error. n The formula for this partitioning is SST = SSTR + SSBL + SSE n The total degrees of freedom, n T - 1, are partitioned such that k - 1 degrees of freedom go to treatments, b - 1 go to blocks, and ( k - 1)( b - 1) go to the error term. The ANOVA Procedure

39 39 Slide ANOVA Table for a Randomized Block Design Source of Sum of Degrees of Mean Variation Squares Freedom Squares F Treatments SSTR k - 1 Blocks SSBL b - 1 Error SSE ( k - 1)( b - 1) Total SST n T - 1

40 40 Slide Example: Eastern Oil Co. Eastern Oil has developed three new blends of gasoline and must decide which blend or blends to produce and distribute. A study of the miles per gallon ratings of the three blends is being conducted to determine if the mean ratings are the same for the three blends. Five automobiles have been tested using each of the three gasoline blends and the miles per gallon ratings are shown on the next slide.

41 41 Slide Example: Eastern Oil Co. Automobile Type of Gasoline (Treatment) Blocks Automobile Type of Gasoline (Treatment) Blocks (Block) Blend X Blend Y Blend Z Means (Block) Blend X Blend Y Blend Z Means 131 3030 30.333 131 3030 30.333 230 2929 29.333 230 2929 29.333 329 2928 28.667 329 2928 28.667 433 3129 31.000 433 3129 31.000 526 2526 25.667 526 2526 25.667Treatment Means 29.8 28.8 28.4 Means 29.8 28.8 28.4

42 42 Slide Example: Eastern Oil Co. n Randomized Block Design Mean Square Due to Treatments Mean Square Due to Treatments The overall sample mean is 29. Thus, SSTR = 5[(29.8 - 29) 2 + (28.8 - 29) 2 + (28.4 - 29) 2 ] = 5.2 MSTR = 5.2/(3 - 1) = 2.6 MSTR = 5.2/(3 - 1) = 2.6 Mean Square Due to Blocks Mean Square Due to Blocks SSBL = 3[(30.333 - 29) 2 +... + (25.667 - 29) 2 ] = 51.33 MSBL = 51.33/(5 - 1) = 12.8 Mean Square Due to Error Mean Square Due to Error SSE = 62 - 5.2 - 51.33 = 5.47 MSE = 5.47/[(3 - 1)(5 - 1)] =.68 MSE = 5.47/[(3 - 1)(5 - 1)] =.68

43 43 Slide Example: Eastern Oil Co. n Randomized Block Design Rejection Rule Rejection Rule Assuming  =.05, F.05 = 4.46 (2 d.f. numerator and 8 d.f. denominator). Reject H 0 if F > 4.46. Test Statistic Test Statistic F = MSTR/MSE = 2.6/.68 = 3.82 F = MSTR/MSE = 2.6/.68 = 3.82 Conclusion Conclusion Since 3.82 < 4.46, we cannot reject H 0. There is not sufficient evidence to conclude that the miles per gallon ratings differ for the three gasoline blends.

44 44 Slide End of Chapter 13


Download ppt "1 1 Slide 統計學 Spring 2004 授課教師:統計系余清祥 日期: 2004 年 3 月 30 日 第八週:變異數分析與實驗設計."

Similar presentations


Ads by Google