Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the.

Similar presentations


Presentation on theme: "1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the."— Presentation transcript:

1 1 Chapter 13 Analysis of Variance

2 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the completely randomized design

3 3 An Introduction to Experimental Design and Analysis and Variance  Statistical studies can be classified as being either experimental or observational.  In an experimental study, one or more factors are controlled so that data can be obtained about how the factors influence the variables of interest.  In an observational study, no attempt is made to control the factors.  Cause and effect relationship are easier to establish in experimental studies than in observational studies.  Analysis of variance (ANOVA) can be used to analyze the data obtained from experimental or observational studies.

4 4 An Introduction to Experimental Design and Analysis and Variance  Three types of experimental designs are introduced.  A completely randomly design  A randomized block design  A factorial experiment

5 5 An Introduction to Experimental Design and Analysis and Variance  A factor is a variable that the experimenter has selected for investigation.  A treatment is a level of a factor  For example, if location is a factor, then a treatment of location can be New York, Chicago, or Seattle.  Experimental units are the objects of interest in the experiment.  A completely randomized design is an experimental design in which the treatments are randomly assigned to the experimental units.

6 6 Analysis of Variance: A Conceptual Overview  Analysis of Variance (ANOVA) can be used to test for the equality of three or more population means.  Data obtained from observational or experimental studies can be used for the analysis.  We want to use the sample results to test the following hypothesis: H 0 :  1  =  2  =  3  = ... =  k H a : Not all population means are equal

7 7 Analysis of Variance: A Conceptual Overview H 0 :  1  =  2  =  3  = ... =  k H a : Not all population means are equal If H 0 is rejected, we cannot conclude that all If H 0 is rejected, we cannot conclude that all population means are different. population means are different. If H 0 is rejected, we cannot conclude that all If H 0 is rejected, we cannot conclude that all population means are different. population means are different. Rejecting H 0 means that at least two population Rejecting H 0 means that at least two population means have different values. means have different values. Rejecting H 0 means that at least two population Rejecting H 0 means that at least two population means have different values. means have different values.

8 8 Analysis of Variance: A Conceptual Overview Assumptions for Analysis of Variance For each population, the response (dependent) For each population, the response (dependent) variable is normally distributed. variable is normally distributed. For each population, the response (dependent) For each population, the response (dependent) variable is normally distributed. variable is normally distributed. The variance of the response variable, denoted  2, The variance of the response variable, denoted  2, is the same for all of the populations. is the same for all of the populations. The variance of the response variable, denoted  2, The variance of the response variable, denoted  2, is the same for all of the populations. is the same for all of the populations. The observations must be independent. The observations must be independent.

9 9 Analysis of Variance: A Conceptual Overview Sampling Distribution of Given H 0 is True µ If H 0 is true, all the populations have the same mean. It is also assumed that all the populations have the same variance. Therefore, all the sample means are drawn from the same sampling distribution. As a result, the sample means tend to be close to one another. Sample means are likely to be close to the same population mean if H 0 is true.

10 10 Analysis of Variance: A Conceptual Overview Sampling Distribution of Given H 0 is False 33 33 11 11 22 22 When H 0 is false, sample means are drawn from different populations. As a result, sample means tend NOT to be close together. Instead, they tend to be close to their own population means.

11 11 Analysis of Variance  Between-treatments estimate of population variance  Within-treatments estimate of population variance  Comparing the variance estimates: The F test  ANOVA table

12 12  2 Between-Treatments Estimate of Population Variance  2  2 based on the variation of the sample means is called the mean square due to treatments and is denoted by MSTR.  The estimate of  2 based on the variation of the sample means is called the mean square due to treatments and is denoted by MSTR. Denominator is the degrees of freedom associated with SSTR Numerator is called the sum of squares due to treatments (SSTR)

13 13  2 Between-Treatments Estimate of Population Variance  2 k is the number of treatments (# of samples) is the number of observations in treatment j is the sample mean of treatment j is the overall mean, i.e. the average value of ALL the observations from all the treatments

14 14  2 Within-Treatments Estimate of Population Variance  2  2 based on the variation of the sample observations within each sample is called the mean square due to error and is denoted by MSE.  The estimate of  2 based on the variation of the sample observations within each sample is called the mean square due to error and is denoted by MSE. Denominator is the degrees of freedom associated with SSE Numerator is called the sum of squares due to error (SSE)

15 15  2 Within-Treatments Estimate of Population Variance  2 k is the number of treatments (# of samples) is the number of observations in treatment j is the sample variance of treatment j n T is the total number of ALL the observations from all the treatments

16 16 Comparing the Variance Estimates: The F Test  2  Because the within-treatments estimate (MSE) of  2 only involves sample variances, all of which are unbiased estimates of the population variance (according to the assumptions, all the population variances are the same), MSE is a good estimate of population variance regardless whether H 0 is true or not.  2  On the other hand, the between-treatments estimate (MSTR), which uses sample means, will be a good estimate of  2 if H 0 is true, since all the sample means are drawn from the same population when H 0 is true.  2 since the sample means will not be close together.  When H 0 is false, the sample means are drawn from different populations (with different µ). Therefore, MSTR will overestimate  2 since the sample means will not be close together.

17 17 Comparing the Variance Estimates: The F Test  If the null hypothesis is true and the ANOVA assumptions are valid, the sampling distribution of MSTR/MSE is an F distribution with MSTR degrees of freedom (d.f.) equal to k -1 and MSE d.f. equal to n T - k.  2.  If H 0 is true, MSTR/MSE should be close to 1 since both are good estimates of  2.  2.  If H 0 is false, i.e. if the means of the k populations are not equal, the ratio MSTR/MSE will be larger than 1 since MSTR overestimates  2.  Hence, we will reject H 0 if the value of MSTR/MSE proves to be too large to have been resulted at random from the appropriate F distribution.

18 18 Comparing the Variance Estimates: The F Test Sampling Distribution of MSTR/MSE Do Not Reject H 0 Reject H 0 MSTR/MSE Critical Value FF FF Sampling Distribution of MSTR/MSE 

19 19 ANOVA Table Source of Variation Sum of Squares Degrees of Freedom MeanSquare F Treatments Error Total k - 1 n T - 1 SSTR SSE SST n T - k SST is partitioned into SSTR and SSE. SST’s degrees of freedom (d.f.) are partitioned into SSTR’s d.f. and SSE’s d.f. p - Value

20 20 ANOVA Table SST divided by its degrees of freedom n T – 1 is the SST divided by its degrees of freedom n T – 1 is the overall sample variance that would be obtained if we overall sample variance that would be obtained if we treated the entire set of observations as one data set. treated the entire set of observations as one data set. SST divided by its degrees of freedom n T – 1 is the SST divided by its degrees of freedom n T – 1 is the overall sample variance that would be obtained if we overall sample variance that would be obtained if we treated the entire set of observations as one data set. treated the entire set of observations as one data set. With the entire data set as one sample, the formula for computing the total sum of squares, SST, is: for computing the total sum of squares, SST, is: With the entire data set as one sample, the formula for computing the total sum of squares, SST, is: for computing the total sum of squares, SST, is:

21 21 ANOVA Table ANOVA can be viewed as the process of partitioning ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error. into their corresponding sources: treatments and error. ANOVA can be viewed as the process of partitioning ANOVA can be viewed as the process of partitioning the total sum of squares and the degrees of freedom the total sum of squares and the degrees of freedom into their corresponding sources: treatments and error. into their corresponding sources: treatments and error. Dividing the sum of squares by the appropriate Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates. degrees of freedom provides the variance estimates. Dividing the sum of squares by the appropriate Dividing the sum of squares by the appropriate degrees of freedom provides the variance estimates. degrees of freedom provides the variance estimates. The F value (MSTR/MSE) is used to test the hypothesis The F value (MSTR/MSE) is used to test the hypothesis of equal population means. of equal population means. The F value (MSTR/MSE) is used to test the hypothesis The F value (MSTR/MSE) is used to test the hypothesis of equal population means. of equal population means.

22 22 Test for the Equality of k Population Means  Hypotheses  Test Statistic H 0 :  1  =  2  =  3  = ... =  k  H a : Not all population means are equal F = MSTR/MSE

23 23 Test for the Equality of k Population Means  Rejection Rule p -value Approach: Reject H 0 if p -value <  Critical Value Approach: Reject H 0 if F > F  where the value of F  is based on an F distribution with k - 1 numerator d.f. and n T - k denominator d.f.

24 24 Test for the Equality of k Population Means: An Observational Study  Example: Reed Manufacturing Janet Reed would like to know if there is any Janet Reed would like to know if there is any significant difference in the mean number of hours worked per week for the department managers at her three manufacturing plants (in Buffalo, Pittsburgh, and Detroit). An F test will be conducted using  =.05.

25 25 Test for the Equality of k Population Means: An Observational Study  Example: Reed Manufacturing A simple random sample of five managers from A simple random sample of five managers from each of the three plants was taken and the number of hours worked by each manager in the previous week is shown on the next slide. Factor... Manufacturing plant Treatments... Buffalo, Pittsburgh, Detroit Experimental units... Managers Response variable... Number of hours worked

26 26 Test for the Equality of k Population Means: An Observational Study 1 2 3 4 5 48 54 57 54 62 73 63 66 64 74 51 63 61 54 56 Plant 1 Buffalo Plant 2 Pittsburgh Plant 3 Detroit Observation Sample Mean Sample Variance 55 68 57 26.0 26.5 24.5

27 27 Test for the Equality of k Population Means: An Observational Study H 0 :  1  =  2  =  3  H a : Not all the means are equal where:  1 = mean number of hours worked per week by the managers at Plant 1  2 = mean number of hours worked per week by the managers at Plant 2  3 = mean number of hours worked per week by the managers at Plant 3 1. Develop the hypotheses.

28 28 Test for the Equality of k Population Means: An Observational Study 2. Specify the level of significance.  =.05 3. Compute the value of the test statistic. MSTR = 490/(3 - 1) = 245 SSTR = 5(55 - 60) 2 + 5(68 - 60) 2 + 5(57 - 60) 2 = 490 = (55 + 68 + 57)/3 = 60 ( Only when sample sizes are all equal, the overall mean is equal to the average of sample means.) Due to Treatments Mean Square Due to Treatments

29 29 Test for the Equality of k Population Means: An Observational Study 3. Compute the value of the test statistic. MSE = 308/(15 - 3) = 25.667 SSE = 4(26.0) + 4(26.5) + 4(24.5) = 308 Mean Square Due to Error (con’t.) F = MSTR/MSE = 245/25.667 = 9.55

30 30 Test for the Equality of k Population Means: An Observational Study Treatment Error Total 490 308 798 2 12 14 245 25.667 Source of Variation Sum of Squares Degrees of Freedom MeanSquare 9.55 F ANOVA Table p -Value.0033

31 31 Test for the Equality of k Population Means: An Observational Study p – Value Approach 5. Determine whether to reject H 0. We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plants. The p -value <.05, so we reject H 0. With 2 numerator d.f. and 12 denominator d.f., the p -value is.0033 for F = 9.55. 4. Compute the p –value.

32 32 Test for the Equality of k Population Means: An Observational Study Critical Value Approach 5. Determine whether to reject H 0. Because F = 9.55 > 3.89, we reject H 0. 4. Determine the critical value and rejection rule. Reject H 0 if F > 3.89 (critical value) We have sufficient evidence to conclude that the mean number of hours worked per week by department managers is not the same at all 3 plants. Based on an F distribution with 2 numerator d.f. and 12 denominator d.f., F.05 = 3.89.


Download ppt "1 Chapter 13 Analysis of Variance. 2 Chapter Outline  An introduction to experimental design and analysis of variance  Analysis of Variance and the."

Similar presentations


Ads by Google