Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:

Similar presentations


Presentation on theme: "Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:"— Presentation transcript:

1 Chapter 13 Analysis of Variance (ANOVA)

2 ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always: H 0 :  1 =  2 =... =  k (where k is the number of groups) H a : Not all of the population means are equal

3 Analysis of Variance: Assumptions For each population the response variable is normally distributed The variance of the response variable,  2, is the same for all of the populations The observations must be independent

4 Analysis of Variance (ANOVA) The ANOVA hypothesis test is based on a comparison of the variation between groups (treatments) and within groups (treatments).

5 Between vs. Within Variation FamilyCarsAttribute 1Attribute 2 11AX 21AY 31AZ 42BX 52BY 62BZ 73CX 83CY 93CZ

6 Between vs. Within Variation Group AGroup BGroup C 123 123 123 Case 1: All variation is due to differences between groups

7 Between vs. Within Variation Group XGroup YGroup Z 111 222 333 Case 2: All variation is due to differences within groups

8 Analysis of Variance (ANOVA) If the variation is primarily due to differences between groups then we would conclude the means are different and reject H 0. If the variation is primarily due to differences within the groups then we would conclude the means are the same and accept H 0.

9 Analysis of Variance (ANOVA) The relative sizes of between and within group variation are measured by comparing two estimates of the variance. ns x̄ 2 is used to estimate n  x̄ 2 and  2. If the means are equal ns x̄ 2 will be an unbiased estimator of  2. If the means are not equal ns x̄ 2 will overestimate  2.

10 Sampling Distribution of Given H 0 is True   Sample means are close together because they are drawn from the same sampling distribution when H 0 is true. ANOVA

11 Sampling Distribution of Given H 0 is False 33 33 11 11 22 22 Sample means come from different sampling distributions and are not as close together when H 0 is false. ANOVA

12 Analysis of Variance (ANOVA) The second way of estimating the population variance is to find the average of the variances of the different groups. This approach provides an unbiased estimate regardless whether or not the null hypothesis is true.

13 Analysis of Variance (ANOVA) If we take the ratio of the two approaches we have a measure that has an expected value of 1 if the null hypothesis is true. It will be larger than 1 if the null hypothesis is false.

14 Analysis of Variance (ANOVA) If the null hypothesis is true and the conditions for conducting the ANOVA test are met then the sampling distribution of the ratio is an F distribution with k-1 degrees of freedom in the numerator and n T – k degrees of freedom in the denominator.

15 F Distribution 

16 As before  is the probability of rejecting H 0 when it is true (the probability of making a Type I error). F  is the critical value such that an area equal to  lies in the upper-most tail. For example, with 5 degrees of freedom in the denominator and 10 degrees of freedom in the numerator, an F value of 4.74 would capture an area equal to 0.05 in the tail.

17 ANOVA Hypothesis Test The steps for conducting an ANOVA hypothesis test are the same as for conducting a hypothesis test of the mean: 1.State the hypotheses 2.State the rejection rule 3.Calculate the test statistic 4.State the result of the test and its implications

18 ANOVA Table Typically when we do the test we organize the calculations in table with a specific format: Source of variation Sum of Squares Degrees of freedom Mean Square F TreatmentsSSTRk-1MSTRF ErrorSSEn T -kMSE TotalSSTn T -1

19 Between-Treatments Estimate of Population Variance Denominator is the degrees of freedom associated with SSTR Numerator is called the sum of squares due to treatments (SSTR) The estimate of  2 based on the variation of the sample means is called the mean square due to treatments and is denoted by MSTR.

20 Between-Treatments Estimate of Population Variance Assume there are the same number of observations in each group so that n j = n.

21 The estimate of  2 based on the variation of the sample observations within each sample is called the mean square error and is denoted by MSE. Within-Treatments Estimate of Population Variance s 2 Denominator is the degrees of freedom associated with SSE Numerator is called the sum of squares due to error (SSE)

22 Assume there are the same number of observations in each group so that n j = n and n T = nk. Within-Treatments Estimate of Population Variance

23 ANOVA Table With the entire data set as one sample, the formula for computing the total sum of squares, SST, is: SST divided by its degrees of freedom n T -1 is the overall sample variance that would be obtained if we treated the entire set of observations as one data set.

24 Source of Variation Sum of Squares Degrees of Freedom Mean Square F Treatments Error Total k - 1 n T - 1 SSTR SSE SST n T - k SST is partitioned into SSTR and SSE. SST’s degrees of freedom (d.f.) are partitioned into SSTR’s d.f. and SSE’s d.f. ANOVA Table

25 ABC 0212 130 642 226 104 Assume we are interested in finding if the average number of cars owned is different for three different towns. Five people are interviewed in each town. Assume  =.05 ANOVA Example

26 Analysis of Variance (ANOVA) H 0 :  1 =  2 =  3 H a : Not all of the population means are equal Reject H 0 if: F > F  F > 3.89 Given k - 1 = 3 - 1 = 2 df in the numerator and n T - k = 15 - 3 = 12 df in the denominator

27 ABC 0212 130 642 226 104 Mean=22.24.8 ANOVA Example = (2+2.2+4.8)/3 = 9/3 = 3

28 Analysis of Variance (ANOVA) SSTR = 5(2-3) 2 + 5(2.2-3) 2 + 5(4.8-3) 2 = 24.4 SSE = (0-2) 2 + (1-2) 2 + (6-2) 2 + (2-2) 2 + (1-2) 2 + (2-2.2) 2 + (3-2.2) 2 + (4-2.2) 2 + (2-2.2) 2 + (0-2.2) 2 + (12-4.8) 2 + (0-4.8) 2 + (2-4.8) 2 + (6-4.8) 2 + (4-4.8) 2 =115.6

29 ANOVA Table Source of variation Sum of Squares Degrees of freedom Mean Square F Treatments24.4212.21.27 Error115.6129.6 Total14014

30 Anova: Single Factor SUMMARY GroupsCountSumAverageVariance Column 151025.5 Column 25112.2 Column 35244.821.2 ANOVA Source of VariationSSdfMSFP-valueF crit Between Groups24.4212.21.2660.31693.885 Within Groups115.6129.633 Total14014

31 ABC 145 313 232 246 Assume we are interested in finding if the average number of bedrooms per home is different for three different towns. Data on four houses were collected in each town. Assume  =.05 Graded Homework P. 401, #7 (just do a hypothesis test)


Download ppt "Chapter 13 Analysis of Variance (ANOVA). ANOVA can be used to test for differences between three or more means. The hypotheses for an ANOVA are always:"

Similar presentations


Ads by Google