Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.

Similar presentations


Presentation on theme: "Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test."— Presentation transcript:

1 Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test Test statistic used is called an F statistic (F-test)

2 Normal Distribution Most characteristics follow a normal distribution –For example: height, length, speed, etc. One of the assumptions of the ANOVA test is that the sample data is ‘normally distributed.’

3 Sample Distribution Approaches Normal Distribution With Sample Size

4

5

6 Proc Univariate Tests for normality Gives you a ‘visual’ of your sample distribution SAS code: proc sort; by location proc univariate plot normal; by location; var length; run;

7 Tests for Normality Test --Statistic p Value Shapiro-Wilk W Pr < W Kolmogorov-Smirnov D Pr > D > Cramer-von Mises W-Sq Pr > W-Sq > Anderson-Darling A-Sq Pr > A-Sq > Important Univariate Output Each of the four above tests are testing for normality. The Shapiro-Wilk and Kolmogorov-Smirnov are the two most common. Because all p values are >0.05, none of the tests indicate that our sample is significantly different than a normal distribution.

8 Variance =  (x-x) 2 N-1  i= x N N Mean = x = Standard Deviation =  (x-x) 2 N-1  Mean = 169/6 = Range = 25 – 32 SOS = Variance = / 5 = 8.16 Std. Dev. =  40.83/5 = 2.86 Std. Err. = 2.86 / √ 6 = 1.17 Standard Error = SD √N

9 Calculate a SOS based on an overall mean (total SOS) ANOVA – Analysis of Variance

10 TrtmntReplicateLengthOverall MeanSOS Total Pond Pond Pond Pond Pond Pond Pond Pond Pond Pond Lake Lake Lake Lake Lake Lake Lake Lake Lake Lake This provides a measure of the overall variance (Total SOS).

11 Calculate a SOS based for each treatment (Treatment or Error SOS).

12 TrtmntReplicateLengthTrtmnt MeanSOS Error Pond Pond Pond Pond Pond Pond Pond Pond Pond Pond Lake Lake Lake Lake Lake Lake Lake Lake Lake Lake This provides a measure of the reduction of variance by measuring each treatment separately (Treatment or Error SOS). What happens to Error SOS when the variability w/in each treatment decreases?

13 Calculate a SOS for each predicted value vs. the overall mean (Model SOS)

14 TrtmntReplicateLengthTrtmnt MeanOverall MeanSOS Model Pond Pond Pond Pond Pond Pond Pond Pond Pond Pond Lake Lake Lake Lake Lake Lake Lake Lake Lake Lake This provides a measure of the distance between the mean values (Model SOS). What happens to Model SOS when the two means are close together? What if the means are equal?

15 Detecting a Difference Between Treatments Model SOS gives us an index on how far apart the two means are from each other. – Bigger Model SOS = farther apart Error SOS gives us an index of how scattered the data is for each treatment. –More variability = larger Error SOS = more possible overlap between treatments

16 Magic of the F-test The ratio of Model SOS to Error SOS (Model SOS divided by Error SOS) gives us an overall index (the F statistic) used to indicate the relative ‘distance’ and ‘overlap’ between two means. –A large Model SOS and small Error SOS = a large F statistic. Why does this indicate a significant difference? –A small Model SOS and a large Error SOS = a small F statistic. Why does this indicate no significant difference?? Based on sample size and alpha level (P-value), each F statistic has an associated P-value. –P < 0.05 (Large F statistic) there is a significant difference between the means –P ≥ 0.05 (Small F statistic) there is NO significant difference

17 Data chicken; input location$ replicate length; cards; Data Set not shown ; proc print; run; proc sort; by location; /* proc means mean n var stddev cv stderr clm; by location; var length; run; */ proc anova; {Tells SAS to do the analysis of variance procedure} class location; {Tells SAS that location is a class variable} model length=location; (Tells SAS to compare length between locations} run; Tells SAS to ignore everything between /* and */ SAS Program with ANOVA added

18 The SAS System 10:10 Monday, June 19, Obs location replicate length 1 Pond Pond Pond Pond Pond Pond Pond Pond Pond Pond Lake Lake Lake Lake Lake Lake Lake Lake Lake Lake Data Set: 20 total observations Two locations with 10 replicates each Individual lengths

19 The ANOVA Procedure Class Level Information Class Levels Values location 2 Lake Pond Number of Observations Read 20 Number of Observations Used 20 SAS ANOVA Output 1 st Page This tell us that SAS understands that there are two classes: Lake and Pond. We also are told that SAS can use all 20 values in this ANOVA procedure.

20 SAS ANOVA Output 2 nd Page The SAS System 10:10 Monday, June 19, The ANOVA Procedure Dependent Variable: length Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE length Mean Source DF Anova SS Mean Square F Value Pr > F location What are some ways to make the F Value larger? ÷ ÷ = = P-value

21 Data Set; 3 treatments with 5 replicates per treatment What about analysis of variance with three treatments: TreatmentMean

22 The SAS System 13:17 Monday, October 4, The ANOVA Procedure Class Level Information Class Levels Values treat Number of observations 15 Variable nameVariable labels

23 The SAS System 13:17 Monday, October 4, The ANOVA Procedure Dependent Variable: size Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE size Mean Source DF Anova SS Mean Square F Value Pr > F treat P-valueF-value   = = TreatmentMean Which means are different/similar?

24 proc anova; {Tells SAS to do the analysis of variance procedure} class treatment; {Tells SAS that treatment is a class variable} model weight=treatment; (Tells SAS to compare weight among treatments} means treatment / tukey lines; {Tells SAS to delineate the means with a Tukey test and use the lines method to show differences. run; Delineating the Means With SAS

25 The SAS System 13:17 Monday, October 4, The ANOVA Procedure Tukey's Studentized Range (HSD) Test for size NOTE: This test controls the Type I experimentwise error rate, but it generally has a higher Type II error rate than REGWQ. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square 10.5 Critical Value of Studentized Range Minimum Significant Difference Means with the same letter are not significantly different. Tukey Grouping Mean N treat A A A B Treat 1 and 3 are not different, 1 and 3 are different than 2

26 A B A Showing Results


Download ppt "Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test."

Similar presentations


Ads by Google