Analysis of Variance STAT E-150 Statistical Methods.

Slides:



Advertisements
Similar presentations
Analysis of Variance (ANOVA) ANOVA methods are widely used for comparing 2 or more population means from populations that are approximately normal in distribution.
Advertisements

BPS - 5th Ed. Chapter 241 One-Way Analysis of Variance: Comparing Several Means.
Independent t -test Features: One Independent Variable Two Groups, or Levels of the Independent Variable Independent Samples (Between-Groups): the two.
Inference for Regression
Chapter 14 Comparing two groups Dr Richard Bußmann.
5/15/2015Slide 1 SOLVING THE PROBLEM The one sample t-test compares two values for the population mean of a single variable. The two-sample test of a population.
ANOVA: Analysis of Variation
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
C82MST Statistical Methods 2 - Lecture 4 1 Overview of Lecture Last Week Per comparison and familywise error Post hoc comparisons Testing the assumptions.
Comparing Two Population Means The Two-Sample T-Test and T-Interval.
© 2010 Pearson Prentice Hall. All rights reserved The Complete Randomized Block Design.
© 2010 Pearson Prentice Hall. All rights reserved Single Factor ANOVA.
ANOVA notes NR 245 Austin Troy
Part I – MULTIVARIATE ANALYSIS
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Analysis of Variance (ANOVA) MARE 250 Dr. Jason Turner.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Are the Means of Several Groups Equal? Ho:Ha: Consider the following.
One-way Between Groups Analysis of Variance
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Chapter 11: Inference for Distributions
Analysis of Variance & Multivariate Analysis of Variance
Lab 2 instruction.  a collection of statistical methods to compare several groups according to their means on a quantitative response variable  One-Way.
January 7, morning session 1 Statistics Micro Mini Multi-factor ANOVA January 5-9, 2008 Beth Ayers.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
5-3 Inference on the Means of Two Populations, Variances Unknown
Two-Way Analysis of Variance STAT E-150 Statistical Methods.
8/20/2015Slide 1 SOLVING THE PROBLEM The two-sample t-test compare the means for two groups on a single variable. the The paired t-test compares the means.
Chapter 12: Analysis of Variance
Copyright © 2009 Pearson Education, Inc. Chapter 28 Analysis of Variance.
F-Test ( ANOVA ) & Two-Way ANOVA
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 28 Analysis of Variance.
Inferential Statistics: SPSS
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistical Analysis Statistical Analysis
ANOVA Analysis of Variance.  Basics of parametric statistics  ANOVA – Analysis of Variance  T-Test and ANOVA in SPSS  Lunch  T-test in SPSS  ANOVA.
QNT 531 Advanced Problems in Statistics and Research Methods
ANOVA Greg C Elvers.
STA291 Statistical Methods Lecture 31. Analyzing a Design in One Factor – The One-Way Analysis of Variance Consider an experiment with a single factor.
Experimental Design STAT E-150 Statistical Methods.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
ANOVA (Analysis of Variance) by Aziza Munir
Basic concept Measures of central tendency Measures of central tendency Measures of dispersion & variability.
ANOVA Conceptual Review Conceptual Formula, Sig Testing Calculating in SPSS.
6/2/2016Slide 1 To extend the comparison of population means beyond the two groups tested by the independent samples t-test, we use a one-way analysis.
6/4/2016Slide 1 The one sample t-test compares two values for the population mean of a single variable. The two-sample t-test of population means (aka.
Analysis of Variance 1 Dr. Mohammed Alahmed Ph.D. in BioStatistics (011)
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Independent t-Test CJ 526 Statistical Analysis in Criminal Justice.
ANOVA: Analysis of Variance.
T HE C OMPLETELY R ANDOMIZED D ESIGN (CRD) L AB # 1.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Data Analysis.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Copyright © Cengage Learning. All rights reserved. 12 Analysis of Variance.
Two-Way (Independent) ANOVA. PSYC 6130A, PROF. J. ELDER 2 Two-Way ANOVA “Two-Way” means groups are defined by 2 independent variables. These IVs are typically.
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
DTC Quantitative Methods Bivariate Analysis: t-tests and Analysis of Variance (ANOVA) Thursday 14 th February 2013.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Lecture Slides Elementary Statistics Eleventh Edition and the Triola.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
 List the characteristics of the F distribution.  Conduct a test of hypothesis to determine whether the variances of two populations are equal.  Discuss.
Chapter 15 Analysis of Variance. The article “Could Mean Platelet Volume be a Predictive Marker for Acute Myocardial Infarction?” (Medical Science Monitor,
ANOVA: Analysis of Variation
ANOVA: Analysis of Variation
Comparing Three or More Means
Basic Practice of Statistics - 5th Edition
One-Way Analysis of Variance
Exercise 1 Use Transform  Compute variable to calculate weight lost by each person Calculate the overall mean weight lost Calculate the means and standard.
Presentation transcript:

Analysis of Variance STAT E-150 Statistical Methods

2 In Analysis of Variance, we are testing for the equality of the means of several levels of a variable. The technique is to compare the variation between the levels and the variation within each level. (The levels of the variable are also referred to as groups, or treatments.) If the variation due to the level (variation between levels) is significantly larger than the variation within each level, then we can conclude that the means of the levels are not all equal.

3 We will test the hypothesis H 0 : μ 1 = μ 2 = ∙∙∙ = μ k vs. H a : the means are not all equal using the ratio When the numerator is large compared to the denominator, we will reject the null hypothesis.

4 ∙ The numerator of F measures the variation between groups; this is called the Mean Square for groups: MSGroups = SSGroups/df = SSGroups/(k-1) ∙ The denominator of F measures the variation within groups; this is called the Error Mean Square: MSE = SSError/df = SSError/(n - k) ∙ The test statistic is We will reject H 0 when F is large. ∙ MSGroups has k - 1 degrees of freedom, where k = the number of groups. ∙ MSError has n - k degrees of freedom, where n is the total sample size.

5 The ANOVA table: If the null hypothesis is true, the groups have a common mean, μ. Each group mean μ k may differ from the grand mean, μ, by some value. This difference is called the group effect, and we denote this value for the kth group by α k. SourcedfSSMSFp Modelk - 1SSGroupsMSGroupsMSGroups/MSError Errorn - kSSErrorMSError Totaln - 1SSTotal

6 If the null hypothesis is true, the groups have a common mean, μ. Each group mean μ k may differ from the grand mean, μ, by some value. This difference is called the group effect, and we denote this value for the kth group by α k.

7 One-Way Analysis of Variance Model The ANOVA model for a quantitative response variable and a single categorical explanatory variable with K values is Response = Grand Mean + Group Effect + Error Term Y = μ + α k + ε The Grand Mean (μ) is the part of the model that is common to all observations. The Group Effect is the variability between groups. The residual, or error, is the variability within groups. Since μ k = μ + α k we can write this model as Y = μ k + ε where ε ~ N(0, σ ε ) and are independent. That is, the errors are approximately normally distributed with a mean of 0 and a common standard deviation, and are independent.

8 The assumptions for a One-Way ANOVA are: 1.Independence Assumption The groups must be independent of each other, and the subjects within each group must be randomly assigned. Think about how the data was collected: Were the data collected randomly or generated from a randomized experiment? Were the treatments randomly assigned to experimental groups?

9 2.Equal Variance Assumption The variances of the treatment groups are equal. Look at side-by- side boxplots of the data to see if the spreads are similar; also check that the spreads don't change systematically with the centers and that the data is not skewed in each group. If either of these is true, a transformation of the data may be appropriate. Also plot the residuals against the predicted values to see if larger predicted values lead to larger residuals; this may also suggest that a reexpression should be considered.

10 3.Normal Population Assumption The values for each treatment group are normally distributed. Again, check side-by-side boxplots of the data for indications of skewness and outliers.

11 Example: A study reported in 1994 compared different psychological therapies for teenaged girls with anorexia. Each girl’s weight was measured before and after a period of cognitive behavioral therapy designed to aid weight gain. One group used a cognitive-behavioral treatment, a second group received family therapy, and the third group was a control group which received no therapy. The subjects in this study were randomly assigned to these groups. The weight change was calculated as weight at the end of the study minus weight at the beginning of the study; the weight change was positive if the subject gained weight and negative if she lost weight. What does this data indicate about the relative success of the three treatments? Note that in this analysis, the explanatory variable (type of therapy) is categorical and the response variable (weight change) is quantitative.

12 The hypotheses are: H 0 : μ 1 = μ 2 = μ 3 H a : the means are not all equal Note that the null hypothesis is not H 0 : μ 1 ≠ μ 2 ≠ μ 3

13 Some of the data is shown below. For SPSS analysis, the data should be entered with the group in one column and the data in a second column: GroupWeightGainGroupWeightGain

14 First we will see if the equal variance condition is met, by comparing side-by-side boxplots of the data: The boxplots do not show a great deal of difference in the spread of the data, but are not conclusive.

15 We can compare the largest standard deviation and the smallest standard deviation; if this ratio is less than or equal to 2, then we can assume that the variances are similar. In this case S max = 7.99 and S min = 7.16 The ratio is 7.99/7.16 = 1.116, which is less than 2, and so we can assume that the equal variance condition is met.

16 We can also use Levene's test: This test for homogeneity of variances tests the null hypothesis that the population variances are equal: H 0 : σ 1 2 = σ 2 2 = σ 3 2 H a : the variances are not all equal Since the p-value is very large (.731), we cannot reject this null hypothesis, and we can conclude that the data does not violate the equal variance assumption.

17 We can check the Normality condition with Normal Probability Plots of the three groups:

18 We can also use the table shown below to assess Normality, using a hypothesis test where the null hypothesis is that the distribution is normal. The p-values for groups 1 and 3 are larger than.05, so this null hypothesis is not rejected for these groups. For the moment, we will assume that the conditions are met. Tests of Normality Treatment Kolmogorov-Smirnov a Shapiro-Wilk StatisticdfSig.StatisticdfSig. WeightGain * * a. Lilliefors Significance Correction *. This is a lower bound of the true significance.

19 The SPSS output includes the following ANOVA table: You can see that F = and the p-value is.006. Since p is small, we reject the null hypothesis that the means are all equal. This data provides evidence of a difference in the mean weight gain for the three groups. But where is this difference? ANOVA Gain Sum of SquaresdfMean SquareFSig. Between Groups Within Groups Total

20 Which group had the greatest mean weight gain? Group 3 Which group had the lowest mean weight gain? Group 1 Descriptives Gain NMeanStd. DeviationStd. Error 95% Confidence Interval for Mean MinimumMaximum Lower BoundUpper Bound Total

21 Which group had the greatest mean weight gain? Group 3 Which group had the lowest mean weight gain? Group 1 Is either of these values significantly different from the other group means? Are all three groups different in terms of weight gain? We can answer these questions using a post-hoc test, Tukey's Honestly Significant Difference test, which compares all pairs of group means. Descriptives Gain NMeanStd. DeviationStd. Error 95% Confidence Interval for Mean MinimumMaximum Lower BoundUpper Bound Total

22 Here is one result of this test: The first line shows the comparison between Group 1 and Group 2. The mean difference is , but it is not significant since p =.212. Multiple Comparisons Gain Tukey HSD (I) Group(J) GroupMean Difference (I-J)Std. ErrorSig. 95% Confidence Interval Lower BoundUpper Bound * * *. The mean difference is significant at the 0.05 level.

23 Here is one result of this test: The next line shows that the difference between Group 1 and Group 3 is significant; not only is p =.005, but SPSS shows an asterisk beside the mean difference of to indicate that the difference is significant. Multiple Comparisons Gain Tukey HSD (I) Group(J) GroupMean Difference (I-J)Std. ErrorSig. 95% Confidence Interval Lower BoundUpper Bound * * *. The mean difference is significant at the 0.05 level.

24 Here is one result of this test: What conclusions can you draw about the difference between Group 2 and Group 3? The difference between Group 2 and Group 3 is not significant. Multiple Comparisons Gain Tukey HSD (I) Group(J) GroupMean Difference (I-J)Std. ErrorSig. 95% Confidence Interval Lower BoundUpper Bound * * *. The mean difference is significant at the 0.05 level.

25 Here is one result of this test: What conclusions can you draw about the difference between Group 2 and Group 3? Since p =.161, the difference between Group 2 and Group 3 is not significant. Are the means different for all three groups? Multiple Comparisons Gain Tukey HSD (I) Group(J) GroupMean Difference (I-J)Std. ErrorSig. 95% Confidence Interval Lower BoundUpper Bound * * *. The mean difference is significant at the 0.05 level.

26 Here is one result of this test: Are the means different for all three groups? The three means are not all different; the only significant difference is between the means of Group 1 and Group 3. Multiple Comparisons Gain Tukey HSD (I) Group(J) GroupMean Difference (I-J)Std. ErrorSig. 95% Confidence Interval Lower BoundUpper Bound * * *. The mean difference is significant at the 0.05 level.

27 Hypothesis Tests and Confidence Intervals A pair of means can be considered significantly different at a.05 level of significance if and only if zero is not contained in a 95% confidence interval for their difference. We can use Fisher's Least Significant Difference to determine where any differences lie by identifying any confidence intervals which do not contain 0.

28 Are the means different for all three groups? Are the results the same as when we used Tukey's HSD?

29 Are the means different for all three groups? Are the results the same as when we used Tukey's HSD? The only confidence interval that does not contain 0 is the CI for the difference on the means of Group 1 and Group 3. This indicates that the means for these two groups are different.

30 How else can we follow up on this analysis? Since the groups are independent, we can do our own pairwise t-tests for the difference of the means.

31 How else can we follow up on this analysis? Since the groups are independent, we can do our own pairwise t-tests for the difference of the means. Two-sample t-tests H 0 : μ 1 - μ 2 = 0 H a : μ 1 - μ 2 ≠ 0 (or > 0 or < 0) Assumptions: Independent random samples: Approximately Normal distributions for both samples

32 Here are the results for the test of H 0 : μ 1 - μ 2 = 0 H a : μ 1 - μ 2 ≠ 0 Is there a significant difference between the means of these groups ? What is your statistical conclusion? Be sure to state the p-value. p =.100 Since p is small, the null hypothesis is reject What is your conclusion in context?t there is a significant difference

33 Is there a significant difference between the means of these groups? What is your statistical conclusion? p =.100 Since p >.05, the null hypothesis is not rejected. What is your conclusion in context? The data does not indicate that there is a significant difference between the mean weight gain with cognitive-behavioral treatment and family therapy.

34 What else can be concluded? If the data was gathered in a well-designed experiment in which subjects were randomly assigned to treatment groups, then we can conclude causality. In an observational study in which random samples are taken from the populations, the results can be extended to the associated populations.

35 SPSS Instructions for ANOVA To create side-by-side boxplots of the data: Assume that your file has the groups in one column and the values of the variable in a second column. Choose > Graphs > Chart Builder Choose Boxplot and drag the first boxplot (Simple) to the preview area. Drag the column with the groups to the x-axis, and the column with the values of the predictor variable to the y-axis. Click OK.

36 To create Normal Probability Plots of the data: Choose > Analyze > Descriptive Statistics > Explore In the Explore dialog box, choose the Dependent List variable and the Factor List variable. Click on Plots. Click OK.

37 To perform a One-Way Analysis of Variance Choose > Analyze > Compare Means > One-Way ANOVA Choose the Dependent List variable and the Factor List variable. Click on Options, and under Statistics, choose Descriptive and Homogeneity of Variance Test. Click on Continue and then OK.

38 To perform Tukey's Honestly Significant Difference test Choose > Analyze > Compare Means > One-Way ANOVA (The variables may still be selected, so you may not have to enter the Dependent List variable and the Factor List variable.) Click on Post-Hoc, and select Tukey. Note that you can also select LSD to choose Fisher's Least Significant Difference test.