Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Variance (ANOVA)

Similar presentations


Presentation on theme: "Analysis of Variance (ANOVA)"— Presentation transcript:

1 Analysis of Variance (ANOVA)

2 Agenda Lab Stuff Questions about Chi-Square?
Intro to Analysis of Variance (ANOVA)

3 This Thursday: Lab 4 Final lab will be distributed on Thursday
Very similar to lab 3, but with different data You will be expected to find appropriate variables for three major tests (correlation, t-test, chi-square test of independence) You will be expected to interpret the findings from each test (one short paragraph per test). We will use the first 15 minutes of class to return lab 3 and discuss common issues and questions

4 Example Crosstab: Gender x Student
Student Not Student Total Males 46 (40.97) 71 (76.02) 117 Females 37 (42.03) 83 (77.97) 120 154 237 Observed Expected

5 ANOVA

6 Analysis of Variance In its simplest form, it is used to compare means for three or more categories. Example: Income (metric) and Marital Status (many categories) Relies on the F-distribution Just like the t-distribution and chi-square distribution, there are several sampling distributions for each possible value of df.

7 What is ANOVA? If we have a categorical variable with 3+ categories and a metric/scale variable, we could just run 3 t-tests. One problem is that the 3 tests would not be independent of each other (i.e., all of the information is known). As number of comparisons grow, likelihood of some differences are expected– but do not necessarily indicate an overall difference. A better approach: compare the variability between groups (treatment variance + error) to the variability within the groups (error) Think of it as an additive probability of each t-test (even though its slightly less than this).

8 The F-ratio MS = mean square bg = between groups wg = within groups
The numerator and denominator have their own degrees of freedom df = # of categories – 1 (k-1) In the simplest terms, it is a ratio of the overall variance between the means of all groups, over the overall variance within the groups. Named after R.A. Fisher

9 Interpreting the F-ratio
Generally, an f-ratio is a measure of how different the means are relative to the variability within each sample Larger values  greater likelihood that the difference between means are not just due to chance alone

10 Null Hypothesis in ANOVA
If there is no difference between the means, then the between-group sum of squares should = the within-group sum of squares. Why do we use the sum of squares? Because…whenever we have three or more numerical values, the measure of their variability is equivalent to the measure of their aggregate differences. If this were a t-test we would have just taken mean differences.

11 F-distribution A right-skewed distribution
It is a ratio of two chi-square distributions Why is the ratio of two chi-square distributions an appropriate distribution to use? Or, what does the F-ratio have to do with chi-square? Remember, Chi-square is the sum of squared deviates for randomly distributed variables… the F-distribution is the ratio of two sum of squared deviates (or, one Chi-Square divided by another)

12 F-distribution F-test for ANOVA is a one-tailed test.

13 Visual ANOVA and f-ratio

14 ANOVA and t-test How do we know where the differences exist once we know that we have an overall difference between groups? t-tests become important after an ANOVA so that we can find out which pairs are significantly different (post-hoc tests). Certain ‘corrections’ can be applied to such post-hoc t-tests so that we account for multiple comparisons (e.g., Bonferroni correction, which divides p-value by the number of comparisons being made) There are many means comparisons test available (Tukey, Sidak, Bonferroni, etc). All are basically modified means comparisons.

15 Logic of the ANOVA Conceptual Intro to ANOVA Class Example: anova.do
GSS96_small.dta


Download ppt "Analysis of Variance (ANOVA)"

Similar presentations


Ads by Google