# ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD.

## Presentation on theme: "ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD."— Presentation transcript:

ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD

Review: Sample Variance This can be re-written into:

Data Set FYI: N = 9

Total Sums of Squares Let’s calculate the Sums of Squares (SS) for this data set as it is… As we can see the mean of 9.67 has already been calculated for us and we are going to treat that 9.67 as a Grand Mean (i.e. ungrouped mean)

Total Sums of Squares

Between Groups Sums of Squares So, the Total Sums of Squares applies to this data if all of the 9 data points were collected as part of a single 9 member group. However, what if the data were collected in groups of 3 instead And let’s imagine that each group is receiving some different form of treatment (e.g. Independent variable) that we think will affect the subjects’ scores along with each individual group’s mean

Between Groups Sums of Squares Note: that each group has it’s own mean that describes the central tendency of the participants in that group (e.g. 6 is the mean of 7, 1 and 10) Also Note: That with equal subjects in each group the average of the 3 group means is the “Grand Mean” from earlier (i.e. 9.67 is the average of 6, 8.667 and 14.333)

Between Groups Sums of Squares So, if we want to understand the effect that the different treatments are having on the group via the participants (i.e. how are the treatments moving the participants away from the grand mean) we can pretend for a second that every participant scored exactly at their own group mean

Between Groups Sums of Squares So, if we want to understand the effect that the different treatments are having on the group via the participants (i.e. how are the treatments moving the participants away from the grand mean) we can pretend for a second that every participant scored exactly at their own group mean Note: the group means and grand mean stays the same

Between Groups Sums of Squares Let’s ignore the group means for a second and calculate the SS pretending that every participant scored at their group mean

Between Groups Sums of Squares Let’s take a look at that last formula and see that when calculated there is a lot of redundancy within each group For instance for the first group we are subtracting and squaring the same number 3 times (i.e. one for each participant) Couldn’t we come to the same answer by simply doing it once and multiplying by the number of participants in that group (i.e. 3)?

Between Groups Sums of Squares This is why typically we don’t substitute the mean for every person’s score but just weight the difference by the number of scores in each group (i.e. n g )

Within Groups Sums of Squares Looking back at the original data we can see that in fact the subject rarely, if ever, scored exactly at their group mean… So, something else, beside our hypothesized treatment, is causing our subjects to differ within each of the groups We haven’t hypothesized it, therefore we can’t explain why it’s there so we are going to assume that it is just random variation, but we still need to identify it…

Within Groups Sums of Squares To identify the random variation, let’s look inside each group separately to see if we can find an average degree of variation Remembering that variance is SS/df let’s identify the with group SS values for each group For Group 1:

Within Groups Sums of Squares To identify the random variation, let’s look inside each group separately to see if we can find an average degree of variation Remembering that variance is SS/df let’s identify the with group SS values for each group For Group 2: Note: this equals the mean coincidentally

Within Groups Sums of Squares To identify the random variation, let’s look inside each group separately to see if we can find an average degree of variation Remembering that variance is SS/df let’s identify the with group SS values for each group For Group 3:

Within Groups Variance These Within Groups Sums of Squares can be used to tell us how people just “randomly” spread out within each of groups Remembering that variance is SS/df let’s divide each groups SS by it’s degrees of freedom (df) Where n j is the number of participants in each group (e.g. for our example n j = 3 for all of the groups)

The Variance Within Each Group For group 1:

The Variance Within Each Group For group 2:

The Variance Within Each Group For group 3:

Average Within Groups Variance Now that we have the variances within each of the groups we can calculate an average within groups variance that is an extension of the pooled variance from the independent samples t-test Because the values for n j are equal this is just a simple average (Note: if the n j values are not equal you can perform a weighted average or just calculate the WG variance directly as in the next slide) Note: No subscript because this is not for any particular group but an average across the across the groups

Within Groups Sums of Squares The value for the overall WG Sums of Squares could have been calculated directly by simply combining the SS WG formula across groups Note again: that there is no subscript on the SS value because it is done for all groups at the same time

Within Groups Sums of Squares Remembering that the means for the groups are 6, 8.667 and 14.333 respectively we can simply take every individual score and subtract the mean of the group the score belongs to All together now…

ANOVA Summary Table Let’s take what we know and see if we can’t put it together and summarize it using the table above We know that the SS Total = 164 We know that the SS BG = 108.655 We know that the SS WG = 55.333 We also know that the WG variance (i.e. MS WG above) = 9.223 from the average of the 3 group variances Note: The SS for Between and Within add up to the SS-total as it should

ANOVA Summary Table We have the SS value for the BG source of variability but we need to convert it to a variance. Remembering that variance is SS/df, we just need to figure out what are the BG degrees of freedom.

ANOVA Summary Table We have the SS value for the BG source of variability but we need to convert it to a variance. Remembering that variance is SS/df, we now just need to divide the SS value by the df value for the Between Groups source

ANOVA Summary Table We have the SS value and the MS value for the WG source of variability but these to values should be connected, somehow… Remembering that variance is SS/df, we just need to figure out what are the degrees of freedom Within Groups to see if we divide in the same way as with the BG source if we get the same value (i.e. 9.223)

ANOVA Summary Table When we calculated the Within Groups variance we did so by averaging over the three individual group variances, each of which had a n – 1 degree of freedom So, that’s an n – 1 for each group or g * (n – 1) Or you can think of it as you need to calculate a mean for each group so you simply take the total number of scores (i.e. N) and subtract 1 for every group (i.e. g), and that’s N – g Note: that if all of the n j values are equal then g*(n-1) = N – g

ANOVA Summary Table We have 9 total subjects and 3 groups so that’s…

ANOVA Summary Table If we divide the SS value by the df value for the Within Groups source of variance we in fact get the same value we calculated earlier using the pooling method

ANOVA Summary Table In order to calculate the total SS we needed to estimate a single Grand Mean, because of this we lose one degree of freedom. The total degrees of freedom is simply the total number of participants (i.e. N) minus 1 Note: The degrees of freedom for BG and WG sum to the df-total as they should

ANOVA Summary Table In the ANOVA demo #1 we talked about how the Between Groups Variance is a measure of how far apart the groups are from the Grand Mean, which in turn tells us how far apart they are from each other on average. In order for us to know if the groups are varying far away from each other (i.e. they are significantly different) we need a measure of random variability to see if our groups are differing more than just randomly The Within Groups Variance tells us how much individuals vary from one another on average across the groups and this is our best estimate of random variability so we use it to see if the groups are different by creating the F-ratio

ANOVA Summary Table The F-ratio is simply the ratio of the Between Groups variance over the Within Groups Variance

ANOVA Summary Table The Between Groups variance contains both Real and Random variability, while the Within Groups variance contains only random (at least that’s what we are assuming). So in order for an F-ratio to be large the real group differences have to be large enough for us to see them through the random differences If no real differences exist than you are left with

ANOVA Summary Table The values found in the F-table indicate how much “real” variability exists between the groups compared to the random variability, controlling for the number of groups (i.e. df BG ) and the number of people in each group (i.e. df WG ) For our example

ANOVA Summary Table The value of 5.143 tells us that any value of 5.143 or larger is not likely to occur by accident (i.e. it has a.05 of lower probability) given the number of groups and the number of subjects per group Since our F-value is 5.89 and that is larger than 5.143 we can conclude that some significant group difference occurs somewhere between 2 of our group means

Download ppt "ANOVA Demo Part 2: Analysis Psy 320 Cal State Northridge Andrew Ainsworth PhD."

Similar presentations