2Between subjects experiments The caffeine experiment was of between subjects design, that is, each participant was tested under only one condition.Participants were RANDOMLY ASSIGNED to the conditions, so that there was no basis on which the data could be paired.Between subjects experiments result in INDEPENDENT SAMPLES of data.
3More than two conditions In more complex experiments, there may be three or more conditions.For example, we could compare the performance of groups of participants who have ingested four different supposedly performance-enhancing drugs with that of a control or placebo group.
4FactorsIn the context of analysis of variance (ANOVA), a FACTOR is a set of related treatments, conditions or categories.The ANOVA term ‘factor’ is a synonym for the term ‘independent variable’.
5One-factor experiments In the drug experiment, there is just ONE set of (drug-related) conditions.The experiment therefore has ONE treatment factor.The conditions making up a factor are known as its LEVELS. In the drug experiment, the treatment factor has 5 levels.
7Statistics of the results group (cell) meansGroup (cell) variancesgroup (cell) standard deviations
8The null hypothesisThe null hypothesis states that, in the population, all the means have the same value.We cannot test this hypothesis with the t statistic.
9The alternative hypothesis The alternative hypothesis is that, in the population, the means do NOT all have the same value.MANY POSSIBILITIES are implied by H1.
10The One-way ANOVAThe ANOVA of a one-factor between groups experiment is also known as the ONE-WAY ANOVA.The one-way ANOVA must be sharply distinguished from the one-factor WITHIN SUBJECTS (or REPEATED MEASURES) ANOVA, which is appropriate when participants are tested at every level of the treatment factor.The between subjects and within subjects ANOVA are based upon different statistical models.
11There are some large differences among the five treatment means, suggesting that the null hypothesis is false.
12Mean square (MS)In ANOVA, the numerator of a variance estimate is known as a SUM OF SQUARES (SS). The denominator is known as the DEGREES OF FREEDOM (df). The variance estimate itself is known as a MEAN SQUARE (MS), so that MS = SS/df .
13Accounting for variability grand meanAccounting for variabilitytotal deviationbetween groups deviationwithin groups deviationThe building block for any variance estimate is a DEVIATION of some sort.The TOTAL DEVIATION of any score from the grand mean (GM) can be divided into 2 components: 1. a BETWEEN GROUPS component; 2. a WITHIN GROUPS component.
14Example of the breakdown The score, the group mean and the grand mean have been ringed in the table.This breakdown holds for each of the fifty scores in the data set.scoregrand meangroup mean
15Breakdown (partition) of the total sum of squares If you sum the squares of the deviations over all 50 scores, you obtain an expression which breaks down the total variability in the scores into between groups and within groups components.
16How ANOVA worksThe variability BETWEEN the treatment means is compared with the average spread of scores around their means WITHIN the treatment groups.The comparison is made with a statistic called the F-RATIO.
17The variances of the scores in each group around their group mean are averaged to obtain a WITHIN GROUPS MEAN SQUARE
18From the values of the five treatment means, a BETWEEN GROUPS MEAN SQUARE is calculated.
19The statistic F is calculated by dividing the between groups MS by the within groups MS thus
21The value of the MSbetween , since it is calculated from the MEANS, reflects random error, plus any real differences among the population means that there may be.
22The value of MSwithin , since it is calculated only from the variances of the scores within groups and ignores the values of the group means, reflects ONLY RANDOM ERROR.
23What F is measuringIf there are differences among the population means, the numerator will be inflated and F will increase.If there are no differences, F will be close to 1.error + real differenceserror only
24ExpectationsIf the null hypothesis is true, the values of MSbetween and MSwithin will be similar, because both variance estimates merely reflect individual differences and random variation or ERROR.If so, the value of F will be around 1.If the null hypothesis is false, real differences among the population means will inflate the value of MSbetween but the value of MSwithin will be unaffected.The result will be a LARGE value of F.
25Range of variation of FThe F statistic is the ratio of two sample variances.A variance can take only non-negative values.So the lower limit for F is zero.There is no upper limit for F.
26Imagine… Suppose the null hypothesis is true. Imagine the experiment were to be repeated thousands and thousands of times, with fresh samples of participants each time.There would be thousands and thousands of data sets, from each of which a value of F could be calculated.
27Sampling distribution To test the null hypothesis, you must be able to locate YOUR value of F in the population or PROBABILITY DISTRIBUTION of such values.The probability distribution of a statistic is known as its SAMPLING DISTRIBUTION.To specify a sampling distribution, you must assign values to properties known as PARAMETERS.
28Parameters of FRecall that the t distribution has ONE parameter: the DEGREES OF FREEDOM (df ).The F distribution has TWO parameters: the degrees of freedom of the between groups and within groups mean squares, which we shall denote by dfbetween and dfwithin, respectively.
29Rule for finding the degrees of freedom There’s a useful rule for finding the degrees of freedom of a statistic.Take the number of independent observations and subtract the number of parameters estimated.The sample variance of n scores is based upon n independent observations. But to obtain the deviations, we need an estimate of ONE parameter, namely, the mean.So the degrees of freedom of the sample variance is n – 1, not n.
31Degrees of freedom of the two mean squares The degrees of freedom of MSbetween is the number of treatment groups minus 1. (One parameter estimated: the grand mean.)The degrees of freedom of MSwithin is the total number of scores minus the number of treatment groups. (Five parameters are estimated: the five group means.)
32The correct F distribution We shall specify an F distribution with the notation F(dfbetween, dfwithin).We have seen that in our example, dfbetween = 4 and dfwithin = 45.The correct F distribution for our test of the null hypothesis is therefore F(4, 45).
33The distribution of F(1, 45) F distributions are POSITIVELY SKEWED, i.e., they have a long tail to the right.However, the shape of F varies quite markedly with the values of the df.
35Distribution of F(4, 45)The critical region is in the upper tail of this F distribution.If we set the significance level at .05, the value of F must be at least 2.6.The value 2.58 is the 95th Percentile of the distribution F(4, 45).
36The F distributionF(dfbetween, dfwithin) = F(4, 45).05.95F95th percentile = 2.58An F distribution is asymmetric, with an infinitely long tail to the right.The critical region lies above the 95th percentile which, in this F distribution, is 2.58.
37The ANOVA summary table F large, nine times larger than unity, the expected value from the null hypothesis and well over the critical value 2.58.The p-value (Sig.) <.01. So F is significant beyond the .01 level.Write this result as follows: ‘with an alpha-level of .05, F is significant: F(4, 45) = 9.09; p <.01’.Do NOT write the p-value as ‘.000’!Notice that SStotal= SSbetween groups + SSwithin groups
38SPSS advice A few general points. Give close attention to the labels you give to your variables, and to the appearance of your data. Unnecessary decimal places clutter the display.It is particularly important to assign VALUE LABELS to the code numbers you choose for any grouping variables.Specify also the LEVEL OF MEASUREMENT of each variable.
39Start in Variable ViewWork in Variable View first, amending the settings so that when you enter Data View, your variables are already labelled, the scores appear without unnecessary decimals and you will have the option of displaying the value labels of your grouping variable.
40GraphicsThe latest SPSS graphics require you to specify the level of measurement of the data on each variable.The group code numbers are at the NOMINAL level of measurement, because they are merely CATEGORY LABELS.Make the appropriate entry in the Measure column.
41Grouping variablesTo instruct SPSS to analyse data from between subjects experiments, you must construct a GROUPING VARIABLE consisting of code numbers identifying the treatment condition under which a score was achieved.So we could set 1 = Placebo, 2 = Drug A, 3 = Drug B, 4 = Drug C, and 5 = Drug D.
42Data View This is what Data View will look like. The entry of data for an ANOVA on SPSS is similar to the procedure we followed when making an independent-samples t-test.On the right, the VALUE LABELS are displayed, instead of the values themselves. (This option appears in the Data menu.)
44Variable View completed Note the setting of Decimals so that only whole numbers will appear in Data View.Note the informative variable LABELS, which will appear in the output.Note the VALUE LABELS giving the key to the code numbers you have chosen for your grouping variable. (The ‘values’ themselves are the code numbers you have chosen.)
47More statisticsBy clicking Options, you can order more statistics than would normally appear in the ANOVA output.Click the Descriptive button to order the extra statistics and then Continue, to return to the ANOVA dialog box.
48A word of warningModern computing packages such as SPSS afford a bewildering variety of attractive graphs and displays to help you bring out the most important features of your results. You should certainly use them.But there are pitfalls awaiting the unwary.Suppose the drug experiment had turned out rather differently. The researcher proceeds as follows.
51The picture is false!The table of means shows miniscule differences among the five group means.The value of F is very small indeed.The p-value of F is very high – unity to two places of decimals.The experiment has failed to show that any of the drugs works.
52A small scale viewOnly a microscopically small section of the scale is shown on the vertical axis.This greatly magnifies even small differences among the group means.
53Putting things rightDouble-click on the image to get into the Graph Editor.Double-click on the vertical axis to access the scale specifications.Click here
54Putting things right …Uncheck the minimum value box and enter zero as the desired minimum point.Click Apply.Amend entry
56The true picture … The effect is dramatic. The profile now reflects the true situation.Always be suspicious of graphs that do not show the complete vertical scale.
57SummaryIn the one-way ANOVA, we compare two variance estimates, MSbetween and MSwithin by means of their ratio, which is called the F statistic.If F is large, we conclude that there is at least one significant difference somewhere among the array of treatment means.