2Only the starting point In ANOVA, the rejection of the null hypothesis leaves many questions unanswered.Further analysis is needed to pinpoint the crucial patterns in the data.So, unlike the t test, the ANOVA is often just the first step in what may be quite an extensive statistical analysis.
4Simple and complex comparisons You might want to make SIMPLE COMPARISONS between the mean for each of the four drug conditions and the Placebo mean.Or you might want to compare the Placebo mean with the mean of the four drug means. This is a COMPLEX COMPARISON.
6Non-independence of comparisons The simple comparison of M5 with M1 and the complex comparison are not independent.The value of M5 feeds into the value of the average of the means for the drug groups.
7Systems of comparisons With a complex experiment, interest centres on SYSTEMS of comparisons.Which comparisons are independent or ORTHOGONAL?What is the probability, under the null hypothesis, that at least one comparison will show significance?How much variance can we attribute to different comparisons?
8The crumpled paper fallacy We owe this to Thouless.Uncrumple a piece of paper.The wrinkles are unique.Therefore, they are statistically significant.Data sets from complex experiments may, ex post facto, show all manner of interesting patterns.Inferences from such patterns are dangerous.
9Over-analysis?You have run a complex experiment and submitted a paper to a journal.Your reviewers will need to be convinced that what you are reporting isn’t just a chance pattern thrown up by sampling error.You may well be asked to specify orthogonal comparisons and test them for significance.
10Linear functionsY is a linear function of X if the graph of Y upon X is a straight line.For example, temperature in degrees Fahrenheit is a linear function of temperature in degrees Celsius.
11F is a linear function of C Degrees FahrenheitPQIntercept → 32(0, 0)Degrees Celsius
14Linear contrastsAny comparison can be expressed as a sum of terms, each of which is a product of a treatment mean and a coefficient such that the coefficients sum to zero.When so expressed, the comparison is a LINEAR CONTRAST, because it has the form of a linear function.It looks artificial at first, but this notation enables us to study the properties of systems of comparisons among the treatment means.
22Helmert contrastsCompare the first mean with the mean of the other means.Drop the first mean and compare the second mean with the mean of the remaining means. Drop the second mean.Continue until you arrive at a comparison between the last two means.
23Helmert contrasts… Our first contrast is 1, -¼, -¼, -¼, -¼ Our second contrast is0, 1, -⅓ , -⅓, -⅓Our third contrast is0, 0, 1, -½, -½Our fourth is0, 0, 0, 1, -1
25Orthogonal contrastsThe first contrast in no way constrains the value of the second, because the first mean has been dropped.The first two contrasts do not affect the third, because the first two means have been dropped.This is a set of four independent or ORTHOGONAL contrasts.
26The orthogonal property The sum of the products of corresponding coefficients in any pair of rows is zero.This means that we have an ORTHOGONAL contrast set.
27Size of an orthogonal set In our example, with five treatment means, there are four orthogonal contrasts.In general, for an array of k means, you can construct a set of, at most, k-1 orthogonal contrasts.In the present ANOVA example, k = 5, so the rule tells us that there can be no more than 4 orthogonal contrasts in the set.Several different orthogonal sets, however, can often be constructed for the same set of means.
28Accounting for variability grand meanAccounting for variabilitytotal deviationbetween groups deviationwithin groups deviationThe building block for any variance estimate is a DEVIATION of some sort.The TOTAL DEVIATION of any score from the grand mean (GM) can be divided into 2 components: 1. a BETWEEN GROUPS component; 2. a WITHIN GROUPS component.
29Breakdown (partition) of the total sum of squares If you sum the squares of the deviations over all 50 scores, you obtain an expression which breaks down the total variability in the scores into BETWEEN GROUPS and WITHIN GROUPS components.
30Contrast sums of squares We have seen that in the one-way ANOVA, the value of SSbetween reflects the sizes of the differences among the treatment means.In the same way, it is possible to measure the importance of a contrast by calculating a sum of squares which reflects the variation attributable to that contrast aloneWe can use an F statistic to test each contrast for significance.
40Testing a contrast sum of squares for significance
41Two approaches A contrast is a comparison between two means. You can therefore run a one-way, 2-group ANOVA.Or you can use a t-test.The tests are equivalent.
42Degrees of freedom of a contrast sum of squares A contrast sum of squares compares two means.A contrast sum of squares, therefore, has ONE degree of freedom, because the two deviations from the grand mean sum to zero.
48Contrasts with SPSS Two approaches: The simpler is through the One-Way option in the Compare Means menu.The General Linear Model, however, provides many more useful statistics.I suggest you begin by exploring contrasts with the One-Way procedure first, then move on to the General Linear Model menu.
53Our Helmert contrasts Each ringed item is a MEAN. In the top row, the Placebo mean is compared with the mean of the drug means.In the third row, the mean for Drug B is compared with the mean of the means for Drug C and Drug D.
54Summary A contrast is a comparison between two means. The contrasts can therefore be tested with either F or t. (F = t2.)The contrast sums of squares sum to the value of SSbetween.
56Heterogeneity of variance The lower part of the table shows the results of tests of the same contrasts when homogeneity of variance is not assumed.Notice that the degrees of freedom have lower values.
57Non-orthogonal contrasts Contrasts don’t have to be independent.For example, you might wish to compare each of the four drug groups with the Placebo group.What you want are SIMPLE CONTRASTS.
58Simple contrasts These are linear contrasts – each row sums to zero. But they are not orthogonal – with some pairings, the sum of products of corresponding coefficients is not zero.
59Simple contrasts with SPSS Here are the entries for the first contrast, which is between the Placebo and Drug A groups.Below that are the entries for the final contrast between the Placebo and Drug D groups.
60The resultsIn the column headed ‘Value of Contrast’, are the differences between pairs of treatment means.For example, Drug A mean minus Placebo mean = = Drug D – Placebo = – 8.00 = 5.00.
61Trend analysisSometimes the factor (independent variable) may be quantitative and continuous.The theory of contrasts can be extended to study trends in the relationship between the factor and the dependent variable.The following slides outline the procedure.
62PolynomialsA POLYNOMIAL is a sum of terms, each of which is a product of a constant and a power of the same variable.The highest power n is the DEGREE of the polynomial.
63Graphs of some polynomials QUADRATICLINEARCUBICQUARTIC
64Fitting points with polynomials A first-order polynomial (line) does not change direction at all. But you can adjust the constants to fit any TWO points.A second-order polynomial (parabola) changes direction ONCE and can be fitted to any THREE points.A third-order polynomial changes direction TWICE and can be fitted to any FOUR points.
65Fitting points with polynomials… In general, any k points can be fitted perfectly by a polynomial of order k – 1.
67Another drug experiment In the drug experiment, the independent variable (or factor) comprised a set of five qualitatively different conditions.There was no intrinsic ordering of the categories. The order in which the variables appeared in Data View was entirely arbitrary.Now suppose that the five groups vary in the extent to which the same drug was present.The Placebo, A, B, C and D groups have dosages of 0, 10, 20, 30 and 40 units of the drug, respectively.The five groups are now ordered with respect to a CONTINUOUS INDEPENDENT VARIABLE.
68A linear trend There is evidence of a linear TREND in these data. The pattern, however, is imperfect – other trends (e.g. quadratic) may be present as well. On the other hand, the irregularity may reflect random error.
69Capturing the linear trend Consider the linear contrastIf we plot these values against X (the concentration of the drug), we shall have the graph of a straight line.LINEAR
70Polynomial coefficients The coefficients in this contrast are actually values of the polynomialy = x – 3The sum of squares of this contrast captures or reflects the linear trend in the data.
71Orthogonal polynomial contrasts Here is a set of orthogonal contrasts.The values in each row are values of one polynomial for various values of X, the continuous independent variable.The top row is a first degree (linear) polynomial, the next row is a second degree (quadratic) polynomial and so on.
72Trend analysisAlthough the entries in a row are values of the same polynomial (whether linear or not), they are still the coefficients of a linear contrast: they sum to zero; moreover, the products of the corresponding coefficients also sum to zero. We have an ORTHOGONAL SET of contrasts.Associated with each contrast is a sum of squares which captures that particular trend in the data.The contrasts are tested in the usual way.
73Ordering a linear polynomial contrast Specify a linear (1st degree) polynomialYou must check the Polynomial box and specify the order of the polynomial.Orthogonal polynomial sets are obtainable from tables in statistics books, such as Howell (2007), which provide orthogonal sets for sets of means of various sizes.You must check the Polynomial box
74Ordering a quadratic polynomial contrast Specify a 2nd degree (quadratic) polynomialYou must now specify a Quadratic (2nd degree) polynomial.The coefficients are entered in the usual way.
75A trend analysis The relevant results are ringed. You can see that only the linear trend is significant.This formal analysis confirms the appearance of the profile plot.
76Partition of the between groups sum of squares Since we have an orthogonal set of contrasts, their sums of squares sum to the ANOVA between groups sum of squares.
77Deviations in the ANOVA table The DEVIATION sum of squares is what remains of SSbetween when the last contrast sum of squares has been subtracted.Each deviation has one degree of freedom fewer than the previous deviation (if there is one).
78The deviationsThe first deviation SS (with df = 3) is obtained by subtracting the linear SS from SSbetweenThe second deviation has df = 2. Both the linear and the quadratic trends have now been removed.
82Alternative analysesAs usual, t-tests are also made without assuming homogeneity of variance (lower half).The values of df are markedly lower, suggesting that we should go by the tests in the lower part of the table.
83A useful question Are you making comparisons or measuring association? If you’re making comparisons, you may need statistics such as the t-test and ANOVAIf you’re investigating associations, you will need techniques such as correlation and regression.
84Purpose of this section Today I intend to build some bridges between the statistics of comparison and association.I hope to show that in some circumstances, the making of a comparison and the investigation of an association are equivalent.
91Warning!This significance test presupposes that the distribution is BIVARIATE NORMAL, which implies that the scatterplot is elliptical (or circular) in shape.ALWAYS CHECK THIS OUT BY INSPECTING THE SCATTERPLOT.
92IndependenceSelect a large sample at random from a population and array the values in a column.Select another sample from the same population at random and array those values alongside the values of the first sample.The two samples are independent, because the data are not paired in any meaningful sense.The correlation between the two columns of values should be approximately zero.
94RegressionRegression is a set of techniques for exploiting the presence of statistical association among variables to make predictions of values of one variable (the DV or CRITERION) from knowledge of the values of other variables (the IVs or REGRESSORS).
95Simple and multiple regression In the simplest case, there is just one IV. This is known as SIMPLE regression.In MULTIPLE regression, there are two or more IVs.
96The regression line of actual violence upon film preference
97The regression line of Violence upon Preference The REGRESSION LINE is the line that fits the points best from the point of view of predicting Actual Violence from Preference.(A different line would be drawn were we to try to predict Preference from Actual Violence.)
102Residual scoresSuppose we use the regression line of Y upon X to predict the value of a person’s score Y from a particular value of X.A RESIDUAL (e) is the difference between a person’s true score on Y and the point on the regression line.
106Summary B1 is the slope and B0 is the intercept. Y/ is the Y-coordinate of the point on the line above the value X.An increase of one unit on variable X will result in an estimated increase of (B1) units on variable Y.A NEGATIVE value of B1 means that an increase of one unit on variable X will result in an estimated REDUCTION of B1 units on Y.regression constant (intercept)regression coefficient (slope)
107The ‘least-squares’ criterion The regression line is the ‘best-fitting’ line in the sense that it minimises the sum of the squares of the residuals.
112The coefficient of determination (r2) The COEFFICIENT OF DETERMINATION (r2) is the proportion of the variance of the predicted variable accounted for by regression.The coefficient of determination can take values within the range from 0 to +1, inclusive.
118Using more than one regressor By analogous methods, we could try to predict a person’s actual violence from exposure to screen violence and number of years of education.This is a problem in MULTIPLE REGRESSION.
120Geometrical interpretation This is the equation of a plane (or hyperplane) with slopes B1, B2, …,Bp with respect to axes X1, X2, …, Xp and intercept B0.The slopes are the PARTIAL REGRESSION COEFFICIENTS and the intercept is the CONSTANT.
121Regression coefficients In simple regression the REGRESSION COEFFICIENT (B1 ) is the estimated change in units of the DV that would result from an increase of one unit in the IV.In multiple regression, a PARTIAL REGRESSION COEFFICIENT such as B1 is the estimated change in the DV resulting from an increase of one unit in the IV X1 with ALL OTHER IVs HELD CONSTANT.
122The multiple correlation coefficient R The MULTIPLE CORRELATION COEFFICIENT is the correlation between the estimates Y/ and the actual values of the DV (Y).The COEFFICIENT OF DETERMINATION (R2) is the proportion of the variance of Y that is accounted for by regression.
123Range of RThe multiple correlation coefficient R can only have non-negative values:0 ≤ R ≤ +1This is because the regression line (or plane) cannot have a slope of opposite sign to that of the elliptical (or hyperelliptical) scatterplot.
124Attribution of variance to regressors If the IVs are uncorrelated, it is easy to attribute variance in Y to each of the independent variables X.
129Dummy variablesInformation about group membership is carried by a grouping variable.A DUMMY VARIABLE has only two values: 0 and 1, where 0 usually denotes the control or comparison condition – in this case the Placebo.
130Point-biserial correlation If we correlate the scores in the Group column with the dummy variable in the Score column, we obtain what is known as a POINT-BISERIAL CORRELATION.The meaning of ‘point-biserial’ is lost in the mists of antiquity.The point is that we are correlating a measured variable with code numbers for category membership.
132A linkThe point biserial correlation is of limited value as a descriptive statistic.However, it forms a useful conceptual bridge between the statistics of comparison (t-test) and association (correlation).
133Regression upon dummy variables We shall now regress the scores that people achieved in the Caffeine experiment against those of the dummy variable carrying group membership.
134The regression line will pass through the group means 1X
135Why?OLS regression minimises the sums of the squares of the residuals.In either group of scores, the sum of the squared deviations about the mean is a minimum.
136The sum of squares of deviations about the mean is a minimum
137The regression statistics When we regress the Score variable against the dummy variable, the intercept of the regression line is the mean score of the Placebo group.The slope of the regression line is the difference between the means of the Caffeine and Placebo groups.
141Significance testsThe intercept (Constant) is 9.25, the value of the Placebo mean.The slope is 2.65, which is – 9.25, the difference between the Caffeine and Placebo means.t(38) = 2.604; p = This is exactly the result we obtained with the independent samples t test.
142Equivalence of ANOVA and regression When we test the slope of the regression line for significance, we are also testing the difference between the Caffeine and Placebo means for significance.Since (in the 2-group case) the F and t tests are equivalent, the regression ANOVA table is identical with the one-way ANOVA table we obtained before.
143Dummy coding for the k-group case Since MSbetween has only four degrees of freedom, regression will predict the treatment means perfectly if the Score variable is regressed upon four dummy variables X1, X2, X3 and X4.As with the two-group example, an interesting equivalence emerges.
146The regression statistics Same as the ANOVA value of F.We see that B0 is the Placebo mean and B1, B2, B3 and B4 are the differences between the means for the 4 drug conditions and the Placebo mean.
147In summaryWhen the scores in the five-group drug experiment are regressed upon 4 dummy variables,The regression constant or intercept B0 is the Placebo mean.The partial regression coefficients are the differences between the drug conditions and the Placebo mean.The regression sum of squares is equal to the ANOVA between groups sum of squares.
148In summary …The t - tests of the regression coefficients are equivalent to the t-tests of the sums of squares associated with the four contrasts.
149Eta squaredReturning to the one-way ANOVA, recall that eta squared (also known as the CORRELATION RATIO) is defined as the ratio of the between groups and within groups mean squares.It’s theoretical range of variation is from zero (no differences among the means) to unity (no variance in the scores of any group, but different values in different groups).In our example, η2 = .447
150Eta squared revisitedIf the scores from a k – group experiment are regressed upon k – 1 dummy variables, the square of the multiple correlation coefficient R is the proportion of variance of the scores accounted for by differences among the treatment means.Eta squared is R2, which I think is why it is also termed the ‘correlation ratio’.
151Formula for SSψWe can think of a contrast sum of squares as the between treatments variability that is accounted for by a particular contrast.The sums of squares for orthogonal contrasts add up to the ANOVA between groups sum of squares.
153Building bridgesIn these two sessions, in addition to revising (and adding to) some material with which you are already familiar, I have tried to demonstrate some striking equivalences between techniques which many think of as having quite different contexts and purposes.
154Assignment before noon on Wednesday 31st October. Please complete the project and hand it in to Annebefore noon on Wednesday 31st October.I shall return your answers (with comments) byWednesday 7th November.