14T tests in SPM: Did the observed signal change occur by chance or is it stat. significant? Recall GLM. Y= X β + εβ1 is an estimate of signal change over time attributable to the condition of interestSet up contrast (cT) 1 0 for β1: 1xβ1+0xβ2+0xβn/s.dNull hypothesis: cTβ=0 No significant effect at each voxel for condition β1Contrast 1 -1 : Is the difference between 2 conditions significantly non-zero?t = cTβ/sd[cTβ] – 1 sided
20ConclusionsT tests describe how unlikely it is that experimental differences are due to chanceHigher the t score, smaller the p value, more unlikely to be due to chanceCan compare sample with population or 2 samples, paired or unpairedANOVA/F tests are similar but use variances instead of means and can be applied to more than 2 groups and other more complex scenarios
21Acknowledgements MfD slides 2004-2006 Van Belle, Biostatistics Human Brain FunctionWikipedia
23Topics Covered: Is there a relationship between x and y? What is the strength of this relationshipPearson’s rCan we describe this relationship and use it to predict y from x?RegressionIs the relationship we have described statistically significant?F- and t-testsRelevance to SPMGLMWe often want to know whether various variables are ‘linked’, i.e., correlated.This can be interesting in itself, but is also important if we want to predict one variable’s value given a value of the other.
24Relationship between x and y Correlation describes the strength and direction of a linear relationship between two variablesRegression tells you how well a certain independent variable predicts a dependent variableCORRELATION CAUSATIONIn order to infer causality: manipulate independent variable and observe effect on dependent variablemeans that correlation cannot be validly used to infer a causal relationship between the variablesnot be taken to mean that correlations cannot indicate causal relations. However, the causes underlying the correlation, if any, may be indirect and unknownA correlation between age and height in children is fairly causally transparent, but a correlation between mood and health in people is less so. Does improved mood lead to improved health? Or does good health lead to good mood? Or does some other factor underlie both? Or is it pure coincidence? In other words, a correlation can be taken as evidence for a possible causal relationship, but cannot indicate what the causal relationship, if any, might be.
26Variance vs. Covariance Do two variables change together?Variance ~DX * DXVariance is just a definition. The reason we’re squaring it is to have it get a positive value whether dx is negative or positive, so that we can sum them and positives and negatives will not cancel out.Variance is spread around a mean, covariance is the measure of how much x and y change together; very similar: multiply 2 variables rather than square 1Covariance ~DX * DY
27Covariance When X and Y : cov (x,y) = pos. When X and Y : cov (x,y) = neg.When no constant relationship: cov (x,y) = 0
28Example Covariance What does this number tell us? x ( )( ) 3 2 1 4 6 9 yi-()()321469=å7What does this number tell us?
29Example of how covariance value relies on variance High variance data Low variance dataSubjectxyx error * y errorX error * y error11011002500545392818090052436160515054140496212048747MeanSum of x error * y error :700028Covariance:4.67Problem with Covariance:The value obtained by covariance is dependent on the size of the data’s standard deviations: if large, the value will be greater than if small… even if the relationship between x and y is exactly the same in the large versus small standard deviation datasets.
30Pearson’s R Covariance does not really tell us anything Solution: standardise this measurePearson’s R: standardise by adding std to equation:Can only compare covariances between different variables to see which is greater.
31Basic assumptions Normal distributions Variances are constant and not zeroIndependent sampling – no autocorrelationsNo errors in the values of the independent variableAll causation in the model is one-way (not necessary mathematically, but essential for prediction)
32Pearson’s R: degree of linear dependence The distance of r from 0 indicates strength of correlationr = 1 or r = (-1) means that we can predict y from x and vice versa with certainty; all data points are on a straight line. i.e., y = ax + bThe correlation is 1 in the case of an increasing linear relationship, −1 in the case of a decreasing linear relationship, and some value in between in all other cases, indicating the degree of linear dependence between the variables. The closer the coefficient is to either −1 or 1, the stronger the correlation between the variables.the correlation coefficient detects only linear dependencies between two variables
33Limitations of r r is actually r = true r of whole population = estimate of r based on datar is very sensitive to extreme values:X av = 3, y av = 2If extreme value such as y = 5 is possible (graph)X – 1,2,3,4,5,Y – 1,2,3,4,0
34In the real world…r is never 1 or –1interpretations for correlations in psychological research (Cohen)Correlation Negative PositiveSmall to to 0.29Medium to to 0.49Large to to 1.00the interpretation of a correlation coefficient depends on the context and purposes. A correlation of 0.9 may be very low if one is verifying a physical law using high-quality instruments, but may be regarded as very high in the social sciences where there may be a greater contribution from complicating factors.
35RegressionCorrelation tells you if there is an association between x and y but it doesn’t describe the relationship or allow you to predict one variable from the other.To do this we need REGRESSION!
36Best-fit Line ŷ = ax + b ε = ŷ, predicted value = y i , true value Aim of linear regression is to fit a straight line, ŷ = ax + b, to data that gives best prediction of y for any value of xThis will be the line thatminimises distance betweendata and fitted line, i.e.the residualsεŷ = ax + bε = residual error= y i , true valueslopeinterceptSo to understand the relationship between two variables, we want to draw the ‘best’ line through the cloud – find the best fit.This is done using the principle of least squares= ŷ, predicted value
37Least Squares Regression To find the best line we must minimise the sum of the squares of the residuals (the vertical distances from the data points to our line)Model line: ŷ = ax + ba = slope, b = interceptResidual (ε) = y - ŷSum of squares of residuals = Σ (y – ŷ)2we must find values of a and b that minimiseΣ (y – ŷ)2
38Finding bFirst we find the value of b that gives the min sum of squaresbbεεbTrying different values of b is equivalent to shifting the line up and down the scatter plot
39Finding a Now we find the value of a that gives the min sum of squares bbbTrying out different values of a is equivalent to changing the slope of the line, while b stays constant
40Minimising sums of squares Values of a and bsums of squares (S)Gradient = 0min SNeed to minimise Σ(y–ŷ)2ŷ = ax + bso need to minimise:Σ(y - ax - b)2If we plot the sums of squares for all different values of a and b we get a parabola, because it is a squared termSo the min sum of squares is at the bottom of the curve, where the gradient is zero.
41The maths bitSo we can find a and b that give min sum of squares by taking partial derivatives of Σ(y - ax - b)2 with respect to a and b separatelyThen we solve these for 0 to give us the values of a and b that give the min sum of squares
42The solutionDoing this gives the following equations for a and b:a =r sysxr = correlation coefficient of x and ysy = standard deviation of ysx = standard deviation of xYou can see that:A low correlation coefficient gives a flatter slope (small value of a)Large spread of y, i.e. high standard deviation, results in a steeper slope (high value of a)Large spread of x, i.e. high standard deviation, results in a flatter slope (high value of a)
43The solution cont. r sy x b = y - sx Our model equation is ŷ = ax + bThis line must pass through the mean so:y = ax + bb = y – axb = y – axWe can put our equation into this giving:b = y -r sysxr = correlation coefficient of x and ysy = standard deviation of ysx = standard deviation of xxThe smaller the correlation, the closer the intercept is to the mean of y
44Back to the modelWe can calculate the regression line for any data, but the important question is:How well does this line fit the data, or how good is it at predicting y from x?
45How good is our model? Variance of predicted y values (ŷ): sy2 =∑(y – y)2n - 1SSydfy=Total variance of y:Variance of predicted y values (ŷ):This is the variance explained by our regression modelsŷ2 =∑(ŷ – y)2n - 1SSpreddfŷ=Error variance:This is the variance of the error between our predicted y values and the actual y values, and thus is the variance in y that is NOT explained by the regression modelserror2 =∑(y – ŷ)2n - 2SSerdfer=
46How good is our model cont. Total variance = predicted variance + error variancesy2 = sŷ2 + ser2Conveniently, via some complicated rearrangingsŷ2 = r2 sy2r2 = sŷ2 / sy2so r2 is the proportion of the variance in y that is explained by our regression model
47How good is our model cont. Insert r2 sy2 into sy2 = sŷ2 + ser2 and rearrange to get:ser2 = sy2 – r2sy2= sy2 (1 – r2)From this we can see that the greater the correlation the smaller the error variance, so the better our prediction
48Is the model significant? i.e. do we get a significantly better prediction of y from our regression equation than by just predicting the mean?F-statistic:complicatedrearrangingF(dfŷ,dfer)=sŷ2ser2r2 (n - 2)21 – r2=......=And it follows that:r (n - 2)So all we need toknow are r and n !t(n-2) =(because F = t2)√1 – r2
49General Linear ModelLinear regression is actually a form of the General Linear Model where the parameters are a, the slope of the line, and b, the intercept.y = ax + b +εA General Linear Model is just any model that describes the data in terms of a straight line
50Multiple regressionMultiple regression is used to determine the effect of a number of independent variables, x1, x2, x3 etc., on a single dependent variable, yThe different x variables are combined in a linear way and each has its own regression coefficient:y = a1x1+ a2x2 +…..+ anxn + b + εThe a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y.i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for
51SPMLinear regression is a GLM that models the effect of one independent variable, x, on ONE dependent variable, yMultiple Regression models the effect of several independent variables, x1, x2 etc, on ONE dependent variable, yBoth are types of General Linear ModelGLM can also allow you to analyse the effects of several independent x variables on several dependent variables, y1, y2, y3 etc, in a linear combinationThis is what SPM does and will be explained soon…