T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic.

t-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic

Overview Simple hypothesis testing Simple hypothesis testing Z-tests & t-tests Z-tests & t-tests F-tests and ANOVA F-tests and ANOVA Correlation/regression Correlation/regression

Central aim of statistical tests: Central aim of statistical tests: Determining the likelihood of a value in a sample, given that the Null Hypothesis is true: P(value|H 0 ) Determining the likelihood of a value in a sample, given that the Null Hypothesis is true: P(value|H 0 ) H 0 : no statistically significant difference between sample & population (or between samples) H 0 : no statistically significant difference between sample & population (or between samples) H 1 : statistically significant difference between sample & population (or between samples) H 1 : statistically significant difference between sample & population (or between samples) Significance level: P(value|H 0 ) < 0.05 Significance level: P(value|H 0 ) < 0.05 Starting point

Types of error True state of the world H 0 true H 0 false Decision Accept H 0 Correct acceptance p=1- p=1- -error -error (Type II error) False negative Reject H 0 -error -error (Type I error) False positive Correct rejection p=1- Power) Totalpopulation Population studied α β

Distribution & probability /2 If we know something about the distribution of events in a population, we know something about the probability of these events Population mean Population standard deviation

Standardised normal distribution The z-score represents a value on the x-axis for which we know the p-value The z-score represents a value on the x-axis for which we know the p-value 2-tailed: z = 1.96 is equivalent to p=0.05 (rule of thumb ~2SD) 2-tailed: z = 1.96 is equivalent to p=0.05 (rule of thumb ~2SD) 1-tailed: z = 1.65 is equivalent to p=0.05 (area between infinity and 1.65=5% on one of the tails) 1-tailed: z = 1.65 is equivalent to p=0.05 (area between infinity and 1.65=5% on one of the tails) 1 point compared to population Group compared to population Standardised

Assumptions of parametric tests Variables are: Variables are: Normally distributed Normally distributed N>10 (or 12, or 15…) N>10 (or 12, or 15…) On an interval or ideally ratio scale (e.g. 2 metres=2x1 metre) On an interval or ideally ratio scale (e.g. 2 metres=2x1 metre) …but parametric tests are (fairly) robust to violations of these assumptions …but parametric tests are (fairly) robust to violations of these assumptions

Z- versus t- statistic? Z is used when we know the variance in the general population e.g. IQ. This is not normally true! Z is used when we know the variance in the general population e.g. IQ. This is not normally true! t is used when we do not know the variance of the underlying population for sure, and is dependent on N t is used when we do not know the variance of the underlying population for sure, and is dependent on N The t distribution is similar to the Z (but flatter) The t distribution is similar to the Z (but flatter) For N>30, tZ For N>30, tZ Large N Small N

Two-sample t-test Group 1Group 2 Difference between the means divided by the pooled standard error of the mean

Different types of t-test One-sample One-sample Tests whether the mean of a population is different to a given value (e.g. if chance performance=50% in 2 alternative forced choice) Tests whether the mean of a population is different to a given value (e.g. if chance performance=50% in 2 alternative forced choice) Paired t-test (within subjects) Paired t-test (within subjects) Tests whether a group of individuals tested under condition A is different to tested under condition B Tests whether a group of individuals tested under condition A is different to tested under condition B Must have 2 values for each subject Must have 2 values for each subject Basically the same as a one-sample t-test on the difference scores, comparing the difference scores to 0 Basically the same as a one-sample t-test on the difference scores, comparing the difference scores to 0

Another approach to group differences Instead of thinking about the group means, we can instead think about variances Instead of thinking about the group means, we can instead think about variances Recall sample variance: Recall sample variance: F=Variance 1/Variance 2 F=Variance 1/Variance 2 ANOVA = AN ALYSIS O F VA RIANCE ANOVA = AN ALYSIS O F VA RIANCE Total variance=model variance + error variance Total variance=model variance + error variance

Partitioning the variance Group 1 Group 2 Group 1 Group 2 Group 1 Group 2 Total =Model + (Between groups) Error (Within groups)

ANOVA At its simplest, one-way ANOVA is the same as the two- sample t-test At its simplest, one-way ANOVA is the same as the two- sample t-test Recall t=difference between means/spread around means (pooled standard error of the mean) Recall t=difference between means/spread around means (pooled standard error of the mean) Group 1 Group 2 Group 1 Group 2 Model (difference between means) Between groups Error (spread around means) Within groups

A quick proof from SPSS ANOVA t-test In fact, t=SQRT(F) => 2.764=SQRT(7.638) (for 1 degree of freedom)

ANOVA is useful for more complex designs Group 1Group 2Group 3 More than 1 group MaleFemale More than 1 effect (interaction) Drug Placebo …but we need to use post-hoc tests (t-tests corrected for multiple comparisons) to interpret the results

Differences between t-tests and F- tests (especially in SPM) t-tests can only be used to compare 2 groups/effects, while ANOVA can handle more sophisticated designs (several groups/several effects/interactions) t-tests can only be used to compare 2 groups/effects, while ANOVA can handle more sophisticated designs (several groups/several effects/interactions) In SPM t-tests are one-tailed (i.e. for contrast X-Y, significant voxels are only reported where X>Y) In SPM t-tests are one-tailed (i.e. for contrast X-Y, significant voxels are only reported where X>Y) In SPM F-tests are two-tailed (i.e. for contrast X-Y, significant voxels are reported for both X>Y and Y>X) In SPM F-tests are two-tailed (i.e. for contrast X-Y, significant voxels are reported for both X>Y and Y>X)

Correlation and Regression Is there a relationship between x and y? Is there a relationship between x and y? What is the strength of this relationship What is the strength of this relationship Pearsons r Pearsons r Can we describe this relationship and use it to predict y from x? Can we describe this relationship and use it to predict y from x? Regression Regression Fitting a line using the Least Squares solution Fitting a line using the Least Squares solution Is the relationship we have described statistically significant? Is the relationship we have described statistically significant? Significance tests Significance tests Relevance to SPM Relevance to SPM GLM GLM

Is there a relationship between x and y? Is there a relationship between x and y? What is the strength of this relationship What is the strength of this relationship Pearsons r Pearsons r Can we describe this relationship and use it to predict y from x? Can we describe this relationship and use it to predict y from x? Regression Regression Fitting a line using the Least Squares solution Fitting a line using the Least Squares solution Is the relationship we have described statistically significant? Is the relationship we have described statistically significant? Significance tests Significance tests Relevance to SPM Relevance to SPM GLM GLM Correlation and Regression

Correlation: predictability about the relationship between two variables Correlation: predictability about the relationship between two variables Covariance: measurement of this predictability Covariance: measurement of this predictability Regression: description about the relationship between two variables where one is dependent and the other is independent Regression: description about the relationship between two variables where one is dependent and the other is independent No causality in any of these models No causality in any of these models Correlation and Regression

When X and Y : cov (x,y) = pos. When X and Y : cov (x,y) = pos. When X and Y : cov (x,y) = neg. When X and Y : cov (x,y) = neg. When no consistent relationship: cov (x,y) = 0 When no consistent relationship: cov (x,y) = 0 Dependent on size of the datas standard deviations (!) Dependent on size of the datas standard deviations (!) We need to standardize the data (Pearsons r) We need to standardize the data (Pearsons r) Covariance

Pearsons r Covariance does not really tell us anything Covariance does not really tell us anything –Solution: standardise this measure Pearsons r: standardises the covariance value Pearsons r: standardises the covariance value Divides the covariance by the multiplied standard deviations of X and Y: Divides the covariance by the multiplied standard deviations of X and Y:

Best-fit line = ŷ, predicted value Aim of linear regression is to fit a straight line, ŷ = ax + b to data that gives best prediction of y for any value of x Aim of linear regression is to fit a straight line, ŷ = ax + b to data that gives best prediction of y for any value of x This will be the line that minimises distance between This will be the line that minimises distance between data and fitted line, i.e. the residuals data and fitted line, i.e. the residuals intercept ε ŷ = ax + b ε = residual error = y i, true value slope

Least Squares Regression To find the best line we must minimise the sum of the squares of the residuals (the vertical distances from the data points to our line) To find the best line we must minimise the sum of the squares of the residuals (the vertical distances from the data points to our line) Residual (ε) = y - ŷ Sum of squares (SS) of residuals = Σ (y – ŷ) 2 Model line: ŷ = ax + b We must find values of a and b that minimise We must find values of a and b that minimise Σ (y – ŷ) 2 a = slope, b = intercept

Finding b First we find the value of b that gives the least sum of squares First we find the value of b that gives the least sum of squares b Trying different values of b is equivalent to shifting the line up and down the scatter plot Trying different values of b is equivalent to shifting the line up and down the scatter plot b b

Finding a Now we find the value of a that gives the least sum of squares Now we find the value of a that gives the least sum of squares Trying out different values of a is equivalent to changing the slope of the line, while b stays constant Trying out different values of a is equivalent to changing the slope of the line, while b stays constant b b b

Minimising the sum of squares Need to minimise Σ(y–ŷ) 2 Need to minimise Σ(y–ŷ) 2 ŷ = ax + b ŷ = ax + b So need to minimise: So need to minimise: Σ(y - ax - b) 2 If we plot the sums of squares for all different values of a and b we get a parabola, because it is a squared term If we plot the sums of squares for all different values of a and b we get a parabola, because it is a squared term So the minimum sum of squares is at the bottom of the curve, where the gradient is zero So the minimum sum of squares is at the bottom of the curve, where the gradient is zero Values of a and b sum of squares (SS) Gradient = 0 min S

The solution Doing this gives the following equation for a: Doing this gives the following equation for a: a = r s y sxsx r = correlation coefficient of x and y s y = standard deviation of y s x = standard deviation of x From this we can see that: From this we can see that: A low correlation coefficient gives a flatter slope (low value of a) A low correlation coefficient gives a flatter slope (low value of a) Large spread of y, i.e. high standard deviation, results in a steeper slope (high value of a) Large spread of y, i.e. high standard deviation, results in a steeper slope (high value of a) Large spread of x, i.e. high standard deviation, results in a flatter slope (low value of a) Large spread of x, i.e. high standard deviation, results in a flatter slope (low value of a)

The solution continued Our model equation is ŷ = ax + b Our model equation is ŷ = ax + b This line must pass through (x, y) so: This line must pass through (x, y) so: y = ax + b b = y – ax We can put our equation for a into this giving: We can put our equation for a into this giving: b = y – ax b = y - r s y sxsx r = correlation coefficient of x and y s y = standard deviation of y s x = standard deviation of x x The smaller the correlation, the closer the intercept is to the mean of y The smaller the correlation, the closer the intercept is to the mean of y

Back to the model If the correlation is zero, we will simply predict the mean of y for every value of x, and our regression line is just a flat straight line crossing the x-axis at y If the correlation is zero, we will simply predict the mean of y for every value of x, and our regression line is just a flat straight line crossing the x-axis at y But this isnt very useful But this isnt very useful We can calculate the regression line for any data, but the important question is how well does this line fit the data, or how good is it at predicting y from x We can calculate the regression line for any data, but the important question is how well does this line fit the data, or how good is it at predicting y from x ŷ = ax + b = r s y sxsx sxsx x + y - x r s y sxsx ŷ = (x – x) + y Rearranges to: ab a a

Weve determined the form of the relationship Weve determined the form of the relationship (y = ax + b) (y = ax + b) Does a prediction based on this model do a better job than just predicting the mean? Does a prediction based on this model do a better job than just predicting the mean? How can we determine the significance of the model?

We can solve this using ANOVA In general: In general: Total variance = predicted (or model) variance + error variance In a one-way ANOVA, we have: In a one-way ANOVA, we have: Variance Total = MS Model + MS Error MS Error MS Model F (df model, df error ) MS=SS/df

= + Total = Model + Error (Between)(Within) Partitioning the variance for linear regression (using ANOVA) So linear regression and ANOVA are doing the same thing statistically, and are the same as correlation…

Another quick proof from SPSS Regression Correlation (Pearsons r)

Relating the F and t statistics Alternatively (as F is the square of t): 2 )2( ˆ 1 )2( ˆ r Nr t N So all we need to know is N and r!! 2 2 ˆ 1 )2( ˆ r Nr F (df model, df error ) MS Error MS Model

Basic assumptions Variables: ratio or interval with > 10 (or 12, or 15…) different pairs of values Variables: ratio or interval with > 10 (or 12, or 15…) different pairs of values Variables normally distributed in the population Variables normally distributed in the population Linear relationship Linear relationship Residuals (errors) should be normally distributed Residuals (errors) should be normally distributed Independent sampling Independent sampling

Warning 1: Outliers Regression health warning!

Warning 2: More than 1 different population or contrast (aka the Ecological Fallacy) Regression health warning! (519 citations) Science (1997) 277:968-71

Multiple regression Multiple regression is used to determine the effect of a number of independent variables, x 1, x 2, x 3 etc, on a single dependent variable, y Multiple regression is used to determine the effect of a number of independent variables, x 1, x 2, x 3 etc, on a single dependent variable, y The different x variables are combined in a linear way and each has its own regression coefficient: The different x variables are combined in a linear way and each has its own regression coefficient: y = a 1 x 1 + a 2 x 2 +…..+ a n x n + b + ε y = a 1 x 1 + a 2 x 2 +…..+ a n x n + b + ε The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y The a parameters reflect the independent contribution of each independent variable, x, to the value of the dependent variable, y i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for i.e. the amount of variance in y that is accounted for by each x variable after all the other x variables have been accounted for

SPM Linear regression is a GLM that models the effect of one independent variable, x, on ONE dependent variable, y Linear regression is a GLM that models the effect of one independent variable, x, on ONE dependent variable, y Multiple regression models the effect of several independent variables, x 1, x 2 etc, on ONE dependent variable, y Multiple regression models the effect of several independent variables, x 1, x 2 etc, on ONE dependent variable, y Both are types of General Linear Model Both are types of General Linear Model GLM can also allow you to analyse the effects of several independent x variables on several dependent variables, y 1, y 2, y 3 etc, in a linear combination GLM can also allow you to analyse the effects of several independent x variables on several dependent variables, y 1, y 2, y 3 etc, in a linear combination This is what SPM does! This is what SPM does!

Acknowledgements Previous years slides Previous years slides David Howells excellent book Statistical Methods for Psychology (2002) David Howells excellent book Statistical Methods for Psychology (2002) And David Howells website http://www.uvm.edu/~dhowell/StatPages/St atHomePage.html And David Howells website http://www.uvm.edu/~dhowell/StatPages/St atHomePage.html http://www.uvm.edu/~dhowell/StatPages/St atHomePage.html http://www.uvm.edu/~dhowell/StatPages/St atHomePage.html The lecturers declare that they do not own stocks, shares or capital investments in David Howells book, they are not employed by the Duxbury group and do not consult for them, nor are they associated with David Howell, or his friends, or his family, or his cat

T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic.

Similar presentations

Presentation on theme: "T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic.

Similar presentations

Presentation on theme: "T-tests, ANOVA and regression Methods for Dummies February 1 st 2006 Jon Roiser and Predrag Petrovic."— Presentation transcript:

Similar presentations

About project

Feedback