Understanding the General Linear Model

Name: Understanding the General Linear Model
Uploaded: 2017-11-28T01:44:55+00:00
Duration: PTM18S14
Channel: Sherman Moore
Description: Understanding the General Linear Model

Understanding the General Linear Model
Regression Understanding the General Linear Model

Relations among variables
A goal of science is prediction and explanation of phenomena In order to do so we must find events that are related in some way such that knowledge about one will lead to knowledge about the other In psychology we seek to understand the relationship among variables that are indicators of an innumerable amount of information about human nature in order better understand ourselves and why we do the things we do

Correlation While we could just use our N of 1 personal experience to try and understand human behavior, a scientific (and better) means of understanding the relationship between variables is by means of assessing correlation Two variables take on different values, but if they are related in some fashion they will covary They may do so in a way in which their values tend to move in the same direction, or they may tend to move in opposite directions The underlying statistic assessing this is covariance, which is at the heart of every statistical procedure you are likely to use inferentially

Covariance and Correlation
Covariance as a statistical construct is unbounded and thus difficult to interpret in its raw form Correlation (Pearson’s r) is a measure of the direction and degree of a linear association between two variables Correlation is the standardized covariance between two variables

Regression Regression allows us to use the information about covariance to make predictions Given a particular value of x, we can predict y with some level of accuracy The basic model is that of a straight line (the general linear model) Only one possible straight line can be drawn once the slope and Y intercept are specified The formula for a straight line is: Y = bx + a Y = the calculated value for the variable on the vertical axis a = the intercept b = the slope of the line X = a value for the variable on the horizontal axis Once this line is specified, we can calculate the corresponding value of Y for any value of X entered In more general terms Y = Xb + e, where these elements represent vectors and/or matrices (of the outcome, data, coefficients and error respectively), is the general linear model to which most of the techniques in psychological research adhere to

The Line of Best Fit Real data do not conform perfectly to a straight line The best fit straight line is that which minimizes the amount of variation in data points from the line The common, but by no means the only or only acceptable method attempts to derive a least squares regression line which minimizes the squared deviations from it The equation for this line can be used to predict or estimate an individual’s score on Y on the basis of his or her score on X

Least Squares Modeling
When the relation between variables are expressed in this manner, we call the relevant equation(s) mathematical models The intercept and weight values are called the parameters of the model While typical regression analysis by itself does not determine causal relations, the assumption indicated by such a model is that the variable on the left-hand side of the previous equation is being caused by the variable(s) on the right side The arrows explicitly go from the predictors to the outcome, not vice versa* Variable X Variable Y Variable Z Criterion A B C *If you think more or differently directed arrows are in order, you need to do something other than MR.

Parameter Estimation The process of obtaining the correct parameter values (assuming we are working with the right model) is called parameter estimation Often, theories specify the form of the relationship rather than the specific values of the parameters The parameters themselves, assuming the basic model is correct, are typically estimated from data. We refer to the estimation processes as “calibrating the model” A method is required for choosing parameter values that will give us the best representation of the data possible In estimating the parameters of our model, we are trying to find a set of parameters that minimizes the error variance. With least-squares estimation, we want to be as small as it possibly can be.

Estimates of the constant (a) and coefficient (b) in the simple setting
Estimating the Slope (the regression coefficient) requires first estimating the covariance Estimating the Y intercept where and are the means based on the sets of the Y and X values respectively, and b is the estimated slope These calculations ensure that the regression line passes through the point on the scatterplot defined by the two means

In terms of the Pearson r

What can the model explain? Variance Components
Total variability in the dependent variable (observed – mean) comes from two sources Variability predicted by the model i.e. what variability in the dependent variable is due to the independent variable How far off our predicted values are from the mean of Y Error or residual variability i.e. variability not explained by the independent variable The difference between the predicted values and the observed values S2y S2 S2(yi - i) Total variance = predicted variance + error variance

R-squared - the coefficient of determination
We can also show this graphically using a Venn diagram Showing r2 as the proportion of variability shared by two variables (X and Y) The larger the area of overlap, the greater the strength of the association between the two variables The square of the correlation, r², is the fraction of the variation in the values of y that is explained by the regression of y on x R² = variance of predicted values y divided by the variance of observed values y

Predicted variance and r2
Hence predicted variance= …….. e.g. Smoking: if r is 0.5, this means that 25% of age of death can be predicted from number of cigs smoked

The Accuracy of Prediction
How good a fit does our line represent? The error associated with a prediction (of a Y value from a known X value) is a function of the deviations of Y about the predicted point The standard error of estimate provides an assessment of accuracy of prediction the standard deviation of Y predicted from X In terms of R2, we can see that the more variance we account for the smaller our standard error of estimate will be

Interpreting regression: Summary of the basics
Intercept Value of Y if X is 0 Often not meaningful, particularly if it’s practically impossible to have an X of 0 (e.g. weight) Slope Amount of change in Y seen with 1 unit change in X Standardized regression coefficient Amount of change in Y seen in standard deviation units with 1 standard deviation unit change in X In simple regression it is equivalent to the r for the two variables Standard error of estimate Gives a measure of the accuracy of prediction R2 Proportion of variance explained by the model

The General Linear Model with Categorical Predictors
Extension

Extension Regression can actually handle different types of predictors, and in the social sciences we are often interested in differences between groups For now we will concern ourselves with the two independent groups case E.g. gender, republican vs. democrat etc.

Dummy coding There are different ways to code categorical data for regression, and in general, to represent a categorical variable you need k-1* coded variables k = number of categories/groups Dummy coding involves using zeros and ones to identify group membership, and since we only have two groups, one group will be zero (the reference group) and the other 1 We will revisit coding with k > 2 after we’ve discussed multiple regression *We will come back to why we use k-1 rather than k later

Dummy coding Example The thing to note at this point is that we have a simple bivariate correlation/simple regression setting The correlation between group and the DV is .76 This is sometimes referred to as the point biserial correlation (rpb) because of the categorical variable However, don’t be fooled, it is calculated exactly the same way as before i.e. you treat that 0,1 grouping variable like any other in calculating the correlation coefficient Group DV 0 3 0 5 0 7 0 2 1 6 1 7 1 8 1 9

Example Graphical display The R-square is .762 = .577
The regression equation is

Example Look closely at the descriptive output compared to the coefficients. What do you see?

The constant Note again our regression equation
Recall the definition for the slope and constant First the constant, what does “when X = 0” mean here in this setting? It means when we are in the 0 group What is that value? Y = 4, which is that group’s mean The constant here is thus the reference group’s mean

The coefficient Now think about the slope
What does a ‘1 unit change in X’ mean in this setting? It means we go from one group to the other Based on that coefficient, what does the slope represent in this case (i.e. can you derive that coefficient from the descriptive stats in some way?) The coefficient is the difference between means

The regression line The regression line covers the values represented
i.e. 0, 1, for the two groups It passes through each of their means Using least squares regression the regression line always passes through the mean of X and Y The constant (if we are using dummy coding) is the mean for the zero (reference) group The coefficient is the difference between means

More to consider Analysis of variance
Recall that in regression we are trying to account for the variance in the DV That total variance reflects the sum of the squared deviations of values from the DV mean Sums of squares That breaks down into: Variance we account for Sums of squares predicted or model or regression And that which we do not account for Sums of squares ‘error’ (observed – predicted)

Variance accounted for
What are our predicted values in this case? We only have 2 values of X to plug in We already know what Y is if X is zero, and so we’d predict the group mean of 4 for all zero values The only other value to plug in is 1 for the rest of the cases In other words for those in the 1 group, we’re predicting their respective mean

So in order to get our model summary and F-statistic, we need: Total variance Predicted variance Predicted value minus grand mean of the DV just like it has always been Note again how our average predicted value is our group average for the DV Error variance Essentially each person’s score minus group mean

Predicted SS = 5[(4-5.7)2 + ( )2] 28.9 Error SS = (3-4) )2…+ (9-7.4)2 21.2 Total variance to be accounted for = (3-5.7)2+(5-5.7)2+…(9-5.7)2 Or just Predicted SS + Error SS 50.1 Calculate R2 from these values

Regression output Here is the summary table from our regression
The mean square is derived from dividing our sums of squares by the degrees of freedom K-1 for the regression Total = N -1 Error N-k The ratio of the mean squares is the F-statistic

ANOVA = Regression Note the title of the summary table
It is an ANOVA summary table because you have in fact just conducted an analysis of variance, specifically for the two group situation ANOVA, the statistical procedure as it is so-called, is a special case of regression Below the first table is the ANOVA, as opposed to regression output.

Eta-squared = R-squared
Note the ‘partial eta-squared’ Eta-squared has the same interpretation as R-squared and as one can see, is R-squared from our regression SPSS calls it partial as there is often more than one grouping variable, and we are interested in unique effects (i.e. partial out the effects from other variables) However it is actually eta-squared here, as there is no other variable effect to partial out

The lowly t-test The t-test is a special case of ANOVA
ANOVA can handle more than two groups, while the t-test is just for two However, F = t2 in the two group setting, the p-value is exactly the same

The lowly t-test Compare to regression
The t, standard error, CI and p-value are the same, and again the coefficient is the difference between means

The Statistical Language
Statistics is a language used for communicating research ideas and findings We have various dialects with which to speak it and of course pick freely of the words available Sometimes we prefer to do regression and talk about amount of variance to be accounted for Sometimes we prefer to talk about mean differences and how large those are In both cases we are interested in the effect size Which tool we use reflects how we want to talk about our results

Parameter Estimation example
Let’s assume that we believe there is a linear relationship between X and Y. Which set of parameter values will bring us closest to representing the data accurately?

Estimation example We begin by picking some values, plugging them into the equation, and seeing how well the implied values correspond to the observed values We can quantify what we mean by “how well” by examining the difference between the model-implied Y and the actual Y value This difference between our observed value and the one predicted, , is often called error in prediction, or the residual

Estimation example Let’s try a different value of b and see what happens Now the implied values of Y are getting closer to the actual values of Y, but we’re still off by quite a bit

Estimation example Things are getting better, but certainly things could improve

Estimation example Ah, much better

Estimation example Now that’s very nice
There is a perfect correspondence between the predicted values of Y and the actual values of Y

Understanding the General Linear Model

Similar presentations

Presentation on theme: "Understanding the General Linear Model"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Understanding the General Linear Model

Similar presentations

Presentation on theme: "Understanding the General Linear Model"— Presentation transcript:

Similar presentations

About project

Feedback