Multiple Regression. Multiple regression  Previously discussed the one predictor scenario  Multiple regression is the case of having two or more independent.

Slides:



Advertisements
Similar presentations
Statistical Techniques I EXST7005 Multiple Regression.
Advertisements

Multiple Regression Advanced Issues.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Linear Regression t-Tests Cardiovascular fitness among skiers.
Model Adequacy Testing Assumptions, Checking for Outliers, and More.
Model Adequacy Running a Real Regression Analysis
Prediction, Correlation, and Lack of Fit in Regression (§11. 4, 11
Simple Linear Regression 1. Correlation indicates the magnitude and direction of the linear relationship between two variables. Linear Regression: variable.
Bivariate Regression Analysis
Understanding the General Linear Model
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
Statistics for Managers Using Microsoft® Excel 5th Edition
Class 5: Thurs., Sep. 23 Example of using regression to make predictions and understand the likely errors in the predictions: salaries of teachers and.
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
Lecture 6: Multiple Regression
Regression Diagnostics Checking Assumptions and Data.
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Multiple Regression Dr. Andy Field.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Simple Linear Regression Analysis
Relationships Among Variables
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Example of Simple and Multiple Regression
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Elements of Multiple Regression Analysis: Two Independent Variables Yong Sept
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Multiple Regression.
Basic linear regression and multiple regression Psych Fraley.
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Statistics for the Social Sciences Psychology 340 Fall 2013 Correlation and Regression.
The intelligent and valid application of analytic methods requires knowledge of the rationale, hence the assumptions, behind them. ~Elazar Pedhazur.
Extension to Multiple Regression. Simple regression With simple regression, we have a single predictor and outcome, and in general things are straightforward.
Soc 3306a Lecture 9: Multivariate 2 More on Multiple Regression: Building a Model and Interpreting Coefficients.
1 Psych 5510/6510 Chapter 10. Interactions and Polynomial Regression: Models with Products of Continuous Predictors Spring, 2009.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
SW388R6 Data Analysis and Computers I Slide 1 Multiple Regression Key Points about Multiple Regression Sample Homework Problem Solving the Problem with.
Scatterplots & Regression Week 3 Lecture MG461 Dr. Meredith Rolfe.
Slide 1 DSCI 5340: Predictive Modeling and Business Forecasting Spring 2013 – Dr. Nick Evangelopoulos Lecture 2: Review of Multiple Regression (Ch. 4-5)
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
ANOVA, Regression and Multiple Regression March
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Applied Quantitative Analysis and Practices LECTURE#28 By Dr. Osman Sadiq Paracha.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
Multiple Regression Prof. Andy Field.
Multiple Regression.
Week 14 Chapter 16 – Partial Correlation and Multiple Regression and Correlation.
Regression.
Stats Club Marnie Brennan
Regression Diagnostics
Regression Forecasting and Model Building
Regression Analysis.
MGS 3100 Business Analysis Regression Feb 18, 2016
Presentation transcript:

Multiple Regression

Multiple regression  Previously discussed the one predictor scenario  Multiple regression is the case of having two or more independent variables predicting some outcome variable  Basic idea is the same as simple regression, however more will need to be considered in its interpretation

The best fitting plane  Before we attempted to find the best fitting line to our 2d scatterplot of values  With the addition of another predictor our cloud of values becomes 3d  Now we are looking for what amounts to the best fitting plane With 3 or more we get into hyperspace and dealing with a regression surface  Regression equation:

Linear combination  The notion of a linear combination is important for you to understand for MR and multivariate techniques in general  Again, what MR analysis does is create a linear combination (weighted sum) of the predictors  The weights are important to help us assess the nature of the predictor-DV relationships with consideration of the other variables in the model  We then look to see how the linear combination in a sense matches up with the DV  One way to think about it is we extract relevant information from predictors to help us understand the DV

Stage of Condom Use (X3) Self-Efficacy of Condom Use (X2) Cons of Condom Use (X1) Pros of Condom Use (X4) Psychosexual Functioning X’ New Linear Combination 33  4 4 22 11 MR Example

Considerations in multiple regression  Assumptions  Overall fit  Parameter estimates and variable importance  Variable entry  IV relationships  Prediction

Assumptions: Normality  The assumptions for simple regression will continue to hold Normality, homoscedasticity, independence  Mulitvariate normality can be at least partially checked through examination of individual variables for normality, linearity, and heteroscedasticity  Tests for multivariate normality seem to be easily obtained in every package except SPSS

Assumptions: Model Misspecification  In addition, we must worry about model misspecification Omitting relevant variables, including irrelevant ones, incorrect paths  Not much one can do about omitting relevant variables, but it may produce biased and less valid results  However we can’t just throw in all the variables we can think of also Overfitting Violation of Ockham's razor  Including irrelevant variables contributes to the standard error of estimate (and thus the SE for our coefficients) which will affect the statistical tests on individual variables

Example data  Current salary predicted by educational level, time since hire, and previous experience (N = 474)  As with any analysis, initial data analysis should be extensive prior to examination of the inferential analysis

Initial examination of data  We can use the descriptives to give us a general feel for what’s going on with the variables in question  Here we can also see that months since hire and previous experience are not too well correlated with our dependent variable of current salary Ack!  We’d also want to look at the scatterplots to further aid our assessment of the predictor-DV relationships

Starting point: Statistical significance of the model  The Anova summary table tells us whether our model is statistically significant R 2 different from zero Equation is better predictor than the mean  As with simple regression, the analysis involves the ratio of variance predicted to residual variance  As we can see, it is reflective of the relationship of the predictors to the DV (R 2 ), the number of predictors in the model, and sample size

Multiple correlation coefficient  The multiple correlation coefficient is the correlation between the DV and the linear combination of predictors which minimizes the sum of the squared residuals  More simply, it is the correlation between the observed values and the values that would be predicted by our model  Its squared value (R 2 ) is the amount of variance in the dependent variable accounted for by the independent variables

R2R2  Here it appears we have an OK model for predicting current salary

Variable importance: Statistical significance  After noting that our model is viable, we can begin our interpretation of how the predictors’ relative contributions  To begin with we can examine the output to determine which variables statistically significantly contribute to the model  Standard error measure of the variability that would be found among the different slopes estimated from other samples drawn from the same population

Variable importance: Statistical significance  We can see from the output that only previous experience and education level are statistically significant predictors

Variable importance: Weights  Statistical significance, as usual, is only a starting point for our assessment of results  What we’d really want is a measure of the unique contribution of an IV to the model  Unfortunately the regression coefficient, though useful in understanding that particular variable’s relationship to the DV, is not useful for comparing to other IVs that are of a different scale

Variable importance: standardized coefficients  Standardized regression coefficients get around that problem  Now we can see how much the DV will change in standard deviation units with one standard deviation unit change in the IV (all others held constant)  Here we can see that education level seems to have much more influence on the DV Another 3 years of education  >$11000 bump in salary

Variable importance  However we still have other output to help us understand variable contribution  Partial correlation is the contribution of an IV after the contributions of the other IVs have been taken out of both the IV and DV  Semi-partial correlation is the unique contribution of an IV after the contribution of other IVs have been taken only out of the predictor in question

Variable importance: Partial correlation  A+B+C+D represents all the variability in the DV to be explained A+B+C = R 2  The squared partial correlation is the amount a variable explains relative to the amount in the DV that is left to explain after the contributions of the other IVs have been removed from both the predictor and criterion  It is A/(A+D) For IV 2 it would be B/(B+D)

Variable importance: Semipartial correlation  The semipartial correlation (squared) is perhaps the more useful measure of contribution  It refers to the unique contribution of A to the model, i.e. the relationship between the DV and IV after the contributions of the other IVs have been removed from the predictor  A/(A+B+C+D) For IV 2  B/(A+B+C+D)  Interpretation (of the squared value):  Out of all the variance to be accounted for, how much does this variable explain that no other IV does or  How much would R 2 drop if the variable were removed?

Variable importance IV 1 IV 2  Note that exactly how partial and semi-partial will be figured will depend on the type of multiple regression employed.  The previous examples concerned a standard multiple regression situation.  For sequential (i.e. hierarchical) regression, the partial correlation would be IV 1 = (A+C)/(A+C+D) IV 2 = B/(B+D)

Variable importance  For semi-partial correlation IV 1 = (A+C)/(A+B+C+D) IV 2 same as before  The result for the addition of the second variable is the same as it would be in standard MR  Thus if the goal is to see the unique contribution of a single variable after all others have been controlled for, there is no real reason to perform a sequential over standard MR  In general terms, it is the unique contribution of the variable at the point it enters the equation (sequential or stepwise)

Variable importance: Example data  The semipartial correlation is labeled as ‘part’ correlation in SPSS  Here we can see that education level is really doing all the work in this model Obviously from some alternate universe

Another example  Mental health symptoms predicted by number of doctor visits, physical health symptoms, number of stressful life events

 Here we see that physical health symptoms and stressful life events both significantly contribute to the model  Physical health symptoms more ‘important’

Variable Importance: Comparison  Comparison of standardized coefficients, partial, and semi-partial correlation coefficients  All of them are ‘partial’ correlations

Another Approach to Variable Importance  The methods just provided give us a glimpse as to variable importance, but interestingly we don’t have a unique contribution statistic that is a true decomposition of R-squared, i.e. that we could add each measure of importance to equal our overall R-squared  One that does provides an average R 2 increase, depending on the order the variable enters into the model 3 predictor example A B C; B A C, C A B etc.  One way to think about it using what you’ve just learned is thinking of the squared semi-partial correlation whether a variable is first second third etc.  Note that the average is for all possible permutations E.g. the R-square contribution for B being first in the model includes B A C and B C A, both of which would of course be the same value  The following example comes from the survey data

As Predictor 1: R 2 =.629 Note there are 2 models in which war would be.629 As Predictor 2: R 2 change =.639 and.087 As Predictor 3: R 2 change =.098 There are 2 models in which war would be.098

Interpretation  The average of these is the average contribution to R square for a particular variable over all possible orderings In this case for war it is ~.36, i.e. on average, it increases R square 36% of variance accounted for  Furthermore, if we add up the average R-squared contribution for all three… = is the R 2 for the model

R program example  library(relaimpo)  RegModel.1 <- lm(SOCIAL~BUSH+MTHABLTY+WAR, data=Dataset)  calc.relimp(RegModel.1, type = c("lmg", "last", "first", "betasq", "pratt"))  Output: LMG, is what we were just talking about. LMG stands for Lindemann, Merenda and Gold, authors who introduced it  Last is simply the squared semi-partial correlation  First is just the square of the simple bivariate correlation between predictor and DV  Beta square is the square of the beta coefficient with ‘all in’  Pratt is the product of the standardized correlation and the simple bivariate correlation It too will add up to the model R 2 but is not recommended, one reason being that it can actually be negative lmglastfirstbetasqpratt BUSH MATH WAR *Note the relaimpo package is equipped to provide bootstrapped estimates

Different Methods  Note that one’s assessment of relative importance may depend on the method  Much of the time those methods will largely agree, but they may not, so use multiple estimates to help you decide  One might go with the LMG typically as it is both intuitive and a decomposition of R 2 lmglastfirstbetasqpratt BUSH MATH WAR

Relative Importance Summary  There are multiple ways to estimate a variable’s contribution to the model, and some may be better than others  A general approach:  Check simple bivariate relationships. If you don’t see worthwhile correlations with the DV there you shouldn’t expect much from your results regarding the model  Check for outliers and compare with robust measures also You may detect that some variables are so highly correlated that one is redundant  Statistical significance is not a useful means of assessing relative importance, nor is the raw coefficient  Standardized coefficients and partial correlations are a first step Compare standardized to simple correlations as a check on possible suppression  Of typical output the semi-partial correlation is probably the more intuitive assessment  The LMG is also intuitive, and is a natural decomposition of R 2, unlike the others

Relative Importance Summary  One thing to keep in mind is that determining variable importance, while possible for a single sample, should not be overgeneralized  Variable orderings likely will change upon repeated sampling E.g. while one might think that war and bush are better than math (it certainly makes theoretical sense), saying that either would be better than the other would be quite a stretch with just one sample  What you see in your sample is specific to it, and it would be wise to not make any bold claims without validation

Regression Diagnostics  Of course all of the previous information would be relatively useless if we are not meeting our assumptions and/or have overly influential data points In fact, you shouldn’t be really looking at the results unless you test assumptions and look for outliers, even though this requires running the analysis to begin with  Various tools are available for the detection of outliers  Classical methods Standardized Residuals (ZRESID) Studentized Residuals (SRESID) Studentized Deleted Residuals (SDRESID)  Ways to think about outliers Leverage Discrepancy Influence  Thinking ‘robustly’

Regression Diagnostics  Standardized Residuals (ZRESID) Standardized errors in prediction  Mean 0, Sd = std. error of estimate  To standardize, divide each residual by its s.e.e. At best an initial indicator (e.g. the +2 rule of thumb), but because the case itself determines what the mean residual would be, almost useless  Studentized Residuals (SRESID) Same thing but studentized residual recognizes that the error associated with predicting values far from the mean of X is larger than the error associated with predicting values closer to the mean of X standard error is multiplied by a value that will allow the result to take this into account  Studentized Deleted Residuals (SDRESID) Studentized in which the standard error is calculated with the case in question removed from the others

Regression Diagnostics  Mahalanobis’ Distance Mahalanobis distance is the distance of a case from the centroid of the remaining points (point where the means meet in n-dimensional space)  Cook’s Distance Identifies an influential data point whether in terms of predictor or DV A measure of how much the residuals of all cases would change if a particular case were excluded from the calculation of the regression coefficients. With larger (relative) values, excluding a case would change the coefficients substantially.  DfBeta Change in the regression coefficient that results from the exclusion of a particular case Note that you get DfBetas for each coefficient associated with the predictors

Regression Diagnostics  Leverage assesses outliers among the IVs Mahalanobis distance  Relatively high Mahalanobis suggests an outlier on one or more variables  Discrepancy Measures the extent to which a case is in line with others  Influence A product of leverage and discrepancy How much would the coefficients change if the case were deleted?  Cook’s distance, dfBetas

Outliers  Influence plots  With a couple measures of ‘outlierness’ we can construct a scatterplot to note especially problematic cases After fitting a regression model in R- commander, i.e. running the analysis, this graph is available via point and click  Here we have what is actually a 3-d plot, with 2 outlier measures on the x and y axes (studentized residuals and ‘hat’ values, a measure of leverage) and a third in terms of the size of the circle (Cook’s distance)  For this example, case 35 appears to be a problem

Outliers  It should be clear to interested readers whatever has been done to deal with outliers,  Applications such as S-plus, R, and even SAS and Stata (pretty much all but SPSS) provide methods of robust regression analysis, and would be preferred

Summary: Outliers  No matter the analysis, some cases will be the ‘most extreme’. However, none may really qualify as being overly influential.  Whatever you do, always run some diagnostic analysis and do not ignore influential cases  It should be clear to interested readers whatever has been done to deal with outliers  As noted before, the best approach to dealing with outliers when they do occur is to run a robust regression with capable software