Ch15: Multiple Regression 3 Nov 2011 BUSI275 Dr. Sean Ho HW7 due Tues Please download: 17-Hawlins.xls 17-Hawlins.xls.

Slides:



Advertisements
Similar presentations
12.1: One-way ANOVA 8 Nov 2011 BUSI275 Dr. Sean Ho HW7 due tonight Please download: 18-Delivery.xls 18-Delivery.xls.
Advertisements

Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 14-1 Business Statistics: A Decision-Making Approach 6 th Edition Chapter.
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Statistics for the Social Sciences
Statistics for Managers Using Microsoft® Excel 5th Edition
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 12 Multiple Regression
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Lecture 6: Multiple Regression
RESEARCH STATISTICS Jobayer Hossain Larry Holmes, Jr November 6, 2008 Examining Relationship of Variables.
Multiple Linear Regression
Regression Diagnostics Checking Assumptions and Data.
Ch. 14: The Multiple Regression Model building
Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 11 th Edition.
Chapter 15: Model Building
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Multiple Regression Dr. Andy Field.
Copyright ©2011 Pearson Education 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft Excel 6 th Global Edition.
Guide to Using Excel For Basic Statistical Applications To Accompany Business Statistics: A Decision Making Approach, 6th Ed. Chapter 14: Multiple Regression.
Correlation & Regression
Chapter 8 Forecasting with Multiple Regression
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 15-1 Chapter 15 Multiple Regression Model Building Statistics for Managers using Microsoft.
© 2004 Prentice-Hall, Inc.Chap 15-1 Basic Business Statistics (9 th Edition) Chapter 15 Multiple Regression Model Building.
Lecture 12 Model Building BMTRY 701 Biostatistical Methods II.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Ch14: Linear Correlation and Regression 13 Mar 2012 Dr. Sean Ho busi275.seanho.com Please download: 09-Regression.xls 09-Regression.xls HW6 this week Projects.
Ch15: Multiple Regression 12.1: One-way ANOVA 20 Mar 2012 Dr. Sean Ho busi275.seanho.com Please download: 10-MultRegr.xls 10-MultRegr.xls HW7 this week.
Maths Study Centre CB Open 11am – 5pm Semester Weekdays
Ch10: T-tests 6 Mar 2012 Dr. Sean Ho busi275.seanho.com Please download: 08-TTests.xls 08-TTests.xls HW5 this week Projects.
Linear Regression Analysis Using MS Excel Tutorial for Assignment 2 Civ E 342.
September 18-19, 2006 – Denver, Colorado Sponsored by the U.S. Department of Housing and Urban Development Conducting and interpreting multivariate analyses.
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
Model Selection and Validation. Model-Building Process 1. Data collection and preparation 2. Reduction of explanatory or predictor variables (for exploratory.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Data Analysis.
 Relationship between education level, income, and length of time out of school  Our new regression equation: is the predicted value of the dependent.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Basic Business Statistics, 10e © 2006 Prentice-Hall, Inc. Chap 15-1 Chapter 15 Multiple Regression Model Building Basic Business Statistics 10 th Edition.
Chap 13-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 13 Multiple Regression and.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 14-1 Chapter 14 Multiple Regression Model Building Statistics for Managers.
Ch14: Correlation and Linear Regression 27 Oct 2011 BUSI275 Dr. Sean Ho HW6 due today Please download: 15-Trucks.xls 15-Trucks.xls.
Linear Regression Chapter 7. Slide 2 What is Regression? A way of predicting the value of one variable from another. – It is a hypothetical model of the.
Multiple Regression 8 Oct 2010 CPSY501 Dr. Sean Ho Trinity Western University Please download from “Example Datasets”: Record2.sav Domene.sav.
: Hypothesis Tests on Regression 1 Nov 2011 BUSI275 Dr. Sean Ho HW7 due next Tues Please download: 16-Banks.xls 16-Banks.xls.
Mediators 22 Nov 2011 BUSI275 Dr. Sean Ho HW8 due Thu Please download: 21-ExamAnxiety.xls 21-ExamAnxiety.xls.
Ch16: Time Series 24 Nov 2011 BUSI275 Dr. Sean Ho HW8 due tonight Please download: 22-TheFed.xls 22-TheFed.xls.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
ENGR 610 Applied Statistics Fall Week 11 Marshall University CITE Jack Smith.
Lab 4 Multiple Linear Regression. Meaning  An extension of simple linear regression  It models the mean of a response variable as a linear function.
REGRESSION REVISITED. PATTERNS IN SCATTER PLOTS OR LINE GRAPHS Pattern Pattern Strength Strength Regression Line Regression Line Linear Linear y = mx.
Multiple Regression Scott Hudson January 24, 2011.
DSCI 346 Yamasaki Lecture 6 Multiple Regression and Model Building.
Regression. Why Regression? Everything we’ve done in this class has been regression: When you have categorical IVs and continuous DVs, the ANOVA framework.
26134 Business Statistics Week 4 Tutorial Simple Linear Regression Key concepts in this tutorial are listed below 1. Detecting.
1 BUSI 6220 By Dr. Nick Evangelopoulos, © 2012 Brief overview of Linear Regression Models (Pre-MBA level)
Yandell – Econ 216 Chap 15-1 Chapter 15 Multiple Regression Model Building.
Stats Methods at IC Lecture 3: Regression.
Chapter 15 Multiple Regression Model Building
Chapter 15 Multiple Regression and Model Building
Multiple Regression Prof. Andy Field.
Multiple Regression.
Chapter 12: Regression Diagnostics
CHAPTER 29: Multiple Regression*
Lecture 12 Model Building
Presentation transcript:

ch15: Multiple Regression 3 Nov 2011 BUSI275 Dr. Sean Ho HW7 due Tues Please download: 17-Hawlins.xls 17-Hawlins.xls

3 Nov 2011 BUSI275: multiple regression2 Outline for today Multiple regression model Running it in Excel Interpreting output Unique contributions of predictors Automated predictor selection Moderation (interaction of predictors) How to test for it Regression diagnostics: checking assumptions Transforming variables

3 Nov 2011 BUSI275: multiple regression3 Multiple regression 1 outcome (scale), k predictors (scale) Linear model: hyperplane ŷ = b 0 + b 1 x 1 + b 2 x 2 + … + b k x k Residuals still assumed normal, homoscedastic y x1x1 x2x2 slope b 1 slope b 2 Regression plane: ŷ = b 0 + b 1 x 1 + b 2 x 2 yiyi ŷi ŷi εiεiεiεi

3 Nov 2011 BUSI275: multiple regression4 Multiple regression in Excel Dataset: 17-Hawlins.xls17-Hawlins.xls DV (y): Output (units/wk) IV (x 1 ): Employees IV (x 2 ): Age of Plant (yrs) Pairwise scatters are helpful Note R 2 for each predictor Data → Analysis → Regression Y Range: B1:B160 X Range: C1:D160 Check “Labels” and “Standardized Residuals”

3 Nov 2011 BUSI275: multiple regression5 Interpreting the output R Square (R 2 ): fraction of DV var explained Adjusted R 2 compensates for adding more IVs ANOVA table: F, p, and dfs “Number of employees and plant age significantly predicted output: R 2 =.72, F(2, 156) = 200.7, p <.001.” Coefficient table: For each predictor: slope b i, t-score, and p Both slopes are significantly nonzero Standardized residuals: z-scores Can use to look for observations that don't fit the model (e.g., |z| > 3)

3 Nov 2011 BUSI275: multiple regression6 Unique contributions From the Employees scatter, it predicts Output pretty well (R 2 = 71%) Age? Not so well (R 2 = 12%) When use both together, why is R 2 only 72%? Most of the 12% of variability in Output explained by Age is shared variability: Age doesn't tell us much more about Output than we already knew from Employees Age's unique contribution is only 1% Compare regression using all predictors against regression using all except Age Output Emp. Age

3 Nov 2011 BUSI275: multiple regression7 Drawing conclusions We see that Employees and Age do significantly predict Output (global F test), and Each predictor does contribute significantly (t-tests on slope), but The unique contribution of Age is very small, so Most of the predictive power is in the number of employees. In a formal write-up, you usually want to include details such as R 2, F, dfs, and p, for those who understand the statistics.

3 Nov 2011 BUSI275: multiple regression8 Automated predictor selection “Best subsets” regression uses several runs with different combinations of predictors to try to find the set that predicts best while using the fewest predictors Parsimony: simpler model to understand “Stepwise” regression adds/removes 1 predictor at a time to try to do the same Backward: eliminate the least significant IV Forward: add the next most significant IV Only in PHStat add-on, or SPSS, Stata, R, etc.

3 Nov 2011 BUSI275: multiple regression9 Moderation Moderator: a predictor that affects the strength of another predictor's influence on the outcome Interacts with the other predictor E.g., natural disasters may affect your supply, but having multiple suppliers buffers the effect Supply Disasters Few suppliers Many suppliers DisastersSupply # of Suppliers

3 Nov 2011 BUSI275: multiple regression10 Testing for moderation How do we know if predictors are interacting? Add an interaction term to the regression: ŷ = b 0 + b 1 x 1 + b 2 x 2 + b 12 x 1 x 2 In Excel, centre both IVs (subtract their means), then make a 3rd column with the product Include it in the regression as if it were an IV Check the t-test to see if the slope (b 12 ) of the interaction term is significantly nonzero If so, check R 2 both with and without the interaction term to see the size of its effect Also 3-way (x 1 x 2 x 3 ) and higher interactions!

3 Nov 2011 BUSI275: multiple regression11 Diagnostics: check assumptions Normality of residuals: Check histogram of standardized residuals Homoscedasticity: Residual plot: residuals vs. predicted values Look for any odd or “fan shaped” patterns Linearity: curves on the residual plot Try adding x 1 2 or x 2 2, etc. to the model And/or apply transforms to variables Indep. of residuals (time series are usually bad) Collinearity of IVs: check correlations of IVs Outliers / influential points: see residual plot

3 Nov 2011 BUSI275: multiple regression12 Homoscedasticity & linearity

3 Nov 2011 BUSI275: multiple regression13 Transforms Some variables (either IVs or DV) may be so heavily skewed that they break assumptions (esp. heteroscedasticity and nonlinearity) You can try applying a transform to make them roughly more symmetric or normal But strict normality is not required E.g., log(income) is usually more normal The family of power transforms includes: √x, x 2, 1/x, x -5.2, etc., as well as log(x) May need to shift (x+c) or reflect (c-x) first The Box-Cox procedure “automatically” selects a power transform for your variable

3 Nov 2011 BUSI275: multiple regression14 TODO HW7 (ch10,14): due Tue 8 Nov Projects: Acquire data if you haven't already  If waiting for REB: try making up toy data so you can get started on analysis Background research for likely predictors of your outcome variable Read ahead on your chosen method of analysis (regression, time-series, logistic, etc.)