Chapter 15 – Multiple Linear Regression

Chapter 15 – Multiple Linear Regression
Introduction to Business Statistics, 6e Kvanli, Pavur, Keeling Chapter 15 – Multiple Linear Regression Slides prepared by Jeff Heyl, Lincoln University ©2003 South-Western/Thomson Learning™

Multiple Regression Model
Y = 0 + 1X1 + 2X2 + kXk + e Deterministic component 0 + 1X1 + 2X2 + kXk Least Squares Estimate SSE = ∑(Y - Y)2 ^

Y X1 X2 Y = 0 + 1X1 + 2X2 e (positive) e (negative) Figure 15.1

Figure 15.2

Assumptions of the Multiple Regression Model
The errors follow a normal distribution, centered at zero, with common variance The errors are independent

Errors in Multiple Linear Regression
Y X1 X2 Y = 0 + 1X1 + 2X2 X1 = 30, X2 = 8 X1 = 50, X2 = 2 e Figure 15.3

An estimate of e2 s2 = e2 = = SSE n - (k + 1) n - k - 1 ^

Hypothesis Test for the Significance of the Model
Ho: 1 = 2 = … = k Ha: at least one of the ’s ≠ 0 F = MSR MSE Reject Ho if F > F,k,n-k-1

ANOVA Table Figure 15.4

Associated F Curve reject H0 F,v ,v 1 2 Area =  Figure 15.4

Test for Ho: i = 0 b1 sb reject Ho if |t| > t ./2,n-k-1 t =
Ho: 1 = 0 (X1 does not contribute) Ha: 1 ≠ 0 (X1 does contribute) Ho: 2 = 0 (X2 does not contribute) Ha: 2 ≠ 0 (X2 does contribute) Ho: 3 = 0 (X3 does not contribute) Ha: 3 ≠ 0 (X3 does contribute) reject Ho if |t| > t ./2,n-k-1 t = b1 sb 1 bi - t/2,n-k-1sb to bi + t/2,n-k-1sb i (1- ) 100% Confidence Interval

BB Investments Example
Figure 15.6

Coefficient of Determination
SST = total sum of squares = SSY = ∑(Y - Y)2 = ∑Y2 - (∑Y)2 n R2 = 1 - SSE SST F = R2 / k (1 - R2) / (n - k - 1)

Partial F Test Test statistic (Rc2 - Rr2) / v1 (1 - Rc2) / v2 F =
Rc2 = the value of R2 for the complete model Rr2 = the value of R2 for the reduced model Test statistic F = (Rc2 - Rr2) / v1 (1 - Rc2) / v2

Real Estate Example Figure 15.7

Real Estate Example Figure 15.8

Motormax Example Figure 15.9

Quadratic Curves Y X 24 18 16 22 | 1 2 3 4 5 A B Figure 15.10

Motormax Example Figure 15.11

Error From Extrapolation
Y X Predicted Actual | 1 2 3 4 5 Figure 15.12

Multicollinearity Occurs when independent variables are highly correlated with each other Often detectable through pairwise correlations readily available in statistical packages The variance inflation factor can also be used VIFj = 1 1 - Rj2 Conclude severe multicollinearity exists when the maximum VIFj > 10

Multicollinearity Example
Figure 15.13

Figure 15.14

Figure 15.15

Figure 15.16

Multicollinearity The stepwise selection process can help eliminate correlated predictor variables Other advanced procedures such as ridge regression can also be applied Care should be taken during the model selection phase as multicollinearity can be difficult to detect and eliminate

Dummy Variables Dummy, or indicator, variables allow for the inclusion of qualitative variables in the model For example: X1 = if female 0 if male

Dummy Variable Example
Figure 15.17

Stepwise Procedures Procedures either choose or eliminate variables, one at a time, in an effort to avoid including variables with either no predictive ability or are highly correlated with other predictor variables Forward regression Add one variable at a time until contribution is insignificant Backward regression Remove one variable at a time starting with the “worst” until R2 drops significantly Stepwise regression Forward regression with the ability to remove variables that become insignificant

Stepwise Regression Include X3 Include X6 Include X2 Include X5
Remove X2 (When X5 was inserted into the model X2 became unnecessary) Include X7 Remove X7 - it is insignificant Stop Final model includes X3, X5 and X6 Stepwise Regression Figure 15.18

Checking Model Assumptions
Checking Assumption 1 - Normal distribution Construct a histogram Checking Assumption 2 - Constant variance Plot residuals versus predicted Y values ^ Checking Assumption 3 - Errors are independent Durbin-Watson statistic

Detecting Sample Outliers
Sample leverages Standardized residuals Cook’s distance measure Cook’s distance measure Di = (standardized residual)2 1 k + 1 hi 1 - hi

Residual Analysis Figure 15.19

Residual Analysis Frequency Histogram Figure 15.20 Class Limits 14 –
12 – 10 – 8 – 6 – 4 – 2 – 0 – -600 and under -450 -450 and under -300 -300 and under -150 -150 and under 0 0 and under 150 150 and under 300 300 and under 450 450 and under 600 Frequency Histogram Class Limits Figure 15.20

Residual Analysis Figure 15.21

Prediction Using Multiple Regression
Figure 15.22

Prediction Using Multiple Regression
Confidence and Prediction Intervals Y - t/2,n-k-1sY to Y + t/2,n-k-1sY ^ (1- ) 100% Confidence Interval for µY|X (1- ) 100% Confidence Interval for Yx Y - t/2,n-k-1 s2 + sY2 to Y + t/2,n-k-1 s2 + sY2 ^

Interaction Effects Y = 0 + 1X1 + 2X2 + 3X1X2 + e
Implies how variables occur together has an impact on prediction of the dependent variable Y = 0 + 1X1 + 2X2 + 3X1X2 + e µY = 0 + 1X1 + 2X2 + 3X1X2

Interaction Effects X1 | 1 2 µY = 18 + 5X1 µY = 30 - 10X1 X2 = 2
60 – 50 – 40 – 30 – 20 – 10 – µY X1 A | 1 2 µY = X1 µY = X1 X2 = 2 X2 = 5 B µY = X1 µY = X1 Figure 15.23

Quadratic and Second-Order Models
Y = 0 + 1X1 + 2X12 + e Quadratic Effects Y = 0 + 1X1 + 2X2 + 3X1X2 + 4X12 + 5X22 + e Complete Second-Order Models Y = 0 + 1X1 + 2X2 + 3X3 + 4X1X2 + 5X2X3 + 6X2X3 + 7X12 + 8X22 + 9X32 + e

Financial Example Figure 15.24

Chapter 15 – Multiple Linear Regression

Similar presentations

Presentation on theme: "Chapter 15 – Multiple Linear Regression"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 15 – Multiple Linear Regression

Similar presentations

Presentation on theme: "Chapter 15 – Multiple Linear Regression"— Presentation transcript:

Similar presentations

About project

Feedback