Presentation is loading. Please wait.

Presentation is loading. Please wait.

12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

Similar presentations


Presentation on theme: "12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved."— Presentation transcript:

1 12-1

2 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.

3 12-3 Multiple Regression 12.1The Linear Regression Model 12.2The Least Squares Estimates and Prediction 12.3The Mean Squared Error and the Standard Error 12.4Model Utility: R 2, Adjusted R 2, and the F Test 12.5Testing the Significance of an Independent Variable 12.6Confidence Intervals and Prediction Intervals 12.7Dummy Variables 12.8Model Building and the Effects of Multicollinearity 12.9 Residual Analysis in Multiple Regression

4 The Linear Regression Model The linear regression model relating y to x 1, x 2, …, x k is is the mean value of the dependent variable y when the values of the independent variables are x 1, x 2, …, x k.  are the regression parameters relating the mean value of y to x 1, x 2, …, x k.  is an error term that describes the effects on y of all factors other than the independent variables x 1, x 2, …, x k. where

5 12-5 Example: The Linear Regression Model Example 12.1: The Fuel Consumption Case

6 12-6 The Linear Regression Model Illustrated Example 12.1: The Fuel Consumption Case

7 12-7 The Regression Model Assumptions Assumptions about the model error terms,  ’s Mean Zero The mean of the error terms is equal to 0. Constant Variance The variance of the error terms   is, the same for every combination values of x 1, x 2, …, x k. Normality The error terms follow a normal distribution for every combination values of x 1, x 2, …, x k. Independence The values of the error terms are statistically independent of each other. Model

8 Least Squares Estimates and Prediction Estimation/Prediction Equation: b 1, b 2, …, b k are the least squares point estimates of the parameters  1,  2, …,  k. x 01, x 02, …, x 0k are specified values of the independent predictor variables x 1, x 2, …, x k. is the point estimate of the mean value of the dependent variable when the values of the independent variables are x 01, x 02, …, x 0k. It is also the point prediction of an individual value of the dependent variable when the values of the independent variables are x 01, x 02, …, x 0k.

9 12-9 Example: Least Squares Estimation Example 12.3: The Fuel Consumption Case Minitab Output FuelCons = Temp Chill Predictor Coef StDev T P Constant Temp Chill S = R-Sq = 97.4% R-Sq(adj) = 96.3% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Predicted Values (Temp = 40, Chill = 10) Fit StDev Fit 95.0% CI 95.0% PI ( 9.895, ) ( 9.293, )

10 12-10 Example: Point Predictions and Residuals Example 12.3: The Fuel Consumption Case

11 Mean Square Error and Standard Error Mean Square Error, point estimate of residual variance   Standard Error, point estimate of residual standard deviation  Example 12.3 The Fuel Consumption Case Sum of Squared Errors Analysis of Variance Source DF SS MS F P Regression Residual Error Total

12 Model Utility: Multiple Coefficient of Determination, R² The multiple coefficient of determination R 2 is R 2 is the proportion of the total variation in y explained by the linear regression model

13 Model Utility: Adjusted R 2 The adjusted multiple coefficient of determination is Fuel Consumption Case: S = R-Sq = 97.4% R-Sq(adj) = 96.3% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

14 Model Utility: F Test for Linear Regression Model To testH 0 :   =   = …=   = 0 versus H a : At least one of the  ,  , …,  k is not equal to 0 Test Statistic: Reject H 0 in favor of H a if: F(model) > F   or p-value <  F  is based on k numerator and n-(k+1) denominator degrees of freedom.

15 12-15 Example: F Test for Linear Regression Test Statistic: Example 12.5 The Fuel Consumption Case Minitab Output Reject H 0 at  level of significance, since F  is based on 2 numerator and 5 denominator degrees of freedom. F-test at  = 0.05 level of significance Analysis of Variance Source DF SS MS F P Regression Residual Error Total

16 Testing Significance of the Independent Variable Test Statistic If the regression assumptions hold, we can reject H 0 :  j = 0 at the  level of significance (probability of Type I error equal to  ) if and only if the appropriate rejection point condition holds or, equivalently, if the corresponding p-value is less than . t , t  /2 and p-values are based on n – (k+1) degrees of freedom. AlternativeReject H 0 if:p-Value 100(1-  )% Confidence Interval for  j

17 12-17 Example: Testing and Estimation for  s Example 12.6: The Fuel Consumption Case Minitab Output Predictor Coef StDev T P Constant Temp Chill t , t  /2 and p-values are based on 5 degrees of freedom. Chill is significant at the  = 0.05 level, but not at  = 0.01 Test Interval

18 Confidence and Prediction Intervals t  is based on n-(k+1) degrees of freedom Prediction: 100(1 -  )% confidence interval for the mean value of y If the regression assumptions hold, 100(1 -  )% prediction interval for an individual value of y (Distance value requires matrix algebra)

19 12-19 Example: Confidence and Prediction Intervals Example 12.9 The Fuel Consumption CaseMinitab Output FuelCons = Temp Chill Predicted Values (Temp = 40, Chill = 10) Fit StDev Fit 95.0% CI 95.0% PI (9.895, ) (9.293,11.374) 95% Confidence Interval95% Prediction Interval

20 Dummy Variables Example The Electronics World Case Location Dummy Variable

21 12-21 Example: Regression with a Dummy Variable Example 12.11: The Electronics World Case Minitab Output Sales = Households DM Predictor Coef StDev T P Constant Househol DM S = R-Sq = 98.3% R-Sq(adj) = 97.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total

22 Model Building and the Effects of Multicollinearity Example: The Sale Territory Performance Case

23 12-23 Correlation Matrix Example: The Sale Territory Performance Case

24 12-24 Multicollinearity Multicollinearity refers to the condition where the independent variables (or predictors) in a model are dependent, related, or correlated with each other. Effects Hinders ability to use b j s, t statistics, and p-values to assess the relative importance of predictors. Does not hinder ability to predict the dependent (or response) variable. Detection Scatter Plot Matrix Correlation Matrix Variance Inflation Factors (VIF)

25 Residual Analysis in Multiple Regression For an observed value of y i, the residual is If the regression assumptions hold, the residuals should look like a random sample from a normal distribution with mean 0 and variance  2. Residual Plots Residuals versus each independent variable Residuals versus predicted y’s Residuals in time order (if the response is a time series) Histogram of residuals Normal plot of the residuals

26 12-26 Multiple Regression Summary: 12.1 The Linear Regression Model 12.2 The Least Squares Estimates and Prediction 12.3 The Mean Squared Error and the Standard Error 12.4 Model Utility: R2, Adjusted R2, and the F Test 12.5 Testing the Significance of an Independent Variable 12.6 Confidence Intervals and Prediction Intervals 12.7 Dummy Variables 12.8 Model Building and the Effects of Multicollinearity 12.9 Residual Analysis in Multiple Regression


Download ppt "12-1. 12-2 Chapter Twelve Multiple Regression and Model Building McGraw-Hill/Irwin Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved."

Similar presentations


Ads by Google