# To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Chapter 4 RegressionModels.

## Presentation on theme: "To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Chapter 4 RegressionModels."— Presentation transcript:

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Chapter 4 RegressionModels Prepared by Lee Revere and John Large

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-2 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Learning Objectives Students will be able to: 1.Identify variables and use them in a regression model. 2.Develop simple linear regression equations from sample data and interpret the slope and intercept. 3.Compute the coefficient of determination and the coefficient of correlation and interpret their meanings. 4.Interpret the F-test in a linear regression model. 5.List the assumptions used in regression and use residual plots to identify problems.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-3 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Learning Objectives (continued) Students will be able to: 6.Develop a multiple regression model and use it to predict. 7.Use dummy variables to model categorical data. 8.Determine which variables should be included in a multiple regression model. 9.Transform a nonlinear function into a linear one for use in regression. 10.Understand and avoid common mistakes made in the use of regression analysis.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-4 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Chapter Outline 4.1Introduction 4.2Scatter Diagrams 4.3Simple Linear Regression 4.4Measuring the Fit of a Regression Model 4.5Using Computer Software for Regression 4.6Assumptions of the Regression Model

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-5 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Chapter Outline (continued) 4.7 Testing the Model for Significance 4.8 Multiple Regression Analysis 4.9 Binary or Dummy Variables 4.10 Model Building 4.11 Nonlinear Regression 4.12 Cautions and Pitfalls in Regression Analysis

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-6 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Introduction Regression analysis is a very valuable tool for today’s manager. Regression is used to:  understand the relationship between variables.  predict the value of one variable based on another variable. Cost estimation models are a good example.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-7 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Introduction (continued) A regression model is comprised of a dependent, or response, variable and an independent, or predictor, variable. Dependent Variable = Independent Variable(s) Prediction Relationship

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-8 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Scatter Diagram A scatter diagram is used to graphically investigate the relationship between the dependent and independent variables.  Plot the dependent variable on the Y axis.  Plot the independent variable on the X axis.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-9 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Triple A Construction Example Triple A Construction Company renovates old homes in Albany. They have found that its dollar volume of renovation work is dependent on the Albany area payroll. Triple A Sales (\$100,000’s) Local Payroll (\$100,000,000’s) 63 84 96 54 4.52 9.55

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-10 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Triple A Construction Example (continued) Scatter Diagram Dependent Variable Independent Variable

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-11 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Simple Linear Regression Y =  0 +  1 X + error Y =  0 +  1 X + error Where, Y = dependent variable (response) X = independent variable (predictor / explanatory)  0 = intercept (value of Y when X = 0)  1 = slope of the regression line Error = random error Regression models are used to test if a relationship exists between variables; that is, to use one variable to predict another. However, there is some random error that cannot be predicted.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-12 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Simple Linear Regression (continued) Sample data are used to estimate the true values for the intercept and slope. Y = b + b X Where, Y = predicted value of Y Error = (actual value) – (predicted value) e = Y - Y The difference between the actual value of Y and the predicted value (using sample data) is known as the error. 0 1

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-13 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Least Squares Regression Least squares regression minimizes the sum of the squared errors.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-14 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Least Squares Regression Equations Y = b + b X 0 1 Least squares regression equations are:

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-15 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Calculating the Regression Line: Triple A Construction Sales (Y)Payroll (X)(X - X)(X-X)(Y-Y) 6311 8400 9644 5400 4.5245 9.5512.5 2 Summations for each column: 42 24 10 12.5 Y = 42/6 = 7 X = 24/6 = 4

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-16 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Calculating the Regression Line (continued) Calculating the required parameters: b = (X-X)(Y-Y) 12.5 (X-X) 10 b = Y – b X = 7 – (1.25)(4) = 2 So, Y = 2 + 1.25 X ∑ ∑ 2 o 1 1 = = 1.25

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-17 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Using Regression Line If the payroll estimations for next year were \$600 million, what is the predicted value of Triple A’s sales? Y = 2 + 1.25 X Sales = 2 + 1.25 (payroll) So, Next year sales = 2 + 1.25 (6) = 9.5

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-18 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Measuring the Fit of the Regression Model  The variability in the Y variable SST – Total variability about the mean SSE – Variability about the regression line SSR – Variability that is explained  Coefficient of Determination r 2 - Proportion of explained variation  Correlation Coefficient r – Strength of the relationship between Y and X variables To understand how well the model predicts the response variable, we evaluate the following:

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-19 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Measuring the Fit of the Regression Model  Sum of Squares Total (SST) measures the total variable in Y.  Sum of the Squared Error (SSE) is less than the SST because the regression line reduced the variability.  Sum of Squares due to Regression (SSR) indicated how much of the total variability is explained by the regression model. Errors (deviations) may be positive or negative. Summing the errors would be misleading, thus we square the terms prior to summing. SST = (Y-Y) 2 ∑ SSE = e = (Y-Y) 2 ∑ ∑ SSR = (Y-Y)∑ 2

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-20 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Measuring the Fit of the Regression Model (continued) For Triple A Construction: SST = (Y-Y) 2 ∑ SSE = e = (Y-Y) 2 ∑ ∑ SSR = (Y-Y)∑ 2 = 22.5 = 6.875 = 15.625 Note: SST = SSR + SSE Explained Variability Unexplained Variability

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-21 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Coefficient of Determination The coefficient of determination (r 2 ) is the proportion of the variability in Y that is explained by the regression equation. r 2 = SSR = 1 – SSE SST SST For Triple A Construction: r 2 = 15.625 = 0.6944 22.5 69% of the variability in sales is explained by the regression based on payroll. Note: 0 < r 2 < 1

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-22 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Correlation Coefficient               YY(Yn XXn    YXXYn r For Triple A Construction, r = 0.8333 The correlation coefficient (r) measures the strength of the linear relationship. Note: -1 < r < 1

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-23 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Correlation Coefficient (continued)

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-24 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Computer Software for Regression In Excel, use Tools/ Data Analysis. This is an ‘add-in’ option.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-25 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Computer Software for Regression (continued) After selecting the regression option, this will appear X and Y ranges Specify labels if included in range Output area Residual (error) output Scatter diagram output

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-26 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Computer Software for Regression (continued) High r (close to 1) 2 Multiple r is correlation coefficient (r) A scatter diagram will be given. Regression coefficients

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-27 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Assumptions of the Regression Model  Errors are independent.  Errors are normally distributed.  Errors have a mean of zero.  Errors have a constant variance. We make certain assumptions about the errors in a regression model which allow for statistical testing. Assumptions:

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-28 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Residual Analysis Residual analyses (plots) will highlight glaring violations of the assumptions. 0 X Healthy Residual Plot – no violations

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-29 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 0 X Residual Analysis: Nonlinear Violation Nonlinear Residual Plot –violation

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-30 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 0 X Nonconstant Error Residual Plot –violation Residual Analysis: Nonconstant Error

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-31 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Estimating the Variance s = MSE = SSE n–k-1 The mean squared error (MSE) is the estimate of the error variance of the regression equation. 2 Where, n = number of observations in the sample k = number of independent variables For Triple A Construction, s = 1.7188 2

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-32 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Estimating the Variance (continued) s = MSE The standard deviation of the regression is used in many statistical tests about the regression model. For Triple A Construction, s = 1.31

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-33 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Testing the Model for Significance: F-test An F-test is used to statistically test the null hypothesis that there is no linear relationship between the X and Y variables (i.e. β = 0). If the significance level for the F test is low, we reject Ho and conclude there is a linear relationship. F = MSR MSE where, MSR = SSR k 1

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-34 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Testing the Model for Significance: F-test For Triple A Construction: MSR = 15.625 = 15.625 1 F = 15.625 = 9.0909 1.7188 The significance level for F = 9.0909 is 0.0394, indicating we reject Ho and conclude a linear relationship exists between sales and payroll.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-35 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Testing the Model for Significance: R 2 r 2 is the best measure of the strength of the prediction relationship between the X and Y variables.  Values closer to 1 indicate a strong prediction relationship.  Good regression models have significant F-test and high r 2 values.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-36 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Testing the Model for Significance: Coefficient Hypotheses Statistical tests of significance can be performed on the coefficients. The null hypothesis is that the coefficient of X (i.e., the slope of the line) is 0.  P values are the observed significance level and can be used to test the null hypothesis.  For a simple linear regression the test of the regression coefficients gives the same information as the F-test.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-37 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 ANOVA Tables When developing a regression model, an ANOVA table is computing by most statistical software. The general form of the ANOVA table is helpful for understanding the interrelatedness of error terms. DFSSMSF Significance Regression kSSRMSRMSR/MSEP-value Residual n-k-1SSEMSE Total n-1SST

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-38 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Multiple Regression Multiple regression models are similar to simple linear regression models except they include more than one X variable. Y = b + b X + b X +…+ b X 0 1 1 2 2 n n Independent variables slope

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-39 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Multiple Regression: Wilson Realty Example PriceSq. FeetAgeCondition 35000192630Good 47000206940Excellent 49900172030Excellent 55000139615Good 58900170632Mint 60000184738Mint 67000195027Mint 70000232330Excellent 78500228526Mint 79000375235Good 87500230018Good 93000252517Good 95000380040Excellent 97000174012Mint Wilson Realty wants to develop a model to determine the suggested listing price for a house based on size and age.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-40 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Wilson Realty Example (continued) 67% of the variation in sales price is explained by size and age. Ho: No linear relationship is rejected Ho: β1 = 0 is rejected Ho: β2 = 0 is rejected Y = 60815.45 + 21.91(size) – 1449.34 (age)

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-41 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Wilson Realty Example (continued) Y = 60815.45 + 21.91(size) – 1449.34 (age) Wilson Realty has found a linear relationship between price and size and age. The coefficient for size indicates each additional square foot increases the value by \$21.91, while each additional year in age decreases the value by \$1449.34. For a 1900 square foot house that is 10 years old, the following prediction can be made: \$87,951 = 21.91(1900) + 1449.34(10)

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-42 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Binary Variables  A dummy variable is assigned a value of 1 if a particular condition is met and a value of 0 otherwise.  The number of dummy variables must equal one less than the number of categories of the qualitative variable. Binary (or dummy) variables are special variables that are created for qualitative data.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-43 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Wilson Realty Example: Binary Variables Return to Wilson Realty, and let’s evaluate how to use property condition in the regression model. There are three categories: Mint, Excellent, and Good. X = 1 if the house is in excellent condition = 0 otherwise X = 1 if the house is in mint condition = 0 otherwise Note: If both X and X = 0 then the house is in good condition 3 3 4 4

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-44 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Wilson Realty: Binary Variables (continued) What can you say about the new model? Y = 48329.23 + 28.21 (size) – 1981.41(age) + 23684.62 (if mint) + 16581.32 (if excellent)

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-45 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Model Building  As more variables are added to the model, the r 2 usually increases.  The adjusted r 2 takes into account the number of independent variables in the model. The best model is a statistically significant model with a high r 2 and a few variables. Note: When variables are added to the model, the value of r 2 can never decrease; however, the adjusted r 2 may decrease.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-46 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Model Building (continued)  Collinearity and multicollinearity create problems in the coefficients.  The overall model prediction is still good; however individual interpretation of the variables is questionable. Collinearity or multicollinearity exists when an independent variable is correlated with another independent variable.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-47 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Nonlinear Regression  Transformations may be used to turn a nonlinear model into a linear model. Nonlinear relationships may exist between variables, thereby requiring a transformation of one or more variables to achieve linearity.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-48 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Automobile Example: Nonlinear Regression Engineers at Colonel Motors want to use regression analysis to improve fuel efficiency. They are studying the impact of weight on miles per gallon (MPG). MPGWeightMPGWeight 124.58203.18 134.66232.68 154.02242.65 182.53331.70 193.09361.95 193.11421.92

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-49 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Automobile Example (continued) Linear regression line Nonlinear regression line Perhaps a nonlinear relationship exists?

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-50 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Automobile Example (continued)  Linear regression model: MPG = 47.8 – 8.2 (weight) F significance =.0003 r 2 =.7446  Nonlinear (transformed variable) regression model MPG = 79.8 – 30.2(weigth) + 3.4 (weight) F significance =.0002 R 2 =.8478 2 Which model is best? What are the difficulties with interpreting the individual coefficients?

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-51 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Cautions and Pitfalls  If the assumptions are not met, the statistical test may not be valid.  Correlation does not mean causation.  Multicollinearity causes problems with coefficient interpretation.

To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-52 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Cautions and Pitfalls (continued)  Prediction beyond the range of X values in the sample can be misleading, including interpretation of the intercept (X=0).  A linear regression model may not be the best model, even in the presence of a significant F test.  A statistically significant relationship does not mean practical value.

Download ppt "To accompany Quantitative Analysis for Management, 9e by Render/Stair/Hanna 4-1 © 2006 by Prentice Hall, Inc., Upper Saddle River, NJ 07458 Chapter 4 RegressionModels."

Similar presentations