© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation
© 2000 Prentice-Hall, Inc. Chap What is a forecast? Using a statistical method on past data to predict the future. Using experience, judgment and surveys to predict the future.
© 2000 Prentice-Hall, Inc. Chap Why forecast? to enhance planning. to force thinking about the future. to fit corporate strategy to future conditions. to coordinate departments to the same future. to reduce corporate costs.
© 2000 Prentice-Hall, Inc. Chap Kinds of Forecasts Causal forecasts are when changes in a variable (Y) you wish to predict are caused by changes in other variables (X's). Time series forecasts are when changes in a variable (Y) are predicted based on prior values of itself (Y). Regression can provide both kinds of forecasts.
© 2000 Prentice-Hall, Inc. Chap Types of Relationships Positive Linear RelationshipNegative Linear Relationship
© 2000 Prentice-Hall, Inc. Chap Types of Relationships Relationship NOT Linear No Relationship (continued)
© 2000 Prentice-Hall, Inc. Chap Relationships If the relationship is not linear, the forecaster often has to use math transformations to make the relationship linear.
© 2000 Prentice-Hall, Inc. Chap Correlation Analysis Correlation measures the strength of the linear relationship between variables. It can be used to find the best predictor variables. It does not assure that there is a causal relationship between the variables.
© 2000 Prentice-Hall, Inc. Chap The Correlation Coefficient Ranges between -1 and 1. The Closer to -1, The Stronger Is The Negative Linear Relationship. The Closer to 1, The Stronger Is The Positive Linear Relationship. The Closer to 0, The Weaker Is Any Linear Relationship.
© 2000 Prentice-Hall, Inc. Chap Y X Y X Y X Y X Y X Graphs of Various Correlation (r) Values r = -1 r = -.6r = 0 r =.6 r = 1
© 2000 Prentice-Hall, Inc. Chap The Scatter Diagram Plot of all (X i, Y i ) pairs
© 2000 Prentice-Hall, Inc. Chap The Scatter Diagram Is used to visualize the relationship and to assess its linearity. The scatter diagram can also be used to identify outliers.
© 2000 Prentice-Hall, Inc. Chap Regression Analysis Regression Analysis can be used to model causality and make predictions. Terminology: The variable to be predicted is called the dependent or response variable. The variables used in the prediction model are called independent, explanatory or predictor variables.
© 2000 Prentice-Hall, Inc. Chap Simple Linear Regression Model The relationship between variables is described by a linear function. A change of one variable causes the other variable to change.
© 2000 Prentice-Hall, Inc. Chap Population Regression Line Is A Straight Line that Describes The Dependence of One Variable on The Other Population Y intercept Population Slope Coefficient Random Error Dependent (Response) Variable Independent (Explanatory) Variable Population Linear Regression Population Regression Line
© 2000 Prentice-Hall, Inc. Chap = Random Error Y X How is the best line found? Observed Value
© 2000 Prentice-Hall, Inc. Chap Sample Linear Regression Sample Y Intercept Sample Slope Coefficient Sample Regression Line Provides an Estimate of The Population Regression Line provides an estimate of Sample Regression Line Residual
© 2000 Prentice-Hall, Inc. Chap Simple Linear Regression: An Example You wish to examine the relationship between the square footage of produce stores and their annual sales. Sample data for 7 stores were obtained. Find the equation of the straight line that fits the data best Annual Store Square Sales Feet($1000) 1 1,726 3, ,542 3, ,816 6, ,555 9, ,292 3, ,208 5, ,313 3,760
© 2000 Prentice-Hall, Inc. Chap The Scatter Diagram Excel Output
© 2000 Prentice-Hall, Inc. Chap The Equation for the Regression Line From Excel Printout:
© 2000 Prentice-Hall, Inc. Chap Graph of the Regression Line Y i = X i
© 2000 Prentice-Hall, Inc. Chap Interpreting the Results Y i = X i The slope of means that each increase of one unit in X, we predict the average of Y to increase by an estimated units. The model estimates that for each increase of 1 square foot in the size of the store, the expected annual sales are predicted to increase by $1487.
© 2000 Prentice-Hall, Inc. Chap The Coefficient of Determination SSR regression sum of squares SST total sum of squares r 2 = = The Coefficient of Determination ( r 2 ) measures the proportion of variation in Y explained by the independent variable X.
© 2000 Prentice-Hall, Inc. Chap Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 1, Y Y i =b 0 +b 1 X i X ^ r = +1
© 2000 Prentice-Hall, Inc. Chap Coefficients of Determination (R 2 ) and Correlation (R) r 2 =.81,r = +0.9 Y Y i =b 0 +b 1 X i X ^ (continued)
© 2000 Prentice-Hall, Inc. Chap Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 0,r = 0 Y Y i =b 0 +b 1 X i X ^ (continued)
© 2000 Prentice-Hall, Inc. Chap Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 1,r = -1 Y Y i =b 0 +b 1 X i X ^ (continued)
© 2000 Prentice-Hall, Inc. Chap Correlation: The Symbols Population correlation coefficient (‘rho’) measures the strength between two variables. Sample correlation coefficient r estimates based on a set of sample observations.
© 2000 Prentice-Hall, Inc. Chap Example: Produce Stores From Excel Printout
© 2000 Prentice-Hall, Inc. Chap Inferences About the Slope t Test for a Population Slope Is There A Linear Relationship between X and Y ? Test Statistic: and df = n - 2 Null and Alternative Hypotheses H 0 : 1 = 0(No Linear Relationship) H 1 : 1 0(Linear Relationship) Where
© 2000 Prentice-Hall, Inc. Chap Example: Produce Stores Data for 7 Stores: Estimated Regression Equation: The slope of this model is Is Square Footage of the store affecting its Annual Sales? Annual Store Square Sales Feet($000) 1 1,726 3, ,542 3, ,816 6, ,555 9, ,292 3, ,208 5, ,313 3,760 Y i = X i
© 2000 Prentice-Hall, Inc. Chap H 0 : 1 = 0 H 1 : 1 0 .05 df = 5 Critical value(s): Test Statistic: Decision: Conclusion: There is evidence of a linear relationship. t Reject.025 From Excel Printout Reject H 0 Inferences About the Slope: t Test Example
© 2000 Prentice-Hall, Inc. Chap Inferences About the Slope Using A Confidence Interval Confidence Interval Estimate of the Slope b 1 t n-2 Excel Printout for Produce Stores At 95% level of Confidence The confidence Interval for the slope is (1.062, 1.911). Does not include 0. Conclusion: There is a significant linear relationship between annual sales and the size of the store.
© 2000 Prentice-Hall, Inc. Chap Residual Analysis Is used to evaluate validity of assumptions. Residual analysis uses numerical measures and plots to assure the validity of the assumptions.
© 2000 Prentice-Hall, Inc. Chap Linear Regression Assumptions 1.X is linearly related to Y. 2.The variance is constant for each value of Y (Homoscedasticity). 3. The Residual Error is Normally Distributed. 4.If the data is over time, then the errors must be independent.
© 2000 Prentice-Hall, Inc. Chap Residual Analysis for Linearity Not Linear Linear X e e X Y X Y X
© 2000 Prentice-Hall, Inc. Chap Residual Analysis for Homoscedasticity Heteroscedasticity Homoscedasticity e X e X Y X X Y
© 2000 Prentice-Hall, Inc. Chap Residual Analysis for Independence: The Durbin-Watson Statistic It is used when data is collected over time. It detects autocorrelation; that is, the residuals in one time period are related to residuals in another timeperiod. It measures violation of independence assumption. Calculate D and compare it to the value in Table E.8.
© 2000 Prentice-Hall, Inc. Chap Preparing Confidence Intervals for Forecasts
© 2000 Prentice-Hall, Inc. Chap Interval Estimates for Different Values of X X Y X Confidence Interval for a individual Y i A Given X Confidence Interval for the mean of Y Y i = b 0 + b 1 X i _
© 2000 Prentice-Hall, Inc. Chap Estimation of Predicted Values Confidence Interval Estimate for YX The Mean of Y given a particular X i t value from table with df=n-2 Standard error of the estimate Size of interval vary according to distance away from mean, X.
© 2000 Prentice-Hall, Inc. Chap Estimation of Predicted Values Confidence Interval Estimate for Individual Response Y i at a Particular X i Addition of 1 increases width of interval from that for the mean of Y
© 2000 Prentice-Hall, Inc. Chap Example: Produce Stores Y i = X i Data for 7 Stores: Regression Model Obtained: Predict the annual sales for a store with 2000 square feet. Annual Store Square Sales Feet($000) 1 1,726 3, ,542 3, ,816 6, ,555 9, ,292 3, ,208 5, ,313 3,760
© 2000 Prentice-Hall, Inc. Chap Estimation of Predicted Values: Example Find the 95% confidence interval for the average annual sales for stores of 2,000 square feet Predicted Sales Y i = X i = ($000) X = S YX = t n-2 = t 5 = = Confidence interval for mean Y Confidence Interval Estimate for YX
© 2000 Prentice-Hall, Inc. Chap Estimation of Predicted Values: Example Find the 95% confidence interval for annual sales of one particular store of 2,000 square feet Predicted Sales Y i = X i = ($000) X = S YX = t n-2 = t 5 = = Confidence interval for individual Y Confidence Interval Estimate for Individual Y