Presentation is loading. Please wait.

Presentation is loading. Please wait.

© 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation.

Similar presentations


Presentation on theme: "© 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation."— Presentation transcript:

1 © 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation

2 © 2000 Prentice-Hall, Inc. Chap. 9 - 2 What is a forecast? Using a statistical method on past data to predict the future. Using experience, judgment and surveys to predict the future.

3 © 2000 Prentice-Hall, Inc. Chap. 9 - 3 Why forecast? to enhance planning. to force thinking about the future. to fit corporate strategy to future conditions. to coordinate departments to the same future. to reduce corporate costs.

4 © 2000 Prentice-Hall, Inc. Chap. 9 - 4 Kinds of Forecasts Causal forecasts are when changes in a variable (Y) you wish to predict are caused by changes in other variables (X's). Time series forecasts are when changes in a variable (Y) are predicted based on prior values of itself (Y). Regression can provide both kinds of forecasts.

5 © 2000 Prentice-Hall, Inc. Chap. 9 - 5 Types of Relationships Positive Linear RelationshipNegative Linear Relationship

6 © 2000 Prentice-Hall, Inc. Chap. 9 - 6 Types of Relationships Relationship NOT Linear No Relationship (continued)

7 © 2000 Prentice-Hall, Inc. Chap. 9 - 7 Relationships If the relationship is not linear, the forecaster often has to use math transformations to make the relationship linear.

8 © 2000 Prentice-Hall, Inc. Chap. 9 - 8 Correlation Analysis Correlation measures the strength of the linear relationship between variables. It can be used to find the best predictor variables. It does not assure that there is a causal relationship between the variables.

9 © 2000 Prentice-Hall, Inc. Chap. 9 - 9 The Correlation Coefficient Ranges between -1 and 1. The Closer to -1, The Stronger Is The Negative Linear Relationship. The Closer to 1, The Stronger Is The Positive Linear Relationship. The Closer to 0, The Weaker Is Any Linear Relationship.

10 © 2000 Prentice-Hall, Inc. Chap. 9 - 10 Y X Y X Y X Y X Y X Graphs of Various Correlation (r) Values r = -1 r = -.6r = 0 r =.6 r = 1

11 © 2000 Prentice-Hall, Inc. Chap. 9 - 11 The Scatter Diagram Plot of all (X i, Y i ) pairs

12 © 2000 Prentice-Hall, Inc. Chap. 9 - 12 The Scatter Diagram Is used to visualize the relationship and to assess its linearity. The scatter diagram can also be used to identify outliers.

13 © 2000 Prentice-Hall, Inc. Chap. 9 - 13 Regression Analysis Regression Analysis can be used to model causality and make predictions. Terminology: The variable to be predicted is called the dependent or response variable. The variables used in the prediction model are called independent, explanatory or predictor variables.

14 © 2000 Prentice-Hall, Inc. Chap. 9 - 14 Simple Linear Regression Model The relationship between variables is described by a linear function. A change of one variable causes the other variable to change.

15 © 2000 Prentice-Hall, Inc. Chap. 9 - 15 Population Regression Line Is A Straight Line that Describes The Dependence of One Variable on The Other Population Y intercept Population Slope Coefficient Random Error Dependent (Response) Variable Independent (Explanatory) Variable Population Linear Regression Population Regression Line

16 © 2000 Prentice-Hall, Inc. Chap. 9 - 16 = Random Error Y X How is the best line found? Observed Value

17 © 2000 Prentice-Hall, Inc. Chap. 9 - 17 Sample Linear Regression Sample Y Intercept Sample Slope Coefficient Sample Regression Line Provides an Estimate of The Population Regression Line provides an estimate of Sample Regression Line Residual

18 © 2000 Prentice-Hall, Inc. Chap. 9 - 18 Simple Linear Regression: An Example You wish to examine the relationship between the square footage of produce stores and their annual sales. Sample data for 7 stores were obtained. Find the equation of the straight line that fits the data best Annual Store Square Sales Feet($1000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760

19 © 2000 Prentice-Hall, Inc. Chap. 9 - 19 The Scatter Diagram Excel Output

20 © 2000 Prentice-Hall, Inc. Chap. 9 - 20 The Equation for the Regression Line  From Excel Printout:

21 © 2000 Prentice-Hall, Inc. Chap. 9 - 21 Graph of the Regression Line Y i = 1636.415 +1.487X i 

22 © 2000 Prentice-Hall, Inc. Chap. 9 - 22 Interpreting the Results Y i = 1636.415 +1.487X i The slope of 1.487 means that each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units. The model estimates that for each increase of 1 square foot in the size of the store, the expected annual sales are predicted to increase by $1487. 

23 © 2000 Prentice-Hall, Inc. Chap. 9 - 23 The Coefficient of Determination SSR regression sum of squares SST total sum of squares r 2 = = The Coefficient of Determination ( r 2 ) measures the proportion of variation in Y explained by the independent variable X.

24 © 2000 Prentice-Hall, Inc. Chap. 9 - 24 Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 1, Y Y i =b 0 +b 1 X i X ^ r = +1

25 © 2000 Prentice-Hall, Inc. Chap. 9 - 25 Coefficients of Determination (R 2 ) and Correlation (R) r 2 =.81,r = +0.9 Y Y i =b 0 +b 1 X i X ^ (continued)

26 © 2000 Prentice-Hall, Inc. Chap. 9 - 26 Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 0,r = 0 Y Y i =b 0 +b 1 X i X ^ (continued)

27 © 2000 Prentice-Hall, Inc. Chap. 9 - 27 Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 1,r = -1 Y Y i =b 0 +b 1 X i X ^ (continued)

28 © 2000 Prentice-Hall, Inc. Chap. 9 - 28 Correlation: The Symbols Population correlation coefficient  (‘rho’) measures the strength between two variables. Sample correlation coefficient r estimates  based on a set of sample observations.

29 © 2000 Prentice-Hall, Inc. Chap. 9 - 29 Example: Produce Stores From Excel Printout

30 © 2000 Prentice-Hall, Inc. Chap. 9 - 30 Inferences About the Slope t Test for a Population Slope Is There A Linear Relationship between X and Y ? Test Statistic: and df = n - 2 Null and Alternative Hypotheses H 0 :  1 = 0(No Linear Relationship) H 1 :  1  0(Linear Relationship) Where

31 © 2000 Prentice-Hall, Inc. Chap. 9 - 31 Example: Produce Stores Data for 7 Stores: Estimated Regression Equation: The slope of this model is 1.487. Is Square Footage of the store affecting its Annual Sales?  Annual Store Square Sales Feet($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Y i = 1636.415 +1.487X i

32 © 2000 Prentice-Hall, Inc. Chap. 9 - 32 H 0 :  1 = 0 H 1 :  1  0  .05 df  7 - 2 = 5 Critical value(s): Test Statistic: Decision: Conclusion: There is evidence of a linear relationship. t 02.5706-2.5706.025 Reject.025 From Excel Printout Reject H 0 Inferences About the Slope: t Test Example

33 © 2000 Prentice-Hall, Inc. Chap. 9 - 33 Inferences About the Slope Using A Confidence Interval Confidence Interval Estimate of the Slope b 1  t n-2 Excel Printout for Produce Stores At 95% level of Confidence The confidence Interval for the slope is (1.062, 1.911). Does not include 0. Conclusion: There is a significant linear relationship between annual sales and the size of the store.

34 © 2000 Prentice-Hall, Inc. Chap. 9 - 34 Residual Analysis Is used to evaluate validity of assumptions. Residual analysis uses numerical measures and plots to assure the validity of the assumptions.

35 © 2000 Prentice-Hall, Inc. Chap. 9 - 35 Linear Regression Assumptions 1.X is linearly related to Y. 2.The variance is constant for each value of Y (Homoscedasticity). 3. The Residual Error is Normally Distributed. 4.If the data is over time, then the errors must be independent.

36 © 2000 Prentice-Hall, Inc. Chap. 9 - 36 Residual Analysis for Linearity Not Linear Linear X e e X Y X Y X

37 © 2000 Prentice-Hall, Inc. Chap. 9 - 37 Residual Analysis for Homoscedasticity Heteroscedasticity Homoscedasticity e X e X Y X X Y

38 © 2000 Prentice-Hall, Inc. Chap. 9 - 38 Residual Analysis for Independence: The Durbin-Watson Statistic It is used when data is collected over time. It detects autocorrelation; that is, the residuals in one time period are related to residuals in another timeperiod. It measures violation of independence assumption. Calculate D and compare it to the value in Table E.8.

39 © 2000 Prentice-Hall, Inc. Chap. 9 - 39 Preparing Confidence Intervals for Forecasts

40 © 2000 Prentice-Hall, Inc. Chap. 9 - 40 Interval Estimates for Different Values of X X Y X Confidence Interval for a individual Y i A Given X Confidence Interval for the mean of Y Y i = b 0 + b 1 X i  _

41 © 2000 Prentice-Hall, Inc. Chap. 9 - 41 Estimation of Predicted Values Confidence Interval Estimate for  YX The Mean of Y given a particular X i t value from table with df=n-2 Standard error of the estimate Size of interval vary according to distance away from mean, X.

42 © 2000 Prentice-Hall, Inc. Chap. 9 - 42 Estimation of Predicted Values Confidence Interval Estimate for Individual Response Y i at a Particular X i Addition of 1 increases width of interval from that for the mean of Y

43 © 2000 Prentice-Hall, Inc. Chap. 9 - 43 Example: Produce Stores Y i = 1636.415 +1.487X i Data for 7 Stores: Regression Model Obtained: Predict the annual sales for a store with 2000 square feet.  Annual Store Square Sales Feet($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760

44 © 2000 Prentice-Hall, Inc. Chap. 9 - 44 Estimation of Predicted Values: Example Find the 95% confidence interval for the average annual sales for stores of 2,000 square feet Predicted Sales Y i = 1636.415 +1.487X i = 4610.45 ($000)  X = 2350.29S YX = 611.75 t n-2 = t 5 = 2.5706 = 4610.45  612.66 Confidence interval for mean Y Confidence Interval Estimate for  YX

45 © 2000 Prentice-Hall, Inc. Chap. 9 - 45 Estimation of Predicted Values: Example Find the 95% confidence interval for annual sales of one particular store of 2,000 square feet Predicted Sales Y i = 1636.415 +1.487X i = 4610.45 ($000)  X = 2350.29S YX = 611.75 t n-2 = t 5 = 2.5706 = 4610.45  1687.68 Confidence interval for individual Y Confidence Interval Estimate for Individual Y


Download ppt "© 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation."

Similar presentations


Ads by Google