Download presentation
Presentation is loading. Please wait.
1
© 2000 Prentice-Hall, Inc. Chap. 9 - 1 Forecasting Using the Simple Linear Regression Model and Correlation
2
© 2000 Prentice-Hall, Inc. Chap. 9 - 2 What is a forecast? Using a statistical method on past data to predict the future. Using experience, judgment and surveys to predict the future.
3
© 2000 Prentice-Hall, Inc. Chap. 9 - 3 Why forecast? to enhance planning. to force thinking about the future. to fit corporate strategy to future conditions. to coordinate departments to the same future. to reduce corporate costs.
4
© 2000 Prentice-Hall, Inc. Chap. 9 - 4 Kinds of Forecasts Causal forecasts are when changes in a variable (Y) you wish to predict are caused by changes in other variables (X's). Time series forecasts are when changes in a variable (Y) are predicted based on prior values of itself (Y). Regression can provide both kinds of forecasts.
5
© 2000 Prentice-Hall, Inc. Chap. 9 - 5 Types of Relationships Positive Linear RelationshipNegative Linear Relationship
6
© 2000 Prentice-Hall, Inc. Chap. 9 - 6 Types of Relationships Relationship NOT Linear No Relationship (continued)
7
© 2000 Prentice-Hall, Inc. Chap. 9 - 7 Relationships If the relationship is not linear, the forecaster often has to use math transformations to make the relationship linear.
8
© 2000 Prentice-Hall, Inc. Chap. 9 - 8 Correlation Analysis Correlation measures the strength of the linear relationship between variables. It can be used to find the best predictor variables. It does not assure that there is a causal relationship between the variables.
9
© 2000 Prentice-Hall, Inc. Chap. 9 - 9 The Correlation Coefficient Ranges between -1 and 1. The Closer to -1, The Stronger Is The Negative Linear Relationship. The Closer to 1, The Stronger Is The Positive Linear Relationship. The Closer to 0, The Weaker Is Any Linear Relationship.
10
© 2000 Prentice-Hall, Inc. Chap. 9 - 10 Y X Y X Y X Y X Y X Graphs of Various Correlation (r) Values r = -1 r = -.6r = 0 r =.6 r = 1
11
© 2000 Prentice-Hall, Inc. Chap. 9 - 11 The Scatter Diagram Plot of all (X i, Y i ) pairs
12
© 2000 Prentice-Hall, Inc. Chap. 9 - 12 The Scatter Diagram Is used to visualize the relationship and to assess its linearity. The scatter diagram can also be used to identify outliers.
13
© 2000 Prentice-Hall, Inc. Chap. 9 - 13 Regression Analysis Regression Analysis can be used to model causality and make predictions. Terminology: The variable to be predicted is called the dependent or response variable. The variables used in the prediction model are called independent, explanatory or predictor variables.
14
© 2000 Prentice-Hall, Inc. Chap. 9 - 14 Simple Linear Regression Model The relationship between variables is described by a linear function. A change of one variable causes the other variable to change.
15
© 2000 Prentice-Hall, Inc. Chap. 9 - 15 Population Regression Line Is A Straight Line that Describes The Dependence of One Variable on The Other Population Y intercept Population Slope Coefficient Random Error Dependent (Response) Variable Independent (Explanatory) Variable Population Linear Regression Population Regression Line
16
© 2000 Prentice-Hall, Inc. Chap. 9 - 16 = Random Error Y X How is the best line found? Observed Value
17
© 2000 Prentice-Hall, Inc. Chap. 9 - 17 Sample Linear Regression Sample Y Intercept Sample Slope Coefficient Sample Regression Line Provides an Estimate of The Population Regression Line provides an estimate of Sample Regression Line Residual
18
© 2000 Prentice-Hall, Inc. Chap. 9 - 18 Simple Linear Regression: An Example You wish to examine the relationship between the square footage of produce stores and their annual sales. Sample data for 7 stores were obtained. Find the equation of the straight line that fits the data best Annual Store Square Sales Feet($1000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760
19
© 2000 Prentice-Hall, Inc. Chap. 9 - 19 The Scatter Diagram Excel Output
20
© 2000 Prentice-Hall, Inc. Chap. 9 - 20 The Equation for the Regression Line From Excel Printout:
21
© 2000 Prentice-Hall, Inc. Chap. 9 - 21 Graph of the Regression Line Y i = 1636.415 +1.487X i
22
© 2000 Prentice-Hall, Inc. Chap. 9 - 22 Interpreting the Results Y i = 1636.415 +1.487X i The slope of 1.487 means that each increase of one unit in X, we predict the average of Y to increase by an estimated 1.487 units. The model estimates that for each increase of 1 square foot in the size of the store, the expected annual sales are predicted to increase by $1487.
23
© 2000 Prentice-Hall, Inc. Chap. 9 - 23 The Coefficient of Determination SSR regression sum of squares SST total sum of squares r 2 = = The Coefficient of Determination ( r 2 ) measures the proportion of variation in Y explained by the independent variable X.
24
© 2000 Prentice-Hall, Inc. Chap. 9 - 24 Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 1, Y Y i =b 0 +b 1 X i X ^ r = +1
25
© 2000 Prentice-Hall, Inc. Chap. 9 - 25 Coefficients of Determination (R 2 ) and Correlation (R) r 2 =.81,r = +0.9 Y Y i =b 0 +b 1 X i X ^ (continued)
26
© 2000 Prentice-Hall, Inc. Chap. 9 - 26 Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 0,r = 0 Y Y i =b 0 +b 1 X i X ^ (continued)
27
© 2000 Prentice-Hall, Inc. Chap. 9 - 27 Coefficients of Determination (R 2 ) and Correlation (R) r 2 = 1,r = -1 Y Y i =b 0 +b 1 X i X ^ (continued)
28
© 2000 Prentice-Hall, Inc. Chap. 9 - 28 Correlation: The Symbols Population correlation coefficient (‘rho’) measures the strength between two variables. Sample correlation coefficient r estimates based on a set of sample observations.
29
© 2000 Prentice-Hall, Inc. Chap. 9 - 29 Example: Produce Stores From Excel Printout
30
© 2000 Prentice-Hall, Inc. Chap. 9 - 30 Inferences About the Slope t Test for a Population Slope Is There A Linear Relationship between X and Y ? Test Statistic: and df = n - 2 Null and Alternative Hypotheses H 0 : 1 = 0(No Linear Relationship) H 1 : 1 0(Linear Relationship) Where
31
© 2000 Prentice-Hall, Inc. Chap. 9 - 31 Example: Produce Stores Data for 7 Stores: Estimated Regression Equation: The slope of this model is 1.487. Is Square Footage of the store affecting its Annual Sales? Annual Store Square Sales Feet($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760 Y i = 1636.415 +1.487X i
32
© 2000 Prentice-Hall, Inc. Chap. 9 - 32 H 0 : 1 = 0 H 1 : 1 0 .05 df 7 - 2 = 5 Critical value(s): Test Statistic: Decision: Conclusion: There is evidence of a linear relationship. t 02.5706-2.5706.025 Reject.025 From Excel Printout Reject H 0 Inferences About the Slope: t Test Example
33
© 2000 Prentice-Hall, Inc. Chap. 9 - 33 Inferences About the Slope Using A Confidence Interval Confidence Interval Estimate of the Slope b 1 t n-2 Excel Printout for Produce Stores At 95% level of Confidence The confidence Interval for the slope is (1.062, 1.911). Does not include 0. Conclusion: There is a significant linear relationship between annual sales and the size of the store.
34
© 2000 Prentice-Hall, Inc. Chap. 9 - 34 Residual Analysis Is used to evaluate validity of assumptions. Residual analysis uses numerical measures and plots to assure the validity of the assumptions.
35
© 2000 Prentice-Hall, Inc. Chap. 9 - 35 Linear Regression Assumptions 1.X is linearly related to Y. 2.The variance is constant for each value of Y (Homoscedasticity). 3. The Residual Error is Normally Distributed. 4.If the data is over time, then the errors must be independent.
36
© 2000 Prentice-Hall, Inc. Chap. 9 - 36 Residual Analysis for Linearity Not Linear Linear X e e X Y X Y X
37
© 2000 Prentice-Hall, Inc. Chap. 9 - 37 Residual Analysis for Homoscedasticity Heteroscedasticity Homoscedasticity e X e X Y X X Y
38
© 2000 Prentice-Hall, Inc. Chap. 9 - 38 Residual Analysis for Independence: The Durbin-Watson Statistic It is used when data is collected over time. It detects autocorrelation; that is, the residuals in one time period are related to residuals in another timeperiod. It measures violation of independence assumption. Calculate D and compare it to the value in Table E.8.
39
© 2000 Prentice-Hall, Inc. Chap. 9 - 39 Preparing Confidence Intervals for Forecasts
40
© 2000 Prentice-Hall, Inc. Chap. 9 - 40 Interval Estimates for Different Values of X X Y X Confidence Interval for a individual Y i A Given X Confidence Interval for the mean of Y Y i = b 0 + b 1 X i _
41
© 2000 Prentice-Hall, Inc. Chap. 9 - 41 Estimation of Predicted Values Confidence Interval Estimate for YX The Mean of Y given a particular X i t value from table with df=n-2 Standard error of the estimate Size of interval vary according to distance away from mean, X.
42
© 2000 Prentice-Hall, Inc. Chap. 9 - 42 Estimation of Predicted Values Confidence Interval Estimate for Individual Response Y i at a Particular X i Addition of 1 increases width of interval from that for the mean of Y
43
© 2000 Prentice-Hall, Inc. Chap. 9 - 43 Example: Produce Stores Y i = 1636.415 +1.487X i Data for 7 Stores: Regression Model Obtained: Predict the annual sales for a store with 2000 square feet. Annual Store Square Sales Feet($000) 1 1,726 3,681 2 1,542 3,395 3 2,816 6,653 4 5,555 9,543 5 1,292 3,318 6 2,208 5,563 7 1,313 3,760
44
© 2000 Prentice-Hall, Inc. Chap. 9 - 44 Estimation of Predicted Values: Example Find the 95% confidence interval for the average annual sales for stores of 2,000 square feet Predicted Sales Y i = 1636.415 +1.487X i = 4610.45 ($000) X = 2350.29S YX = 611.75 t n-2 = t 5 = 2.5706 = 4610.45 612.66 Confidence interval for mean Y Confidence Interval Estimate for YX
45
© 2000 Prentice-Hall, Inc. Chap. 9 - 45 Estimation of Predicted Values: Example Find the 95% confidence interval for annual sales of one particular store of 2,000 square feet Predicted Sales Y i = 1636.415 +1.487X i = 4610.45 ($000) X = 2350.29S YX = 611.75 t n-2 = t 5 = 2.5706 = 4610.45 1687.68 Confidence interval for individual Y Confidence Interval Estimate for Individual Y
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.