# 1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.

## Presentation on theme: "1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole."— Presentation transcript:

1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. John Loucks St. Edward’s University...................... SLIDES. BY

2 2 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Chapter 14, Part B Simple Linear Regression n Using the Estimated Regression Equation for Estimation and Prediction for Estimation and Prediction n Residual Analysis: Validating Model Assumptions n Residual Analysis: Outliers and Influential Observations Observations n Computer Solution

3 3 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Using the Estimated Regression Equation for Estimation and Prediction The margin of error is larger for a prediction interval. The margin of error is larger for a prediction interval. A prediction interval is used whenever we want to A prediction interval is used whenever we want to predict an individual value of y for a new observation predict an individual value of y for a new observation corresponding to a given value of x. corresponding to a given value of x. A confidence interval is an interval estimate of the A confidence interval is an interval estimate of the mean value of y for a given value of x. mean value of y for a given value of x.

4 4 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Using the Estimated Regression Equation for Estimation and Prediction where: confidence coefficient is 1 -  and t  /2 is based on a t distribution with n - 2 degrees of freedom n Confidence Interval Estimate of E ( y * ) n Prediction Interval Estimate of y *

5 5 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be: Point Estimation ^ y = 10 + 5(3) = 25 cars

6 6 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Estimate of the Standard Deviation of Confidence Interval for E ( y * )

7 7 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. The 95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is: Confidence Interval for E ( y * ) 25 + 4.61 25 + 3.1824(1.4491) 20.39 to 29.61 cars

8 8 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Estimate of the Standard Deviation of an Individual Value of y * of an Individual Value of y * Prediction Interval for y *

9 9 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. The 95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: Prediction Interval for y * 25 + 8.28 25 + 3.1824(2.6013) 16.72 to 33.28 cars

10 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Computer Solution Recall that the independent variable was named Ads Recall that the independent variable was named Ads and the dependent variable was named Cars in the and the dependent variable was named Cars in the example. example. On the next slide we show Minitab output for the On the next slide we show Minitab output for the Reed Auto Sales example. Reed Auto Sales example. Performing the regression analysis computations Performing the regression analysis computations without the help of a computer can be quite time without the help of a computer can be quite time consuming. consuming.

11 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. The regression equation is Cars = 10.0 + 5.00 Ads PredictorCoefSE CoefTp Constant10.0002.3664.230.024 Ads5.00001.0804.630.019 S = 2.16025 R-sq = 87.7%R-sq(adj) = 83.6% Analysis of Variance SOURCE DFSSMSFp Regression1100 21.430.019 Residual Err.3144.667 Total4114 Predicted Values for New Observations New Obs FitSE Fit 95% C.I. 95% P.I. 1 251.45 (20.39, 29.61)(16.72, 33.28) Computer Solution n Minitab Output Output EstimatedRegressionEquationEstimatedRegressionEquation ANOVATableANOVATable IntervalEstimatesIntervalEstimates

12 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Minitab Output Minitab prints the standard error of the estimate, s, Minitab prints the standard error of the estimate, s, as well as information about the goodness of fit.. as well as information about the goodness of fit.. For each of the coefficients b 0 and b 1, the output shows For each of the coefficients b 0 and b 1, the output shows its value, standard deviation, t value, and p -value. its value, standard deviation, t value, and p -value. Minitab prints the estimated regression equation as Minitab prints the estimated regression equation as Cars = 10.0 + 5.00 Ads. Cars = 10.0 + 5.00 Ads. The standard ANOVA table is printed. The standard ANOVA table is printed. Also provided are the 95% confidence interval Also provided are the 95% confidence interval estimate of the expected number of cars sold and the estimate of the expected number of cars sold and the 95% prediction interval estimate of the number of 95% prediction interval estimate of the number of cars sold for an individual weekend with 3 ads. cars sold for an individual weekend with 3 ads.

13 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Residual Analysis Much of the residual analysis is based on an Much of the residual analysis is based on an examination of graphical plots. examination of graphical plots. Residual for Observation i Residual for Observation i The residuals provide the best information about . The residuals provide the best information about . If the assumptions about the error term  appear If the assumptions about the error term  appear questionable, the hypothesis tests about the questionable, the hypothesis tests about the significance of the regression relationship and the significance of the regression relationship and the interval estimation results may not be valid. interval estimation results may not be valid.

14 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Residual Plot Against x If the assumption that the variance of  is the same for all values of x is valid, and the assumed regression model is an adequate representation of the relationship between the variables, then If the assumption that the variance of  is the same for all values of x is valid, and the assumed regression model is an adequate representation of the relationship between the variables, then The residual plot should give an overall The residual plot should give an overall impression of a horizontal band of points impression of a horizontal band of points

15 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. x 0 Good Pattern Residual Residual Plot Against x

16 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Residual Plot Against x x 0 Residual Nonconstant Variance

17 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Residual Plot Against x x 0 Residual Model Form Not Adequate

18 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Residuals Residual Plot Against x

19 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Residual Plot Against x

20 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Standardized Residual for Observation i Standardized Residuals where:

21 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Standardized Residual Plot The standardized residual plot can provide insight about the assumption that the error term  has a normal distribution. The standardized residual plot can provide insight about the assumption that the error term  has a normal distribution. n If this assumption is satisfied, the distribution of the standardized residuals should appear to come from a standard normal probability distribution.

22 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Standardized Residuals Standardized Residual Plot

23 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. n Standardized Residual Plot Standardized Residual Plot

24 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Standardized Residual Plot All of the standardized residuals are between –1.5 and +1.5 indicating that there is no reason to question the assumption that  has a normal distribution. All of the standardized residuals are between –1.5 and +1.5 indicating that there is no reason to question the assumption that  has a normal distribution.

25 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. Outliers and Influential Observations n Detecting Outliers Minitab classifies an observation as an outlier if its standardized residual value is +2. Minitab classifies an observation as an outlier if its standardized residual value is +2. This standardized residual rule sometimes fails to identify an unusually large observation as being an outlier. This standardized residual rule sometimes fails to identify an unusually large observation as being an outlier. This rule’s shortcoming can be circumvented by using studentized deleted residuals. This rule’s shortcoming can be circumvented by using studentized deleted residuals. The | i th studentized deleted residual| will be larger than the | i th standardized residual|. The | i th studentized deleted residual| will be larger than the | i th standardized residual|. An outlier is an observation that is unusual in comparison with the other data. An outlier is an observation that is unusual in comparison with the other data.

26 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. or duplicated, or posted to a publicly accessible website, in whole or in part. End of Chapter 14, Part B