Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Forecasting Using the Simple Linear Regression Model and Correlation
Copyright © 2011 Pearson Education, Inc. Curved Patterns Chapter 20.
Inference for Regression
Objectives (BPS chapter 24)
LECTURE 3 Introduction to Linear Regression and Correlation Analysis
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Chapter 12 Simple Regression
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter Topics Types of Regression Models
Linear Regression Example Data
Slide Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
© 2000 Prentice-Hall, Inc. Chap Forecasting Using the Simple Linear Regression Model and Correlation.
Pertemua 19 Regresi Linier
Chapter 14 Introduction to Linear Regression and Correlation Analysis
Business Statistics - QBM117 Statistical inference for regression.
Correlation and Regression Analysis
Chapter 7 Forecasting with Simple Regression
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Active Learning Lecture Slides
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Chapter 11 Simple Regression
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
September In Chapter 14: 14.1 Data 14.2 Scatterplots 14.3 Correlation 14.4 Regression.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.1 Using Several Variables to Predict a Response.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 23 Multiple Regression.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 17 Comparison.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Chap 14-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 14 Additional Topics in Regression Analysis Statistics for Business.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 27 Time Series.
Copyright © 2011 Pearson Education, Inc. Analysis of Variance Chapter 26.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 18 Inference for Counts.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 19 Linear Patterns.
Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Inference for Regression Chapter 14. Linear Regression We can use least squares regression to estimate the linear relationship between two quantitative.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright © 2011 Pearson Education, Inc. Time Series Chapter 27.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Lecture 10: Correlation and Regression Model.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Copyright © 2010 Pearson Education, Inc. Warm Up- Good Morning! If all the values of a data set are the same, all of the following must equal zero except.
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 24 Building Regression Models.
Statistics for Managers Using Microsoft® Excel 5th Edition
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 26 Analysis of Variance.
Conceptual Foundations © 2008 Pearson Education Australia Lecture slides for this course are based on teaching materials provided/referred by: (1) Statistics.
Inference for Least Squares Lines
Statistics for Managers using Microsoft Excel 3rd Edition
Chapter 11 Simple Regression
Lecture Slides Elementary Statistics Thirteenth Edition
3 4 Chapter Describing the Relation between Two Variables
Presentation transcript:

Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Although regression analysis allows the use of prices of different size homes to estimate the home of a specific size, prices tend to be more variable for larger homes. How does this affect the SRM?  Consider how to recognize and fix three potential problems affecting regression models: changing variation in the data, outliers, and dependence among observations

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Price ($000) vs. Home Size (Sq. Ft.) Both the average and standard deviation in price increase as home size increases.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation SRM Results: Home Price Example

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Fixed Costs, Marginal Costs, and Variable Costs  The estimated intercept (50.599) can be interpreted as the fixed cost of a home.  The 95% confidence interval for the intercept (after rounding) is -$4,000 to $105,000.  Since it includes zero, this interval is not a precise estimate of fixed costs.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Fixed Costs, Marginal Costs, and Variable Costs  The slope (0.159) estimates the marginal cost of an additional square foot of space.  The 95% confidence interval for the slope (after rounding) is $135,000 to $183,500.  It can be interpreted as the average difference in home price associated with 1,000 square feet.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Detecting Differences in Variation  Based on the scatterplot, the association between home price and size appears linear.  Little concern about lurking variables since the sample of homes is from the same neighborhood.  Similar variances condition is not satisfied.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Detecting Differences in Variation Fan-shaped appearance of residual plot indicates changing variances.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Detecting Differences in Variation Side-by-side boxplots confirm that variances increase as home size increases.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Detecting Differences in Variation  Heteroscedastic: errors that have different amounts of variation.  Homoscedastic: errors having equal amounts of variation.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Consequences of Different Variation  Prediction intervals are too narrow or too wide.  Confidence intervals for the slope and intercept are not reliable.  Hypothesis tests regarding β 0 and β 1 are not reliable.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Consequences of Different Variation The 95% prediction intervals are too wide for small homes and too narrow for large homes.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Fixing the Problem: Revise the Model  If F represents fixed cost and M marginal costs, the equation of the SRM becomes Price =

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Fixing the Problem: Revise the Model  Divide both sides of the equation by the number of square feet and simplify:

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Fixing the Problem: Revise the Model  The response variable becomes price per square foot and the explanatory variable becomes the reciprocal of the number of square feet.  The marginal cost M is the intercept and the slope is F, the fixed cost.  The residuals have similar variances.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Fixing the Problem: Revise the Model Boxplots confirm homoscedastic errors.

Copyright © 2014, 2011 Pearson Education, Inc. 17 4M Example 22.1: ESTIMATING HOME PRICES Motivation A company is relocating several managers to the Seattle area. For budgeting purposes, they would like a break down of home prices into fixed and variable costs to better prepare for negotiations with realtors.

Copyright © 2014, 2011 Pearson Education, Inc. 18 4M Example 22.1: ESTIMATING HOME PRICES Method Data consists of a sample of 94 homes for sale in Seattle. The explanatory variable is the reciprocal of home size and the response is price per square foot. The scatterplot shows a linear association and there are no obvious lurking variables.

Copyright © 2014, 2011 Pearson Education, Inc. 19 4M Example 22.1: ESTIMATING HOME PRICES Mechanics Evidently independent, similar variances, and nearly normal conditions met.

Copyright © 2014, 2011 Pearson Education, Inc. 20 4M Example 22.1: ESTIMATING HOME PRICES Mechanics The SRM results.

Copyright © 2014, 2011 Pearson Education, Inc. 21 4M Example 22.1: ESTIMATING HOME PRICES Mechanics The fitted equation is Estimated $/SqFt = ,887/SqFt. The 95% confidence interval for the intercept is [ to ] and the 95% confidence interval for the slope is [18,663 to 89,111].

Copyright © 2014, 2011 Pearson Education, Inc. 22 4M Example 22.1: ESTIMATING HOME PRICES Message Prices for homes in this Seattle neighborhood run about $140 to $180 per square foot, on average. Average fixed costs associated with the purchase are in the range $20,000 to $90,000, with 95% confidence.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Comparing Models with Different Responses Even though the revised model has a smaller r 2,  It provides more reliable and narrower confidence intervals for fixed and variable costs; and  It provides more sensible prediction intervals.

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Comparing Models with Different Responses

Copyright © 2014, 2011 Pearson Education, Inc Changing Variation Comparing Models with Different Responses

Copyright © 2014, 2011 Pearson Education, Inc Outliers Consider a Contractor’s Bid on a Project A contractor is bidding on a project to construct an 875 square-foot addition to a home.  If he bids too low, he loses money on the project.  If he bids too high, he does not get the job.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Contractor Data for n=30 Similar Projects Note that all but one of his previous projects are smaller than 875 square feet.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Contractor Example  His one project at 900 square feet is an outlier.  It is also a leveraged observation as it pulls the regression line in its direction.  Leveraged: an observation in regression that has a small or large value of the explanatory variable.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Consequences of an Outlier  To see the consequences of an outlier, fit the least squares regression line both with and without it.  Use the standard errors obtained without including the outlier to compare estimates.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Consequences for the Contractor Example

Copyright © 2014, 2011 Pearson Education, Inc Outliers Consequences for the Contractor Example  Including the outlier shifts the estimated fixed cost up by about 1.5 standard errors.  Including the outlier shifts the estimated marginal cost down by about 1.56 standard errors.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Consequences for the Contractor Example Prediction intervals when the outlier is included.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Consequences for the Contractor Example Prediction intervals when the outlier is not included.

Copyright © 2014, 2011 Pearson Education, Inc Outliers Fixing the Problem: More Information  If the outlier describes what is expected the next time under the same conditions, then it should be included.  In the contractor example, more information is needed to decide whether to include or exclude the outlier.

Copyright © 2014, 2011 Pearson Education, Inc Dependent Errors and Time Series Detecting Dependence  With time series data, plot residuals versus time to look for a pattern indicating dependence in the errors.  Use the Durbin-Watson statistic to test for correlation between adjacent residuals (known as autocorrelation).

Copyright © 2014, 2011 Pearson Education, Inc Dependent Errors and Time Series Detecting Dependence Scatterplot suggests few problems with fit.

Copyright © 2014, 2011 Pearson Education, Inc Dependent Errors and Time Series Detecting Dependence Timeplot of residuals from regression of change in employment on utilization reveals dependence.

Copyright © 2014, 2011 Pearson Education, Inc Dependent Errors and Time Series The Durbin-Watson Statistic  Tests the null hypothesis H 0 : ρ ε = 0.  Is calculated as follows:

Copyright © 2014, 2011 Pearson Education, Inc Dependent Errors and Time Series The Durbin-Watson Statistic  Use p-value provided by software or table of critical values at α = 0.05 (portion shown below) to draw a conclusion.

Copyright © 2014, 2011 Pearson Education, Inc Dependent Errors and Time Series Consequences of Dependence  If there is positive autocorrelation in the errors, the estimated standard errors are too small.  The estimated slope and intercept are less precise than suggested by the output.  Best remedy is to incorporate the dependence into the regression model.

Copyright © 2014, 2011 Pearson Education, Inc. 41 4M Example 22.2: CELL PHONE SUBSCRIBERS Motivation Predict the market for cellular telephone services.

Copyright © 2014, 2011 Pearson Education, Inc. 42 4M Example 22.2: CELL PHONE SUBSCRIBERS Method Use simple regression to predict the future number of subscribers. The number of subscriber connections, in millions, is the response. The explanatory variable is the date (time). The scatterplot shows a linear association. Lurking variables may be present, such as technology and marketing.

Copyright © 2014, 2011 Pearson Education, Inc. 43 4M Example 22.2: CELL PHONE SUBSCRIBERS Mechanics The least squares equation is Estimated Subscribers = - 40, Date

Copyright © 2014, 2011 Pearson Education, Inc. 44 4M Example 22.2: CELL PHONE SUBSCRIBERS Mechanics The timeplot of meandering residuals and D = 0.25 indicate independence condition is not satisfied.

Copyright © 2014, 2011 Pearson Education, Inc. 45 4M Example 22.2: CELL PHONE SUBSCRIBERS Message There is a strong upward trend in the number of subscribers that can be summarized by Estimated Subscribers = -40, Date. However, since the conditions for SRM are not satisfied, we cannot rely on statistical inferences to quantify the uncertainty for predictions.

Copyright © 2014, 2011 Pearson Education, Inc. 46 Best Practices  Make sure that your model makes sense.  Plan to change your model if it does not match the data.  Report the presence of and how you handle any outliers.

Copyright © 2014, 2011 Pearson Education, Inc. 47 Pitfalls  Do not rely on summary statistics like r 2 to pick the best model.  Don’t compare r 2 between regression models unless the response is the same.  Do not check for normality until you get the right equation.

Copyright © 2014, 2011 Pearson Education, Inc. 48 Pitfalls (Continued)  Don’t think that your data are independent if the Durbin-Watson statistic is close to 2.  Never forget to look at plots of the data and model.