Slide 9-1 Copyright © 2004 Pearson Education, Inc.

Slides:



Advertisements
Similar presentations
Copyright © 2010 Pearson Education, Inc. Slide A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a.
Advertisements

Statistical Methods Lecture 28
Chapter 9. * No regression analysis is complete without a display of the residuals to check the linear model is reasonable. * The residuals are what is.
 Objective: To identify influential points in scatterplots and make sense of bivariate relationships.
Chapter 8 Linear regression
Chapter 8 Linear regression
Linear Regression Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. Chapter 8 Linear Regression.
Extrapolation: Reaching Beyond the Data
Regression Wisdom Chapter 9.
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 17 Understanding Residuals.
Copyright © 2009 Pearson Education, Inc. Chapter 8 Linear Regression.
CHAPTER 8: LINEAR REGRESSION
17.2 Extrapolation and Prediction
Chapter 17 Understanding Residuals © 2010 Pearson Education 1.
Regression Wisdom.
Chapter 9: Regression Wisdom
Getting to Know Your Scatterplot and Residuals
Chapter 9 Regression Wisdom
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 8, Slide 1 Chapter 8 Regression Wisdom.
Chapter 9: Regression Alexander Swan & Rafey Alvi.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Scatterplots, Association, and Correlation Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 7 Scatterplots, Association, and Correlation.
The Practice of Statistics Third Edition Chapter 4: More about Relationships between Two Variables Copyright © 2008 by W. H. Freeman & Company Daniel S.
Copyright © 2010 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Chapter 9 Regression Wisdom
Regression Wisdom.  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t take it for granted.)
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 9 Regression Wisdom.
Slide 7-1 Copyright © 2004 Pearson Education, Inc.
1 Chapter 7 Scatterplots, Association, and Correlation.
Notes Bivariate Data Chapters Bivariate Data Explores relationships between two quantitative variables.
Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression.
Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Copyright © 2010 Pearson Education, Inc. Slide A least squares regression line was fitted to the weights (in pounds) versus age (in months) of a.
Chapter 3.3 Cautions about Correlations and Regression Wisdom.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 Linear Regression (3)
Copyright © 2010 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Copyright © 2010 Pearson Education, Inc. Slide The lengths of individual shellfish in a population of 10,000 shellfish are approximately normally.
Chapter 8 Linear Regression HOW CAN A MODEL BE CREATED WHICH REPRESENTS THE LINEAR RELATIONSHIP BETWEEN TWO QUANTITATIVE VARIABLES?
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 8- 1.
Chapter 9 Regression Wisdom math2200. Sifting residuals for groups Residuals: ‘left over’ after the model How to examine residuals? –Residual plot: residuals.
Linear Regression Chapter 8. Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King.
Chapter 9 Regression Wisdom
Regression Wisdom Chapter 9. Getting the “Bends” Linear regression only works for linear models. (That sounds obvious, but when you fit a regression,
Module 11 Scatterplots, Association, and Correlation.
Regression Wisdom. Getting the “Bends”  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t.
Chapter 9 Regression Wisdom. Getting the “Bends” Linear regression only works for data with a linear association. Curved relationships may not be evident.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Regression Wisdom Copyright © 2010, 2007, 2004 Pearson Education, Inc.
Do Now Examine the two scatterplots and determine the best course of action for other countries to help increase the life expectancies in the countries.
Statistics 9 Regression Wisdom. Getting the “Bends” Linear regression only works for linear models. (That sounds obvious, but when you fit a regression,
1-1 Copyright © 2015, 2010, 2007 Pearson Education, Inc. Chapter 8, Slide 1 Chapter 8 Regression Wisdom.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 7 Scatterplots, Association, and Correlation.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 7- 1.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
AP Statistics.  Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t take it for granted.)
Unit 4 Lesson 4 (5.4) Summarizing Bivariate Data
Chapter 5 Lesson 5.3 Summarizing Bivariate Data
Chapter 9 Regression Wisdom Copyright © 2010 Pearson Education, Inc.
Regression Wisdom Chapter 9.
Chapter 8 Regression Wisdom.
Chapter 8 Part 2 Linear Regression
Week 5 Lecture 2 Chapter 8. Regression Wisdom.
Review of Chapter 3 Examining Relationships
Presentation transcript:

Slide 9-1 Copyright © 2004 Pearson Education, Inc.

Slide 9-2 Copyright © 2004 Pearson Education, Inc. Regression Wisdom Chapter 9 Created by Jackie Miller, The Ohio State University

Slide 9-3 Copyright © 2004 Pearson Education, Inc. Sifting Residuals for Groups No regression analysis is complete without a display of the residuals to check that the linear model is reasonable. Residuals often reveal subtleties that were not clear from a plot of the original data.

Slide 9-4 Copyright © 2004 Pearson Education, Inc. Sifting Residuals for Groups (cont.) Sometimes the subtleties we see are additional details that help confirm or refine our understanding. Sometimes they reveal violations of the regression conditions that require our attention.

Slide 9-5 Copyright © 2004 Pearson Education, Inc. Sifting Residuals for Groups (cont.) It is a good idea to look at both a histogram of the residuals and a scatterplot of the residuals vs. predicted values:

Slide 9-6 Copyright © 2004 Pearson Education, Inc. Sifting Residuals for Groups (cont.) An examination of residuals often leads us to discover groups of observations that are different from the rest. When we discover that there is more than one group in a regression, we may decide to analyze the groups separately, using a different model for each group.

Slide 9-7 Copyright © 2004 Pearson Education, Inc. Subsets Here’s an important unstated condition for fitting models: All the data must come from the same group. When we discover that there is more than one group in a regression, neither modeling the groups together nor modeling them apart is correct.

Slide 9-8 Copyright © 2004 Pearson Education, Inc. Subsets (cont.) Figure 9.3 from the text shows regression lines fit to calories and sugar for each of the three cereal shelves in a supermarket:

Slide 9-9 Copyright © 2004 Pearson Education, Inc. Getting the “Bends” Linear regression only works for linear models. (That sounds obvious, but when you fit a regression, you can’t take it for granted.) A curved relationship between two variables might not be apparent when looking at a scatterplot alone, but will be more obvious in a plot of the residuals. –Remember, we want to see “nothing” in a plot of the residuals.

Slide 9-10 Copyright © 2004 Pearson Education, Inc. Getting the “Bends” (cont.) The curved relationship between fuel efficiency and weight is more obvious in the plot of the residuals than in the original scatterplot:

Slide 9-11 Copyright © 2004 Pearson Education, Inc. Extrapolation: Reaching Beyond the Data We cannot assume that a linear relationship in the data exists beyond the range of the data. Once we venture into new x territory, such a prediction is called an extrapolation.

Slide 9-12 Copyright © 2004 Pearson Education, Inc. Extrapolation (cont.) Extrapolations are dubious because they require the additional—and very questionable—assumption that nothing about the relationship between x and y changes even at extreme values of x. Extrapolations can get you into deep trouble. You’re better off not making extrapolations.

Slide 9-13 Copyright © 2004 Pearson Education, Inc. Extrapolation (cont.) A regression of mean age at first marriage for men vs. year fit to the first 4 decades of the 20 th century does not hold for later years:

Slide 9-14 Copyright © 2004 Pearson Education, Inc. Predicting the Future Extrapolation is always dangerous. But, when the x-variable in the model is time, extrapolation becomes an attempt to peer into the future. Knowing that extrapolation is dangerous doesn’t stop people. The temptation to see into the future is hard to resist. Here’s some more realistic advice: If you must extrapolate into the future, at least don’t believe that the prediction will come true.

Slide 9-15 Copyright © 2004 Pearson Education, Inc. Outliers Outlying points can strongly influence a regression. Even a single point far from the body of the data can dominate the analysis. –Any point that stands away from the others can be called an outlier and deserves your special attention.

Slide 9-16 Copyright © 2004 Pearson Education, Inc. Outliers (cont.) You cannot simply delete outliers from the data. You can, however, fit a model with and without an outlier as long as you examine and discuss the outlying point(s). When we investigate an outlier, we often learn more about the situation than we could have learned from the model alone.

Slide 9-17 Copyright © 2004 Pearson Education, Inc. Outliers (cont.) The following scatterplot shows that something was awry in Palm Beach County, Florida, during the 200 presidential election…

Slide 9-18 Copyright © 2004 Pearson Education, Inc. Outliers and Influential Points Points that are extraordinary in their x- value can especially influence a regression model. We say that such points have high leverage. High leverage points that are also model outliers can have a shocking effect on the regression.

Slide 9-19 Copyright © 2004 Pearson Education, Inc. Outliers and Influential Points (cont.)

Slide 9-20 Copyright © 2004 Pearson Education, Inc. Outliers and Influential Points (cont.) Warning: Influential points can hide in plots of residuals. You’ll see influential points more easily in scatterplots of the original data or by finding a regression model with and without the points.

Slide 9-21 Copyright © 2004 Pearson Education, Inc. Lurking Variables and Causation Important thing to remember: No matter how strong the association, no matter how large the R 2 value, no matter how straight the line, there is no way to conclude from a regression alone that one variable causes the other. With observational data, as opposed to data from a designed experiment, there is no way to be sure that a lurking variable is not the cause of any apparent association.

Slide 9-22 Copyright © 2004 Pearson Education, Inc. Working With Summary Values Scatterplots of statistics summarized over groups tend to show less variability than we would see if we measured the same variable on individuals. This is because the summary statistics themselves vary less than the data on the individuals do.

Slide 9-23 Copyright © 2004 Pearson Education, Inc. Working With Summary Values (cont.) Scatterplots of summary statistics show less scatter than the baseline data on individuals, and can give a false impression of how well a line summarizes the data: There is no simple correction for this phenomenon.

Slide 9-24 Copyright © 2004 Pearson Education, Inc. What Can Go Wrong? Make sure the relationship is straight— check the straight enough condition. Beware of extrapolating. Beware especially of extrapolating into the future! Be on guard for different groups in your regression—if you find subsets that behave differently, consider fitting a different linear model to each subset.

Slide 9-25 Copyright © 2004 Pearson Education, Inc. What Can Go Wrong? (cont.) Look for outliers—outliers are special points that always deserve attention and that may well reveal more about your data than the rest of the points combined. Beware of high leverage points, and especially those that are influential—such points can alter the regression model a great deal. Consider comparing two regressions—run regressions with extraordinary points and without and then compare the results.

Slide 9-26 Copyright © 2004 Pearson Education, Inc. What Can Go Wrong? (cont.) Treat outliers honestly—don’t just remove outliers to get a model that fits better. Beware of lurking variables—and don’t assume that association is causation. Watch out when dealing with data that are summaries—summary data tend to inflate the impression of the strength of a relationship.

Slide 9-27 Copyright © 2004 Pearson Education, Inc. Key Concepts There are some diagnostic methods (e.g., histograms of residuals, scatterplots of residuals vs. predicted values) that will help us determine if linear regression is appropriate for our data. We need to be aware of some of the pitfalls with regression, particularly those listed in the “What Can Go Wrong?” section.