Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.

Slides:



Advertisements
Similar presentations
Statistical Methods Lecture 28
Advertisements

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Inferences for Regression.
Copyright © 2010 Pearson Education, Inc. Chapter 27 Inferences for Regression.
Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Inferences for Regression
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
Copyright © 2012 Pearson Education. All rights reserved Copyright © 2012 Pearson Education. All rights reserved. Chapter 17 Understanding Residuals.
Inference for Regression
Chapter 8 Linear Regression © 2010 Pearson Education 1.
17.2 Extrapolation and Prediction
Chapter 17 Understanding Residuals © 2010 Pearson Education 1.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Inferences for Regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
BA 555 Practical Business Analysis
Class 6: Tuesday, Sep. 28 Section 2.4. Checking the assumptions of the simple linear regression model: –Residual plots –Normal quantile plots Outliers.
SIMPLE LINEAR REGRESSION
Regression Diagnostics Checking Assumptions and Data.
Linear Regression Example Data
Slide Copyright © 2010 Pearson Education, Inc. Active Learning Lecture Slides For use with Classroom Response Systems Business Statistics First Edition.
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
1 Simple Linear Regression 1. review of least squares procedure 2. inference for least squares lines.
Correlation & Regression
Active Learning Lecture Slides
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Inferences for Regression
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Simple linear regression Tron Anders Moger
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Copyright © 2010 Pearson Education, Inc. Slide The lengths of individual shellfish in a population of 10,000 shellfish are approximately normally.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 9 Regression Wisdom.
Chapter 26 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Statistics 27 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
1 Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis Section 12.1: How Can We Model How Two Variables Are Related?
BPS - 5th Ed. Chapter 231 Inference for Regression.
Chapter 12: Correlation and Linear Regression 1.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 12 More About Regression 12.1 Inference for.
Statistics 27 Inferences for Regression. An Example: Body Fat and Waist Size Our chapter example revolves around the relationship between % body fat and.
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Quantitative Methods Residual Analysis Multiple Linear Regression C.W. Jackson/B. K. Gordor.
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Inference for Least Squares Lines
Inferences for Regression
Inference for Regression
Chapter 11: Simple Linear Regression
CHAPTER 29: Multiple Regression*
Unit 3 – Linear regression
Basic Practice of Statistics - 3rd Edition Inference for Regression
CHAPTER 12 More About Regression
Inferences for Regression
Inference for Regression
Presentation transcript:

Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course

The Population and the Sample  We already know that we can model the relationship between two quantitative variables by fitting a straight line to a sample of ordered pairs.  But, observations differ from sample to sample: © 2011 Pearson Education, Inc.

The Population and the Sample  We can imagine a line that summarizes the true relationship between x and y for the entire population, where  y is the population mean of y at a given value of x. NOTE: We are assuming an idealized case in which the the points ( x,  y ) are in fact exactly linear. © 2011 Pearson Education, Inc.

The Population and the Sample  For a given value x :  The value of ŷ for a specific value of x obtained from a particular sample may not lie on the line µ y.  These values of ŷ will be distributed about µ y.  We can account for the error between ŷ and µ y by adding an error term (  ) to the model: © 2011 Pearson Education, Inc.

The Population and the Sample  Regression Inference  Collect a sample and estimate the population  ’s by finding a regression line:  The residuals e = y – ŷ are the sample based versions of .  Account for the uncertainties in  0 and  1 by making confidence intervals, as we’ve done for means and proportions. © 2011 Pearson Education, Inc.

Assumptions and Conditions  Inference in regression are based on these assumptions (should check these assumptions in this order): 1. Linearity Assumption 2. Independence Assumption 3. Equal Variance Assumption 4. Normal Population Assumption © 2011 Pearson Education, Inc.

Assumptions and Conditions Testing the Assumptions 1.Make a scatterplot of the data to check for linearity. (Linearity Assumption) 2.Fit a regression and find the residuals, e, and predicted values ŷ. 3.Plot the residuals against time (if appropriate) and check for evidence of patterns (Independence Assumption). 4.Make a scatterplot of the residuals against x or the predicted values. This plot should not exhibit a “fan” or “cone” shape. (Equal Variance Assumption) © 2011 Pearson Education, Inc.

Assumptions and Conditions 5. Make a histogram and/or Normal probability plot of the residuals (Normal Population Assumption) © 2011 Pearson Education, Inc.

Assumptions and Conditions Graphical Summary of Assumptions and Conditions © 2011 Pearson Education, Inc.

Regression Inference For a sample, we expect b 1 to be close to the model slope  1. For similar samples, the standard error of the slope is a measure of the variability of b 1 about the true slope  1. © 2011 Pearson Education, Inc.

Regression Inference Which of these scatterplots would give the more consistent regression slope estimate if we were to sample repeatedly from the underlying population? Hint: Compare s e ’s. © 2011 Pearson Education, Inc.

Regression Inference Which of these scatterplots would give the more consistent regression slope estimate if we were to sample repeatedly from the underlying population? Hint: Compare s x ’s. © 2011 Pearson Education, Inc.

Regression Inference Which of these scatterplots would give the more consistent regression slope estimate if we were to sample repeatedly from the underlying population? Hint: Compare n’s. © 2011 Pearson Education, Inc.

Regression Inference © 2011 Pearson Education, Inc.

Regression Inference © 2011 Pearson Education, Inc.

Regression Inference © 2011 Pearson Education, Inc.

Standard Errors for Predicted Values SE becomes larger the further x gets from. That is, the confidence interval broadens as you move away from. (See figure at right.) © 2011 Pearson Education, Inc.

Standard Errors for Predicted Values SE, and the confidence interval, becomes smaller with increasing n. SE, and the confidence interval, are larger for samples with more spread around the line (when s e is larger). © 2011 Pearson Education, Inc.

Standard Errors for Predicted Values Because of the extra term, the prediction interval for individual values is broader that the confidence interval for predicted mean values. © 2011 Pearson Education, Inc.

Using Confidence and Prediction Intervals Confidence interval for a mean: The result at 95% means “We are 95% confident that the mean value of y is between 4.40 and 4.70 when.” © 2011 Pearson Education, Inc.

Using Confidence and Prediction Intervals Prediction interval for an individual value: The result at 95% means “We are 95% confident that a single particular value of y will be between 2.95 and 5.15 when.” © 2011 Pearson Education, Inc.

Extrapolation and Prediction Extrapolating – predicting a y value by extending the regression model to regions outside the range of the x -values of the data. © 2011 Pearson Education, Inc.

Extrapolation and Prediction Why is extrapolation dangerous?  It introduces the questionable and untested assumption that the relationship between x and y does not change. © 2011 Pearson Education, Inc.

Extrapolation and Prediction Cautionary Example: Oil Prices in Constant Dollars Model Prediction (Extrapolation): On average, a barrel of oil will increase $7.39 per year from 1983 to © 2011 Pearson Education, Inc.

Extrapolation and Prediction Cautionary Example: Oil Prices in Constant Dollars Actual Price Behavior Extrapolating the model to the ’80s and ’90s lead to grossly erroneous forecasts. © 2011 Pearson Education, Inc.

Extrapolation and Prediction Remember: Linear models ought not be trusted beyond the span of the x -values of the data. If you extrapolate far into the future, be prepared for the actual values to be (possibly quite) different from your predictions. © 2011 Pearson Education, Inc.

27 In regression, an outlier can stand out in two ways. It can have… 1) a large residual: 14.7 Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

28 In regression, an outlier can stand out in two ways. It can have… 2) a large distance from : “High-leverage point” A high leverage point is influential if omitting it gives a regression model with a very different slope Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

29 Tell whether the point is a high-leverage point, if it has a large residual, and if it is influential Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

30 Tell whether the point is a high-leverage point, if it has a large residual, and if it is influential.  Not high-leverage  Large residual  Not very influential 14.7 Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

31 Tell whether the point is a high-leverage point, if it has a large residual, and if it is influential Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

32 Tell whether the point is a high-leverage point, if it has a large residual, and if it is influential.  High-leverage  Small residual  Not very influential 14.7 Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

33 Tell whether the point is a high-leverage point, if it has a large residual, and if it is influential Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

34 Tell whether the point is a high-leverage point, if it has a large residual, and if it is influential.  High-leverage  Medium residual  Very influential (omitting the red point will change the slope dramatically!) 14.7 Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

35 What should you do with a high-leverage point?  Sometimes, these points are important. They can indicate that the underlying relationship is in fact nonlinear.  Other times, they simply do not belong with the rest of the data and ought to be omitted. When in doubt, create and report two models: one with the outlier and one without Unusual and Extraordinary Observations © 2011 Pearson Education, Inc.

36 What Have We Learned?  Do not fit a linear regression to data that are not straight.  Watch out for changing spread.  Watch out for non-Normal errors.  Beware of extrapolating, especially far into the future.  Look for unusual points. Consider setting aside outliers and re-running the regression.  Treat unusual points honestly. © 2011 Pearson Education, Inc.

37 What Have We Learned?  Under certain conditions, the sampling distribution for the slope of a regression line can be modeled by a Student’s t- model with n – 2 degrees of freedom.  Check four conditions – in order – before proceeding to inference.  Linearity  Independence  Equal Variance  Normality © 2011 Pearson Education, Inc.

38 What Have We Learned?  Use the appropriate t-model to test a hypothesis (H 0 :  1 = 0) about the slope.  Create and interpret a confidence interval for the slope. © 2011 Pearson Education, Inc.