Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21.

Slides:



Advertisements
Similar presentations
Chap 12-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 12 Simple Regression Statistics for Business and Economics 6.
Advertisements

Copyright © 2011 Pearson Education, Inc. Curved Patterns Chapter 20.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 20 Curved Patterns.
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 27 Inferences for Regression.
Copyright © 2010 Pearson Education, Inc. Chapter 27 Inferences for Regression.
Copyright © 2010 Pearson Education, Inc. Slide
Inference for Regression
Inference for Regression
Objectives (BPS chapter 24)
The Simple Regression Model
Linear Regression and Correlation Analysis
Simple Linear Regression Analysis
SIMPLE LINEAR REGRESSION
Business Statistics - QBM117 Statistical inference for regression.
Lecture 19 Simple linear regression (Review, 18.5, 18.8)
Correlation and Regression Analysis
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Simple Linear Regression Analysis
Correlation & Regression
Active Learning Lecture Slides
Copyright © 2011 Pearson Education, Inc. Multiple Regression Chapter 23.
Introduction to Linear Regression and Correlation Analysis
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Inference for regression - Simple linear regression
Copyright © 2010, 2007, 2004 Pearson Education, Inc. All Rights Reserved Section 10-3 Regression.
Chapter 11 Simple Regression
STA291 Statistical Methods Lecture 27. Inference for Regression.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 25 Categorical Explanatory Variables.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
1 Least squares procedure Inference for least squares lines Simple Linear Regression.
Inferences for Regression
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 22 Regression Diagnostics.
Statistics for Business and Economics 7 th Edition Chapter 11 Simple Regression Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 23 Multiple Regression.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Introduction to Linear Regression
Copyright © 2011 Pearson Education, Inc. Comparison Chapter 18.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 17 Comparison.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 27 Time Series.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 19 Linear Patterns.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Chapter 14 Inference for Regression © 2011 Pearson Education, Inc. 1 Business Statistics: A First Course.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 13 Multiple Regression Section 13.3 Using Multiple Regression to Make Inferences.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Copyright © 2011 Pearson Education, Inc. Time Series Chapter 27.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Chapter 14: Inference for Regression. A brief review of chapter 4... (Regression Analysis: Exploring Association BetweenVariables )  Bi-variate data.
Copyright © 2011 Pearson Education, Inc. Regression Diagnostics Chapter 22.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Simple Linear Regression Analysis Chapter 13.
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 24 Building Regression Models.
Chapter 13 Lesson 13.2a Simple Linear Regression and Correlation: Inferential Methods 13.2: Inferences About the Slope of the Population Regression Line.
Lecture Slides Elementary Statistics Twelfth Edition
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Inferences for Regression
Inference for Regression
Chapter 11 Simple Regression
Unit 3 – Linear regression
Simple Linear Regression
CHAPTER 12 More About Regression
Inferences for Regression
Presentation transcript:

Copyright © 2011 Pearson Education, Inc. The Simple Regression Model Chapter 21

21.1 The Simple Regression Model How can we test the CAPM (Capital Asset Pricing Model) for Berkshire Hathaway stock?  Formulate the simple regression with percentage change in Berkshire Hathaway stock as y and the percentage change in value of the whole stock market as x  Use inference related to regression: standard errors, confidence intervals and hypothesis tests Copyright © 2011 Pearson Education, Inc. 3 of 50

21.1 The Simple Regression Model  Simple Regression Model (SRM): model for the association in the population between an explanatory variable x and response y.  Consider the data to be a sample from a population. Copyright © 2011 Pearson Education, Inc. 4 of 50

21.1 The Simple Regression Model Linear on Average  The equation of the SRM describes how the conditional mean of Y depends on X.  The SRM shows that these means lie on a line with intercept β 0 and slope β 1 : Copyright © 2011 Pearson Education, Inc. 5 of 50

21.1 The Simple Regression Model Deviations from the Mean  The deviations of responses around are called errors.  Error, is denoted by, and E( ) = 0. Copyright © 2011 Pearson Education, Inc. 6 of 50

21.1 The Simple Regression Model Deviations from the Mean The SRM makes three assumptions about : 1. Independent. Errors are independent of each other. 2. Equal variance. All errors have the same variance, Var( ) =. 3. Normal. The errors are normally distributed. Copyright © 2011 Pearson Education, Inc. 7 of 50

21.1 The Simple Regression Model Data Generating Process  Let Y denote monthly sales of a company and let X denote its spending on advertising (both in thousands of dollars).  Assume the following population model: Copyright © 2011 Pearson Education, Inc. 8 of 50

21.1 The Simple Regression Model Data Generating Process The SRM assumes a normal distribution at each x. Copyright © 2011 Pearson Education, Inc. 9 of 50

21.1 The Simple Regression Model Data Generating Process Eventually the data shown below are observed. Copyright © 2011 Pearson Education, Inc. 10 of 50

21.1 The Simple Regression Model Data Generating Process  The true regression line is a characteristic of the population, not the observed data.  The SRM is a model and offers a simplified view of reality. Copyright © 2011 Pearson Education, Inc. 11 of 50

21.1 The Simple Regression Model Simple Regression Model (SRM) Observed values of the response Y are linearly related to the values of the explanatory variable X by the equation:, ~ N(0, ). The observations are independent of one another, have equal variance around the regression line, and are normally distributed around the regression line. Copyright © 2011 Pearson Education, Inc. 12 of 50

21.2 Conditions for the SRM Conditions for the SRM – Checklist  Is the association between y and x linear?  Have lurking variables been ruled out?  Are the errors evidently independent?  Are the variances of the residuals similar?  Are the residuals nearly normal? Copyright © 2011 Pearson Education, Inc. 13 of 50

21.2 Conditions for the SRM Conditions for the SRM – CAPM Example Linearity condition is satisfied; no pattern in the residuals. Data are shifted to the right because of two outliers (well-known declines in the market). Copyright © 2011 Pearson Education, Inc. 14 of 50

21.2 Conditions for the SRM Conditions for the SRM – CAPM Example No obvious lurking variable (according to CAPM theory). Similar variances condition is satisfied. Check the plot of residuals versus x for any fan shaped pattern (none visible). Copyright © 2011 Pearson Education, Inc. 15 of 50

21.2 Conditions for the SRM Conditions for the SRM – CAPM Example Evidently independent. No dependence apparent in the timeplot of the residuals. Copyright © 2011 Pearson Education, Inc. 16 of 50

21.2 Conditions for the SRM Conditions for the SRM – CAPM Example The residuals are not normally distributed. Check sample size condition (satisfied) to use CLT. Copyright © 2011 Pearson Education, Inc. 17 of 50

21.2 Conditions for the SRM Modeling Process Before looking at plots, ask two questions: 1. Does a linear relationship make sense? 2. Is the relationship free of lurking variables? Then begin working with data. Copyright © 2011 Pearson Education, Inc. 18 of 50

21.2 Conditions for the SRM Modeling Process  Plot y versus x and verify a linear association.  Fit the least squares line and obtain residuals.  Plot the residuals versus x.  If time series data, construct a timeplot of residuals.  Inspect the histogram and quantile plot of the residuals. Copyright © 2011 Pearson Education, Inc. 19 of 50

21.3 Inference in Regression Parameters and Estimates for SRM Copyright © 2011 Pearson Education, Inc. 20 of 50

21.3 Inference in Regression Standard Errors  Describe the sample-to-sample variability of b 0 and b 1  The estimated standard error of b 1 is Copyright © 2011 Pearson Education, Inc. 21 of 50

21.3 Inference in Regression Estimated Standard Error of b 1 Influenced by:  Standard deviation of the residuals. As it increases, the standard error increases.  Sample size. As it increases, the standard error decreases.  Standard deviation of x. As it increases, the standard error increases. Copyright © 2011 Pearson Education, Inc. 22 of 50

21.3 Inference in Regression Software Results for CAPM Example Copyright © 2011 Pearson Education, Inc. 23 of 50

21.3 Inference in Regression Confidence Intervals The 95% confidence interval for β 1 is The 95% confidence interval for β 0 is Copyright © 2011 Pearson Education, Inc. 24 of 50

21.3 Inference in Regression Confidence Intervals – CAPM Example The 95% confidence interval for β 1 is The 95% confidence interval for β 0 is Copyright © 2011 Pearson Education, Inc. 25 of 50

21.3 Inference in Regression Hypothesis Tests To test H 0 : β 1 = 0 use To test H 0 : β 0 = 0 use Copyright © 2011 Pearson Education, Inc. 26 of 50

21.3 Inference in Regression Hypothesis Tests – CAPM Example  The t-statistic of 9.29 with p-value of < indicates that the slope is significantly different from zero.  The t-statistic of 4.11 with p-value of < indicates that the intercept is significantly different from zero. Copyright © 2011 Pearson Education, Inc. 27 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Motivation Does traffic volume affect gasoline sales? How much more gasoline can be expected to be sold at a franchise location with an average of 40,000 drive-bys compared to one with an average of 32,000 drive-bys? Copyright © 2011 Pearson Education, Inc. 28 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Method Use sales data from a recent month obtained from 80 franchise outlets. The 95% confidence interval for 8,000 times the estimated slope will indicate how much more gas is expected to sell at the busier location. Copyright © 2011 Pearson Education, Inc. 29 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Method Association is linear; no obvious lurking variable. Copyright © 2011 Pearson Education, Inc. 30 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Mechanics Copyright © 2011 Pearson Education, Inc. 31 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Mechanics Residual plot confirms similar variances. Copyright © 2011 Pearson Education, Inc. 32 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Mechanics Residuals appear normally distributed. Copyright © 2011 Pearson Education, Inc. 33 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Mechanics The 95% confidence interval for β 1 is approximately to gallons/car. Hence, a difference of 8,000 cars in daily traffic volume implies a difference in average daily sales of approximately 1,507 to 2,281 more gallons per day. Copyright © 2011 Pearson Education, Inc. 34 of 50

4M Example 21.1: LOCATING A FRANCHISE OUTLET Message Based on a sample of 80 gas stations, we expect that a station located at a site with 40,000 drive bys will sell on average from 1,507 to 2,281 more gallons of gas daily than a location with 32,000 drive bys. Copyright © 2011 Pearson Education, Inc. 35 of 50

21.4 Prediction Intervals Leveraging the SRM  Prediction interval: an interval designed to hold a fraction (usually 95%) of the values of the response for a given value of x.  A prediction interval differs from a confidence interval because it makes a statement about the location of a new observation rather than a parameter of a population. Copyright © 2011 Pearson Education, Inc. 36 of 50

21.4 Prediction Intervals Leveraging the SRM The 95% prediction interval for y new is where and Copyright © 2011 Pearson Education, Inc. 37 of 50

21.4 Prediction Intervals Leveraging the SRM  A simple approximation for a 95% prediction interval is.  Prediction intervals are reliable within the range of observed data. They are also sensitive to the assumptions of constant variance and normality. Copyright © 2011 Pearson Education, Inc. 38 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Motivation In managing commercial fishing fleets, the level of effort (number of boat-days) is assumed to influence the size of the catch. What is the predicted crab catch in a season with 7,500 days of effort? Copyright © 2011 Pearson Education, Inc. 39 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Method Use regression with Y equal to the catch near Vancouver Island from 1980 – 2007 measured in thousands of pounds of Dungeness crabs with X equal to the level of effort (total number of days by boats catching Dungeness crabs). Copyright © 2011 Pearson Education, Inc. 40 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Method Linear association is evident. Copyright © 2011 Pearson Education, Inc. 41 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Mechanics Copyright © 2011 Pearson Education, Inc. 42 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Mechanics Evidently independent. Copyright © 2011 Pearson Education, Inc. 43 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Mechanics Similar variances confirmed. Copyright © 2011 Pearson Education, Inc. 44 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Mechanics Nearly normal condition could be satisfied. Copyright © 2011 Pearson Education, Inc. 45 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Mechanics The t-statistic (and p-value) indicate that the slope is significantly different from zero. The predicted catch in a year with x = 7500 days of effort is 1, thousand pounds. The 95% prediction interval is from to 1, thousand pounds. Copyright © 2011 Pearson Education, Inc. 46 of 50

4M Example 21.2: MANAGING NATURAL RESOURCES Message There is a statistically significant linear association between days of effort and total catch. On average, each additional day of effort (per boat) increases the harvest by about 160 pounds. In a season with 7,500 days of effort, there is an expected total harvest of 1,173,240 pounds. There is a 95% probability that the catch will be between 908,440 and 1,438,110 pounds. Copyright © 2011 Pearson Education, Inc. 47 of 50

Best Practices  Verify that your model makes sense, both visually and substantively.  Consider other possible explanatory variables.  Check the conditions, in the listed order. Copyright © 2011 Pearson Education, Inc. 48 of 50

Best Practices (Continued)  Use confidence intervals to express what you know about the slope and intercept.  Check the assumptions of the SRM carefully before using prediction intervals.  Be careful when extrapolating. Copyright © 2011 Pearson Education, Inc. 49 of 50

Pitfalls  Don’t overreact to residual plots.  Do not mistake varying amounts of data for unequal variances.  Do not confuse confidence intervals with prediction intervals.  Do not expect that r 2 and s e must improve with a larger sample. Copyright © 2011 Pearson Education, Inc. 50 of 50