Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.

Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing for Significance n Using the Estimated Regression Equation for Estimation and Prediction for Estimation and Prediction n Computer Solution n Residual Analysis: Validating Model Assumptions

Simple Linear Regression Model n The equation that describes how y is related to x and an error term is called the regression model. n The simple linear regression model is: y =  0 +  1 x +   0 and  1 are called parameters of the model.  0 and  1 are called parameters of the model.  is a random variable called the error term.  is a random variable called the error term.

Simple Linear Regression Equation n The simple linear regression equation is: E ( y ) =  0 +  1 x Graph of the regression equation is a straight line. Graph of the regression equation is a straight line.  0 is the y intercept of the regression line.  0 is the y intercept of the regression line.  1 is the slope of the regression line.  1 is the slope of the regression line. E ( y ) is the expected value of y for a given x value. E ( y ) is the expected value of y for a given x value.

Simple Linear Regression Equation n Positive Linear Relationship E(y)E(y)E(y)E(y) x Slope  1 is positive Regression line Intercept  0

Simple Linear Regression Equation n Negative Linear Relationship E(y)E(y)E(y)E(y) x Slope  1 is negative Regression line Intercept  0

Simple Linear Regression Equation n No Relationship E(y)E(y)E(y)E(y) x Slope  1 is 0 Regression line Intercept  0

Estimated Simple Linear Regression Equation n The estimated simple linear regression equation is: The graph is called the estimated regression line. The graph is called the estimated regression line. b 0 is the y intercept of the line. b 0 is the y intercept of the line. b 1 is the slope of the line. b 1 is the slope of the line. is the estimated value of y for a given x value. is the estimated value of y for a given x value.

Estimation Process Regression Model y =  0 +  1 x +  Regression Equation E ( y ) =  0 +  1 x Unknown Parameters  0,  1 Sample Data: x y x 1 y 1...... x n y n Estimated Regression Equation Sample Statistics b 0, b 1 b 0 and b 1 provide estimates of  0 and  1

Least Squares Method n Least Squares Criterion where: y i = observed value of the dependent variable for the i th observation for the i th observation y i = estimated value of the dependent variable for the i th observation for the i th observation ^

n Slope for the Estimated Regression Equation The Least Squares Method

n y -Intercept for the Estimated Regression Equation where: x i = value of independent variable for i th observation y i = value of dependent variable for i th observation x = mean value for independent variable x = mean value for independent variable y = mean value for dependent variable y = mean value for dependent variable n = total number of observations n = total number of observations _ _ The Least Squares Method

Example: Reed Auto Sales n Simple Linear Regression Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown on the next slide. Reed Auto periodically has a special week-long sale. As part of the advertising campaign Reed runs one or more television commercials during the weekend preceding the sale. Data from a sample of 5 previous sales are shown on the next slide.

Example: Reed Auto Sales n Simple Linear Regression Number of TV Ads Number of Cars Sold Number of TV Ads Number of Cars Sold 114 324 218 117 327

n Slope for the Estimated Regression Equation b 1 = 220 - (10)(100)/5 = 5 b 1 = 220 - (10)(100)/5 = 5 24 - (10) 2 /5 24 - (10) 2 /5 n y -Intercept for the Estimated Regression Equation b 0 = 20 - 5(2) = 10 b 0 = 20 - 5(2) = 10 n Estimated Regression Equation y = 10 + 5 x ^ Example: Reed Auto Sales

n Scatter Diagram ^

The Coefficient of Determination n Relationship Among SST, SSR, SSE SST = SSR + SSE where: SST = total sum of squares SST = total sum of squares SSR = sum of squares due to regression SSR = sum of squares due to regression SSE = sum of squares due to error SSE = sum of squares due to error ^^

n The coefficient of determination is: r 2 = SSR/SST where: SST = total sum of squares SST = total sum of squares SSR = sum of squares due to regression SSR = sum of squares due to regression The Coefficient of Determination

n Coefficient of Determination r 2 = SSR/SST = 100/114 =.8772 The regression relationship is very strong because 88% of the variation in number of cars sold can be explained by the linear relationship between the number of TV ads and the number of cars sold. Example: Reed Auto Sales

The Correlation Coefficient n Sample Correlation Coefficient where: b 1 = the slope of the estimated regression b 1 = the slope of the estimated regressionequation

Example: Reed Auto Sales n Sample Correlation Coefficient The sign of b 1 in the equation is “+”. r xy = +.9366 r xy = +.9366

Model Assumptions Assumptions About the Error Term  Assumptions About the Error Term  1.The error  is a random variable with mean of zero. 2.The variance of , denoted by  2, is the same for all values of the independent variable. 3.The values of  are independent. 4.The error  is a normally distributed random variable.

Testing for Significance To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of  1 is zero. To test for a significant regression relationship, we must conduct a hypothesis test to determine whether the value of  1 is zero. n Two tests are commonly used t Test t Test F Test F Test Both tests require an estimate of  2, the variance of  in the regression model. Both tests require an estimate of  2, the variance of  in the regression model.

An Estimate of  2 An Estimate of  2 The mean square error (MSE) provides the estimate of  2, and the notation s 2 is also used. s 2 = MSE = SSE/(n-2) s 2 = MSE = SSE/(n-2)where: Testing for Significance

An Estimate of  An Estimate of  To estimate  we take the square root of  2. To estimate  we take the square root of  2. The resulting s is called the standard error of the estimate. The resulting s is called the standard error of the estimate.

n Hypotheses H 0 :  1 = 0 H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 n Test Statistic Testing for Significance: t Test

n Rejection Rule Reject H 0 if t t  where: t  is based on a t distribution with n - 2 degrees of freedom with n - 2 degrees of freedom Testing for Significance: t Test

n t Test Hypotheses Hypotheses H 0 :  1 = 0 H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 Rejection Rule Rejection Rule For  =.05 and d.f. = 3, t.025 = 3.182 For  =.05 and d.f. = 3, t.025 = 3.182 Reject H 0 if t > 3.182 Reject H 0 if t > 3.182 Example: Reed Auto Sales

n t Test Test Statistics Test Statistics t = 5/1.08 = 4.63 Conclusions Conclusions t = 4.63 > 3.182, so reject H 0 t = 4.63 > 3.182, so reject H 0 Example: Reed Auto Sales

Confidence Interval for  1 We can use a 95% confidence interval for  1 to test the hypotheses just used in the t test. We can use a 95% confidence interval for  1 to test the hypotheses just used in the t test. H 0 is rejected if the hypothesized value of  1 is not included in the confidence interval for  1. H 0 is rejected if the hypothesized value of  1 is not included in the confidence interval for  1.

Confidence Interval for  1 The form of a confidence interval for  1 is: The form of a confidence interval for  1 is: where b 1 is the point estimate is the margin of error is the t value providing an area of  /2 in the upper tail of a t distribution with n - 2 degrees t distribution with n - 2 degrees of freedom

n Rejection Rule Reject H 0 if 0 is not included in the confidence interval for  1. 95% Confidence Interval for  1 95% Confidence Interval for  1 = 5 +/- 3.182(1.08) = 5 +/- 3.44 = 5 +/- 3.182(1.08) = 5 +/- 3.44 or 1.56 to 8.44 n Conclusion 0 is not included in the confidence interval. Reject H 0 Reject H 0 Example: Reed Auto Sales

n Hypotheses H 0 :  1 = 0 H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 n Test Statistic F = MSR/MSE Testing for Significance: F Test

n Rejection Rule Reject H 0 if F > F  where: F  is based on an F distribution with 1 d.f. in the numerator and n - 2 d.f. in the denominator Testing for Significance: F Test

n F Test Hypotheses Hypotheses H 0 :  1 = 0 H 0 :  1 = 0 H a :  1 = 0 H a :  1 = 0 Rejection Rule Rejection Rule For  =.05 and d.f. = 1, 3: F.05 = 10.13 For  =.05 and d.f. = 1, 3: F.05 = 10.13 Reject H 0 if F > 10.13. Reject H 0 if F > 10.13. Example: Reed Auto Sales

n F Test Test Statistic Test Statistic F = MSR/MSE = 100/4.667 = 21.43 Conclusion Conclusion F = 21.43 > 10.13, so we reject H 0. Example: Reed Auto Sales

Some Cautions about the Interpretation of Significance Tests Rejecting H 0 :  1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y. Rejecting H 0 :  1 = 0 and concluding that the relationship between x and y is significant does not enable us to conclude that a cause-and-effect relationship is present between x and y. Just because we are able to reject H 0 :  1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y. Just because we are able to reject H 0 :  1 = 0 and demonstrate statistical significance does not enable us to conclude that there is a linear relationship between x and y.

n Confidence Interval Estimate of E ( y p ) n Prediction Interval Estimate of y p y p + t  /2 s ind y p + t  /2 s ind where:confidence coefficient is 1 -  and t  /2 is based on a t distribution with n - 2 degrees of freedom Using the Estimated Regression Equation for Estimation and Prediction

n Point Estimation If 3 TV ads are run prior to a sale, we expect the mean number of cars sold to be: y = 10 + 5(3) = 25 cars ^ Example: Reed Auto Sales

n Confidence Interval for E ( y p ) 95% confidence interval estimate of the mean number of cars sold when 3 TV ads are run is: 25 + 4.61 = 20.39 to 29.61 cars Example: Reed Auto Sales

n Prediction Interval for y p 95% prediction interval estimate of the number of cars sold in one particular week when 3 TV ads are run is: 25 + 8.28 = 16.72 to 33.28 cars 25 + 8.28 = 16.72 to 33.28 cars Example: Reed Auto Sales

n Residual for Observation i y i – y i y i – y i n Standardized Residual for Observation i where: Residual Analysis ^^^ ^

Example: Reed Auto Sales n Residuals

Example: Reed Auto Sales n Residual Plot

Residual Analysis n Residual Plot x 0 Good Pattern Residual

Residual Analysis n Residual Plot x 0 Nonconstant Variance Residual

Residual Analysis n Residual Plot x 0 Model Form Not Adequate Residual

Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.

Similar presentations

Presentation on theme: "Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.

Similar presentations

Presentation on theme: "Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing."— Presentation transcript:

Similar presentations

About project

Feedback