# Statistics Measures of Regression and Prediction Intervals.

## Presentation on theme: "Statistics Measures of Regression and Prediction Intervals."— Presentation transcript:

Statistics Measures of Regression and Prediction Intervals

Warm-up What value of “r” below best describes the scatterplot below? a) 0.9 b) -0.9 c) 0.3 d) -0.3 e) 0

Warm-up A least squares regression equation was created from last year’s students data to predict Exam 3 scores based on Exam 1 scores. The equation was: a) 69.70 b) 79.89 c) 85.53 d) 89.33 e) 90.53 Predict the score on Exam 3 for a student that scored an 80 on Exam 1.

Objectives Interpret the three types of variation about a regression line Find and interpret the coefficient of determination Find and interpret the standard error of the estimate for a regression line

Variation About a Regression Line Three types of variation about a regression line Total variation Explained variation Unexplained variation To find the total variation, you must first calculate The total deviation The explained deviation The unexplained deviation

Variation About a Regression Line (x i, ŷ i ) x y (x i, y i ) Unexplained deviation Total deviation Explained deviation Total Deviation = Explained Deviation = Unexplained Deviation =

Total variation The sum of the squares of the differences between the y-value of each ordered pair and the mean of y. Explained variation The sum of the squares of the differences between each predicted y-value and the mean of y. Variation About a Regression Line Total variation = Explained variation =

Unexplained variation The sum of the squares of the differences between the y-value of each ordered pair and each corresponding predicted y-value. Variation About a Regression Line Unexplained variation = The sum of the explained and unexplained variation is equal to the total variation. Total variation = Explained variation + Unexplained variation

Coefficient of Determination Coefficient of determination The ratio of the explained variation to the total variation. Denoted by r 2

Example: Coefficient of Determination About 83.4% of the variation in the company sales can be explained by the variation in the advertising expenditures. About 16.9% of the variation is unexplained. The correlation coefficient for the advertising expenses and company sales data as calculated is r ≈ 0.913. Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation? Solution :

The Standard Error of Estimate Standard error of estimate The standard deviation of the observed y i -values about the predicted ŷ-value for a given x i -value. Denoted by s e. The closer the observed y-values are to the predicted y- values, the smaller the standard error of estimate will be. n is the number of ordered pairs in the data set

The Standard Error of Estimate 1.Make a table that includes the column heading shown. 2.Use the regression equation to calculate the predicted y-values. 3.Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value. 4.Find the standard error of estimate. In WordsIn Symbols

Example: Standard Error of Estimate The regression equation for the advertising expenses and company sales data is ŷ = 50.729x + 104.061 Find the standard error of estimate. Solution: Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value.

Solution: Standard Error of Estimate xyŷ iŷ i (y i – ŷ i ) 2 2.4225225.81(225 – 225.81) 2 = 0.6561 1.6184185.23(184 – 185.23) 2 = 1.5129 2.0220205.52(220 – 205.52) 2 = 209.6704 2.6240235.96(240 – 235.96) 2 = 16.3216 1.4180175.08(180 – 175.08) 2 = 24.2064 1.6184185.23(184 – 185.23) 2 = 1.5129 2.0186205.52(186 – 205.52) 2 = 381.0304 2.2215215.66(215 – 215.66) 2 = 0.4356 Σ = 635.3463 unexplained variation

Standard error of Estimate Related to unexplained variation (residuals) It is a measurement of the variation of the points about the regression line. s e = 0, r = 1 or -1, a perfect relation exists. The larger the s e, the more variability exists and the lower the quality of the relationship.

Solution: Standard Error of Estimate n = 8, Σ(y i – ŷ i ) 2 = 635.3463 The standard error of estimate of the company sales for a specific advertising expense is about \$10.29.

Summary Interpreted the three types of variation about a regression line Found and interpreted the coefficient of determination Found and interpreted the standard error of the estimate for a regression line

Homework Pg 490-492; # 2-16 even