Presentation on theme: "Statistics Measures of Regression and Prediction Intervals."— Presentation transcript:
Statistics Measures of Regression and Prediction Intervals
Warm-up What value of “r” below best describes the scatterplot below? a) 0.9 b) -0.9 c) 0.3 d) -0.3 e) 0
Warm-up A least squares regression equation was created from last year’s students data to predict Exam 3 scores based on Exam 1 scores. The equation was: a) 69.70 b) 79.89 c) 85.53 d) 89.33 e) 90.53 Predict the score on Exam 3 for a student that scored an 80 on Exam 1.
Objectives Interpret the three types of variation about a regression line Find and interpret the coefficient of determination Find and interpret the standard error of the estimate for a regression line
Variation About a Regression Line Three types of variation about a regression line Total variation Explained variation Unexplained variation To find the total variation, you must first calculate The total deviation The explained deviation The unexplained deviation
Variation About a Regression Line (x i, ŷ i ) x y (x i, y i ) Unexplained deviation Total deviation Explained deviation Total Deviation = Explained Deviation = Unexplained Deviation =
Total variation The sum of the squares of the differences between the y-value of each ordered pair and the mean of y. Explained variation The sum of the squares of the differences between each predicted y-value and the mean of y. Variation About a Regression Line Total variation = Explained variation =
Unexplained variation The sum of the squares of the differences between the y-value of each ordered pair and each corresponding predicted y-value. Variation About a Regression Line Unexplained variation = The sum of the explained and unexplained variation is equal to the total variation. Total variation = Explained variation + Unexplained variation
Coefficient of Determination Coefficient of determination The ratio of the explained variation to the total variation. Denoted by r 2
Example: Coefficient of Determination About 83.4% of the variation in the company sales can be explained by the variation in the advertising expenditures. About 16.9% of the variation is unexplained. The correlation coefficient for the advertising expenses and company sales data as calculated is r ≈ 0.913. Find the coefficient of determination. What does this tell you about the explained variation of the data about the regression line? About the unexplained variation? Solution :
The Standard Error of Estimate Standard error of estimate The standard deviation of the observed y i -values about the predicted ŷ-value for a given x i -value. Denoted by s e. The closer the observed y-values are to the predicted y- values, the smaller the standard error of estimate will be. n is the number of ordered pairs in the data set
The Standard Error of Estimate 1.Make a table that includes the column heading shown. 2.Use the regression equation to calculate the predicted y-values. 3.Calculate the sum of the squares of the differences between each observed y-value and the corresponding predicted y-value. 4.Find the standard error of estimate. In WordsIn Symbols
Example: Standard Error of Estimate The regression equation for the advertising expenses and company sales data is ŷ = 50.729x + 104.061 Find the standard error of estimate. Solution: Use a table to calculate the sum of the squared differences of each observed y-value and the corresponding predicted y-value.
Standard error of Estimate Related to unexplained variation (residuals) It is a measurement of the variation of the points about the regression line. s e = 0, r = 1 or -1, a perfect relation exists. The larger the s e, the more variability exists and the lower the quality of the relationship.
Solution: Standard Error of Estimate n = 8, Σ(y i – ŷ i ) 2 = 635.3463 The standard error of estimate of the company sales for a specific advertising expense is about $10.29.
Summary Interpreted the three types of variation about a regression line Found and interpreted the coefficient of determination Found and interpreted the standard error of the estimate for a regression line