CHAPTER 3 Describing Relationships 3.2 Least-Squares Regression
Least-Squares Regression INTERPRET the slope and y intercept of a least-squares regression line. USE the least-squares regression line to predict y for a given x. CALCULATE and INTERPRET residuals and their standard deviation. EXPLAIN the concept of least squares. DETERMINE the equation of a least-squares regression line using a variety of methods. CONSTRUCT and INTERPRET residual plots to assess whether a linear model is appropriate. ASSESS how well the least-squares regression line models the relationship between two variables. DESCRIBE how the slope, y intercept, standard deviation of the residuals, and r 2 are influenced by outliers.
Regression Line Linear (straight-line) relationships between two quantitative variables are common and easy to understand. A regression line summarizes the relationship between two variables, but only in settings where one of the variables helps explain or predict the other. A regression line is a line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x.
Interpreting a Regression Line A regression line is a model for the data, much like density curves. The equation of a regression line gives a compact mathematical description of what this model tells us about the relationship between the response variable y and the explanatory variable x. Suppose that y is a response variable (plotted on the vertical axis) and x is an explanatory variable (plotted on the horizontal axis). A regression line relating y to x has an equation of the form ŷ = a + bx In this equation, ŷ (read “y hat”) is the predicted value of the response variable y for a given value of the explanatory variable x. b is the slope, the amount by which y is predicted to change when x increases by one unit. a is the y intercept, the predicted value of y when x = 0.
Example: Interpreting slope and y intercept The equation of the regression line shown is PROBLEM: Identify the slope and y intercept of the regression line. Interpret each value in context. SOLUTION: The slope b = -0.1629 tells us that the price of a used Ford F-150 is predicted to go down by 0.1629 dollars (16.29 cents) for each additional mile that the truck has been driven. The y intercept a = 38,257 is the predicted price of a Ford F-150 that has been driven 0 miles. Try Exercise 39
Prediction We can use a regression line to predict the response ŷ for a specific value of the explanatory variable x. Use the regression line to predict price for a Ford F-150 with 100,000 miles driven.
Extrapolation We can use a regression line to predict the response ŷ for a specific value of the explanatory variable x. The accuracy of the prediction depends on how much the data scatter about the line. While we can substitute any value of x into the equation of the regression line, we must exercise caution in making predictions outside the observed values of x. Extrapolation is the use of a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. Such predictions are often not accurate. Don’t make predictions using values of x that are much larger or much smaller than those that actually appear in your data.