Presentation is loading. Please wait.

Presentation is loading. Please wait.

AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION.

Similar presentations


Presentation on theme: "AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION."— Presentation transcript:

1 AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION

2 Regression Line A regression line is a straight line that describes how a response variable y changes as an explanatory variable x changes. We often use a regression line to predict the value of y for a given value of x. Regression, unlike correlation, requires we have an explanatory variable and a response variable. LSRL – Is the abbreviation for least squares regression line. LSRL is a mathematical model.

3 Least – squares Regression Line Error = observed – predicted To find the most effective model we must square the errors and sum them to find the least errors squared.

4 Least – squares Regression Line The least – squares regression line of y on x is the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible.

5 Equation of the LSRL We have data on an explanatory variable x and a response variable y for n individuals. From the data, calculate the means x and y and the standard deviations s x and s y, and their correlation r. ¯¯

6 What happened to y = mx+b? y represents the observed (actual) values for y, and y represents the predicted values for y. We use y hat in the equation of the regression line to emphasize that the line gives predicted values for any x. When you are solving regression problems, be sure to distinguish between y and y. Hot tip: (x, y) is always a point on the regression line! ˆ ˆ ¯¯

7 AP STATISTICS LESSON 3 – 3 (DAY 2) The role of r 2 in regression

8 Essential Question: How is the r 2 used to determine the reliability of a linear regression line? To calculate r 2. To find the SST, the SSE and find the r 2 from them.

9 Definitions and Abbreviations r 2 = coefficient of determination ( The proportion of the total sample variability that is explained by the least-squares regression of y on x. LSRL – Least squares regression line. SST – (Total Sum of Squares) SST = ∑ ( y – y ) SSE – (Sum of squares of errors) SSE = ∑ ( y – ŷ) 2 2

10 Exercises Small r 2 and Large r 2 Page 158: Example 3.10 SMALL r 2 Page 160: Example 3.11 LARGE r 2

11 r 2 in Regression The coefficient of determination r 2, is the fraction of the variation in the values of y that is explained by least-squares regression of y on x. r 2 = SST - SSE SST

12 Facts about Least-squares Regressions Fact 1: The distinction between explanatory and response variable is essential in regression. Fact 2: There is a close connection between correlation and the slope of the least-squares line. A change of one standard deviation of x corresponds to a change of r standard deviations in y.

13 Facts of Regression (continued) Fact 3. The least-squares regression line always passes through the point ( x, y ). Fact 4. The square of the correlation, r 2, is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x.

14 A P STATISTICS LESSON 3 – 3 (DAY 3) A P STATISTICS LESSON 3 – 3 (DAY 3) RESIDUALS

15 ESSENTIAL QUESTION: What is a residual and what can a residual graph tell us about linear regression lines? Objective: To define and use residuals in the analysis of linear regression lines.

16 Residuals A residual is the difference between an observed variable and the value predicted by the regression line. That is, residual = observed y – predicted y = y - ŷ

17 Residual Facts The mean of the least-square residuals is always zero. The sum is not exactly 0 because the software rounded the residuals to four decimal places. This is roundoff error. The horizontal line of the residual plot is at zero.

18 Residual Plots A residual plot is a scatterplot of the regression residuals against the explanatory variable. Residual plots help us assess the fit of a regression line. If the regression line captures the overall relationship between x and y, the residuals should should have no systematic pattern. The residual plot will look something like the simplfied pattern. That plot shows a uniform scatter of the points about the fitted line, with no unusual individual observations.


Download ppt "AP STATISTICS LESSON 3 – 3 LEAST – SQUARES REGRESSION."

Similar presentations


Ads by Google