 # Warm Up: #23, 24 on page 555 Answer each of the following questions for #23, 24: a) Is there a linear correlation? Use software then Table A-6 to prove.

## Presentation on theme: "Warm Up: #23, 24 on page 555 Answer each of the following questions for #23, 24: a) Is there a linear correlation? Use software then Table A-6 to prove."— Presentation transcript:

Warm Up: #23, 24 on page 555 Answer each of the following questions for #23, 24: a) Is there a linear correlation? Use software then Table A-6 to prove it. Are your answers the same? b) Graph the points (don’t forget axis labels). If there is a correlation, graph the LSRL and continue to do c-h. c) Find the vital statistics (r, r-squared, a, b, y-hat – don’t forget to define x and y) d) Tell me what r and r-squared means in the context of the problem (r: form, direction, strength) (r-squared: how much of the variation in x can be explained by the variation in y) e) Find the residuals f) Draw the residual plot – is the regression line a good model for the data? Why? g) For # 23, predict the winning time when the temperature is 73 degrees Fahrenheit. h) For #24, predict the height of a daughter when her mother is 66 inches tall.

The SAT essay: longer is better?
Words 460 422 402 365 357 278 236 201 168 156 133 114 108 100 403 Score 6 5 4 3 2 1 401 388 320 258 189 128 67 697 387 355 337 325 272 150 135 73

Section 10-4 Variation

Key Concept In this section we proceed to consider a method for constructing a prediction interval, which is an interval estimate of a predicted value of y. Using paired data (x,y), we describe the variation that can be explained between x and y and the variation that is unexplained.

Unexplained, Explained, and Total Deviation
Figure 10-9

Definitions Total Deviation
The total deviation of (x, y) is the vertical distance y – ybar, which is the distance between the point (x, y) and the horizontal line passing through the sample mean y-bar. Explained Deviation The explained deviation is the vertical distance y-hat - y-bar, which is the distance between the predicted y-value and the horizontal line passing through the sample mean y-bar. Unexplained Deviation The unexplained deviation is the vertical distance y – y-hat, which is the vertical distance between the point (x, y) and the regression line. (The distance y – y-hat is also called a residual, as defined in Section 10-3.)

Particulars We can explain the discrepancy between y-bar = 9 and y-hat=13 by noting that there is a linear relationship best described by the LSRL (y-y-hat). The discrepancy between y-hat = 13 and y=19 can’t be explained by the LSRL = residual or unexplained deviation (y-y-hat)

Relationships (y - y) = (y - y) + (y - y)
(total deviation) = (explained deviation) + (unexplained deviation) (y - y) = (y - y) (y - y) ^ (total variation) = (explained variation) + (unexplained variation) (y - y) 2 =  (y - y)  (y - y) 2 ^ Formula 10-4

r2 = Definition Coefficient of determination
is the amount of the variation in y that is explained by the regression line. r2 = explained variation. total variation Example found at bottom of page 559 of text. The value of r2 is the proportion of the variation in y that is explained by the linear relationship between x and y.

Warm Up: Day 2 Consider the following data set: Find: Total variation
Explained variation Unexplained variation X Y 1 4 2 24 8 5 32

Try again! Consider the following data set: Find: Total variation
Explained variation Unexplained variation X Y 1 2 3 5 4 7

Not Old Faithful again! In section 10-2 we used the duration/interval after eruption times in Table 10-1 to find that r = find the coefficient of determination. Also, find the percentage of the total variation in y (time interval after eruption) that can be explained by the linear relationship between the duration of time and the time interval after an eruption. Duration 240 120 178 234 235 269 255 220 Interval After 92 65 72 94 83 101 87

Interpretation/New Def
86% of the total variation in time intervals after eruptions (y) can be explained by the duration times (x) 14% of the total variation in time intervals after eruptions can be explained by factors other than duration times. Recall: y-hat = x (x = duration in seconds, y = predicted time interval). When x = 180, we predict a y-hat of ____? This single value is called a point estimate. It is our best predicted value. How accurate is it? We use prediction intervals to answer this question.

Definitions Prediction Interval: an interval estimate of a predicted value of y. The development of a prediction interval requires a measure of the spread of sample points about the regression line. The standard error of estimate, denoted by se is a measure of the differences (or distances) between the observed sample y-values and the predicted values y that are obtained using the regression equation. That is, it is a collective measure of the spread of the sample points about the regression line. Se = A measure of how sample points deviate from their regression line.

Standard Error of Estimate
 (y – y)2 n – 2 ^ se = or se =  y2 – b0  y – b1  xy n – 2 Formula 10-5 Second formula is found on page 560 of text. Example found on page 561.

Example: Old Faithful Given the sample data in Table 10-1, find the standard error of estimate se for the duration/interval data. Duration 240 120 178 234 235 269 255 220 Interval After 92 65 72 94 83 101 87 This is on page 561.

Prediction Interval for an Individual y
y - E < y < y + E ^ where E = t2 se n(x2) – (x)2 n(x0 – x)2 1 n x0 represents the given value of x t2 has n – 2 degrees of freedom

Example: Old Faithful For the paired duration/interval after eruption times in Table 10-1, we have found that for a duration of 180 sec, the best predicted time interval after the eruption is 76.9 min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 180 sec (so that x = 180). Duration 240 120 178 234 235 269 255 220 Interval After 92 65 72 94 83 101 87 E = t2 se + n(x2) – (x)2 n(x0 – x)2 1 + 1 n This is on page 562.

Example: Old Faithful - cont
For the paired duration/interval after eruption times in Table 10-1, we have found that for a duration of 180 sec, the best predicted time interval after the eruption is 76.9 min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 180 sec (so that x = 180). y – E < y < y + E 76.9 – 13.4 < y < 63.5 < y < 90.3 ^ This is on page 562.

Same problem, different x
Duration 240 120 178 234 235 269 255 220 Interval After 92 65 72 94 83 101 87 For the paired duration/interval after eruption times, find: For a duration of 150 sec, the best predicted time interval after the eruption is _____ min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 150 sec (so that x = 150). E = t2 se + n(x2) – (x)2 n(x0 – x)2 1 + 1 n ^ ^ y – E < y < y + E

Download ppt "Warm Up: #23, 24 on page 555 Answer each of the following questions for #23, 24: a) Is there a linear correlation? Use software then Table A-6 to prove."

Similar presentations