Presentation on theme: "Warm Up: #23, 24 on page 555 Answer each of the following questions for #23, 24: a) Is there a linear correlation? Use software then Table A-6 to prove."— Presentation transcript:
1 Warm Up: #23, 24 on page 555Answer each of the following questions for #23, 24:a) Is there a linear correlation? Use software then Table A-6 to prove it. Are your answers the same?b) Graph the points (don’t forget axis labels). If there is a correlation, graph the LSRL and continue to do c-h.c) Find the vital statistics (r, r-squared, a, b, y-hat – don’t forget to define x and y)d) Tell me what r and r-squared means in the context of the problem (r: form, direction, strength) (r-squared: how much of the variation in x can be explained by the variation in y)e) Find the residualsf) Draw the residual plot – is the regression line a good model for the data? Why?g) For # 23, predict the winning time when the temperature is 73 degrees Fahrenheit.h) For #24, predict the height of a daughter when her mother is 66 inches tall.
2 The SAT essay: longer is better? Words460422402365357278236201168156133114108100403Score6543214013883202581891286769738735533732527215013573
4 Key ConceptIn this section we proceed to consider a method for constructing a prediction interval, which is an interval estimate of a predicted value of y.Using paired data (x,y), we describe the variation that can be explained between x and y and the variation that is unexplained.
5 Unexplained, Explained, and Total Deviation Figure 10-9
6 Definitions Total Deviation The total deviation of (x, y) is the vertical distance y – ybar, which is the distance between the point (x, y) and the horizontal line passing through the sample mean y-bar.Explained DeviationThe explained deviation is the vertical distance y-hat- y-bar, which is the distance between the predicted y-value and the horizontal line passing through the sample mean y-bar.Unexplained DeviationThe unexplained deviation is the vertical distance y – y-hat, which is the vertical distance between the point (x, y) and the regression line. (The distance y – y-hat is also called a residual, as defined in Section 10-3.)
7 ParticularsWe can explain the discrepancy between y-bar = 9 and y-hat=13 by noting that there is a linear relationship best described by the LSRL (y-y-hat).The discrepancy between y-hat = 13 and y=19 can’t be explained by the LSRL = residual or unexplained deviation (y-y-hat)
9 r2 = Definition Coefficient of determination is the amount of the variation in y thatis explained by the regression line.r2 =explained variation.total variationExample found at bottom of page 559 of text.The value of r2 is the proportion of the variation in y that is explained by the linear relationship between x and y.
10 Warm Up: Day 2 Consider the following data set: Find: Total variation Explained variationUnexplained variationXY142248532
11 Try again! Consider the following data set: Find: Total variation Explained variationUnexplained variationXY123547
12 Not Old Faithful again!In section 10-2 we used the duration/interval after eruption times in Table 10-1 to find that r = find the coefficient of determination. Also, find the percentage of the total variation in y (time interval after eruption) that can be explained by the linear relationship between the duration of time and the time interval after an eruption.Duration240120178234235269255220Interval After926572948310187
13 Interpretation/New Def 86% of the total variation in time intervals after eruptions (y) can be explained by the duration times (x)14% of the total variation in time intervals after eruptions can be explained by factors other than duration times.Recall: y-hat = x (x = duration in seconds, y = predicted time interval). When x = 180, we predict a y-hat of ____?This single value is called a point estimate. It is our best predicted value. How accurate is it?We use prediction intervals to answer this question.
14 DefinitionsPrediction Interval: an interval estimate of a predicted value of y. The development of a prediction interval requires a measure of the spread of sample points about the regression line.The standard error of estimate, denoted by se is a measure of the differences (or distances) between the observed sample y-values and the predicted values y that are obtained using the regression equation. That is, it is a collective measure of the spread of the sample points about the regression line.Se = A measure of how sample points deviate from their regression line.
15 Standard Error of Estimate (y – y)2n – 2^se =orse = y2 – b0 y – b1 xyn – 2Formula 10-5Second formula is found on page 560 of text.Example found on page 561.
16 Example: Old FaithfulGiven the sample data in Table 10-1, find the standard error of estimate se for the duration/interval data.Duration240120178234235269255220Interval After926572948310187This is on page 561.
17 Prediction Interval for an Individual y y - E < y < y + E^whereE = t2 sen(x2)– (x)2n(x0 – x)21nx0 represents the given value of x t2 has n – 2 degrees of freedom
18 Example: Old FaithfulFor the paired duration/interval after eruption times in Table 10-1, we have found that for a duration of 180 sec, the best predicted time interval after the eruption is 76.9 min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 180 sec (so that x = 180).Duration240120178234235269255220Interval After926572948310187E = t2 se+n(x2) – (x)2n(x0 – x)21 + 1nThis is on page 562.
19 Example: Old Faithful - cont For the paired duration/interval after eruption times in Table 10-1, we have found that for a duration of 180 sec, the best predicted time interval after the eruption is 76.9 min. Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 180 sec (so that x = 180).y – E < y < y + E76.9 – 13.4 < y <63.5 < y < 90.3^This is on page 562.
20 Same problem, different x Duration240120178234235269255220Interval After926572948310187For the paired duration/interval after eruption times, find:For a duration of 150 sec, the best predicted time interval after the eruption is _____ min.Construct a 95% prediction interval for the time interval after the eruption, given that the duration of the eruption is 150 sec (so that x = 150).E = t2 se+n(x2) – (x)2n(x0 – x)21 + 1n^^y – E < y < y + E
Your consent to our cookies if you continue to use this website.