Presentation is loading. Please wait.

Presentation is loading. Please wait.

Objective Find the line of regression. Use the Line of Regression to Make Predictions.

Similar presentations


Presentation on theme: "Objective Find the line of regression. Use the Line of Regression to Make Predictions."— Presentation transcript:

1 Objective Find the line of regression. Use the Line of Regression to Make Predictions.

2 Relevance To be able to find a model to best represent quantitative data with 2 variables and use it to make predictions.

3 A better alternative to “storing” numbers! 2-Variable Statistics

4 Now that we have used one variable statistics to “store” our necessary numbers, let’s learn another way that’s even better

5 Find the mean and standard deviation of the x’s and y’s using 2-var stats. xy 216 189 303 354

6 Find the mean and standard deviation of the x’s and y’s using 2-var stats. xy 216 189 303 354 Use this when using your lists to find r.

7 Find the correlation Coefficient: xy 46 815 22 1918 2227

8 Find the correlation Coefficient: xy 3227 4082 3034 1814 151 2522

9 Find the correlation Coefficient: xy 272 860 1064 1452 2843 3240 1832

10 A student wonders if tall women tend to date taller men than do short women. She measures herself, her dormitory roommate, and the women in the adjoining rooms. Then she measures the next man each woman dates. Draw & discuss the scatterplot and calculate the correlation coefficient. Women (x) Men (y) 6672 6468 6670 6568 7071 65

11 A student wonders if tall women tend to date taller men than do short women. She measures herself, her dormitory roommate, and the women in the adjoining rooms. Then she measures the next man each woman dates. Draw & discuss the scatterplot and calculate the correlation coefficient. Women (x) Men (y) 667201.18590 6468-0.9535-0.39530.3769 667000.39530 6568-0.4767-0.39530.1884 70711.90690.79061.5076 65 -0.4767-1.5810.7538

12 Linear Regression

13 Guess the correlation coefficient http://istics.net/stat/Correlations/

14 Can we make a Line of Best Fit Want: 1)The distances to the line to be the same. 2)The smallest distances.

15 Regression Line When a scatterplot shows a linear relationship, we’d like to summarize the overall pattern by drawing a line on the scatterplot. A regression line summarizes the relationship between two variables, but only in a specific setting: when one of the variables helps explain or predict the other. Regression – unlike scatter plots – REQUIRES that we have an explanatory variable and a response variable.

16 Regression Line This is a line that describes how a response variable (y) changes as an explanatory variable (x) changes. It’s used to predict the value of (y) for a given value of (x). The regression line is a model for the data.

17 Let’s try some! http://illuminations.nctm.org/ActivityDetail.aspx?ID=146

18 Regression Line When given the response variable (y) and the explanatory variable (x), the regression line relating y to x has equation of the following form:

19 The following data shows the number of miles driven and advertised price for 11 used Honda CR-Vs from the 2002-2006 model years (prices found at www.carmax.com). The scatterplot below shows a strong, negative linear association between number of miles and advertised cost. The correlation is -0.874. The line on the plot is the regression line for predicting advertised price based on number of miles. Thousand Miles Driven Cost (dollars) 2217998 2916450 3514998 3913998 4514599 4914988 5513599 5614599 6911998 7014450 8610998

20 Use the regression line to answer the following. Slope y-intercept The predicted price of the car decreases by $86.18 for every additional thousand miles driven. The predicted cost ($18,773) of a used Honda 2002 to 2006 CR-V with 0 miles.

21 Predict the price for a Honda with 50,000 miles. (Use 50 in equation!)

22 Extrapolation This refers to using a regression line for prediction far outside the interval of values of the explanatory variable x used to obtain the line. They are not usually very accurate predictions. Should we predict the asking price for a used 2002-2006 Honda CR-V with 250,000 miles? No! We only have data for cars with between 22,000 and 86,000 miles. We don’t know if the linear pattern will continue beyond these values. In fact, if we did predict the asking price for a car with 250 thousand miles, it would be −$2772!

23 Slope: Y-int: Predict weight after 16 wk Predict weight at 2 years:

24 Residual A residual is the difference between an observed value of the response variable and the value predicted by the regression line.

25 Example The equation of the least-squares regression line for the sprint time and long-jump distance data is. Find and interpret the residual for the student who had a sprint time of 8.09 seconds. This student jumped 69.97 inches farther than we expected based on his sprint time.

26 Regression Let’s see how a regression line is calculated.

27 Fat vs Calories in Burgers Fat (g)Calories 19410 31580 34590 35570 39640 39680 43660

28 Let’s standardize the variables FatCalz - x'sz - y's 19410-1.959-2 31580-0.42-0.1 34590-0.0360 355700.09-0.2 396400.60.56 396800.61 436601.120.78 The line must contain the point and pass through the origin.

29 Let’s clarify a little. (Just watch & listen)

30 Line of Best Fit –Least Squares Regression Line It’s the line for which the sum of the squared residuals is smallest. We want to find the mean squared residual. Focus on the vertical deviations from the line.

31 Let’s find it. (just watch & soak it in) St. Dev of z scores is 1 so variance is 1 also. This is r! Note: MSR is “Mean Squared Residual”

32 Continue…… Since this is a parabola – it reaches it’s minimum at This gives us Hence – the slope of the best fit line for z- scores is the correlation coefficient → r.

33 Slope – rise over run A slope of r for z-scores means that for every increase of 1 standard deviation in there is an increase of r standard deviations in. “Over 1 and up r.” Translate back to x & y values – “over one standard deviation in x, up r standard deviations in y. Slope of the regression line is:

34 Why is correlation “r” Because it was calculated from the regression of y on x after standardizing the variables – just like we have just done – thus he used r to stand for (standardized) regression.

35 Let’s Write the Equation Fat (g)Calories 19410 31580 34590 35570 39640 39680 43660 Slope: Explain the slope: Your calories increase by 11.056 for every additional gram of fat.

36 Now for the final part – the equation! y-intercept: Remember – it has to pass through the point. Solve for y-intercept Find the value of the y-intercept

37 Put the parts together to form the equation of the regression line. Now it can be used to predict. How many calories do I expect to find in a hamburger that has 25 grams of fat?

38 Try another problem Mean call - to-shock time Survival Rate 290 645 730 95 122

39 Interpret the slope: Interpret the y-intercept: Predict the survival rate for a 10 min. call to shock time Predict the survival rate for a 20 min. call to shock time The survival rate will decrease by 9.2956 for every additional minute of call-to-shock. The survival rate is 101.3285 when there is NO call to shock time.

40 Try another problem SAT MathSAT Verbal 600650 720800 540600 450500 620

41 Interpret the slope: Interpret the y-intercept: Predict the verbal score for a math score of 400 Predict the verbal score for a math score of 500 Verbal score will increase by 1.05 pts for every additional point in math. Verbal score with no math score. Extrapolated!

42 That’s…all…..Folks! Homework: p. 191 (27-32, 35, 37,39,41, 47)


Download ppt "Objective Find the line of regression. Use the Line of Regression to Make Predictions."

Similar presentations


Ads by Google