Presentation is loading. Please wait.

Presentation is loading. Please wait.

Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line.

Similar presentations


Presentation on theme: "Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line."— Presentation transcript:

1 Warm up Use calculator to find r,, a, b

2 Chapter 8 LSRL-Least Squares Regression Line

3  Find the Least Squares Regression Line and interpret its slope, y-intercept, and the coefficients of correlation and determination AP Statistics Objectives Ch8

4 Bivariate data x – variable: is the independent or explanatory variable y- variable: is the dependent or response variable Use x to predict y

5 Copyright © 2010, 2007, 2004 Pearson Education, Inc. Slide 8 - 5 Fat Versus Protein: An Example The following is a scatterplot of total fat versus protein for 30 items on the Burger King menu:

6 Copyright © 2010, 2007, 2004 Pearson Education, Inc. Slide 8 - 6 The Linear Model The correlation in this example is 0.83. It says “There seems to be a linear association between these two variables,” but it doesn’t tell what that association is. We can say more about the linear relationship between two quantitative variables with a model. A model simplifies reality to help us understand underlying patterns and relationships.

7 Copyright © 2010, 2007, 2004 Pearson Education, Inc. Slide 8 - 7 The Linear Model (cont.) The linear model is just an equation of a straight line through the data. The points in the scatterplot don’t all line up, but a straight line can summarize the general pattern with only a couple of parameters. The linear model can help us understand how the values are associated.

8 b – is the slope –it is the amount by which y increases when x increases by 1 unit a – is the y-intercept –it is the height of the line when x = 0 –in some situations, the y-intercept has no meaning - (y-hat) means the predicted y Be sure to put the hat on the y

9 Least Squares Regression Line LSRL bestThe line that gives the best fit to the data set minimizesThe line that minimizes the sum of the squares of the deviations from the line

10 1 - 10 0 20 40 60 0204060 X Y Scattergram 1.Plot of All (X i, Y i ) Pairs 2.Suggests How Well Model Will Fit

11 1 - 11 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

12 1 - 12 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

13 1 - 13 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

14 1 - 14 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

15 1 - 15 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

16 1 - 16 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

17 1 - 17 Thinking Challenge How would you draw a line through the points? How do you determine which line ‘fits best’?

18 Sum of the squares = 61.25 -4 4.5 -5 y =.5(0) + 4 = 4 0 – 4 = -4 (0,0) (3,10) (6,2) (0,0) y =.5(3) + 4 = 5.5 10 – 5.5 = 4.5 y =.5(6) + 4 = 7 2 – 7 = -5

19 Copyright © 2010, 2007, 2004 Pearson Education, Inc. Slide 8 - 19 Residuals The model won’t be perfect, regardless of the line we draw. Some points will be above the line and some will be below. The estimate made from a model is the predicted value (denoted as ).

20 Copyright © 2010, 2007, 2004 Pearson Education, Inc. Slide 8 - 20 Residuals (cont.) The difference between the observed value and its associated predicted value is called the residual. To find the residuals, we always subtract the predicted value from the observed one:

21 (0,0) (3,10) (6,2) Sum of the squares = 54 Use a calculator to find the line of best fit Find y - y -3 6 What is the sum of the deviations from the line? Will it always be zero? minimizes LSRL The line that minimizes the sum of the squares of the deviations from the line is the LSRL.

22 Types of Regression Models Positive Linear Relationship Negative Linear Relationship Relationship NOT Linear No Relationship

23 Slope: unitx increase/decreaseby For each unit increase in x, there is an approximate increase/decrease of b in y. Interpretations

24 The ages (in months) and heights (in inches) of seven children are given. x1624426075102120 y24303540485660 Find the LSRL. Interpret the slope and correlation coefficient in the context of the problem.

25 Slope: unitx increase/decreaseby For each unit increase in x, there is an approximate increase/decrease of b in y. Interpretations

26 Correlation coefficient: There is a _________________ association between the ________ Slope: _______________ ______ _______________________________ For an increase in _______________, there is an approximate ______ of _______________________________

27 Correlation coefficient: strong, positive, linear age and height of children There is a strong, positive, linear association between the age and height of children. Slope: age of one month increase.34 inches in heights of children. For an increase in age of one month, there is an approximate increase of.34 inches in heights of children.

28 The ages (in months) and heights (in inches) of seven children are given. x1624426075102120 y24303540485660 Predict the height of a child who is 4.5 years old. Predict the height of someone who is 20 years old.

29 Extrapolation should notThe LSRL should not be used to predict y for values of x outside the data set. It is unknown whether the pattern observed in the scatterplot continues outside this range.

30 The ages (in months) and heights (in inches) of seven children are given. x1624426075102120 y24303540485660 Calculate x & y. Plot the point (x, y) on the LSRL. Will this point always be on the LSRL?

31 non-resistant The correlation coefficient and the LSRL are both non-resistant measures.

32 Formulas – on chart

33 The following statistics are found for the variables posted speed limit and the average number of accidents. Find the LSRL & predict the number of accidents for a posted speed limit of 50 mph.

34 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.512100

35 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.512100

36 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.512100

37 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.512100

38 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.512100

39 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.512100

40 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.51.2100

41 Chapter 8 r a)1022030.5 b)20.067.21.2-0.4 c)126-0.8 d)2.51.2100

42 Chapter 8 R 2 – Must also be interpreted when describing a regression model “With the linear regression model, _____% of the variability in _______ (response variable) is accounted for by variation in ________ (explanatory variable)” The remaining variation is due to the residuals

43 R 2 = +1 Examples of Approximate R 2 Values y x y x R 2 = 1 Perfect linear relationship between x and y: 100% of the variation in y is explained by variation in x

44 Examples of Approximate R 2 Values y x y x 0 < R 2 < 1 Weaker linear relationship between x and y: Some but not all of the variation in y is explained by variation in x

45 Examples of Approximate R 2 Values R 2 = 0 No linear relationship between x and y: The value of Y does not depend on x. (None of the variation in y is explained by variation in x) y x R 2 = 0

46 Did you say 2?Wrong. Try again. So what?

47 Important Note: The correlation is not given directly in this software package. You need to look in two places for it. Taking the square root of the “R squared” (coefficient of determination) is not enough. You must look at the sign of the slope too. Positive slope is a positive r-value. Negative slope is a negative r-value.

48 So here you should note that the slope is positive. The correlation will be positive too. Since R 2 is 0.482, r will be +0.694.

49 Coefficient of Determination = (0.694) 2 =0.4816

50 With the linear regression model, 48.2% of the variability in airline fares is accounted for by the variation in distance of the flight.

51 There is an increase of 7.86 cents for every additional mile. There is an increase of $7.86 for every additional 100 miles.

52 The model predicts a flight of zero miles will cost $177.23. The airline may have built in an initial cost to pay for some of its expenses.

53

54

55

56 8. Using those estimates, draw the line on the scatterplot.


Download ppt "Warm up Use calculator to find r,, a, b. Chapter 8 LSRL-Least Squares Regression Line."

Similar presentations


Ads by Google