Presentation is loading. Please wait.

Presentation is loading. Please wait.

DO NOW Read Pages 222 – 224 Read Pages 222 – 224 Stop before “Goals of Re-expression” Stop before “Goals of Re-expression” Answer the following questions:

Similar presentations


Presentation on theme: "DO NOW Read Pages 222 – 224 Read Pages 222 – 224 Stop before “Goals of Re-expression” Stop before “Goals of Re-expression” Answer the following questions:"— Presentation transcript:

1 DO NOW Read Pages 222 – 224 Read Pages 222 – 224 Stop before “Goals of Re-expression” Stop before “Goals of Re-expression” Answer the following questions: Answer the following questions: What is a purpose of re-expressing data? What is a purpose of re-expressing data? What can we check to see that the re-expression makes the linear model appropriate? What can we check to see that the re-expression makes the linear model appropriate?

2

3 Previous Average Current Average Total Change Failures Thanksgiving EC Points # of Students w/ HW EC 3rd 78%83%5 pts114511 4th 74%77%3 pts3995 5th 74%81%7 pts119712

4 HW CHECK - #7 7. Movie Dramas. a) The units for the slopes of these lines are millions of dollars per minutes of running time. b) The slopes of the regression lines are the same. Dramas and movies from other genres have costs for longer movies that increase at the same rate. c) The regression line for dramas has a lower y-intercept. Regardless of running time, dramas cost about 20 million dollars less than other genres of movies of the same running time.

5 HW CHECK - #8 8. Movie Ratings. a) The slopes of the regression lines are approximately the same. The costs increase at about the same rate for all genres as the movies get longer. b) Although the costs per minute are about the same, it costs about 20 million dollars less to make an R-rated movies than a movie of the other rating type with the same running time. c) Omitting King Kong would make the slope for the PG-13 movies steeper. We would conclude that the cost per minute of PG-13 movies was greater than the cost per minute of movies with other rating.

6 HW CHECK - #12 D 1) The point has high leverage and a small residual. 2) The point is not influential. It has the potential to be influential, because its position far from the mean of the explanatory variable gives it high leverage. However, the point is not exerting much influence, because it reinforces the association. 3) If the point were removed, the correlation would become weaker. The point heavily reinforces the association. Removing it would weaken the association. 4) The slope would remain roughly the same, since the point is not influential.

7 HW CHECK - #16 Suppose that researchers find a moderately strong positive correlation between the amount of time that a child spends playing computer games and the aggressiveness they display at school. 16. What’s the effect? 1) Playing computer games may make kids more violent. 2) Violent kids may like to play computer games. 3) Playing computer games and violence may both be caused by a lurking variable such as the child’s home life or a genetic predisposition to aggressiveness.

8 Slide 10 - 8 STRAIGHT TO THE POINT (CONT.) The relationship between fuel efficiency (in miles per gallon) and weight (in pounds) for late model cars looks fairly linear at first:

9 Slide 10 - 9 STRAIGHT TO THE POINT (CONT.) A look at the residuals plot shows a problem:

10 CONVERTING UNITS 3 feet = ____ inches 3 feet = ____ inches 30 inches = ____ feet 30 inches = ____ feet 50 yards = ____ feet 50 yards = ____ feet Does changing the units change the meaning of the quantity? Does changing the units change the meaning of the quantity?

11 Slide 10 - 11 STRAIGHT TO THE POINT (CONT.) We can re-express fuel efficiency as gallons per hundred miles (a reciprocal) and eliminate the bend in the original scatterplot:

12 Slide 10 - 12 STRAIGHT TO THE POINT (CONT.) A look at the residuals plot for the new model seems more reasonable:

13 Slide 10 - 13 STRAIGHT TO THE POINT We cannot use a linear model unless the relationship between the two variables is linear. Often re-expression can save the day, straightening bent relationships so that we can fit and use a simple linear model. We cannot use a linear model unless the relationship between the two variables is linear. Often re-expression can save the day, straightening bent relationships so that we can fit and use a simple linear model. Two simple ways to re-express data are with logarithms and reciprocals. Two simple ways to re-express data are with logarithms and reciprocals. Re-expressions can be seen in everyday life— everybody does it. Re-expressions can be seen in everyday life— everybody does it.

14 Slide 10 - 14 WHAT CAN GO WRONG? Beware of multiple modes. Beware of multiple modes. Re-expression cannot pull separate modes together. Re-expression cannot pull separate modes together. Watch out for scatterplots that turn around. Watch out for scatterplots that turn around. Re-expression can straighten many bent relationships, but not those that go up then down, or down then up. Re-expression can straighten many bent relationships, but not those that go up then down, or down then up.

15 Slide 10 - 15 GOALS OF RE-EXPRESSION Goal 1: Make the distribution of a variable (as seen in its histogram, for example) more symmetric.

16 Slide 10 - 16 GOALS OF RE-EXPRESSION (CONT.) Goal 2: Make the spread of several groups (as seen in side-by-side boxplots) more alike, even if their centers differ.

17 Slide 10 - 17 GOALS OF RE-EXPRESSION (CONT.) Goal 3: Make the form of a scatterplot more nearly linear.

18 Slide 10 - 18 GOALS OF RE-EXPRESSION (CONT.) Goal 4: Make the scatter in a scatterplot spread out evenly rather than thickening at one end. This can be seen in the two scatterplots we just saw with Goal 3: This can be seen in the two scatterplots we just saw with Goal 3:

19 TEXTBOOK #1 Suppose you have fit a linear model to some data and now take a look at the residuals. For each of the following possible residuals plots, tell whether you would try a re- expression and, if so, why.

20 TEXTBOOK #3 Here is the residual plot for a linear model describing the trend in the number of passengers departing from the Oakland (CA) airport each month since the start of 1997. 1.Can you account for the pattern shown here? 2.Would a re-expression help us deal with this pattern? Explain.

21 TEXTBOOK #7 One of the important factors determining a car’s Fuel Efficiency is its Weight. Let’s examine this relationship again, for 11 cars. Describe the association between these variables shown in the scatterplot.

22 TEXTBOOK #7 One of the important factors determining a car’s Fuel Efficiency is its Weight. Let’s examine this relationship again, for 11 cars. The linear model for this data is Fuel Efficiency = 47.96 – 7.65Weight. What does the slope of the line say about this relationship?

23 TEXTBOOK #7 One of the important factors determining a car’s Fuel Efficiency is its Weight. Let’s examine this relationship again, for 11 cars. The linear model for this data is Fuel Efficiency = 47.96 – 7.65Weight. Let’s examine the residuals plot for this linear regression. Is this model appropriate?

24 TEXTBOOK #7 Let’s re-express the variable Fuel Consumption (gal/100 mi) to examine the fuel efficiency of the 11 cars. The revised linear regression is Fuel Efficiency = 1.77 + 0.62 Weight. Explain why this model appears to be better than the linear model. Interpret the slope of this line.


Download ppt "DO NOW Read Pages 222 – 224 Read Pages 222 – 224 Stop before “Goals of Re-expression” Stop before “Goals of Re-expression” Answer the following questions:"

Similar presentations


Ads by Google