Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reexpressing Data. Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight.

Similar presentations


Presentation on theme: "Reexpressing Data. Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight."— Presentation transcript:

1 Reexpressing Data

2 Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight enough condition: Does the scatterplot look straight? Randomization Condition: are the individuals a representative sample from the population? Does the Plot Thicken? Condition: Does a scatterplot of the residuals against predicted values have ANY pattern? It shouldn’t. Clusters indicate that the relationship probably isn’t linear. Boring is good..

3 Huh? The picture you see is a scatter plot of fuel efficiency (mpg) vs weight of a late model car (lbs). Looks ok, and r 2 is.816 (sometimes written as 81.6%) so maybe it is ok. The second graph, extrapolating the data, suggests that a 6000 lb car would get 0 mpg. The H2 weighs 6400 lbs. Now, it doesn’t get good gas mileage, but it is better than 0. The third graph is the residual graph of fuel efficiency. See how it has a “bend” in it? This is the indication that the original graph is not well described by a near expression.

4 So what do we do? Weight vs Fuel efficiency (gal/100 miles) may solve the problem. Where else do we re-express? If I ran 9 miles per hour on a mile run.. Is that fast? What if I did that on a 100 m dash?

5 Why re-express? 1. Make the distribution of a variable (histogram) more symmetric. 2. Makes the spread of several groups (seen in side by side boxplots) more alike. 3. Make the form of a scatterplot more nearly linear. 4. Makes the scatter spread out more evenly.

6 Ladder of Powers This is a list of ways to re-express data 2Square the yTry if unimodal and skewed to left 1Do Nothing ½Square root the yFor counted data, try this 0Log the yMeasurements that cannot be negative and values that grow by percentage increases may benefit. If the data has zeroes, add a small constant to all values before finding the log - ½Reciprocal Root of y Not common, but sometimes useful Reciprocal of yThis is like our running example

7 Ladder of Powers Part 2 If nothing feels good, you can try one of these three ideas (as long as none of the data is negative or zero) Exponentialx, log y Logarithmiclog x, yWide range of x values or a rapid decent but leveling might benefit from doing this. Remember the discussion of CEO’s salaries from earlier in the year? Powerlog x, log yAlways an option.

8 This is not a cure-all!! Some data just won’t benefit. Don’t worry. Yes, some data fits curved models, but the calculations are pretty intense.

9 Example

10 Homework Take a worksheet with you and do the circled problems.


Download ppt "Reexpressing Data. Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight."

Similar presentations


Ads by Google