Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bivariate Data Pick up a formula sheet, Notes for Bivariate Data – Day 1, and a calculator.

Similar presentations


Presentation on theme: "Bivariate Data Pick up a formula sheet, Notes for Bivariate Data – Day 1, and a calculator."— Presentation transcript:

1 Bivariate Data Pick up a formula sheet, Notes for Bivariate Data – Day 1, and a calculator.

2 Archaeopteryx is an extinct beast having feathers like a bird but teeth and a long bony tail like a reptile. Only six fossil specimens are known. Because these specimens differ greatly in size, some scientists think they are different species rather than individuals from the same species. If the specimens belong to the same species and differ in size because some are younger than others, there should be a positive linear relationship between the bones from all individuals. An outlier from this relationship would suggest a different species. Here are data on the lengths in centimeters of the femur (a leg bone) and the humerus (a bone in the upper arm) for the five specimens that preserve both bones. femur3856596474 humerus4163707284 Load data into list 1 and list 2 and make a scatterplot.

3

4 This is not enough. What do we need?

5 femur length in cm humerus length in cm 38 41 64 72 A “cheater” way to put scale on a scatterplot is to trace two points and label each axis with those two values.

6 femur length in cm humerus length in cm 38 41 64 72 explanatory variable? response variable? femur length in cm humerus length in cm But does it really matter here?No. But often it does.

7 Find the correlation coefficient and explain what it means.

8 correlation coefficient Did you get it?

9 If you did not get the correlation coefficient, you must turn your diagnostics on. Push 2 nd then 0. Scroll down to diagnostics on. Push “enter” twice and little calculator guy will say “done”.

10 Find the correlation coefficient and explain what it means. correlation coefficient

11 r =.994 The correlation coefficient is ALWAYS between -1 and 1. It does not change when the units or scale is transformed. Let’s check out the formula sheet.

12 r =.994 What does it mean? The correlation coefficient describes the strength of the linear relationship. The closer it is to 1 or -1 the more the points line up. These points line up pretty well with a negative slope. The correlation coefficient would be close to -1. graph on the bottom of your notes

13 r =.994 What does it mean? The correlation coefficient describes the strength of the linear relationship. The closer it is to 1 or -1 the more the points line up. These points line up pretty well with a positive slope. The correlation coefficient would be close to 0.8 or 0.9.

14 r =.994 What does it mean? The correlation coefficient describes the strength of the linear relationship. The closer it is to 1 or -1 the more the points line up. These points don’t line up at all. The correlation coefficient would be nearly 0.

15 r =.994 What does it mean? The correlation coefficient describes the strength of the linear relationship. The closer it is to 1 or -1 the more the points line up. These points line up sort of well with a negative slope. The correlation coefficient might be – 0.6 or – 0.7.

16 r =.994 What does it mean? The correlation coefficient describes the strength of the linear relationship. The closer it is to 1 or -1 the more the points line up. These points don’t line up at all. The correlation coefficient would be fairly close to 0.

17 r =.994 Here’s what you write: This suggests a strong, positive, linear relationship between femur length and humerus length.

18 So what’s the rest of this stuff?

19 slope y-intercept coefficient of determination equation: ŷ = 1.197x – 3.660 This is hugely important! It means the predicted y.

20 equation: ŷ = 1.197x – 3.660 where x = femur length and y = humerus length slope = 1.197 ; For every 1 cm increase in femur length, the humerus length increases by 1.197 cm, on average. y-intercept ; When the femur length is 0 cm, the humerus length is about -3.660 cm. Of course, this is ridiculous and is an example of extrapolation.

21 Residuals Since our line misses many of the points, a residual is a measure of the “miss.” residual = y – ŷ (actual – predicted) a residual is the vertical distance from the point to the line

22 What is the residual for the point (56, 63)? residual = y – ŷ ŷ = 1.197x – 3.660 ŷ = 1.197(56) – 3.660 = 63.372 residual = y – ŷ = 63 – 63.372 = -. 372 Find the residual for the point (74, 84). -.918

23 A residual plot is a graph of all the residuals. To get resid, push 2 nd stat resid This only works if the little guy knows the equation of the line.

24 femur length in cm residuals 38 -.8 3 59 This is a horrible residual plot. We’d like the points to be equally scattered above and below the line. Residual Plot

25 That’s it for dinosaurs today.

26 Limitations of Correlation and Regression Correlation is linear only. One influential point or incorrectly entered data point can greatly change the regression line. Correlation is not robust (resistant to outliers). Correlations based on averages are usually too high when applied to individuals. Extrapolation can yield silly results. Predictions for y should be made using the range of values in the data. Correlation does not imply a cause-and-effect relationship.

27 Examples of a perfect linear fit A job pays $10 per hour. The relationship between hours worked and pay. hours worked pay

28 Examples of a perfect linear fit The association between hours worked and time spent pursuing hedonistic pleasures. hours worked hedonistic pleasure time Here we could switch x and y. Sometimes we are simply curious about an association.

29 Time for more data?


Download ppt "Bivariate Data Pick up a formula sheet, Notes for Bivariate Data – Day 1, and a calculator."

Similar presentations


Ads by Google