Presentation is loading. Please wait.

Presentation is loading. Please wait.

Part II Exploring Relationships Between Variables.

Similar presentations


Presentation on theme: "Part II Exploring Relationships Between Variables."— Presentation transcript:

1 Part II Exploring Relationships Between Variables

2 Chapter 7 Scatterplots, Association, and Correlation Scatterplot Example : Timeplot Easiest way to observe the relationship between two quantitative variables.

3 Looking at Scatterplots Direction Positive Direction Negative Direction Amount of Scatter

4 Scatterplot Details Cartesian Plane Roles for Variables x axis : Explanatory (Independent variable) y axis :Response (Dependent variable) o. (x,y) Ordered Pair x y

5 Correlation Positive Association How Strong (0 to 1)

6 Standardizing x and y Z x Z y >0

7 Correlation Coefficient Measures the strength of the linear association between two quantitative variables

8 Correlation Conditions Quantitative Variables Straight enough condition No Outliers

9 Correlation Properties Sign : Direction of association Always between -1 and 1 r = 1 maximum linear Association r = 0 No linear association Correlation treats x and y symmetrically Correlation has no units Correlation is not affected by changes in center or scale. Correlation measures the strength of the linear association between two variables Variables can have a strong association but still have small correlation Correlation is sensitive to outliers

10 What Can Go Wrong? Check the conditions Don’t confuse Correlation with Causation Lurking Variables A hidden variable that stands behind a relationship and determines it by simultaneously affecting both variables.

11 Exercises Page 186-187

12 Chapter 8 Linear Regression The Linear Model is just an equation of a straight line through the data

13 Equation of the Line y = mx + b Line will pass through Standardizing y = mx Line will pass through (0,0) x y b

14 Predicted Values Each predicted “y” tends to be closer to its mean (in standard deviations) than its corresponding “x” was. This property of the linear model is called regression to the mean and the line is called the regression line. If you are 1SD taller then the predicted value for your weight will be (r)*1SD heavier

15 Back to the original units Slope Predicted Value y-intercept

16 Residuals Difference between the observed value and its associated predicted value Residual = Data – Model Residuals help us to see whether the model makes sense

17 Burger King Menu Total Fat (g) vs. Protein (g) Example Page 194 r = 0.83 Mean x = 17.2g s x = 14g Mean y = 23.5g s y =16.4g How much fat for an item of 31g of protein?

18 Regression Formula (Prediction) Slope? y-intercept? Prediction? Fat?

19 Correlation and residuals If r = 1.0 all the residuals would be zero and have no variation If r = 0 the model would simply predict the mean of y. The residuals from that prediction would just be the fat observed values minus their mean. Same variation as the data

20 The Squared Correlation r^2 gives the fraction of the data’s variance accounted for by the model and 1-r^2 is the fraction of the original variance left in the residuals. R^2=0 None of the variance in the data is in the model; all of it is still in the residuals. How big should it be?

21 Least Squares Regression Line Minimizes the sum of the squared residuals. Assumptions and Conditions Linearity Straight enough condition Plot Residuals Regression Output on the computer Page 214


Download ppt "Part II Exploring Relationships Between Variables."

Similar presentations


Ads by Google