Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unit 3 – Linear regression

Similar presentations


Presentation on theme: "Unit 3 – Linear regression"— Presentation transcript:

1 Unit 3 – Linear regression

2 Linear relationships Variables Response Variable Explanatory Variable
When dealing with linear relationships we need 2 variables! Variables

3 Strength + Direction + Form
There is a … Strength + Direction + Form …relationship between x and y. Linear Curved Scattered Weak Moderate Strong Positive Negative Describing a Linear Relationship

4 Finding the line of best fit
𝑦 = 𝐵 𝑜 + 𝐵 1 x Slope = 𝑆 𝑦 𝑆 𝑥 * r Y-intercept X value Predicted Y Slope Equation

5 CAUTION: When using a line of best fit
Extrapolation : using your equation to predict outside the range of the data you used to come up with your equation. Lurking variables : an underlying variable that is causing the relationship to look different than it is in reality Outliers : your equation is strongly influenced by outliers Claiming X causes Y. They just have a linear relationship Warning!

6 Sentences Interpreting Linear Relationship Output
Slope As __X___ increases by 1, ___Y___ will be expected to increase/decrease by ___slope_____ Y-intercept When ___X__ is 0, ___Y___ will be expected to be ___Y-intercept___ R-squared ___ 𝑹 𝟐 (as percent) __ of the variability in ___Y___ can be explained by ___X__ Interpreting Linear Relationship Output

7 What is a residual? It’s the difference between the actual value and the predicted value. Residual = 𝑦− 𝑦 Residual

8 What do you want residuals to look like
What do you want residuals to look like? (To be confident your line is a good fit) Scattered or have a pattern? All above 0, all below 0, or half and half? Equal variance around 0, or not equal variance around 0. Residuals

9 SO…. How do you know if you can do a linear regression
relationship between X and Y is linear (can check by looking at a scatterplot of x and y or the residual plot) no obvious lurking variables (you can kind of assume this for this course) simple random sample (was the data taken from a SRS) constant variance of the residuals (plot of residuals) residuals vary according to a normal distribution (normal quantile plot of residuals) Warning

10 Confidence Intervals for the slope
𝑏 1 ± 𝑡 𝛼 2 ;𝑛−2 ∗𝑠𝑒 𝑏 1 standard error se(b1) can be obtained from the JMP output. 𝑏 1 is the predicted value of the slope Degrees of freedom = n-2 I am ___% confident that the true slope between __X__ and __Y__ is between _____ and ______

11 Hypothesis testing for the slope
𝐻 0 : 𝐵 1 =0 𝐻 𝐴 : 𝐵 1 ≠0 Obtain a t-stat Obtain a p-value. Decide to reject Ho. If you reject Ho, then there is sufficient evidence to suggest there is a linear relationship between ___explanatory___ and ___response___


Download ppt "Unit 3 – Linear regression"

Similar presentations


Ads by Google