Download presentation

Presentation is loading. Please wait.

1
Lesson #32 Simple Linear Regression

2
Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent variable(s), X’ s. ere, all independent variables are numeric Simple Linear Regression - one independent variable, X

3
Assume the “true” relationship is Y = + X + is the intercept ( or Y = 0 + 1 X + is the slope represents random error ~ N(0, 2 ) regression coefficients

4
Y = + X Y X

5
Assumptions LINELINE - Linear relationship - Independent observations - Normal errors - Equal variances

6
Estimation We want to estimate the regression coefficients The estimates are the estimated intercept the estimated slope The estimated line is then

7
Y X We want the line that “best fits” the data

8
Y X XiXi YiYi eiei

9
The predicted value at the i th point is: The residual at the i th point is: Ordinary least squares (OLS) chooses the line that minimizes the sum of the squared residuals

10
For OLS: The estimated line, or prediction equation, is

11
Sums of Squares SS TOTAL SS ERROR = SS RESIDUAL SS REGRESSION = SS TOTAL - SS ERROR - total variation in dependent variable - unexplained variation after fitting model - variation “explained” by model

12
ANOVA Table Source df SS MS F Regression Error Total 1 n–2 n–1 SS REG SS ERROR SS TOTAL MS REG MS ERROR F0F0

13
E ( MS ERROR ) = 2 Reject H 0 if F 0 > F (1,n-2),1- F 0 = tests H 0 : = 0 H 1 : 0 R 2 = R 2 = (r) 2

14
X = fat cal. Y = chol. 28 35 43 24 51 33 18 40 177 194 247 186 232 207 143 206 = (28 - 34)(177 - 199) + (40 - 34)(206 - 199) = 2180 = (28-34) 2 +…+ (40-34) 2 = 800

15
Y = Cholesterol X = % Calories from Fat

16
= 2.725 = 199 – (2.725)(34)= 106.35 = 106.35 + 2.725(fat cal.) = 106.35 + 2.725(30)= 188.1 = 106.35 + 2.725(28)= 182.65

17
X = fat cal. Y = chol. 28 35 43 24 51 33 18 40 177 194 247 186 232 207 143 206 182.650 201.725 223.525 171.750 245.325 196.275 155.400 215.350 e -5.650 -7.725 23.475 14.250 -13.325 10.725 -12.400 -9.350 SS ERROR = (-5.650) 2 + … + (-9.350) 2 SS TOTAL = (177-199) 2 +…+ (206-199) 2 = 1379.5 = 7320 SS REGRESSION = 7320 – 1379.5= 5940.5

18
Source df SS MS F Regression Error Total 1 6 7 5940.5 25.84 5940.5 1379.5 7320 Reject H 0 if F 0 > F (1,6),.95 = 5.99 Reject H 0 Conclude there is a positive linear relationship between calories from fat and cholesterol R 2 = =.8115

Similar presentations

© 2024 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google