Download presentation
Presentation is loading. Please wait.
1
Lesson #32 Simple Linear Regression
2
Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent variable(s), X’ s. ere, all independent variables are numeric Simple Linear Regression - one independent variable, X
3
Assume the “true” relationship is Y = + X + is the intercept ( or Y = 0 + 1 X + is the slope represents random error ~ N(0, 2 ) regression coefficients
4
Y = + X Y X
5
Assumptions LINELINE - Linear relationship - Independent observations - Normal errors - Equal variances
6
Estimation We want to estimate the regression coefficients The estimates are the estimated intercept the estimated slope The estimated line is then
7
Y X We want the line that “best fits” the data
8
Y X XiXi YiYi eiei
9
The predicted value at the i th point is: The residual at the i th point is: Ordinary least squares (OLS) chooses the line that minimizes the sum of the squared residuals
10
For OLS: The estimated line, or prediction equation, is
11
Sums of Squares SS TOTAL SS ERROR = SS RESIDUAL SS REGRESSION = SS TOTAL - SS ERROR - total variation in dependent variable - unexplained variation after fitting model - variation “explained” by model
12
ANOVA Table Source df SS MS F Regression Error Total 1 n–2 n–1 SS REG SS ERROR SS TOTAL MS REG MS ERROR F0F0
13
E ( MS ERROR ) = 2 Reject H 0 if F 0 > F (1,n-2),1- F 0 = tests H 0 : = 0 H 1 : 0 R 2 = R 2 = (r) 2
14
X = fat cal. Y = chol. 28 35 43 24 51 33 18 40 177 194 247 186 232 207 143 206 = (28 - 34)(177 - 199) + (40 - 34)(206 - 199) = 2180 = (28-34) 2 +…+ (40-34) 2 = 800
15
Y = Cholesterol X = % Calories from Fat
16
= 2.725 = 199 – (2.725)(34)= 106.35 = 106.35 + 2.725(fat cal.) = 106.35 + 2.725(30)= 188.1 = 106.35 + 2.725(28)= 182.65
17
X = fat cal. Y = chol. 28 35 43 24 51 33 18 40 177 194 247 186 232 207 143 206 182.650 201.725 223.525 171.750 245.325 196.275 155.400 215.350 e -5.650 -7.725 23.475 14.250 -13.325 10.725 -12.400 -9.350 SS ERROR = (-5.650) 2 + … + (-9.350) 2 SS TOTAL = (177-199) 2 +…+ (206-199) 2 = 1379.5 = 7320 SS REGRESSION = 7320 – 1379.5= 5940.5
18
Source df SS MS F Regression Error Total 1 6 7 5940.5 25.84 5940.5 1379.5 7320 Reject H 0 if F 0 > F (1,6),.95 = 5.99 Reject H 0 Conclude there is a positive linear relationship between calories from fat and cholesterol R 2 = =.8115
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.