Presentation is loading. Please wait.

Presentation is loading. Please wait.

2018-12-06 Lecture 5 732G21/732G28/732A35 Detta är en generell mall för att göra PowerPoint presentationer enligt LiUs grafiska profil. Du skriver in din.

Similar presentations


Presentation on theme: "2018-12-06 Lecture 5 732G21/732G28/732A35 Detta är en generell mall för att göra PowerPoint presentationer enligt LiUs grafiska profil. Du skriver in din."— Presentation transcript:

1 Lecture 5 732G21/732G28/732A35 Detta är en generell mall för att göra PowerPoint presentationer enligt LiUs grafiska profil. Du skriver in din rubrik, namn osv på sid 1. Börja sedan skriva in din text på sid 2. För att skapa nya sidor, tryck Ctrl+M. Sidan 3 anger placering av bilder och grafik. Titta gärna på ”Baspresentation 2008” för exempel. Den sista bilden är en avslutningsbild som visar LiUs logotype och webadress. Om du vill ha fast datum, eller ändra författarnamn, gå in under Visa, Sidhuvud och Sidfot. Linköpings universitet

2 Extra sums of squares The difference between SSE for a model with a certain setup of predictors and the SSE for a model with the same predictors plus one or more additional predictors Consider the model Then, we can define the extra sums of squares from adding X2 to the model as Linköpings universitet

3 Salary example Regression Analysis: Salary (Y) versus Age (X1)
Salary example Regression Analysis: Salary (Y) versus Age (X1) The regression equation is Salary (Y) = Age (X1) Predictor Coef SE Coef T P Constant Age (X1) S = R-Sq = 80.5% R-Sq(adj) = 77.2% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Linköpings universitet

4 Salary (Y) Age (X1) Highschool points (X2) 17 21 30 32 120 27 40 35 56
Salary (Y) Age (X1) Highschool points (X2) 17 21 30 32 120 27 40 35 56 90 44 61 160 38 55 36 39 140 25 33 80 Linköpings universitet

5 Regression Analysis: Salary (Y) versus Age (X1), Highschool points (X2) The regression equation is Salary (Y) = Age (X1) Highschool points (X2)    Predictor Coef SE Coef T P Constant Age (X1) Highschool points (X2) S = R-Sq = 96.3% R-Sq(adj) = 94.8% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Source DF Seq SS Age (X1) Highschool points (X2) Linköpings universitet

6 Partial F-test H0: βq = βq+1 = … = βp-1 = 0 Ha: not all β in H0 = 0 Reject H0 if F* > F(1-α; p-q; n-p) Linköpings universitet

7 Linköpings universitet
Salary (Y) Age (X1) Highschool points (X2) Female/Male (X3) 17 21 1 30 32 120 27 40 35 56 90 44 61 160 38 55 36 39 140 25 33 80 Linköpings universitet

8 Regression Analysis: Salary (Y) versus Age (X1), Highschool point, ... The regression equation is Salary (Y) = Age (X1) Highschool points (X2) Female/Male (X3) Predictor Coef SE Coef T P Constant Age (X1) Highschool points (X2) Female/Male (X3) S = R-Sq = 98.4% R-Sq(adj) = 97.2% Analysis of Variance Source DF SS MS F P Regression Residual Error Total Source DF Seq SS Age (X1) Highschool points (X2) Female/Male (X3) Linköpings universitet

9 Summary of tests of regression coefficients
Summary of tests of regression coefficients Test whether a single βk = 0: t-test Test whether all β = 0: F-test Test whether a subset of the β = 0: Partial F-test Linköpings universitet

10 Coefficient of partial determination
Coefficient of partial determination Tell us how much R2 increases if another predictor is added to the model. Consider and add X2. Linköpings universitet

11 Multicollinearity When we have high correlation among the predictors.
Multicollinearity When we have high correlation among the predictors. Multicollinearity causes Adding or deleting a predictor changes the estimates of the regression coefficients very much. The standard errors of the regression coefficients become very large. Thus, conclusions from the model become more imprecise. The estimated regression coefficients will be nonsignificant, although they are highly correlated with Y. When we interpret the regression coefficients, we interpret one at the time, keeping the others constant. If there is high correlation among the predictors, this is of course not possible because if we change one of them, the others will change too (it is possible mathematically, but not logically). Linköpings universitet

12 Indications of the presence of multicollinearity
Indications of the presence of multicollinearity Large changes in the regression coefficients when a predictor is added or deleted Non-significant results in t-tests on the regression coefficients for variables that through scatter matrix and correlation matrices (and logically) seemed to be very important. Estimated regression coefficients with a sign opposite to what we expect it to be. Linköpings universitet

13 Formal test of the presence of multicollinearity
Formal test of the presence of multicollinearity Variance Inflation Factor (VIF) is the coefficient of determination when performing a regression of Xk versus the other X-variables in the model. Consider a model with predictors X1, X2 and X3: Decision rule: if the largest VIF > 10 and the average of the VIF:s are larger than one, we may multicollinearity in the model. give give give Linköpings universitet


Download ppt "2018-12-06 Lecture 5 732G21/732G28/732A35 Detta är en generell mall för att göra PowerPoint presentationer enligt LiUs grafiska profil. Du skriver in din."

Similar presentations


Ads by Google