Presentation is loading. Please wait.

Presentation is loading. Please wait.


Similar presentations

Presentation on theme: "REGRESSION DIAGNOSTICS"— Presentation transcript:


2 Problems Influentials and outliers Heteroscedasticity Autocorrelation
Multicollinearity – relationship of independent variables


4 Residuals - review Unstandardized residuals H = hat matrix
Predicted residuals

5 Residuals - review Standardized residuals Jackknife residuals

6 I. Influentials If omitted from computation big change in regression coeffs can be found. Goal: to find and exclude

7 Influentials -diagnosis
DFBETA(-i)=b-b(-i) Rule of thumb: Problem if NDFBETA>2/√n Note : NDFFIT problem if NDFFIT>2/√(n/p)

8 II. Heteroscedasticity
Assumption for regression: variance of error is the same for all values of indep. variable Checking: Charts for residuals vs. ind. vars Consequence: big standard errors of coeffs, t-tests statistically insig. Tests - Glejser, Goldfeld-Quandt tests Analytical solution: weighted LS (WLS)

9 Glejser’s test Model for residuals on ind. vars :

10 III. Multicollinearity
Estimate: Strong relationship between ind. variables: X´X is singular matrix or nearly singular Consequence: standard errors of coeffs are inflated, t-test statistically insignificant, estimates are not stable

11 Multicollinearity Diagnosis: Correlation of ind. vars – cor. coeff>0,8 Other options: a) Tolerance (1-R2j) b) VIF = 1/(1-R2j) VIF diagonál cells R-1 c) Conditon index: square root of ratio: max lambda/min lambda ROT* > 30 → problem *ROT=Rules of thumb

12 Multicollinearity Solution Ignore  Leave out variable Get more data
Use factor analysis (see later) Ridge regression Biased estimates but smaller standard erorrs (slight change in diagonal’s elements)

13 IV. Autocorrelation Assumption for regression: variance of error is independent for individual observations Checking: Charts for residuals vs. time Consequence: big standard errors of coeffs, t-tests statistically insig. Tests - Durbin-Watson Solution: weighted LS Autocorrelation is present in time series (usually not used in social sciences)


Similar presentations

Ads by Google