Download presentation
Presentation is loading. Please wait.
Published byΛέων Δελή Modified over 5 years ago
1
Multicollinearity What does it mean? A high degree of correlation amongst the explanatory variables What are its consequences? It may be difficult to separate out the effects of the individual regressors. Standard errors may be overestimated and t-values depressed. Note: a symptom may be high R2 but low t-values How can you detect the problem? Examine the correlation matrix of regressors - also carry out auxiliary regressions amongst the regressors. Look at the Variance Inflation Factors NOTE: be careful not to apply t tests mechanically without checking for multicollinearity multicollinearity is a data problem, not a misspecification problem
2
Multicollinearity Sources of multicollinearity
Problem in data collection Over defined model has more predictor variables Most economic variables move together Use of lagged values of some explanatory variables
3
Multicollinearity Consequences
In case of perfect multicollinearity the values of the estimates are indeterminate and their variances are infinite. While in case of partial collinearity it is possible to get the estimator but their variances are tend to be vary large as the degree of correlation between explanatory variable increases.
4
Multicollinearity 2. Because of large standard error confidence interval are tend to be larger. 3.Coefficient of determination may also be high 4.OLS estimator and variances become vary sensitive
5
Multicollinearity Detection of Multicollinearity
Examination of Correlation matrix If the determinant of correlation matrix is near 1, so no multicollinearity. If if near to zero a perfect multicollinearity.
6
Variance Inflation Factor (VIF)
Multicollinearity inflates the variance of an estimator VIFJ is cjj of (x’x)-1 serious multicollinearity problem if VIFJ>5
7
Multicollinearity Farrar-Gluaber test It is a set of three test
Chi square for detecting the existence and severity of multicollinearity . F test is used for locating which variables are multicollinear. T-test is used for finding pattern of Multicollinearity
8
Examination of correlation matrix
A very simple procedure to detect multicollinearity is the inspection of off diagonal element in the correlation matrix. If the explanatory variables and are linearly independent the will be close to zero. It is helpful only in detecting pair wise collinearity. But it is not sufficient for detecting anything more complex than pair wise. For more complex one we used the determinant of correlation matrix if significant multicollinearity, while no multicollinearity
9
Multicollinearity: Definition
Multicollinearity is the condition where the independent variables are related to each other. Causation is not implied by multicollinearity. As any two (or more) variables become more and more closely correlated, the condition worsens, and ‘approaches singularity’. Since the X's are supposed to be fixed, this a sample problem. Since multicollinearity is almost always present, it is a problem of degree, not merely existence. May 19, 2019
10
Multicollinearity: Implications
Consider the following cases A) No multicollinearity The regression would appear to be identical to separate bivariate regressions This produces variances which are biased upward (too large) making t-tests too small. For multiple regression this satisfies the assumption. May 19, 2019
11
Multicollinearity: Implications (cont.)
B) Perfect Multicollinearity Some variable Xi is a perfect linear combination of one or more other variables Xj, therefore X'X is singular, and |X'X| = 0. This is matrix algebra notation. It means that one variable is a perfect linear function of another (e.g. X2 = X1+3.2) A model cannot be estimated under such circumstances. The computer dies. May 19, 2019
12
Multicollinearity: Implications (cont.)
C. A high degree of Multicollinearity When the independent variables are highly correlated the variances and covariances of the Bi's are inflated (t ratio's are lower) and R2 tends to be high as well. The B's are unbiased (but perhaps useless due to their imprecise measurement as a result of their variances being too large). In fact they are still BLUE. OLS estimates tend to be sensitive to small changes in the data. Relevant variables may be discarded. May 19, 2019
13
Multicollinearity: Causes
Sampling mechanism.Poorly constructed design & measurement scheme or limited range. Statistical model specification: adding polynomial terms or trend indicators. Too many variables in the model - the model is over-determined. Theoretical specification is wrong. Inappropriate construction of theory or even measurement May 19, 2019
14
Multicollinearity: Tests/Indicators
|X'X| = approaches 0 Since the determinant is a function of variable scale, this measure doesn't help a whole lot. We could, however, use the determinant of the correlation matrix and therefore bound the range from 0. to 1.0 May 19, 2019
15
Multicollinearity: Tests/Indicators (cont.)
Tolerance: If the tolerance equals 1, the variables are unrelated. If TOLj = 0, then they are perfectly correlated. Variance Inflation Factors (VIFs) Tolerance May 19, 2019
16
Interpreting VIFs No multicollinearity produces VIFs = 1.0
If the VIF is greater than 10.0, then multicollinearity is probably severe. 90% of the variance of Xj is explained by the other X’s. In small samples, a VIF of about 5.0 may indicate problems May 19, 2019
17
Multicollinearity: Tests/Indicators (cont.)
R2 deletes - tries all possible models of X's and by includes/ excludes based on small changes in R2 with the inclusion/omission of the variables (taken 1 at a time) F is significant, But no t value is. Adjusted R2 declines with a new variable Multicollinearity is of concern when either May 19, 2019
18
Multicollinearity: Tests/Indicators (cont.)
I would avoid the rule of thumb Beta's are > 1.0 or < -1.0 Sign changes occur with the introduction of a new variable The R2 is high, but few t-ratios are. Eigenvalues and Condition Index - If this topic is beyond Gujarati, it’s beyond me. May 19, 2019
19
Multicollinearity: Remedies
Increase sample size Omit Variables Scale Construction/Transformation Factor Analysis Constrain the estimation. Such as the case where you can set the value of one coefficient relative to another. May 19, 2019
20
Multicollinearity: Remedies (cont.)
Change design (LISREL maybe or Pooled cross- sectional Time series) Ridge Regression This technique introduces a small amount of bias into the coefficients to reduce their variance. Ignore it - report adjusted R2 and claim it warrants retention in the model. May 19, 2019
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.