Presentation is loading. Please wait.

Presentation is loading. Please wait.

OUTLIER, HETEROSKEDASTICITY,AND NORMALITY

Similar presentations


Presentation on theme: "OUTLIER, HETEROSKEDASTICITY,AND NORMALITY"— Presentation transcript:

1 OUTLIER, HETEROSKEDASTICITY,AND NORMALITY
Robust Regression HAC Estimate of Standard Error Quantile Regression

2

3

4

5

6

7

8 Robust regression analysis
alternative to a least squares regression model when fundamental assumptions are unfulfilled by the nature of the data resistant to the influence of outliers deal with residual problems Stata & E-Views

9 Alternatives of OLS A. White’s Standard Errors
OLS with HAC Estimate of Standard Error B. Weighted Least Squares Robust Regression C. Quantile Regression Median Regression Bootstrapping

10

11

12

13 If the residual distribution is normally distributed, the analyst can determine where the level of significance or rejection regions begin. Even if the sample size is large, the influence of the outlier can increase the local and possibly even the global error variance. This inflation of error variance decreases the efficiency of estimation.

14

15 OLS and Heteroskedasticity
What are the implications of heteroskedasticity for OLS? Under the Gauss–Markov assumptions (including homoskedasticity), OLS was the Best Linear Unbiased Estimator. Under heteroskedasticity, is OLS still Unbiased? Is OLS still Best?

16 A. Heteroskedasticity and Autocorrelation Consistent Variance Estimation
the robust White variance estimator rendered regression resistant to the heteroskedasticity problem. Harold White in 1980 showed that for asymptotic (large sample) estimation, the sample sum of squared error corrections approximated those of their population parameters under conditions of heteroskedasticity and yielded a heteroskedastically consistent sample variance estimate of the standard errors

17

18

19

20

21

22

23

24 Quantile Regression Problem
The distribution of Y, the “dependent” variable, conditional on the covariate X, may have thick tails. The conditional distribution of Y may be asymmetric. The conditional distribution of Y may not be unimodal. Neither regression nor ANOVA will give us robust results. Outliers are problematic, the mean is pulled toward the skewed tail, multiple modes will not be revealed.

25 Reasons to use quantiles rather than means
Analysis of distribution rather than average Robustness Skewed data Interested in representative value Interested in tails of distribution Unequal variation of samples E.g. Income distribution is highly skewed so median relates more to typical person that mean.

26 Quantiles Cumulative Distribution Function Quantile Function
Discrete step function

27 Regression Line

28 The Perspective of Quantile Regression (QR)

29

30

31 Optimality Criteria Linear absolute loss Mean optimizes
Quantile τ optimizes I = 0,1 indicator function

32 Quantile Regression Absolute Loss vs. Quadratic Loss
Quadratic loss penalizes large errors very heavily. When p=.5 our best predictor is the median; it does not give as much weight to outliers. When p=.7 the loss is asymmetric; large positive errors are more heavily penalized then negative errors.

33

34 Simple Linear Regression
Food Expenditure vs Income Engel survey of 235 Belgian households Range of Quantiles Change of slope at different quantiles?

35 Bootstrapping When distributional normality and homoskedasticity assumptions are violated, many researchers resort to nonparametric bootstrapping methods

36

37

38

39 Bootstrap Confidence Limits


Download ppt "OUTLIER, HETEROSKEDASTICITY,AND NORMALITY"

Similar presentations


Ads by Google