OUTLIER, HETEROSKEDASTICITY,AND NORMALITY

Slides:



Advertisements
Similar presentations
The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Advertisements

Heteroskedasticity Hill et al Chapter 11. Predicting food expenditure Are we likely to be better at predicting food expenditure at: –low incomes; –high.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
The Simple Regression Model
Hypothesis Testing Steps in Hypothesis Testing:
Heteroskedasticity Prepared by Vera Tabakova, East Carolina University.
Taupo, Biometrics 2009 Introduction to Quantile Regression David Baird VSN NZ, 40 McMahon Drive, Christchurch, New Zealand
The Simple Linear Regression Model: Specification and Estimation
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
Simple Linear Regression
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
1Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Multiple Regression Analysis
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Multiple Regression Analysis
FIN357 Li1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Topic 3: Regression.
The Simple Regression Model
Economics Prof. Buckles
Chapter 7 Forecasting with Simple Regression
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Ordinary Least Squares
Understanding Multivariate Research Berry & Sanders.
Quantile Regression. The Problem The Estimator Computation Properties of the Regression Properties of the Estimator Hypothesis Testing Bibliography Software.
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
Microeconometric Modeling William Greene Stern School of Business New York University.
12.1 Heteroskedasticity: Remedies Normality Assumption.
Copyright © 2014 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill Education.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Chapter 5 Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals.
Stat 112: Notes 2 Today’s class: Section 3.3. –Full description of simple linear regression model. –Checking the assumptions of the simple linear regression.
Copyright © 2006 The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin The Two-Variable Model: Hypothesis Testing chapter seven.
Copyright ©2011 Brooks/Cole, Cengage Learning Inference about Simple Regression Chapter 14 1.
Simple Linear Regression (SLR)
Simple Linear Regression (OLS). Types of Correlation Positive correlationNegative correlationNo correlation.
1 Javier Aparicio División de Estudios Políticos, CIDE Primavera Regresión.
The Simple Linear Regression Model: Specification and Estimation ECON 4550 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s.
Linear Regression Basics III Violating Assumptions Fin250f: Lecture 7.2 Spring 2010 Brooks, chapter 4(skim) 4.1-2, 4.4, 4.5, 4.7,
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Chap 8 Heteroskedasticity
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
1/61: Topic 1.2 – Extensions of the Linear Regression Model Microeconometric Modeling William Greene Stern School of Business New York University New York.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Statistics 350 Lecture 2. Today Last Day: Section Today: Section 1.6 Homework #1: Chapter 1 Problems (page 33-38): 2, 5, 6, 7, 22, 26, 33, 34,
Statistics 350 Review. Today Today: Review Simple Linear Regression Simple linear regression model: Y i =  for i=1,2,…,n Distribution of errors.
High Speed Heteroskedasticity Review. 2 Review: Heteroskedasticity Heteroskedasticity leads to two problems: –OLS computes standard errors on slopes incorrectly.
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
Regression Overview. Definition The simple linear regression model is given by the linear equation where is the y-intercept for the population data, is.
Estimating standard error using bootstrap
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Linear Regression with One Regression
Basic Estimation Techniques
The Simple Linear Regression Model: Specification and Estimation
Fundamentals of regression analysis
Microeconometric Modeling
Basic Estimation Techniques
I271B Quantitative Methods
Chapter 12 Inference on the Least-squares Regression Line; ANOVA
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Microeconometric Modeling
Mean, Median, Mode The Mean is the simple average of the data values. Most appropriate for symmetric data. The Median is the middle value. It’s best.
Simple Linear Regression
Heteroskedasticity.
Econometrics I Professor William Greene Stern School of Business
Multiple Regression Analysis: OLS Asymptotics
Microeconometric Modeling
Introductory Statistics
Presentation transcript:

OUTLIER, HETEROSKEDASTICITY,AND NORMALITY Robust Regression HAC Estimate of Standard Error Quantile Regression

Robust regression analysis alternative to a least squares regression model when fundamental assumptions are unfulfilled by the nature of the data resistant to the influence of outliers deal with residual problems Stata & E-Views

Alternatives of OLS A. White’s Standard Errors OLS with HAC Estimate of Standard Error B. Weighted Least Squares Robust Regression C. Quantile Regression Median Regression Bootstrapping

If the residual distribution is normally distributed, the analyst can determine where the level of significance or rejection regions begin. Even if the sample size is large, the influence of the outlier can increase the local and possibly even the global error variance. This inflation of error variance decreases the efficiency of estimation.

OLS and Heteroskedasticity What are the implications of heteroskedasticity for OLS? Under the Gauss–Markov assumptions (including homoskedasticity), OLS was the Best Linear Unbiased Estimator. Under heteroskedasticity, is OLS still Unbiased? Is OLS still Best?

A. Heteroskedasticity and Autocorrelation Consistent Variance Estimation the robust White variance estimator rendered regression resistant to the heteroskedasticity problem. Harold White in 1980 showed that for asymptotic (large sample) estimation, the sample sum of squared error corrections approximated those of their population parameters under conditions of heteroskedasticity and yielded a heteroskedastically consistent sample variance estimate of the standard errors

Quantile Regression Problem The distribution of Y, the “dependent” variable, conditional on the covariate X, may have thick tails. The conditional distribution of Y may be asymmetric. The conditional distribution of Y may not be unimodal. Neither regression nor ANOVA will give us robust results. Outliers are problematic, the mean is pulled toward the skewed tail, multiple modes will not be revealed.

Reasons to use quantiles rather than means Analysis of distribution rather than average Robustness Skewed data Interested in representative value Interested in tails of distribution Unequal variation of samples E.g. Income distribution is highly skewed so median relates more to typical person that mean.

Quantiles Cumulative Distribution Function Quantile Function Discrete step function

Regression Line

The Perspective of Quantile Regression (QR)

Optimality Criteria Linear absolute loss Mean optimizes Quantile τ optimizes I = 0,1 indicator function

Quantile Regression Absolute Loss vs. Quadratic Loss Quadratic loss penalizes large errors very heavily. When p=.5 our best predictor is the median; it does not give as much weight to outliers. When p=.7 the loss is asymmetric; large positive errors are more heavily penalized then negative errors.

Simple Linear Regression Food Expenditure vs Income Engel 1857 survey of 235 Belgian households Range of Quantiles Change of slope at different quantiles?

Bootstrapping When distributional normality and homoskedasticity assumptions are violated, many researchers resort to nonparametric bootstrapping methods

Bootstrap Confidence Limits