# Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure The Ordinary Least Squares Estimation Procedure, Omitted Explanatory.

## Presentation on theme: "Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure The Ordinary Least Squares Estimation Procedure, Omitted Explanatory."— Presentation transcript:

Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure The Ordinary Least Squares Estimation Procedure, Omitted Explanatory Variable Bias, and Consistency Revisit Omitted Explanatory Variable Bias Review of Our Previous Explanation of Omitted Explanatory Variable Bias Omitted Explanatory Variable Bias and the Explanatory Variable/Error Term Independence Premise Instrumental Variable (IV) Estimation Procedure: A Two Regression Procedure Omitted Explanatory Variables Example: 2008 Presidential Election Justifying the Instrumental variable (IV) Estimation Procedure The Mechanics The Two “Good” Instrument Conditions Instrument Variable (IV) Application: 2008 Presidential Election The Mechanics The Two “Good” Instrument Conditions Revisited

x1 t Up  y t Up  Direct Effect  x1 > 0  x2 > 0  x1 and x2 positively correlated  Typically, x2 t Up  y t Up  Proxy Effect The direct effect is what we want the coefficient estimate to capture, the influence that x1 t has on y t. Review: Omitting an explanatory variable from a regression will bias the OLS estimation procedure whenever two conditions are met. Bias results if the omitted explanatory variable: influences the dependent variable; is correlated with an included explanatory variable. When these two conditions are met, the coefficient estimate of the included variable is a composite of two effects; the estimate of the included variable reflects the influence that the Direct Effect: the effect that the included explanatory variable actually has on the dependent variable. This is the effect we want regression analysis to isolate. Proxy Effect: the effect that the omitted explanatory variable has on the dependent variable because the included variable is acting as a proxy for the omitted explanatory variable. Revisit Omitted Variables Phenomenon x1 t and x2 t are positively correlated. To illustrate the effect of omitted variables, assume that  x1 > 0 and  x2 > 0 Model: y t =  Const +  x1 x1 t +  x2 x2 t + e t Question: What happens when we omit the explanatory variable x2 from the regression? Proxy effect causes upward bias. We will now see how we can approach the omitted variable phenomenon in a different way. Effect of omitting the explanatory variable x2: Intuitive Approach

Review: The Ordinary Least Squares (OLS) Estimation Procedure, Correlation between an Explanatory Variable and the Error Term, and Bias Explanatory Variable and Error Term Are Positively Correlated  Estimated Equation More Steeply Sloped Than Actual Equation  OLS Estimation Procedure for Coefficient Value Is Biased Up Explanatory Variable and Error Term Are Negatively Correlated  Estimated Equation Less Steeply Sloped Than Actual Equation  OLS Estimation Procedure for Coefficient Value Is Biased Down Esty = b Const + b x x

Effect of omitting the explanatory variable x2: Error term approach x1 t Up  t =  x2 x2 t + e t and  x2 > 0  x1 and x2 positively correlated  Typically, x2 t Up   t Up x1 t and x2 t are positively correlated.  x1 > 0 and  x2 > 0 y t =  Const +  x1 x1 t +  x2 x2 t + e t =  Const +  x1 x1 t + (  x2 x2 t + e t ) =  Const +  x1 x1 t +  t when x2 t is omitted, the error term becomes:  t =  x2 x2 t + e t  OLS estimation procedure for the value of x1’s coefficient is biased upward x1 t is positively correlated with  t Question: Several lectures ago, we showed that the ordinary least squares (OLS) estimation procedure for the included coefficient’s value is biased, but might the ordinary least squares (OLS) estimation procedure be consistent?

Review: Unbiased and Consistent Estimation Procedures Unbiased: Small Sample Property. The estimation procedure does not systematically underestimate or overestimate the actual value. Consistent: Large Sample Property. Both the mean and variance of the estimate’s probability distribution are important for consistency: Formally, the mean of the estimate’s probability distribution equals the actual value. Mean[Est] = Actual Value When the estimate’s probability distribution is symmetric, the chances that the estimate is greater than the actual value equal the chances that it is less. Mean of the estimate’s probability distribution: The estimation procedure is unbiased: Mean[Est] = Actual Value The estimation procedure is biased, but the magnitude of the bias diminishes as the sample size becomes larger. or Variance of the estimate’s probability distribution: The variance diminishes as the sample size becomes larger Formally, as the sample size approaches infinity the variance approaches 0: As Sample Size   : Variance[Est]  0 Formally, as the sample size approaches infinity the mean approaches the actual value: As Sample Size   : Mean[Est]  Actual Value Unbiasedness is called a small sample property because it does not depend on the sample size. Unbiasedness depends only of the mean of the estimate’s probability distribution. Either

 Lab 20.1 Question: Might the ordinary least squares (OLS) estimation procedure be consistent? EstimationCorrActualSampleActualMean ofMagnitudeVariance of ProcedureX1&X2Coef2SizeCoefCoef1 Estsof BiasCoef1 Ests OLS.604.0502.0 OLS.604.01002.0 OLS.604.01502.0  4.4  2.4  6.6  4.4  2.4  3.2  4.4  2.4  2.1 Question: Is the OLS estimation procedure unbiased?No. Question: Is the OLS estimation procedure consistent?No. Summary: When an explanatory variable is omitted that influences the dependent variable; is correlated with an included explanatory variable. the ordinary least squares (OLS) estimation procedure for the value of the included variable’s coefficient is: biased. not consistent.

The Instrumental Variable (IV) Estimation Procedure y t =  Const +  x x t +  t When x t and  t are correlated  x t is a “problem” explanatory variable “Problem” Explanatory Variable: x t is the “problem” explanatory variable: The explanatory variable, x t, is correlated with the error term,  t. Consequently, the explanatory variable/error term independence premise is violated. The ordinary least squares (OLS) estimation procedure for the coefficient value is biased. Question: Why is there a problem?

Addressing the “Problem” Explanatory Variable Using Instrumental Variables Choose an Instrument: A “good” instrument, z t, must meet two conditions. The instrument, z t, must be: Correlated with the “problem” explanatory variable, x t. Instrumental Variables (IV) Regression 1: Use the instrument, z t, to provide an “estimate” of the problem explanatory variable, x t. Dependent Variable: “Problem” explanatory variable, x t. Explanatory Variable: Instrument, z t. Instrumental Variables (IV) Regression 2:In the original model, replace the “problem” explanatory variable, x t, with its surrogate, Estx t, the estimate of the “problem” explanatory variable provided by the instrument, z t, from IV Regression 1. Dependent Variable: Original dependent variable, y t. Explanatory Variable: Estimate of the “problem” explanatory variable based on the results from IV Regression 1, Estx t. Uncorrelated with the error term,  t. Estimate of the problem explanatory variable: Estx t = a Const + a z z t where a Const and a z are the estimates of the constant and coefficient in this regression, IV Regression 1.

Justifying the Instrument Conditions Instrument/”Problem” Explanatory Variable Correlation: The instrument, z t, must be correlated with the “problem” explanatory variable, x t. Estx t = a Const + a z z t We are using the instrument to create a surrogate for the “problem” explanatory variable: Use the instrument, z t, to provide an estimate of the “problem” explanatory variable, x t. The estimate, Estx t, will be a “good” surrogate only if the instrument is correlated with the problem explanatory variable. This will occur only if the instrument, z t, is correlated with the “problem” explanatory variable, x t. Dependent Variable: “Problem” Explanatory Variable, x t. Explanatory Variable: Instrument, z t Focus on Instrumental Variables (IV) Regression 1:

Original Model: Replace the “problem” explanatory variable with its surrogate To avoid violating the explanatory variable/error term independence premise z t and  t must be independent. Estx t = a Const + a z z t y t =  Const +  x x t +  t y t =  Const +  x Estx t +  t  Estx t and  t must be independent Instrument/Error Term Independence: The instrument, z t, must be independent of the error term,  t. Focus on Instrumental Variables (IV) Regression 2 In the original model, replace the “problem” explanatory variable, x t, with its surrogate, Estx t, the estimate of the “problem” explanatory variable provided by the instrument, z t, from IV Regression 1.

Instrumental Variable Example: 2008 Presidential Election AdvDeg t Percent adults who have advanced degrees in state t Coll t Percent adults who graduated from college in state t HS t Percent adults who graduated from high school in state t PopDen t Population density of state t (persons per square mile) VoteDemPartyTwo t Percent of the vote received in 2008 received by the Democratic party in state t based on the two major parties (percent)  EViews Population Density Theory: States with high population densities have large urban areas which are more likely to vote for the Democratic candidate, Obama. “Liberalness” Theory: Since the Democratic party is more liberal than the Republican party, a high “liberalness” value would increase the vote of the Democratic candidate, Obama. Model: VoteDemPartyTwo t =  Const +  PopDen PopDen t +  Lib Liberal + e t  PopDen > 0  Lib > 0 But we have no data for a state’s “liberalness”; hence, we have no choice – we must omit it. Ordinary Least Squares (OLS) Dependent Variable: VoteDemPartyTwo Explanatory Variable(s):EstimateSEt-StatisticProb PopDen 0.0049150.0009575.1373380.0000 Const 50.356211.33414137.744310.0000 Number of Observations51

Model: VoteDemPartyTwo t =  Const +  PopDen PopDen t +  Lib Liberal + e t Question: Might there be a serious econometric problem when Liberal is omitted? =  Const +  PopDen PopDen t + (  Lib Liberal + e t ) =  Const +  PopDen PopDen t +  t where  t =  Lib Liberal + e t PopDen t Up  t =  Lib Liberal + e t and  Lib > 0 PopDen t and Liberal t positively correlated Typically, Liberal t Up  t Up  OLS estimation procedure for PopDen’s coefficient is biased PopDen t is positively correlated with  t PopDen t is the “problem” explanatory variable. Summary: When Liberal is omitted PopDen is a “problem” explanatory variable. It is correlated with the error term,  t. The explanatory variable/error term independence premise is violated.

Applying the Instrumental Variable (IV) Approach Choose an Instrument: We shall choose Coll t as our instrument. Instrumental Variables (IV) Regression 1  EViews uncorrelated with the error term,  t =  Lib Liberal t + e t. Consequently, we believe that the instrument, Coll t, is uncorrelated with the omitted variable, Liberal t. By doing so, we believe that it satisfies the two “good” instrument conditions; that is, we believe that the percent of college graduates, Coll t, is: positively correlated with the “problem” explanatory variable, PopDen t Dependent variable: “Problem” explanatory variable, PopDen Explanatory variable: Instrument, Coll. Estimated Equation: EstPopDen =  3,676.3 + 149.3Coll Ordinary Least Squares (OLS) Dependent Variable: PopDen Explanatory Variable(s):EstimateSEt-StatisticProb Coll 149.337528.215135.2928160.0000 Const  3676.281 781.0829  4.706647 0.0000 Number of Observations51

Ordinary Least Squares (OLS) Dependent Variable: VoteDemPartyTwo Explanatory Variable(s):EstimateSEt-StatisticProb EstPopDen 0.0099550.0013607.3171520.0000 Const 48.462471.21464339.898520.0000 Number of Observations51 Instrumental Variables (IV) Regression 2 Dependent Variable: Original dependent variable, VoteDemPartyTwo. Explanatory Variable: Estimate of the “problem” explanatory variable based on the results from Regression 1, EstPopDen. EstPopDen =  3,676.3 + 149.3Coll  EViews Estimated Equation: VoteDemPartyTwo = 48.5 +.01EstPopDen

Ordinary Least Squares (OLS) Dependent Variable: PopDen Explanatory Variable(s):EstimateSEt-StatisticProb Coll 149.337528.215135.2928160.0000 Const  3676.281 781.0829  4.706647 0.0000 Number of Observations51 Correlation Matrix CollPopDen Coll 1.0000000.603118 PopDen0.603118 1.000000 Good Instrument Conditions Revisited Instrument/”Problem” Explanatory Variable Correlation: The instrument, Coll t, must be correlated with the “problem” explanatory variable, PopDen t. Their correlation coefficient equals.60 The coefficient estimate is significant at the 1 percent level. Consequently, it is not unreasonable to judge that the instrument meets the first condition. Instrumental Variables (IV) Regression 1

Instrument/Error Term Independence: The instrument, Coll t, and the error term,  t, must be independent. VoteDemPartyTwo t =  Const +  PopDen EstPopDen t +  t Question: Are EstPopDen t and  t independent? EstPopDen =  3,676.3 + 149.3Coll  t =  Lib Liberal + e t Answer: Only if Coll t and Liberal t are independent. The explanatory variable/error term independence premise will be satisfied only if the instrument, Coll t, and the omitted variable, Liberal t, are independent. Unfortunately, there is no way to confirm this empirically with our data. This can be the “Achilles heel” of the instrumental variable (IV) estimation procedure.

Justifying the Instrumental Variable (IV) Approach: A Simulation Model: y t =  Const +  x1 x1 t +  x2 x2 t + e t Defaults The Only X1 button is selected. Hence, second explanatory variable, x2, is omitted. y t =  Const +  x1 x1 t + (  x2 x2 t + e t ) =  Const +  x1 x1 t +  t where  t =  x2 x2 t + e t The actual coefficient for the omitted explanatory variable, x2 t, equals 4. In the CorrX1&Z list.50 is selected. Hence, the included explanatory variable and the instrument are correlated. In the CorrX2&Z list.00 is selected. Hence, the instrument and the omitted variable are uncorrelated; consequently, the instrument and the error term are independent. In the CorrX1&X2 list.60 is selected. Hence, the included and omitted explanatory variables are correlated. Corr X1&Z.50.75 Corr X2&Z.00.10 Corr X1&X2.00.30.60

IV.50.10.60502.0 IV.50.10.601002.0 IV.50.10.601502.0 IV.50.00.60502.0 IV.50.00.601002.0 IV.50.00.601502.0  1.82 .18  32.8  1.89 .11  13.7  1.95 .05  9.2 Question: Is the IV estimation procedure: Unbiased?Consistent?No.Yes.  1.97 .03  3.9 Question: Is the instrument better?Yes. Both the magnitude of the bias and the variance of the estimates is less.  2.67 .67  31.1  2.70 .70  13.8  2.73 .73  8.7 Question: Is the IV estimation procedure: Unbiased?No.Consistent?No. EstimationCorrelation CoefficientsSampleActualMean ofMagnitudeVar of ProcedureX1&ZX2&ZX1&X2SizeCoef1Coef1 Estsof BiasCoef1 Ests IV.75.00.601502.0 EstimationCorrelation CoefficientsSampleActualMean ofMagnitudeVar of ProcedureX1&ZX2&ZX1&X2SizeCoef1Coef1 Estsof BiasCoef1 Ests Instrument/Error Term Independence: Suppose that the instrument is correlated with the error term? Instrument/”Problem” Explanatory Variable Correlation: Suppose that the instrument is more highly correlated with the “problem” explanatory variable:  Lab 20.2  Lab 20.3  Lab 20.2

Download ppt "Lecture 20 Preview: Omitted Variables and the Instrumental Variables Estimation Procedure The Ordinary Least Squares Estimation Procedure, Omitted Explanatory."

Similar presentations