Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Research Method Lecture 11-3 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

Similar presentations


Presentation on theme: "1 Research Method Lecture 11-3 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©"— Presentation transcript:

1 1 Research Method Lecture 11-3 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©

2 IV solution to Errors in variable problems: Example 1 4Consider the following model Y=β 0 +β 1 x 1 * +β 2 x 2 +u Where x 1 * is the correctly measured variable. Suppose, however, that you only have error ridden variable x 1 =x 1 * +e 1. Thus, the actual estimation model becomes Y=β 0 +β 1 x 1 +β 2 x 2 +(u- β 1 e 1 ) Thus, the OLS estimate of β 1 is biased. This is the error-in-variable bias. 2

3 4The error-in-variable bias cannot be corrected with the panel data method. But IV method can solve the problem. 4Suppose that you have another measure for x 1 *. Call this z 1. For example consider x 1 * is the husbands annual salary, and x 1 is the annual salary reported by the husband, which is reported with errors. Sometimes, the data also asks the wife to report her husbands annual salary. Then z 1 is the husbands annual salary reported by the wife. 3

4 4In this case, z 1 =x 1 *+a 1 where a1 is the measurement error. 4Although z 1 is measured with errors, it can serve as the instrument for x1. Why? First, x 1 and z 1 should be correlated. Second, since e 1 and a 1 are just measurement errors, they are unlikely to be correlated, which means that z 1 is uncorrelated with the error term (u- β 1 e 1 ). Y=β 0 +β 1 x 1 +β 2 x 2 +(u- β 1 e 1 ) 4So, 2SLS with z 1 as an instrument can eliminate this bias. 4

5 IV solution to Errors in variable problems: Example 2 4This is a more complicated example. Consider the following model. where we have the unobserved ability problem. 4Suppose that you have two test scores that are the indicators of the ability. test 1 = γ abil+e 1 test 2 = δ abil+e 2 5

6 4If you use test 1 as the proxy variable for ability, you have the following model. where. Thus, test 1 is correlated with the error term: It has the error-in-variable problem. In this case, a simple plug-in- solution does not work. 4However, since you have test 2, another measure of abil, you can use test 2 as an instrument for test 1 in the 2SLS procedure to eliminate the bias. 6

7 Exercise 4Using WAGE2.dta, consider a log-wage regression with explanatory variables educ exper tenure married south urban and black. Using IQ and KWW (knowledge of the world of work) as two measures of the unobserved ability, estimate the model that correct for the bias in educ. 7

8 8 OLS

9 9 Simple plug in solution Plug in + IV using KWW as the instrument for IQ

10 2SLS with heteroskedasticity 4When heteroskedasticity is present, we have to modify the standard error formula. 4The derivation of the formula is not the scope of this class. However, STATA automatically compute this. Just use robust option. 10

11 Testing overidentifying restrictions 4Usually, the instrument exogeneity cannot be tested. 4However, when you have extra instruments, you can effectively test this. This is the test of overidentifying restrictions. 11

12 The basic idea behind the test of overidentifying restrictions 4Before presenting the procedure, I will provide you with the basic idea of the test. 4Consider the following model. y 1 = β 0 +β 1 y 2 +β 2 z 1 +β 3 z 2 +u 1 4Suppose you have two instruments for y 2 : z 3 z 4. If both instruments are valid instruments, using either z 3 or z 4 as an instrument will produce consistent estimates. 4Let be the IV estimator when z 3 is used as an instrument. Let be the IV estimate when z 4 is used as an instrument 12

13 4The idea is to check if and are similar. That is, you test H 0 :. 4If you reject this null, it means that either z 3 or z 4, or both of them are not exogenous. We do not know which one is not exogenous. So the rejection of the null typically means that your choice of instruments is invalid. 13

14 4On the other hand, if you fail to reject the null hypothesis, we can have some confidence in the overall set of instruments used. 4However, caution is necessarily. Even if you fail to reject the null, this does not always mean that the set of instruments are valid. 4For example, consider wage regression with education being the endogenous variable. And you have mother and fathers education as instruments. 14

15 4Even if mother and fathers education do not satisfy the instrument exogeneity, may be very close to zero since the direction of the biases are the same. In this case, even if they are invalid instruments, we may fail to reject the null (i.e., erraneously judge that they satisfy the instrument exogeneity). 15

16 The procedure of the test of overidentifying restrictions The procedure: (i)Estimate the structural equation by 2SLS and obtain the 2SLS residuals,. (ii)Regress on al exogenous variables. Obtain R- squared. Say R 1 2. (iii)Under the null that all IVs are uncorrelated with the structural error u 1, where q is the number of extra instruments. If you fail to reject the null (i.e., if nR 1 2 is small), then you have some confidence about the instrument exogeneity. If you reject it, at least some of the instruments are not exogenous. 16

17 4The NR 1 2 statistic is valid when homoskedasticity assumption holds. NR 1 2 statistic is also calld the Sargans statistic. 4When we assume heteroskedasticity, we have to use another statistic called the Hansens J statistic. 4Both tests can be done automatically using STATA. 17

18 Exercise 4Consider the following model. Log(wage)= β 0 +β 1 (educ)+β 2 Exper+β 3 Exper 2 +u 1.Using Mroz.dta, estimate the above equation using motheduc & fathereduc as instruments for (educ). 2.Test the overidentifying restrictions. 18

19 Answers 1 19 OLS 2SLS

20 Answer: 2 4First, conduct the test manually First, estimate 2SLS. 2. Second, generate the 2sls residual. Call this uhat.

21 21 3. Third, regress uhat on all the exogenous variables. Dont forget to include exogenous variables in the structural equation: exper and expersq 4. Fourth, get the R- squared from this regression. You can use this, but this is rounded. To compute more precisely, type this. 5. Finally, compute NR 2. This is NR 2 stat. This is also called the Sargans statistic

22 4The NR 2 stat follows χ 2 (1).The degree of freedom is equal to the number of extra instruments. In our case it is 1. (In our mode, there is only one endogenous variable. Thus, you need only one instrument. But we have two instruments. Therefore the number of extra instrumetn is 1. ) 4Since the 5% cutoff point for χ 2 (1) is 3.84, we failed to reject the null hypothesis that exogenous variables are not correlated with the structural error. 4Thus, we have some confidence in the choice of instruments. In other word, our instruments have passed the test of overidentifying restrictions. 22

23 4Now, let us conduct the test of overidentifying restriction automatically. 23 This is NR 2 stat. It is also called the Sargans statistic.

24 4The heteroskedasticity version can also be done automatically. 24 Use robust option when estimating 2SLS. Then type the same command. Heteroskedasticity robust version is called the Hansens J statistic

25 4Even if you fail to reject the null hypothesis in the test, there is a possibility that your instruments are still invalid. 4Thus, even if your instruments pass the test, in general, you should try to provide a plausible story why your instruments satisfy the instrument exogeneity. (Quarter of birth is a good example). 25 Note

26 Testing the endogeneity 4Consider again the following model. y 1 = β 0 +β 1 y 2 +β 2 z 1 +β 3 z 2 +u 1 4Where y 2 is the suspected endogenous variable and you have instruments z 3 and z 4. 4If y 2 is actually exogenous, OLS is better. If you have valid instruments, you can test if y 2 is exogenous or not. 26

27 4Before laying out the procedure, let us understand the basic idea behind the test. Structural eq:y 1 = β 0 +β 1 y 2 +β 2 z 1 +β 3 z 2 +u 1 Reduced eq :y 2 = π 0 +π 1 z 1 + π 2 z 2 + π 3 z 3 + π 4 z 4 +v 2 4You can check that y 2 is correlated with u 1 only if v 2 is correlated with u 1. 4Further, let u 1 =δv 2 +e 1. Then u 1 and v 2 are correlated only if δ =0. Thus, consider y 1 = β 0 +β 1 y 2 +β 2 z 1 +β 3 z 2 + δv 2 +e 1 then, test if δ is zero or not. 27

28 The test of endogeneity: procedure (i)Estimate the reduced form equation using OLS. y 2 = π 0 +π 1 z 1 + π 2 z 2 + π 3 z 3 + π 4 z 4 +v 2 Then obtain the residual. (ii) Add to the structural equation and estimate using OLS y 1 = β 0 +β 1 y 2 +β 2 z 1 +β 3 z 2 +α +e 1 Then, test H 0 : α=0. If we reject H 0, then we conclude that y 2 is endogenous because u 1 and v 2 are correlated. 28

29 Exercise 4Consider the following model. Log(wage)= β 0 +β 1 (educ)+β 2 Exper+β 3 Exper 2 +u Suppose that father and mother education satisfy the instrument exogeneity. Conduct the Hausman test of endogeneity to check if (educ) is exogenous or not. 29

30 Answer 4First, conduct the test manually. 30 To use the same observations as 2SLS, run 2SLS once and generate this variable

31 31 Now run the reduced for regression, then get the residual. Then check if this coefficient is different from zero.

32 4The coefficient on uhat is significant at 10% level. Thus, you reject the null hypothesis that educ is exogenous (not correlated with the structural error) at 10% level. 4This is a moderate evidence that educ is endogenous and thus 2SLS should be reported (along with OLS). 32

33 4Stata conduct the test of endogeneity automatically. Stata uses a different version of the test. 33

34 4Note that the test of endogeneity is valid only if that the instruments satisfy the instrument exogeneity. 4Thus, test the overidentifying restrictions first to check if the instruments satisfy the instrument exogeneity. If instruments pass the overidentifying test, then conduct the test of endogeneity. 34

35 Applying 2SLS to pooled cross sections 4When you simply apply 2SLS to the pooled cross section data, there is no new difficulty. You can just apply 2SLS. 35

36 Combining panel data method and IV method 4Suppose you have two period panel data. The period is 1987 and Consider the following model. Log(scrap) it =β 0 +δ 0 d88 t +β 1 (hrsemp) it +a i +u it Where (scrap) is the scrap rate. (Hrsemp) is the hours of employee training. You have data 36

37 4Correlation between a i and (hrsemp) it causes a bias in β 1. In the first differened model, we difference to remove a i : that is, we estimate Log(scrap) it =δ 0 +β1 (hrsemp) it +u it …(1) In some case, (hrsemp) it and u it can still be correlated. For example, when a firm hires more skilled workers, they may reduce the job training. 37

38 4In this case, the quality of the worker is time varying, so it is not contained in a i, but it is contained in u it. In this case (hrsemp) it andu it may be negatively correlated. This would cause OLS estimate of β 1 to be biased upward (bias towards not finding the productivity enhancing effect of training). 4To eliminate the bias, we can apply IV method to equation (1). 38

39 4One possible instrument for (hrsemp) it is the (Grant) it. (Grant) it is a variable indicating if the company received job training grant. Since the grant designation is given at the beginning of 1988, (Grant)it may be uncorrelated with u it. At the same time, it would be correlated with (hrsemp) it. Thus, we can use (Grant)it as an IV for (hrsemp) it. 39

40 Exercise 4Using JTRAIN.dta, estimate the following model. Log(scrap) it =δ 0 +β 1 (hrsemp) it +u it …(1) Use (Grant)it as an instrument for (hrsemp) it. Use the data between 1987 and 1988 only. 40

41 Answer 4First estimate it manually. 41 This is the simple first differenced model

42 42 This is the first- differenced model + IV method

43 4Now, estimate the model automatically. 43 First differenced model + IV method.


Download ppt "1 Research Method Lecture 11-3 (Ch15) Instrumental Variables Estimation and Two Stage Least Square ©"

Similar presentations


Ads by Google