Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: binary choice logit models Original citation: Dougherty, C. (2012) EC220.

Slides:



Advertisements
Similar presentations
EC220 - Introduction to econometrics (chapter 5)
Advertisements

EC220 - Introduction to econometrics (chapter 10)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: slope dummy variables Original citation: Dougherty, C. (2012) EC220 -
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: a Monte Carlo experiment Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: introduction to maximum likelihood estimation Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 11) Slideshow: adaptive expectations Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: testing a hypothesis relating to a regression coefficient Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
EC220 - Introduction to econometrics (chapter 7)
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function F(Z) giving the probability is the cumulative standardized.
EC220 - Introduction to econometrics (chapter 2)
In previous lecture, we highlighted 3 shortcomings of the LPM. The most serious one is the unboundedness problem, i.e., the LPM may make the nonsense predictions.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
EC220 - Introduction to econometrics (chapter 9)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: expected value of a function of a random variable Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: variable misspecification iii: consequences for diagnostics Original.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: confidence intervals Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (chapter 1)
EC220 - Introduction to econometrics (review chapter)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: continuous random variables Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: prediction Original citation: Dougherty, C. (2012) EC220 - Introduction.
SLOPE DUMMY VARIABLES 1 The scatter diagram shows the data for the 74 schools in Shanghai and the cost functions derived from a regression of COST on N.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: semilogarithmic models Original citation: Dougherty, C. (2012) EC220.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: nonlinear regression Original citation: Dougherty, C. (2012) EC220 -
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: maximum likelihood estimation of regression coefficients Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: Chow test Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: the normal distribution Original citation: Dougherty, C. (2012)
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy variable classification with two categories Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: two sets of dummy variables Original citation: Dougherty, C. (2012) EC220.
EC220 - Introduction to econometrics (review chapter)
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: sampling and estimators Original citation: Dougherty, C. (2012)
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: the effects of changing the reference category Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: dummy classification with more than two categories Original citation:
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: autocorrelation, partial adjustment, and adaptive expectations Original.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: Tobit models Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: conflicts between unbiasedness and minimum variance Original citation:
Christopher Dougherty EC220 - Introduction to econometrics (chapter 8) Slideshow: measurement error Original citation: Dougherty, C. (2012) EC220 - Introduction.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 11) Slideshow: Friedman Original citation: Dougherty, C. (2012) EC220 - Introduction.
Confidence intervals were treated at length in the Review chapter and their application to regression analysis presents no problems. We will not repeat.
1 PROXY VARIABLES Suppose that a variable Y is hypothesized to depend on a set of explanatory variables X 2,..., X k as shown above, and suppose that for.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
F TEST OF GOODNESS OF FIT FOR THE WHOLE EQUATION 1 This sequence describes two F tests of goodness of fit in a multiple regression model. The first relates.
MULTIPLE REGRESSION WITH TWO EXPLANATORY VARIABLES: EXAMPLE 1 This sequence provides a geometrical interpretation of a multiple regression model with two.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 12) Slideshow: footnote: the Cochrane-Orcutt iterative process Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 9) Slideshow: instrumental variable estimation: variation Original citation: Dougherty,
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: multiple restrictions and zero restrictions Original citation: Dougherty,
(1)Combine the correlated variables. 1 In this sequence, we look at four possible indirect methods for alleviating a problem of multicollinearity. POSSIBLE.
1 We will continue with a variation on the basic model. We will now hypothesize that p is a function of m, the rate of growth of the money supply, as well.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: alternative expression for population variance Original citation:
1 NONLINEAR REGRESSION Suppose you believe that a variable Y depends on a variable X according to the relationship shown and you wish to obtain estimates.
1 CHANGES IN THE UNITS OF MEASUREMENT Suppose that the units of measurement of Y or X are changed. How will this affect the regression results? Intuitively,
SEMILOGARITHMIC MODELS 1 This sequence introduces the semilogarithmic model and shows how it may be applied to an earnings function. The dependent variable.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 2) Slideshow: confidence intervals Original citation: Dougherty, C. (2012) EC220 -
1 REPARAMETERIZATION OF A MODEL AND t TEST OF A LINEAR RESTRICTION Linear restrictions can also be tested using a t test. This involves the reparameterization.
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
Christopher Dougherty EC220 - Introduction to econometrics (review chapter) Slideshow: independence of two random variables Original citation: Dougherty,
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 1) Slideshow: simple regression model Original citation: Dougherty, C. (2012) EC220.
VARIABLE MISSPECIFICATION I: OMISSION OF A RELEVANT VARIABLE In this sequence and the next we will investigate the consequences of misspecifying the regression.
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: binary choice logit models Original citation: Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 10). [Teaching Resource] © 2012 The Author This version available at: Available in LSE Learning Resources Online: May 2012 This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms

1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater than 1 or less than 0. XXiXi 1 0  1 +  2 X i Y, p 11 A B  1 +  2 X i 1 –  1 –  2 X i

The usual way of avoiding this problem is to hypothesize that the probability is a sigmoid (S-shaped) function of Z, F(Z), where Z is a function of the explanatory variables. BINARY CHOICE MODELS: LOGIT ANALYSIS 2

Several mathematical functions are sigmoid in character. One is the logistic function shown here. As Z goes to infinity, e –Z goes to 0 and p goes to 1 (but cannot exceed 1). As Z goes to minus infinity, e –Z goes to infinity and p goes to 0 (but cannot be below 0). BINARY CHOICE MODELS: LOGIT ANALYSIS 3

The model implies that, for values of Z less than –2, the probability of the event occurring is low and insensitive to variations in Z. Likewise, for values greater than 2, the probability is high and insensitive to variations in Z. BINARY CHOICE MODELS: LOGIT ANALYSIS 4

To obtain an expression for the sensitivity, we differentiate F(Z) with respect to Z. The box gives the general rule for differentiating a quotient and applies it to F(Z). BINARY CHOICE MODELS: LOGIT ANALYSIS 5

The sensitivity, as measured by the slope, is greatest when Z is 0. The marginal function, f(Z), reaches a maximum at this point. 6

For a nonlinear model of this kind, maximum likelihood estimation is much superior to the use of the least squares principle for estimating the parameters. More details concerning its application are given at the end of this sequence. BINARY CHOICE MODELS: LOGIT ANALYSIS 7

We will apply this model to the graduating from high school example described in the linear probability model sequence. We will begin by assuming that ASVABC is the only relevant explanatory variable, so Z is a simple function of it. BINARY CHOICE MODELS: LOGIT ANALYSIS 8

. logit GRAD ASVABC Iteration 0: Log Likelihood = Iteration 1: Log Likelihood = Iteration 2: Log Likelihood = Iteration 3: Log Likelihood = Iteration 4: Log Likelihood = Iteration 5: Log Likelihood = Logit Estimates Number of obs = 570 chi2(1) = Prob > chi2 = Log Likelihood = Pseudo R2 = grad | Coef. Std. Err. z P>|z| [95% Conf. Interval] asvabc | _cons | BINARY CHOICE MODELS: LOGIT ANALYSIS The Stata command is logit, followed by the outcome variable and the explanatory variable(s). Maximum likelihood estimation is an iterative process, so the first part of the output will be like that shown. 9

. logit GRAD ASVABC Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Logit estimates Number of obs = 540 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] ASVABC | _cons | In this case the coefficients of the Z function are as shown. BINARY CHOICE MODELS: LOGIT ANALYSIS 10

Since there is only one explanatory variable, we can draw the probability function and marginal effect function as functions of ASVABC. BINARY CHOICE MODELS: LOGIT ANALYSIS 11

BINARY CHOICE MODELS: LOGIT ANALYSIS We see that ASVABC has its greatest effect on graduating when it is below 40, that is, in the lower ability range. Any individual with a score above the average (50) is almost certain to graduate. 12

The t statistic indicates that the effect of variations in ASVABC on the probability of graduating from high school is highly significant. BINARY CHOICE MODELS: LOGIT ANALYSIS. logit GRAD ASVABC Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Logit estimates Number of obs = 540 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] ASVABC | _cons |

BINARY CHOICE MODELS: LOGIT ANALYSIS Strictly speaking, the t statistic is valid only for large samples, so the normal distribution is the reference distribution. For this reason the statistic is denoted z in the Stata output. This z has nothing to do with our Z function.. logit GRAD ASVABC Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Logit estimates Number of obs = 540 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] ASVABC | _cons |

BINARY CHOICE MODELS: LOGIT ANALYSIS The coefficients of the Z function do not have any direct intuitive interpretation. 15

However, we can use them to quantify the marginal effect of a change in ASVABC on the probability of graduating. We will do this theoretically for the general case where Z is a function of several explanatory variables. BINARY CHOICE MODELS: LOGIT ANALYSIS 16

Since p is a function of Z, and Z is a function of the X variables, the marginal effect of X i on p can be written as the product of the marginal effect of Z on p and the marginal effect of X i on Z. BINARY CHOICE MODELS: LOGIT ANALYSIS 17

We have already derived an expression for dp/dZ. The marginal effect of X i on Z is given by its  coefficient. BINARY CHOICE MODELS: LOGIT ANALYSIS 18

Hence we obtain an expression for the marginal effect of X i on p. BINARY CHOICE MODELS: LOGIT ANALYSIS 19

The marginal effect is not constant because it depends on the value of Z, which in turn depends on the values of the explanatory variables. A common procedure is to evaluate it for the sample means of the explanatory variables. BINARY CHOICE MODELS: LOGIT ANALYSIS 20

The sample mean of ASVABC in this sample is BINARY CHOICE MODELS: LOGIT ANALYSIS. sum GRAD ASVABC Variable | Obs Mean Std. Dev. Min Max GRAD | ASVABC | Logit estimates Number of obs = 540 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] ASVABC | _cons |

When evaluated at the mean, Z is equal to BINARY CHOICE MODELS: LOGIT ANALYSIS. sum GRAD ASVABC Variable | Obs Mean Std. Dev. Min Max GRAD | ASVABC | Logit estimates Number of obs = 540 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] ASVABC | _cons |

e –Z is Hence f(Z) is BINARY CHOICE MODELS: LOGIT ANALYSIS. sum GRAD ASVABC Variable | Obs Mean Std. Dev. Min Max GRAD | ASVABC |

The marginal effect, evaluated at the mean, is therefore This implies that a one point increase in ASVABC would increase the probability of graduating from high school by 0.4 percent. BINARY CHOICE MODELS: LOGIT ANALYSIS. sum GRAD ASVABC Variable | Obs Mean Std. Dev. Min Max GRAD | ASVABC |

In this example, the marginal effect at the mean of ASVABC is very low. The reason is that anyone with an average score is almost certain to graduate anyway. So an increase in the score has little effect. BINARY CHOICE MODELS: LOGIT ANALYSIS

To show that the marginal effect varies, we will also calculate it for ASVABC equal to 30. A one point increase in ASVABC then increases the probability by 2.9 percent. BINARY CHOICE MODELS: LOGIT ANALYSIS. sum GRAD ASVABC Variable | Obs Mean Std. Dev. Min Max GRAD | ASVABC |

An individual with a score of 30 has only a 67 percent probability of graduating, and an increase in the score has a relatively large impact. BINARY CHOICE MODELS: LOGIT ANALYSIS 27

. logit GRAD ASVABC SM SF MALE Iteration 0: log likelihood = Iteration 1: log likelihood = Iteration 2: log likelihood = Iteration 3: log likelihood = Iteration 4: log likelihood = Iteration 5: log likelihood = Logit estimates Number of obs = 540 LR chi2(4) = Prob > chi2 = Log likelihood = Pseudo R2 = GRAD | Coef. Std. Err. z P>|z| [95% Conf. Interval] ASVABC | SM | SF | MALE | _cons | Here is the output for a model with a somewhat better specification. BINARY CHOICE MODELS: LOGIT ANALYSIS 28

. sum GRAD ASVABC SM SF MALE Variable | Obs Mean Std. Dev. Min Max GRAD | ASVABC | SM | SF | MALE | We will estimate the marginal effects, putting all the explanatory variables equal to their sample means. BINARY CHOICE MODELS: LOGIT ANALYSIS 29

Logit: Marginal Effects mean b product f(Z) f(Z)b ASVABC SM11.58–0.023– –0.001 SF MALE Constant1.00–3.252–3.252 Total3.514 BINARY CHOICE MODELS: LOGIT ANALYSIS The first step is to calculate Z, when the X variables are equal to their sample means. 30

Logit: Marginal Effects mean b product f(Z) f(Z)b ASVABC SM11.58–0.023– –0.001 SF MALE Constant1.00–3.252–3.252 Total3.514 BINARY CHOICE MODELS: LOGIT ANALYSIS We then calculate f(Z). 31

The estimated marginal effects are f(Z) multiplied by the respective coefficients. We see that the effect of ASVABC is about the same as before. Mother's schooling has negligible effect and father's schooling has no discernible effect at all. BINARY CHOICE MODELS: LOGIT ANALYSIS Logit: Marginal Effects mean b product f(Z) f(Z)b ASVABC SM11.58–0.023– –0.001 SF MALE Constant1.00–3.252–3.252 Total

Logit: Marginal Effects mean b product f(Z) f(Z)b ASVABC SM11.58–0.023– –0.001 SF MALE Constant1.00–3.252–3.252 Total3.514 BINARY CHOICE MODELS: LOGIT ANALYSIS Males have 0.4 percent higher probability of graduating than females. These effects would all have been larger if they had been evaluated at a lower ASVABC score. 33

This sequence will conclude with an outline explanation of how the model is fitted using maximum likelihood estimation. BINARY CHOICE MODELS: LOGIT ANALYSIS Individuals who graduated: outcome probability is 34

In the case of an individual who graduated, the probability of that outcome is F(Z). We will give subscripts 1,..., s to the individuals who graduated. BINARY CHOICE MODELS: LOGIT ANALYSIS Individuals who graduated: outcome probability is 35

In the case of an individual who did not graduate, the probability of that outcome is 1 – F(Z). We will give subscripts s+1,..., n to these individuals. BINARY CHOICE MODELS: LOGIT ANALYSIS Individuals who graduated: outcome probability is Individuals who did not graduate: outcome probability is 36

Maximize F(Z 1 ) x... x F(Z s ) x [1 – F(Z s+1 )] x... x [1 – F(Z n )] Did graduate Did not graduate We choose b 1 and b 2 so as to maximize the joint probability of the outcomes, that is, F(Z 1 ) x... x F(Z s ) x [1 – F(Z s+1 )] x... x [1 – F(Z n )]. There are no mathematical formulae for b 1 and b 2. They have to be determined iteratively by a trial-and-error process. BINARY CHOICE MODELS: LOGIT ANALYSIS 37

Copyright Christopher Dougherty These slideshows may be downloaded by anyone, anywhere for personal use. Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author. The content of this slideshow comes from Section 10.2 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school course EC212 Introduction to Econometrics or the University of London International Programmes distance learning course 20 Elements of Econometrics