Introduction Sometimes we have a situation where the dependent variable is qualitative in nature It takes on two (or more) mutually exclusive values Examples: Whether or not a person is in the labor force Union membership
Linear Probability Model Examine choice of whether an individual owns a house. Y i = b 1 + b 2 X i + u i where Y i = 1 if family owns a house Y i = 0 if family does not own a house X i = family income
Linear Probability Model We can estimate such a model by OLS. However, we don't get good results. This is called a linear probability model because E(Y i | X i ) is the conditional probability that the event (buying a house) will occur given X i (family income).
Derivation Expected value of above: E(Y i |X i ) = b 1 + b 2 X i since E(u i ) = 0. Let P i = probability that Y i =1 (the event occurs) Then 1-P i is the probability Y i =0 Then by definition of a mathematical expectation: E(Y i |X i )= 0(1-P i ) + 1(P i ) = P i
Derivation So E(Y i |X i )= b 1 + b 2 X i = P i So the conditional expectation is like a conditional probability.
Problems with LPM Error term is not normally distributed but follows a binomial probability distribution For OLS we do not require that the error term is distributed normally. But we do assume this for the purposes of hypothesis testing.
Problems with LPM However we can ’ t assume normality for the error term here U i takes on only two values: When Y i = 1 then u i = 1 - b 1 - b 2 X i Y i = 0 then u i = - b 1 - b 2 X i So u i is not normally distributed, but follows a binomial distribution. Note that the OLS point estimates still remain unbiased. As n rises the estimators will tend to be ~ N
Problems with LPM Error term is heteroskedastic Though the E(u i ) = 0, the errors are not homoscedastic. var(u i ) = E(Y i |X i )[1-E(Y i |X i )] var (u i )= P i (1- P i ) This is heteroskedastic because the conditional expectation of Y, depends on the value taken by X.
Problems with LPM What does this imply? With heteroskedasticity, OLS estimators are unbiased but not efficient They do not have minimum variance. We correct the heteroskedasticity - Transform data with weight = P i (1- P i ) This eliminates the heteroskedasticity
Problems with LPM In practice we don't know the true probability - so estimate it: a. Run OLS on original model. b. Get predicted Y i and construct w i = predictedY i* (1-predictedY i ) c. Do OLS regression on transformed data
Problems with LPM Probabilities falling outside 0 and 1 is main problem with LPM. Although in theory P(Y i | X i ) would fall between 0 and 1, there is no guarantee that predicted probabilities in the linear model will We can estimate by OLS and see if estimated probabilities lie outside these bounds, then assume them to be at 0 or 1.
Problems with LPM Or use probit or logit model that guarantees that the estimated probabilities will fall between these limits. Graph
Problems with LPM LPM assumes that probabilities increase linearly with the explanatory variables Each unit increase in an X has the same effect on the probability of Y occurring regardless of the level of the X. More realistic to assume a smaller effect at high probability levels. Probit and Logit make this assumption
Introduction Probit and Logit have a S shaped probability function As X increases, probability of Y increases, but never steps outside the 0-1 interval The relationship between the probability of Y and X is nonlinear It approaches zero at slower and slower rates as X gets small
Introduction It approaches one at slower and slower rates as X gets large. The S-shaped curve can be modeled by a cumulative distribution function (CDF). The CDF of a random variable X: F(X) = P(X x) CDF measures the probability that X takes a value of less than or equal to a given x
Introduction Graph of F(X) vs X The CDF's most commonly chosen are : The logistic function - logit; The cumulative normal - probit Logit and probit quite different models, different interpretation. Logit distribution has flatter tails Approaches the axes more slowly
Introduction Suppose the decision to join union depends on some unobserved index Z i "the propensity to join" for each individual. Don't observe the "propensity to join" Just observe union or not. So we only observe dummy variable D,
Introduction Defined as: D = 0 if a worker is nonunion. D = 1 if a worker is union member Behind this "observed" dummy variable is the "unobserved" index Assume Z depends on explanatory variables such as wage. So Z i = b 1 + b 2 X i where X i is the wage of the i'th individual
Introduction Each individual's Z index can be expressed a function of some intercept term and wage with attached coefficient Reality: many X's, not just wage Suppose there's a critical level or threshold level of the Z, -- Z i *, If Z i >Z i * an individual will join, otherwise will not.
Introduction Assume Z i * is distributed normally with the same mean and variance as Z i. What's the probability that Z i >Z i * In other words, what's the probability that this individual will join?.
Introduction P i, the probability of joining, is measured by the area under the standard normal curve from - to Z i. Individuals are at different points along this function Have different critical values pushing them into joining, depending on characteristics.
Introduction How do we estimate Z i ? Use the inverse of the cumulative normal function, Z i =F -1 (P i ) = b 1 +b 2 X i