Econometrics Chengyuan Yin School of Mathematics.

Econometrics Chengyuan Yin School of Mathematics

23. Discrete Choice Modeling
Econometrics 23. Discrete Choice Modeling

A Microeconomics Platform
Consumers Maximize Utility (!!!) Fundamental Choice Problem: Maximize U(x1,x2,…) subject to prices and budget constraints A Crucial Result for the Classical Problem: Indirect Utility Function: V = V(p,I) Demand System of Continuous Choices The Integrability Problem: Utility is not revealed by demands

Theory for Discrete Choice
Theory is silent about discrete choices Translation to discrete choice Existence of well defined utility indexes: Completeness of rankings Rationality: Utility maximization Axioms of revealed preferences Choice sets and consideration sets – consumers simplify choice situations Implication for choice among a set of discrete alternatives Commonalities and uniqueness Does this allow us to build “models?” What common elements can be assumed? How can we account for heterogeneity? Revealed choices do not reveal utility, only rankings which are scale invariant

Choosing Between Two Alternatives
Modeling the Binary Choice Ui,suv = suv + Psuv + suvIncome + i,suv Ui,sed = sed + Psed + sedIncome + i,sed Chooses SUV: Ui,suv > Ui,sed Ui,suv - Ui,sed > 0 (SUV-SED) + (PSUV-PSED) + (SUV-sed)Income + i,suv - i,sed > 0 i > -[ + (PSUV-PSED) + Income]

What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N)
Are the attributes “relevant?” Predicting behavior Individual Aggregate Analyze changes in behavior when attributes change

Application 210 Commuters Between Sydney and Melbourne
Available modes = Air, Train, Bus, Car Observed: Choice Attributes: Cost, terminal time, other Characteristics: Household income First application: Fly or Other

Binary Choice Data Choose Air Gen.Cost Term Time Income

An Econometric Model Choose to fly iff UFLY > 0
Ufly = +1Cost + 2Time + Income +  Ufly > 0   > -(+1Cost + 2Time + Income) Probability model: For any person observed by the analyst, Prob(fly) = Prob[ > -(+1Cost + 2Time + Income)] Note the relationship between the unobserved  and the outcome

+1Cost + 2TTime + Income

Modeling Approaches Nonparametric – “relationship”
Minimal Assumptions Minimal Conclusions Semiparametric – “index function” Stronger assumptions Robust to model misspecification (heteroscedasticity) Still weak conclusions Parametric – “Probability function and index” Strongest assumptions – complete specification Strongest conclusions Possibly less robust. (Not necessarily)

Nonparametric P(Air)=f(Income)

Semiparametric MSCORE: Find b’x so that
sign(b’x) * sign(y) is maximized. Klein and Spady: Find b to maximize a semiparametric likelihood of G(b’x)

MSCORE

Klein and Spady Semiparametric
Note necessary normalizations. Coefficients are not very meaningful.

Parametric: Logit Model

Logit vs. MScore Logit fits worse
MScore fits better, coefficients are meaningless

Parametric Model Estimation
How to estimate , 1, 2, ? It’s not regression The technique of maximum likelihood Prob[y=1] = Prob[ > -(+1Cost + 2Time + Income)] Prob[y=0] = Prob[y=1] Requires a model for the probability

Completing the Model: F()
The distribution Normal: PROBIT, natural for behavior Logistic: LOGIT, allows “thicker tails” Gompertz: EXTREME VALUE, asymmetric, underlies the basic logit model for multiple choice Does it matter? Yes, large difference in estimates Not much, quantities of interest are more stable.

Underlying Probability Distributions for Binary Choice

Estimated Binary Choice Models
LOGIT PROBIT EXTREME VALUE Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio Constant GC TTME HINC Log-L Log-L(0)

+1Cost + 2Time + (Income+1) ( is positive)
Effect on Predicted Probability of an Increase in Income +1Cost + 2Time + (Income+1) ( is positive)

Marginal Effects in Probability Models
Prob[Outcome] = some F(+1Cost…) “Partial effect” =  F(+1Cost…) / ”x” (derivative) Partial effects are derivatives Result varies with model Logit:  F(+1Cost…) / x = Prob * (1-Prob) *  Probit:  F(+1Cost…) / x = Normal density  Scaling usually erases model differences

The Delta Method

Marginal Effects for Binary Choice
Logit Probit

Estimated Marginal Effects
Logit Probit Extreme Value Estimate t-ratio GC 3.267 3.466 3.354 TTME -5.042 -5.754 -4.871 HINC 2.193 2.532 2.064

Marginal Effect for a Dummy Variable
Prob[yi = 1|xi,di] = F(’xi+di) =conditional mean Marginal effect of d Prob[yi = 1|xi,di=1]=Prob[yi= 1|xi,di=0] Logit:

Computing Effects Compute at the data means?
Simple Inference is well defined Average the individual effects More appropriate? Asymptotic standard errors. (Not done correctly in the literature – terms are correlated!) Is testing about marginal effects meaningful?

Average Partial Effects

Elasticities Elasticity = How to compute standard errors? Delta method
Bootstrap Bootstrap the individual elasticities? (Will neglect variation in parameter estimates.) Bootstrap model estimation?

Odds Ratio – Logit Model Only
Effect Measure? “Effect of a unit change in the odds ratio.”

Ordered Outcomes E.g.: Taste test, credit rating, course grade
Underlying random preferences: Mapping to observed choices Strength of preferences Censoring and discrete measurement The nature of ordered data

Modeling Ordered Choices
Random Utility Uit =  + ’xit + i’zit + it = ait + it Observe outcome j if utility is in region j Probability of outcome = probability of cell Pr[Yit=j] = F(j – ait) - F(j-1 – ait)

Health Care Satisfaction (HSAT)
Self administered survey: Health Care Satisfaction? (0 – 10) Continuous Preference Scale

Ordered Probability Model

Ordered Probabilities

Five Ordered Probabilities

Coefficients

Effects in the Ordered Probability Model
Assume the βk is positive. Assume that xk increases. β’x increases. μj- β’x shifts to the left for all 5 cells. Prob[y=0] decreases Prob[y=1] decreases – the mass shifted out is larger than the mass shifted in. Prob[y=2] decreases – same reason. Prob[y=3] increases. Prob[y=4] increases When βk > 0, increase in xk decreases Prob[y=0] and increases Prob[y=J]. Intermediate cells are ambiguous, but there is only one sign change in the marginal effects from 0 to 1 to … to J

Ordered Probit Marginal Effects

Multinomial Choice Among J Alternatives
• Random Utility Basis Uitj = ij + i ’xitj + i’zit + ijt i = 1,…,N; j = 1,…,J(i); t = 1,…,T(i) • Maximum Utility Assumption Individual i will Choose alternative j in choice setting t iff Uitj > Uitk for all k  j. • Underlying assumptions Smoothness of utilities Axioms: Transitive, Complete, Monotonic

Utility Functions The linearity assumption and curvature
The choice set Deterministic and random components: The “model” Generic vs. alternative specific components Attributes and characteristics Coefficients Part worths Alternative specific constants Scaling

The Multinomial Logit (MNL) Model
Independent extreme value (Gumbel): F(itj) = 1 – Exp(-Exp(itj)) (random part of each utility) Independence across utility functions Identical variances (means absorbed in constants) Same parameters for all individuals (temporary) Implied probabilities for observed outcomes

Specifying Probabilities
• Choice specific attributes (X) vary by choices, multiply by generic coefficients. E.g., TTME, GC Generic characteristics (Income, constants) must be interacted with choice specific constants. (Else they fall out of the probability) • Estimation by maximum likelihood; dij = 1 if person i chooses j

Observed Data Types of Data Attributes and Characteristics
Individual choice Market shares Frequencies Ranks Attributes and Characteristics Choice Settings Cross section Repeated measurement (panel data)

Data on Discrete Choices
Line MODE TRAVEL INVC INVT TTME GC HINC 1 AIR 2 TRAIN 3 BUS 4 CAR 5 AIR 6 TRAIN 7 BUS 8 CAR 321 AIR 322 TRAIN 323 BUS 324 CAR 325 AIR 326 TRAIN 327 BUS 328 CAR

Model Fit Based on Log Likelihood
Three sets of predicted probabilities No model: Pij = 1/J (.25) Constants only: Pij = (1/N)i dij [(58,63,30,59)/210=.286,.300,.143,.281) Estimated model: Logit probabilities Compute log likelihood Measure improvement in log likelihood with R-squared = 1 – LogL/LogL0 (“Adjusted” for number of parameters in the model.) NOT A MEASURE OF “FIT!”

Effects of Changes in Attributes on Probabilities
Partial Effects: Effect of a change in attribute “k” of alternative “m” on the probability that choice “j” will be made is Proportional changes: Elasticities Note the elasticity is the same for all choices “j.” (IIA)

Choice Based Sampling Over/Underrepresenting alternatives in the data set Biases in parameter estimates? (Constants only?) Biases in estimated variances Weighted log likelihood, weight = j / Fj for all i. Fixup of covariance matrix ; Choices = list of names / list of true proportions $ Choice Air Train Bus Car True 0.14 0.13 0.09 0.64 Sample 0.28 0.30

Uitj = ij + i ’xitj + i’zit + ijt
The I.I.D Assumption Uitj = ij + i ’xitj + i’zit + ijt F(itj) = 1 – Exp(-Exp(itj)) (random part of each utility) Independence across utility functions Identical variances (means absorbed in constants) Restriction on scaling Correlation across alternatives? Implication for cross elasticities (we saw earlier) Behavioral assumption, independence from irrelevant alternatives (IIA)

Econometrics Chengyuan Yin School of Mathematics.

Similar presentations

Presentation on theme: "Econometrics Chengyuan Yin School of Mathematics."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Econometrics Chengyuan Yin School of Mathematics.

Similar presentations

Presentation on theme: "Econometrics Chengyuan Yin School of Mathematics."— Presentation transcript:

Similar presentations

About project

Feedback