GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models.

GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models

Introduction Sometimes useful to model individual firm, or other agents choices over discrete alternatives –Choice of transport mode –Choice of firm location amongst regions –Choice of cities or country to migrate to Theoretical framework –Random utility model Empirical methods: –Micro: Probit, logit, multinomial logit –Aggregate: Poisson, OLS, gravity

The Random Utility choice model

Random Utility Model RUM underlies economic interpretation of discrete choice models. Developed by Daniel McFadden for econometric applications –see JoEL January 2001 for Nobel lecture; also Manski (2001) Daniel McFadden and the Econometric Analysis of Discrete Choice, Scandinavian Journal of Economics, 103(2), 217-229 Preferences are functions of biological taste templates, experiences, other personal characteristics –Some of these are observed, others unobserved –Allows for taste heterogeneity Discussion below is in terms of individual utility (e.g. migration, transport mode choice) but similar reasoning applies to firm choices

Random Utility Model Individual is utility from a choice j can be decomposed into two components: V ij is deterministic – common to everyone, given the same characteristics and constraints – representative tastes of the population e.g. effects of time and cost on travel mode choice ij is random –reflects idiosyncratic tastes of i and unobserved attributes of choice j

Random Utility Model V ij is a function of attributes of alternative j (e.g. price and time) and observed consumer and choice characteristics. We are interested in finding,, Lets forget about z now for simplicity

RUM and binary choices Consider two choices e.g. bus or car We observe whether an individual uses one or the other Define What is the probability that we observe an individual choosing to travel by bus? Assume utility maximisation Individual chooses bus (y=1) rather than car (y=0) if utility of commuting by bus exceeds utility of commuting by car

RUM and binary choices So choose bus if So the probability that we observe an individual choosing bus travel is

The linear probability model Assume probability depends linearly on observed characteristics (price and time) Then you can estimate by linear regression Where is the dummy variable for mode choice (1 if bus, 0 if car) Other consumer and choice characteristics can be included (the zs in the first slide in this section)

The linear probability model Unfortunately his has some undesirable properties 1 0 Linear regression line

Non-linear probability model Better for probability function to have a shape something like: 1 0

Probits and logits Common assumptions: –Cumulative normal distribution function – Probit –Logistic function – Logit Estimation by maximum likelihood

Example McFadden, D. (1974) The Measurement of Urban Travel Demand, Journal of Public Economics, 3 Methods of commuting in San Francisco Bay area

Example 1 Characteristics t Family income $0.000095(0.774) Car-bus cost, cents per round trip -0.01022*(3.726) Car-bus vehicle time costs (one way minutes x wage) -0.01479 (2.460) Bus total access time costs (one way minutes x wage) -0.00314(0.818) Constant0.3832(0.428) McFadden (1974) car versus bus commute modes in SF Bay area

Multiple choices and the multinomial logit

Multiple choices We often want to think about many more than two choices –Choice of regional location –Choice of transport mode with many alternatives –Choice amongst a sample of schools How can we extend the binary choice logit model? Random Utility model extends to many choices Choose choice k if utility higher than for all other choices

Multinomial logit (1) Again we need to assume some distribution for the unobserved factor One type of distribution (extreme value) gives a simple solution for the probability that choice k is made: This is a generalisation of the logit model with many alternatives = multinomial logit or conditional logit

Multinomial logit (2) Recall: V ij is a linear function of observed characteristics of the individuals and their choices. e.g. for travel mode choice Parameters estimated: For an individual characteristic that is common across choices (e.g. income, gender): one parameter per choice –For at least one choice this is zero (base case). For a characteristic which varies only across choices e.g. price of transport: one parameter common across choices

Example: Value of time MNL models used to estimate value of travel time with from observed commuter behaviour Three transport choices: bus (0), train (1), car (2) Choosing bus as the base case:

Example 1: Value of time For example, from Truong and Hensher, Economic Journal, 95 (1985) p. 15 for bus/train/car choices in Sydney 1982

Example 2: immigration Scott, Coomes and Izyumov, (2005)The Location Choice of Employment-Based Immigrants among U.S. Metro Areas. Journal of Regional Science 45(1) 113-145 Estimate the impact of metropolitan area characteristics on destination choice for US migrants in 1995 298 destination MSAs

Example 2: immigration Source: Scott, Coomes et al (note: they also report models which include individual Xs)

The independence of irrelevant alternatives problem (IIA) and the nested logit model

Multinomial logit and IIA Many applications in economic and geographical journals (and other research areas) The multinomial logit model is the workhorse of multiple choice modelling in all disciplines. Easy to compute But it has a drawback

Independence of Irrelevant Alternatives Consider market shares –Red bus 20% –Blue bus 20% –Train 60% IIA assumes that if red bus company shuts down, the market shares become –Blue bus 20% + 5% = 25% –Train 60% + 15% = 75% Because the ratio of blue bus trips to train trips must stay at 1:3

Independence of Irrelevant Alternatives Model assumes that unobserved attributes of all alternatives are perceived as equally similar But will people unable to travel by red bus really switch to travelling by train? Most likely outcome is (assuming supply of bus seats is elastic) –Blue bus: 40% –Train: 60% This failure of multinomial/conditional logit models is called the Independence of Irrelevant Alternatives assumption (IIA)

Independence of Irrelevant Alternatives It is easy to see why this is: Ratio of probabilities of choosing k (e.g. red bus) and another choice l (e.g. train) is just All other choices drop out of this odds ratio There are models that overcome this, e.g…

Nested Logit Model Multinomial logit model can be generalised to relax IIA assumption –Nested Logit (Nested Multinomial Logit) Car (1) Public transport (2) Bus (3)Train (4) Characteristics of Bus and Train affect decision of whether to use Car or Public Transport Estimate by sequential logits…

Nested Logit Model Value placed on choices available in second stage (3,4) enter into calculation of choice probabilities in first stage (2)… Logit for bus versus train to estimate V 3 and V 4 Define the Inclusive Value of public transport as Estimate logit model for Car (1) versus Public (2) using:

Example: Transport mode choice Asensio, J., Transport Mode Choice by Commuters to Barcelonas CBD, Urban Studies, 39(10), 2002 Travel mode for suburban commuters Sample of 1381 commuters from a travel survey Records mode of transport and other individual characteristics Private car Public transport Train Bus

Example: Transport mode choice Asensio, J., Transport Mode Choice by Commuters to Barcelonas CBD, Urban Studies, 39(10), 2002 –Some selected coefficients VariableParameter Cost-0.002 Travel time by car-0.054 Travel time by public transport-0.018 Sex (car)0.889 Sex (bus)-1.001 We dont know the units of measurement, but how much more valuable is time saved car than time saved by public transport?

Other discrete choice applications Firm location choices e.g. Head, K. and T.Mayer seminar reading (2004), Market Potential and the Location of Japanese Investment in the European Union, Review of Economics and Statistics, 86(4) 959-972 School choice (e.g. Barro, L. (2002) School choice through relocation: evidence from the Washington, D.C. area, Journal of Public Economics, 86 p.155-189 Migration destinations Residential choice

Aggregate choice models

Micro and aggregated choice models Micro level logit choice models often have aggregated equivalents i.e. if you only have choice characteristics, you could use a choice-level regression of the proportion of individuals making each choice on the choice characteristics Obviously log(n_k) would work too (why?)

Micro and aggregated choice models In fact, a Poisson model on aggregated data gives exactly the same coefficient estimates as the conditional logit model Which is based on ML estimation of See Guimaraes et al Restats (2003) –though this equivalence was known before this discovery Heres an example…

Data (295 is 3 js) idchoicedx 1American018.97627 1Japan07.542373 1Europe13.461017 2American118.97627 2Japan07.542373 2Europe03.461017 3American118.97627 3Japan07.542373 3Europe03.461017 4American018.97627 4Japan17.542373 4Europe03.461017 5American118.97627 5Japan07.542373 5Europe03.461017

Conditional logit Conditional (fixed-effects) logistic regression Number of obs = 885 LR chi2(1) = 129.65 Prob > chi2 = 0.0000 Log likelihood = -259.26785 Pseudo R2 = 0.2000 ------------------------------------------------------------------------------ choice | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.0999331.0091997 10.86 0.000.081902.1179642 ------------------------------------------------------------------------------

Simpler data choicenxp American19218.976270.650847 Japan647.5423730.216949 Europe393.4610170.132203

Poisson Poisson regression Number of obs = 3 LR chi2(1) = 129.65 Prob > chi2 = 0.0000 Log likelihood = -9.3973119 Pseudo R2 = 0.8734 ------------------------------------------------------------------------------ n | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.0999331.0091997 10.86 0.000.081902.1179642 _cons | 3.364614.1450806 23.19 0.000 3.080262 3.648967 ------------------------------------------------------------------------------

OLS. reg lnp x Source | SS df MS Number of obs = 3 -------------+------------------------------ F( 1, 1) = 370.23 Model | 1.32738687 1 1.32738687 Prob > F = 0.0331 Residual |.003585331 1.003585331 R-squared = 0.9973 -------------+------------------------------ Adj R-squared = 0.9946 Total | 1.3309722 2.665486102 Root MSE =.05988 ------------------------------------------------------------------------------ lnp | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- x |.101293.0052644 19.24 0.033.034403.168183 _cons | -2.339238.06295 -37.16 0.017 -3.139094 -1.539383 ------------------------------------------------------------------------------

Aggregate v micro choice models Hence, theres little point in using conditional logit if you only have choice-characteristics Conditional/multinomial logit is good if you have individual and group-level characteristics The aggregated OLS version gives rise to Spatial interaction models of flows between origins and destinations = Gravity models Widely applied (generally a-theoretically) in migration, trade and commuting applications –e.g. See Head (2003) Gravity for beginners

Gravity/spatial interaction/migration/trade models Flow from place j to place k modelled as Typically characteristics of destination and source include some measure of attraction e.g. population mass (or market potential in trade models) wages (endogenous) And measure of the cost in moving between place j and d (e.g. log distance) Hence gravity – after Newton

Strong distance decay effects –Typical elasticities -0.5 to -2.0 Even for internet site visits!: see Blum and Goldfarb (2006) Journal of International Economics Trade literature has many examples Disdier and Head (2003) The Puzzling Persistence Of The Distance Effect On Bilateral Trade, Review of Economics and Statistics –Finds mean distance elasticity of -0.9 from about 1500 studies Gravity/spatial interaction/migration/trade models

Conclusion Generally possible to model choices as discrete, or as flows Discrete choice models offer the advantage of –Including micro-level (individual/firm) level characteristics –An underlying structural model (RUM) Aggregate flow models –Simpler to compute –No need for distributional assumptions necessary for maximum likelihood (nonlinear) methods –A cant separate individual from aggregate factors

GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models.

Similar presentations

Presentation on theme: "GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models.

Similar presentations

Presentation on theme: "GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models."— Presentation transcript:

Similar presentations

About project

Feedback