GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models.

Slides:



Advertisements
Similar presentations
Bayes rule, priors and maximum a posteriori
Advertisements

Dummy Dependent variable Models
Introduction Describe what panel data is and the reasons for using it in this format Assess the importance of fixed and random effects Examine the Hausman.
Nonlinear models Hill et al Chapter 10. Types of nonlinear models Linear in the parameters. –Includes models that can be made linear by transformation:
Assumptions underlying regression analysis
Multilevel Event History Modelling of Birth Intervals
Statistical Analysis SC504/HS927 Spring Term 2008
[Part 13] 1/30 Discrete Choice Modeling Hybrid Choice Models Discrete Choice Modeling William Greene Stern School of Business New York University 0Introduction.
BusAd/Econ 533: Economic and Business Decision Tools.
Slide 13.1 Random Utility Models MathematicalMarketing Chapter 13 Random Utility Models This chapter covers choice models applicable where the consumer.
Discrete Choice Modeling William Greene Stern School of Business IFS at UCL February 11-13, 2004
Rural Economy Research Centre Modelling taste heterogeneity among walkers in Ireland Edel Doherty Rural Economy Research Centre (RERC) Teagasc Department.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
Error Component models Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm Brisbane.
ELASTICITIES AND DOUBLE-LOGARITHMIC MODELS
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function F(Z) giving the probability is the cumulative standardized.
Multinomial Logit Sociology 8811 Lecture 11 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Models with Discrete Dependent Variables
Binary Response Lecture 22 Lecture 22.
Models of migration Observations and judgments In: Raymer and Willekens, 2008, International migration in Europe, Wiley.
In previous lecture, we highlighted 3 shortcomings of the LPM. The most serious one is the unboundedness problem, i.e., the LPM may make the nonsense predictions.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
TESTING A HYPOTHESIS RELATING TO A REGRESSION COEFFICIENT This sequence describes the testing of a hypotheses relating to regression coefficients. It is.
BINARY CHOICE MODELS: LOGIT ANALYSIS
Christopher Dougherty EC220 - Introduction to econometrics (chapter 3) Slideshow: precision of the multiple regression coefficients Original citation:
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
DUMMY CLASSIFICATION WITH MORE THAN TWO CATEGORIES This sequence explains how to extend the dummy variable technique to handle a qualitative explanatory.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 10) Slideshow: Tobit models Original citation: Dougherty, C. (2012) EC220 - Introduction.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
[Part 15] 1/24 Discrete Choice Modeling Aggregate Share Data - BLP Discrete Choice Modeling William Greene Stern School of Business New York University.
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
ECON 6012 Cost Benefit Analysis Memorial University of Newfoundland
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
Structure of the class 1.The linear probability model 2.Maximum likelihood estimations 3.Binary logit models and some other models 4.Multinomial models.
Limited Dependent Variables Ciaran S. Phibbs May 30, 2012.
Nested Logit Model by Asif Khan Phd Graduate Seminar in advance Statistics Institute of Rural Development (IRE) Georg-August University Goettingen July.
Issues in Estimation Data Generating Process:
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
POSSIBLE DIRECT MEASURES FOR ALLEVIATING MULTICOLLINEARITY 1 What can you do about multicollinearity if you encounter it? We will discuss some possible.
Meeghat Habibian Analysis of Travel Choice Transportation Demand Analysis Lecture note.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 6) Slideshow: exercise 6.13 Original citation: Dougherty, C. (2012) EC220 - Introduction.
[Part 15] 1/24 Discrete Choice Modeling Aggregate Share Data - BLP Discrete Choice Modeling William Greene Stern School of Business New York University.
1 BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL Economists are often interested in the factors behind the decision-making of individuals or enterprises,
F TESTS RELATING TO GROUPS OF EXPLANATORY VARIABLES 1 We now come to more general F tests of goodness of fit. This is a test of the joint explanatory power.
Birthweight (gms) BPDNProp Total BPD (Bronchopulmonary Dysplasia) by birth weight Proportion.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Logit Models Alexander Spermann, University of Freiburg, SS Logit Models.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Non-Linear Dependent Variables Ciaran S. Phibbs November 17, 2010.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
assignment 7 solutions ► office networks ► super staffing
MULTINOMIAL REGRESSION MODELS
M.Sc. in Economics Econometrics Module I
Analysis of Travel Choice
Mónica Martí y Carmen Ródenas Dpto. Análisis Económico Aplicado
Discrete Choice Modeling
Count Models 2 Sociology 8811 Lecture 13
Introduction to Econometrics, 5th edition
Introduction to Econometrics, 5th edition
Presentation transcript:

GY460 Techniques of Spatial Analysis Steve Gibbons Lecture 6: Probabilistic choice models

Introduction Sometimes useful to model individual firm, or other agents choices over discrete alternatives –Choice of transport mode –Choice of firm location amongst regions –Choice of cities or country to migrate to Theoretical framework –Random utility model Empirical methods: –Micro: Probit, logit, multinomial logit –Aggregate: Poisson, OLS, gravity

The Random Utility choice model

Random Utility Model RUM underlies economic interpretation of discrete choice models. Developed by Daniel McFadden for econometric applications –see JoEL January 2001 for Nobel lecture; also Manski (2001) Daniel McFadden and the Econometric Analysis of Discrete Choice, Scandinavian Journal of Economics, 103(2), Preferences are functions of biological taste templates, experiences, other personal characteristics –Some of these are observed, others unobserved –Allows for taste heterogeneity Discussion below is in terms of individual utility (e.g. migration, transport mode choice) but similar reasoning applies to firm choices

Random Utility Model Individual is utility from a choice j can be decomposed into two components: V ij is deterministic – common to everyone, given the same characteristics and constraints – representative tastes of the population e.g. effects of time and cost on travel mode choice ij is random –reflects idiosyncratic tastes of i and unobserved attributes of choice j

Random Utility Model V ij is a function of attributes of alternative j (e.g. price and time) and observed consumer and choice characteristics. We are interested in finding,, Lets forget about z now for simplicity

RUM and binary choices Consider two choices e.g. bus or car We observe whether an individual uses one or the other Define What is the probability that we observe an individual choosing to travel by bus? Assume utility maximisation Individual chooses bus (y=1) rather than car (y=0) if utility of commuting by bus exceeds utility of commuting by car

RUM and binary choices So choose bus if So the probability that we observe an individual choosing bus travel is

The linear probability model Assume probability depends linearly on observed characteristics (price and time) Then you can estimate by linear regression Where is the dummy variable for mode choice (1 if bus, 0 if car) Other consumer and choice characteristics can be included (the zs in the first slide in this section)

The linear probability model Unfortunately his has some undesirable properties 1 0 Linear regression line

Non-linear probability model Better for probability function to have a shape something like: 1 0

Probits and logits Common assumptions: –Cumulative normal distribution function – Probit –Logistic function – Logit Estimation by maximum likelihood

Example McFadden, D. (1974) The Measurement of Urban Travel Demand, Journal of Public Economics, 3 Methods of commuting in San Francisco Bay area

Example 1 Characteristics t Family income $ (0.774) Car-bus cost, cents per round trip *(3.726) Car-bus vehicle time costs (one way minutes x wage) (2.460) Bus total access time costs (one way minutes x wage) (0.818) Constant0.3832(0.428) McFadden (1974) car versus bus commute modes in SF Bay area

Multiple choices and the multinomial logit

Multiple choices We often want to think about many more than two choices –Choice of regional location –Choice of transport mode with many alternatives –Choice amongst a sample of schools How can we extend the binary choice logit model? Random Utility model extends to many choices Choose choice k if utility higher than for all other choices

Multinomial logit (1) Again we need to assume some distribution for the unobserved factor One type of distribution (extreme value) gives a simple solution for the probability that choice k is made: This is a generalisation of the logit model with many alternatives = multinomial logit or conditional logit

Multinomial logit (2) Recall: V ij is a linear function of observed characteristics of the individuals and their choices. e.g. for travel mode choice Parameters estimated: For an individual characteristic that is common across choices (e.g. income, gender): one parameter per choice –For at least one choice this is zero (base case). For a characteristic which varies only across choices e.g. price of transport: one parameter common across choices

Example: Value of time MNL models used to estimate value of travel time with from observed commuter behaviour Three transport choices: bus (0), train (1), car (2) Choosing bus as the base case:

Example 1: Value of time For example, from Truong and Hensher, Economic Journal, 95 (1985) p. 15 for bus/train/car choices in Sydney 1982

Example 2: immigration Scott, Coomes and Izyumov, (2005)The Location Choice of Employment-Based Immigrants among U.S. Metro Areas. Journal of Regional Science 45(1) Estimate the impact of metropolitan area characteristics on destination choice for US migrants in destination MSAs

Example 2: immigration Source: Scott, Coomes et al (note: they also report models which include individual Xs)

The independence of irrelevant alternatives problem (IIA) and the nested logit model

Multinomial logit and IIA Many applications in economic and geographical journals (and other research areas) The multinomial logit model is the workhorse of multiple choice modelling in all disciplines. Easy to compute But it has a drawback

Independence of Irrelevant Alternatives Consider market shares –Red bus 20% –Blue bus 20% –Train 60% IIA assumes that if red bus company shuts down, the market shares become –Blue bus 20% + 5% = 25% –Train 60% + 15% = 75% Because the ratio of blue bus trips to train trips must stay at 1:3

Independence of Irrelevant Alternatives Model assumes that unobserved attributes of all alternatives are perceived as equally similar But will people unable to travel by red bus really switch to travelling by train? Most likely outcome is (assuming supply of bus seats is elastic) –Blue bus: 40% –Train: 60% This failure of multinomial/conditional logit models is called the Independence of Irrelevant Alternatives assumption (IIA)

Independence of Irrelevant Alternatives It is easy to see why this is: Ratio of probabilities of choosing k (e.g. red bus) and another choice l (e.g. train) is just All other choices drop out of this odds ratio There are models that overcome this, e.g…

Nested Logit Model Multinomial logit model can be generalised to relax IIA assumption –Nested Logit (Nested Multinomial Logit) Car (1) Public transport (2) Bus (3)Train (4) Characteristics of Bus and Train affect decision of whether to use Car or Public Transport Estimate by sequential logits…

Nested Logit Model Value placed on choices available in second stage (3,4) enter into calculation of choice probabilities in first stage (2)… Logit for bus versus train to estimate V 3 and V 4 Define the Inclusive Value of public transport as Estimate logit model for Car (1) versus Public (2) using:

Example: Transport mode choice Asensio, J., Transport Mode Choice by Commuters to Barcelonas CBD, Urban Studies, 39(10), 2002 Travel mode for suburban commuters Sample of 1381 commuters from a travel survey Records mode of transport and other individual characteristics Private car Public transport Train Bus

Example: Transport mode choice Asensio, J., Transport Mode Choice by Commuters to Barcelonas CBD, Urban Studies, 39(10), 2002 –Some selected coefficients VariableParameter Cost Travel time by car Travel time by public transport Sex (car)0.889 Sex (bus) We dont know the units of measurement, but how much more valuable is time saved car than time saved by public transport?

Other discrete choice applications Firm location choices e.g. Head, K. and T.Mayer seminar reading (2004), Market Potential and the Location of Japanese Investment in the European Union, Review of Economics and Statistics, 86(4) School choice (e.g. Barro, L. (2002) School choice through relocation: evidence from the Washington, D.C. area, Journal of Public Economics, 86 p Migration destinations Residential choice

Aggregate choice models

Micro and aggregated choice models Micro level logit choice models often have aggregated equivalents i.e. if you only have choice characteristics, you could use a choice-level regression of the proportion of individuals making each choice on the choice characteristics Obviously log(n_k) would work too (why?)

Micro and aggregated choice models In fact, a Poisson model on aggregated data gives exactly the same coefficient estimates as the conditional logit model Which is based on ML estimation of See Guimaraes et al Restats (2003) –though this equivalence was known before this discovery Heres an example…

Data (295 is 3 js) idchoicedx 1American Japan Europe American Japan Europe American Japan Europe American Japan Europe American Japan Europe

Conditional logit Conditional (fixed-effects) logistic regression Number of obs = 885 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = choice | Coef. Std. Err. z P>|z| [95% Conf. Interval] x |

Simpler data choicenxp American Japan Europe

Poisson Poisson regression Number of obs = 3 LR chi2(1) = Prob > chi2 = Log likelihood = Pseudo R2 = n | Coef. Std. Err. z P>|z| [95% Conf. Interval] x | _cons |

OLS. reg lnp x Source | SS df MS Number of obs = F( 1, 1) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = lnp | Coef. Std. Err. t P>|t| [95% Conf. Interval] x | _cons |

Aggregate v micro choice models Hence, theres little point in using conditional logit if you only have choice-characteristics Conditional/multinomial logit is good if you have individual and group-level characteristics The aggregated OLS version gives rise to Spatial interaction models of flows between origins and destinations = Gravity models Widely applied (generally a-theoretically) in migration, trade and commuting applications –e.g. See Head (2003) Gravity for beginners

Gravity/spatial interaction/migration/trade models Flow from place j to place k modelled as Typically characteristics of destination and source include some measure of attraction e.g. population mass (or market potential in trade models) wages (endogenous) And measure of the cost in moving between place j and d (e.g. log distance) Hence gravity – after Newton

Strong distance decay effects –Typical elasticities -0.5 to -2.0 Even for internet site visits!: see Blum and Goldfarb (2006) Journal of International Economics Trade literature has many examples Disdier and Head (2003) The Puzzling Persistence Of The Distance Effect On Bilateral Trade, Review of Economics and Statistics –Finds mean distance elasticity of -0.9 from about 1500 studies Gravity/spatial interaction/migration/trade models

Conclusion Generally possible to model choices as discrete, or as flows Discrete choice models offer the advantage of –Including micro-level (individual/firm) level characteristics –An underlying structural model (RUM) Aggregate flow models –Simpler to compute –No need for distributional assumptions necessary for maximum likelihood (nonlinear) methods –A cant separate individual from aggregate factors