Error Component models Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm Brisbane.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Multiple Regression Analysis
The Simple Regression Model
ECE 8443 – Pattern Recognition LECTURE 05: MAXIMUM LIKELIHOOD ESTIMATION Objectives: Discrete Features Maximum Likelihood Resources: D.H.S: Chapter 3 (Part.
CS479/679 Pattern Recognition Dr. George Bebis
Pattern Recognition and Machine Learning
Discrete Choice Modeling William Greene Stern School of Business New York University.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
12. Random Parameters Logit Models. Random Parameters Model Allow model parameters as well as constants to be random Allow multiple observations with.
Sampling: Final and Initial Sample Size Determination
Discrete Choice Modeling William Greene Stern School of Business New York University.
Integration of sensory modalities
Random effects estimation RANDOM EFFECTS REGRESSIONS When the observed variables of interest are constant for each individual, a fixed effects regression.
Nguyen Ngoc Anh Nguyen Ha Trang
Visual Recognition Tutorial
The Simple Linear Regression Model: Specification and Estimation
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
HMM-BASED PATTERN DETECTION. Outline  Markov Process  Hidden Markov Models Elements Basic Problems Evaluation Optimization Training Implementation 2-D.
Econ Prof. Buckles1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Housing Demand in Germany and Japan Borsch-Supan, Heiss, and Seko, JHE 10, (2001) Presented by Mark L. Trueman, 11/25/02.
Topic 3: Regression.
The Simple Regression Model
Review of Probability and Statistics
Single and Multiple Spell Discrete Time Hazards Models with Parametric and Non-Parametric Corrections for Unobserved Heterogeneity David K. Guilkey.
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Discrete Choice Models William Greene Stern School of Business New York University.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
The horseshoe estimator for sparse signals CARLOS M. CARVALHO NICHOLAS G. POLSON JAMES G. SCOTT Biometrika (2010) Presented by Eric Wang 10/14/2010.
Chapter 4-5: Analytical Solutions to OLS
Random Sampling, Point Estimation and Maximum Likelihood.
A Neural Network MonteCarlo approach to nucleon Form Factors parametrization Paris, ° CLAS12 Europen Workshop In collaboration with: A. Bacchetta.
1 Advances in the Construction of Efficient Stated Choice Experimental Designs John Rose 1 Michiel Bliemer 1,2 1 The University of Sydney, Australia 2.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: ML and Simple Regression Bias of the ML Estimate Variance of the ML Estimate.
Nested Logit Models.
Limited Dependent Variables Ciaran S. Phibbs. Limited Dependent Variables 0-1, small number of options, small counts, etc. 0-1, small number of options,
Chapter Seventeen. Figure 17.1 Relationship of Hypothesis Testing Related to Differences to the Previous Chapter and the Marketing Research Process Focus.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
Meeghat Habibian Analysis of Travel Choice Transportation Demand Analysis Lecture note.
Sampling and estimation Petter Mostad
1 Prof. Dr. Rainer Stachuletz Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 1. Estimation.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Designing experiments for efficient WTP estimates Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
1 Ka-fu Wong University of Hong Kong A Brief Review of Probability, Statistics, and Regression for Forecasting.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
CWR 6536 Stochastic Subsurface Hydrology Optimal Estimation of Hydrologic Parameters.
Econometric analysis of CVM surveys. Estimation of WTP The information we have depends on the elicitation format. With the open- ended format it is relatively.
Lecture 6 Feb. 2, 2015 ANNOUNCEMENT: Lab session will go from 4:20-5:20 based on the poll. (The majority indicated that it would not be a problem to chance,
Inference about the slope parameter and correlation
CS479/679 Pattern Recognition Dr. George Bebis
Ch. 2: The Simple Regression Model
Linear Mixed Models in JMP Pro
Discrete Choice Models
Ch. 2: The Simple Regression Model
Analysis of Travel Choice
Generalized Spatial Dirichlet Process Models
The Simple Linear Regression Model: Specification and Estimation
Econometric Analysis of Panel Data
William Greene Stern School of Business New York University
The Simple Regression Model
Learning From Observed Data
Multivariate Methods Berlin Chen
Multivariate Methods Berlin Chen, 2005 References:
Presentation transcript:

Error Component models Ric Scarpa Prepared for the Choice Modelling Workshop 1st and 2nd of May Brisbane Powerhouse, New Farm Brisbane

Presentation structure The basic MNL model Types of Heteroskedasticy in logit models Structure of error components Estimation Applications in env. economics –Flexible substitution patterns –Choice modeling Future perspectives (debate)

ML – RUM Specification The utility from individual i choosing alternative j is given by: Assume error is Gumbel i.e.,

ML Choice Probabilities Given the distributional assumptions and representative agent specification, then defining we have that:

ML Choice Probabilities (cont’d) Thus, we have the conditional choice probability: Taking the expectation of this with respect to yields the unconditional choice probability:

ML Choice Probabilities (cont’d) Consider a change of variables

ML Choice Probabilities (cont’d)

Merits of ML Specification The log-likelihood model is globally concave in its parameters (McFadden, 1973) Choice probabilities lie strictly within the unit interval and sum to one The log-likelihood function has a relatively simple form

Utility Variance in ML Specifications Assumes that the unobserved sources of heterogeneity are independently and identically distributed across individuals and alternatives; i.e., where and Dependent on, but basically homoskedastic in most applications This is a problem as it leads to biased estimates if variance of utilities actually varies in real life, which is likely phenomenon Because the effect is multiplicative bias is likely to be big

Scale heteroskedasticy …or Gumbel error heteroskedasticity SP/RP joint response analysis allowed for minimal heteroskedasticty (variance switch from SP to RP): i =exp(  ×1 i (RP)) Choice complexity work introduced i =exp(  ’z i ), where z i is measure of complexity of choice context i Respondent cognitive effort:  n =exp(  ’s n ), where s n is a measure of cognitive ability of respondent n

Scale Het. limitations While scale heteroskedasticity allows the treatment of heteroskedasticity in the choice- respondent context it does not allow heteroskedasticity across utilities in the same choice context People may inherently associate more utility variance with less familiar alternatives (e.g. unknown destinations, hypothetical alternatives) than with better known ones (e.g. frequently attended sites, status quo option)

Mixed logit The mixed logit model is defined as any model whose choice probabilities can be expressed as where is a logit choice probability; i.e., and is the density function for, with underlying parameters denotes the representative utility function

Special Cases Case #1: MNL results if the density function is degenerate; i.e.,

Special Cases Case #2: Finite mixture logit model results if the density function is discrete; i.e.,

Notes on Mixed Logit (MXL) Train emphasizes two interpretations of the MXL model –Random parameters (variation of taste intensities) –Error components (heteroskedastic utilities) Mixed logit probabilities are simply weighted average of logit probabilities, with weights given by The goal of the research is to estimate the underlying parameter vector

Simulation Estimation Simulation methods are typically used to estimate mixed logit models Recall that the choice probabilities are given by where

Simulation Estimation (cont’d) which can then be used to compute For any given value of, one can generate drawn from

Simulation Estimation The simulated log-likelihood for the panel of t choices becomes:

Error Components Interpretation The mixed logit model is generated in the RUM model by assuming that where with x ij and both observed, and

Error Components Interpretation (cont’d) The error components perspective views the additional random terms as tools for inducing specific patterns of correlation across alternatives. where

Example – Mimicking NL Consider a nesting structure Stay at home (j=0) Take a trip Nest A Nest B

Example (cont’d) The corresponding correlation structure among error components (and utilities) is given by where

Example (cont’d) We can build up this covariance structure using error components with

Example (cont’d) The resulting covariance structure becomes

Example (cont’d) One limitation of the NL model is that one has to fix the nesting structure MXL can be used to create overlapping nests

Herriges and Phaneuf (2002) Covariance Pattern

Implications for Elasticity Patterns In general, elasticities given by

Implications for Elasticity Patterns (cont’d) where denotes the standard logit response elasticity (i.e., without nesting) conditional on a specific draw of the vector and denotes the relative odds that alternative j is selected (i.e., conditional versus unconditional odds)

Illustration – Choice Probabilities

Choice modeling Error component in hypothetical alternatives, yet absent in the SQ or no alternative The induced variance structure across utilities is:

Effect Fairly general result that it improves fit while requiring few additional parameters (only st. dev. of err. comp.) It can be decomposed by socio-economics covariates (e.g. spread of error varies across segments of respondents)

Adoption and state of practice Error component estimators have now been incorporated in commercial software (e.g. Nlogit 4) Given their properties and the flexibility they afford they are likely to be increasingly used in practice