Binary Response Lecture 22 Lecture 22.

Slides:



Advertisements
Similar presentations
Dummy Dependent variable Models
Advertisements

Qualitative and Limited Dependent Variable Models Chapter 18.
The Simple Regression Model
Brief introduction on Logistic Regression
Binary Logistic Regression: One Dichotomous Independent Variable
Econ 140 Lecture 81 Classical Regression II Lecture 8.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
Objectives (BPS chapter 24)
The General Linear Model. The Simple Linear Model Linear Regression.
Nguyen Ngoc Anh Nguyen Ha Trang
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Models with Discrete Dependent Variables
1Prof. Dr. Rainer Stachuletz Limited Dependent Variables P(y = 1|x) = G(  0 + x  ) y* =  0 + x  + u, y = max(0,y*)
Maximum likelihood estimates What are they and why do we care? Relationship to AIC and other model selection criteria.
Econ 140 Lecture 121 Prediction and Fit Lecture 12.
Chapter 10 Simple Regression.
QUALITATIVE AND LIMITED DEPENDENT VARIABLE MODELS.
Maximum likelihood Conditional distribution and likelihood Maximum likelihood estimations Information in the data and likelihood Observed and Fisher’s.
Statistics: Data Analysis and Presentation Fr Clinic II.
Econ 140 Lecture 181 Multiple Regression Applications III Lecture 18.
FIN357 Li1 Binary Dependent Variables Chapter 12 P(y = 1|x) = G(  0 + x  )
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Econ 140 Lecture 171 Multiple Regression Applications II &III Lecture 17.
An Introduction to Logistic Regression JohnWhitehead Department of Economics Appalachian State University.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
Econ 140 Lecture 191 Heteroskedasticity Lecture 19.
Lecture 16 – Thurs, Oct. 30 Inference for Regression (Sections ): –Hypothesis Tests and Confidence Intervals for Intercept and Slope –Confidence.
Inference about a Mean Part II
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
Lecture 14-2 Multinomial logit (Maddala Ch 12.2)
An Introduction to Logistic Regression
Autocorrelation Lecture 18 Lecture 18.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Unit 4b: Fitting the Logistic Model to Data © Andrew Ho, Harvard Graduate School of EducationUnit 4b – Slide 1
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Logistic Regression STA2101/442 F 2014 See last slide for copyright information.
Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
Maximum Likelihood Estimation Methods of Economic Investigation Lecture 17.
“Analyzing Health Equity Using Household Survey Data” Owen O’Donnell, Eddy van Doorslaer, Adam Wagstaff and Magnus Lindelow, The World Bank, Washington.
Lecture 7: What is Regression Analysis? BUEC 333 Summer 2009 Simon Woodcock.
Chapter 13: Limited Dependent Vars. Zongyi ZHANG College of Economics and Business Administration.
Generalized Linear Models (GLMs) and Their Applications.
Data Analysis, Presentation, and Statistics
Machine Learning 5. Parametric Methods.
Logistic regression. Recall the simple linear regression model: y =  0 +  1 x +  where we are trying to predict a continuous dependent variable y from.
Chapter 20 Statistical Considerations Lecture Slides The McGraw-Hill Companies © 2012.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
1 BINARY CHOICE MODELS: LOGIT ANALYSIS The linear probability model may make the nonsense predictions that an event will occur with probability greater.
MathematicalMarketing Slide 5.1 OLS Chapter 5: Ordinary Least Square Regression We will be discussing  The Linear Regression Model  Estimation of the.
Chapter 14 Introduction to Regression Analysis. Objectives Regression Analysis Uses of Regression Analysis Method of Least Squares Difference between.
The Probit Model Alexander Spermann University of Freiburg SS 2008.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Probit Regression Psych DeShon.
The Probit Model Alexander Spermann University of Freiburg SoSe 2009
Chapter 4: Basic Estimation Techniques
Limited Dependent Variables
Logistic Regression.
M.Sc. in Economics Econometrics Module I
THE LOGIT AND PROBIT MODELS
Financial Econometrics Lecture Notes 4
THE LOGIT AND PROBIT MODELS
Modeling with Dichotomous Dependent Variables
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Limited Dependent Variables
Presentation transcript:

Binary Response Lecture 22 Lecture 22

Today’s plan Three models: Linear probability model Probit model Logit model L22.xls provides an example of a linear probability model and a logit model Lecture 22

Discrete choice variable Defining variables: Yi = 1 if individual : Yi = 0 if individual: The discrete choice variable Yi is a function of individual characteristics: Yi = a + bXi + ei Takes BART Buys a car Joins a union Does not take BART Does not buy a car Does not join a union Lecture 22

Graphical representation X = years of labor market experience Y = 1 [if person joins union] = 0 [if person doesn’t join union] X Y 1 Observed data with OLS regression line Lecture 22

Linear probability model The OLS regression line in the previous slide is called the linear probability model predicting the probability that an individual will join a union given their years of labor market experience Using the linear probability model, we estimate the equation: using we can predict the probability Lecture 22

Linear probability model (2) Problems with the linear probability model 1) Predicted probabilities don’t necessarily lie within the 0 to 1 range 2) We get a very specific form of heteroskedasticity errors for this model are note: values are along the continuous OLS line, but Yi values jump between 0 and 1 - this creates large variation in errors 3) Errors are non-normal We can use the linear probability model as a first guess can be used for start values in a maximum likelihood problem Lecture 22

McFadden’s Contribution Suggestion: curve that runs strictly between 0 and 1 and tails off at the boundaries like so: Y 1 Lecture 22

McFadden’s Contribution Recall the probability distribution function and cumulative distribution function for a standard normal: 1 PDF CDF Lecture 22

Probit model For the standard normal, we have the probit model using the PDF The density function for the normal is: where Z = a + bX For the probit model, we want to find Lecture 22

Probit model (2) The probit model imposes the distributional form of the CDF in order to estimate a and b The values have to be estimated as part of the maximum likelihood procedure Lecture 22

Logit model The logit model uses the logistic distribution Density: Cumulative: 1 Standard normal F(Z) Logistic G(Z) Lecture 22

Maximum likelihood Alternative estimation that assumes you know the form of the population Using maximum likelihood, we will be specifying the model as part of the distribution Lecture 22

Maximum likelihood (2) For example: Bernoulli distribution where: (with a parameter ) We have an outcome 1 1 1 0 0 0 0 1 0 0 The probability expression is: We pick a sample of Y1….Yn Lecture 22

Maximum likelihood (3) Probability of getting observed Yi is based on the form we’ve assumed: If we multiply across the observed sample: Given we think that an outcome of one occurs r times: Lecture 22

Maximum likelihood (3) If we take logs, we get This is the log-likelihood We can differentiate this and obtain a solution for Lecture 22

L(a, b) = Si [Yi log(Gi) + (1 - Yi) log(1 - Gi)] Maximum likelihood (4) In a more complex example, the logit model gives Instead of looking for estimates of we are looking for estimates of a and b Think of G(Zi) as : we get a log-likelihood L(a, b) = Si [Yi log(Gi) + (1 - Yi) log(1 - Gi)] solve for a and b Lecture 22

Example Data on union membership and years of labor market experience (L22.xls) To build the maximum likelihood form, we can think of: intercept: a coefficient on experience : b There are three columns Predicted value Z Estimated probability Estimated likelihood as given by the model The Solver from the Tools menu calculates estimates of a and b Lecture 22

Example (2) How the solver works: Defining a and b using start values Choose start values of a and b equal to zero Define our model: Z = a + bX Define the predictive possibilities: Define the log-likelihood and sum it Can use Solver to change the values on a and b Lecture 22

Comparing parameters How do we compare parameters across these models? The linear probability form is: Y = a + bX where Recall the graphs associated with each model Consequently This is the same for the probit and logit forms Lecture 22

L22.xls example Predicting the linear probability model: If we wanted to predict the probability given 20 years of experience, we’d have: For the logit form: use logit distribution: logit estimated equation is: Lecture 22

L22.xls example (2) At 20 years of experience: Thus the slope at 20 years of experience is: 0.234 x 0.06 = 0.014 Lecture 22