Forecasting Choices. Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal.

Slides:



Advertisements
Similar presentations
Continued Psy 524 Ainsworth
Advertisements

Chapter 2 Describing Contingency Tables Reported by Liu Qi.
Probit The two most common error specifications yield the logit and probit models. The probit model results if the are distributed as normal variates,
Topic 12: Multiple Linear Regression
© Department of Statistics 2012 STATS 330 Lecture 32: Slide 1 Stats 330: Lecture 32.
Brief introduction on Logistic Regression
Analysis of Categorical Data Nick Jackson University of Southern California Department of Psychology 10/11/
Logistic Regression Psy 524 Ainsworth.
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Limited Dependent Variables
Nguyen Ngoc Anh Nguyen Ha Trang
GRA 6020 Multivariate Statistics; The Linear Probability model and The Logit Model (Probit) Ulf H. Olsson Professor of Statistics.
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
Log-linear and logistic models
Regression Model Building Setting: Possibly a large set of predictor variables (including interactions). Goal: Fit a parsimonious model that explains variation.
Generalized Linear Models
1 B. The log-rate model Statistical analysis of occurrence-exposure rates.
Logistic regression for binary response variables.
Review of Lecture Two Linear Regression Normal Equation
9. Binary Dependent Variables 9.1 Homogeneous models –Logit, probit models –Inference –Tax preparers 9.2 Random effects models 9.3 Fixed effects models.
MODELS OF QUALITATIVE CHOICE by Bambang Juanda.  Models in which the dependent variable involves two ore more qualitative choices.  Valuable for the.
Aaker, Kumar, Day Ninth Edition Instructor’s Presentation Slides
1 STATISTICAL HYPOTHESES AND THEIR VERIFICATION Kazimieras Pukėnas.
1 G Lect 11W Logistic Regression Review Maximum Likelihood Estimates Probit Regression and Example Model Fit G Multiple Regression Week 11.
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Qualitative and Limited Dependent Variable Models Adapted from Vera Tabakova’s notes ECON 4551 Econometrics II Memorial University of Newfoundland.
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
Basics of Regression Analysis. Determination of three performance measures Estimation of the effect of each factor Explanation of the variability Forecasting.
9-1 MGMG 522 : Session #9 Binary Regression (Ch. 13)
Generalized Linear Models All the regression models treated so far have common structure. This structure can be split up into two parts: The random part:
Linear Model. Formal Definition General Linear Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
© Department of Statistics 2012 STATS 330 Lecture 20: Slide 1 Stats 330: Lecture 20.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Generalized Linear Models (GLMs) and Their Applications.
1 Follow the three R’s: Respect for self, Respect for others and Responsibility for all your actions.
Discrepancy between Data and Fit. Introduction What is Deviance? Deviance for Binary Responses and Proportions Deviance as measure of the goodness of.
Université d’Ottawa - Bio Biostatistiques appliquées © Antoine Morin et Scott Findlay :32 1 Logistic regression.
Statistics 2: generalized linear models. General linear model: Y ~ a + b 1 * x 1 + … + b n * x n + ε There are many cases when general linear models are.
Qualitative and Limited Dependent Variable Models ECON 6002 Econometrics Memorial University of Newfoundland Adapted from Vera Tabakova’s notes.
Dates Presentations Wed / Fri Ex. 4, logistic regression, Monday Dec 7 th Final Tues. Dec 8 th, 3:30.
Logistic Regression Saed Sayad 1www.ismartsoft.com.
1 Say good things, think good thoughts, and do good deeds.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Dependent Variable Discrete  2 values – binomial  3 or more discrete values – multinomial  Skewed – e.g. Poisson Continuous  Non-normal.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
4. Binary dependent variable Sometimes it is not possible to quantify the y’s Ex. To work or not? To vote one or other party, etc. Some difficulties: 1.Heteroskedasticity.
Roger B. Hammer Assistant Professor Department of Sociology Oregon State University Conducting Social Research Logistic Regression Categorical Data Analysis.
Lecture 3 (Chapter 4). Linear Models for Longitudinal Data Linear Regression Model (Review) Ordinary Least Squares (OLS) Maximum Likelihood Estimation.
Instructor: R. Makoto 1richard makoto UZ Econ313 Lecture notes.
Introduction We consider the data of ~1800 phenotype measurements Each mouse has a given probability distribution of descending from one of 8 possible.
BINARY LOGISTIC REGRESSION
Logistic Regression When and why do we use logistic regression?
A priori violations In the following cases, your data violates the normality and homoskedasticity assumption on a priori grounds: (1) count data  Poisson.
Logistic Regression APKC – STATS AFAC (2016).
Probability Theory and Parameter Estimation I
M.Sc. in Economics Econometrics Module I
Generalized Linear Models
Generalized Linear Models
Generalized Linear Models (GLM) in R
Introduction to logistic regression a.k.a. Varbrul
Quantitative Methods What lies beyond?.
Quantitative Methods What lies beyond?.
Introduction to Logistic Regression
MPHIL AdvancedEconometrics
Maximum Likelihood We have studied the OLS estimator. It only applies under certain assumptions In particular,  ~ N(0, 2 ) But what if the sampling distribution.
Presentation transcript:

Forecasting Choices

Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal

Nominal or Ordinal Dependent Variable Indicating “choices” of a decision maker, say a consumer. Response categories: –Mutually exclusive –Collectively exhaustive –Finite Number Desired regression outputs –Probability that the d.m. chooses each category –Coefficient of each independent variable

Generalized Linear Models (GLM) Regression model for a continuous Y: Y =  0 +  1 X 1 +  2 X 2 + e  e following N(0,  ) GLM Formulation: 1.Model for Y: Y is N( ,  ) 2.Link Function (model for the predictors)  =  0 +  1 X 1 +  2 X 2

Estimation of Parameters of GLM Maximum Likelihood Estimation –For normal Y, MLE is the LS estimation Maximize: –Sum of log (likelihood function), L i of each observation

MLE for Regression Model Y is N( ,  ) MLE: Maximize

GLM for Binary Dependent Variable, Y Model for response: Y is B (n,  ) Model for predictors (Link Function) logit(  0 +  1 X 1 +  2 X 2 +…  K X K = g Probability   exp(g) / (1+exp(g))

X : Covariates Independent variables are often referred to as “covariates.” Example: –SPSS binary logistic regression routine –SPSS multinomial logistic regression routine

A. Logistic Regression For Ungrouped Data (n i =1) Model of Observation for the i-th observation Y i = 1: Choose category 1with probability  i Y i = 0: Choose category 2with probability 1-  i Log Likelihood Function for the i-th observation

MLE Maximize:

Setting Up a Worksheet for MLE Define an array for storing parameters of the link function. Enter an initial estimate for each parameter. Then for each observation: Sum the likelihood and invoke the solver to maximize by changing the parameters. Multiply –2 to the maximized value for test of significance of the regression Link Function, g i Parameters of the Likelihood ln(Likelihood) L i

Test of Significance Hypotheses: H 0 :  1 =  2 ….   = 0 H 1 : At least one  j = 0 Test statistic: The Distribution Under H 0 :   (DF = K)

Standard Errors of Logistic Regression Coefficients (optional) Estimate of Information Matrix, I (K=2)

Deviance Residuals and Deviance for Logistic Regression (Optional) Deviance (corresponds to SSE) Deviance Residual

B. Logistic Regression for Grouped Data Using WLS The observation for the i-th group: ->

WLS for Logistic Regression Regress: on X 1i, …, X Ki with

WLS for Unequal Variance Data X Y * * * * * 1 2 Observation 2 is subject to a larger variance than observation 1. So, it makes sense to give a lower weight. In WLS, the weight is proportional to 1/variance.

Modeling of Forecasting Choices - GLM 1.Model for Observation of the Dependent Variable. A probability distribution Link Function (Model for Independent Variables) A mathematical function

Forecasting Choices # of Choices 2 Binomial Distr. > 2 Multinomial Distr. UnorderedOrdered

Multinomial Logit Regression Multinomial Choice (m=3), Ungrouped Data: –Y 1 =1: Choose category 1with probability   –Y 1 =0: Choose category 2 or 3with probability 1-   –Y 2 =1: Choose category 2with probability   –Y 2 =0: Choose category 1 or 3with probability 1-   –Y 3 =1: Choose category 3with probability   –Y 3 =0: Choose category 1 or 2with probability 1-   

Log Likelihood Function Log Likelihood Function of the i-th ungrouped observation MLE: Maximize

Y 3 and  3 can be omitted Multinomial Choice (m=3), Ungrouped Data: –Y 1 =1: Choose category 1with probability   –Y 1 =0: Choose category 2 or 3with probability 1-   –Y 2 =1: Choose category 2with probability   –Y 2 =0: Choose category 1 or 3with probability 1-   

Log Likelihood Function Log Likelihood Function of the i-th (ungrouped) observation MLE: Maximize

1: Formulating “Link” Functions: Unordered Choice Categories Category 3 as the baseline category.

From Link Functions to Probabilities

Test of Significance Hypotheses: H 0 :  11 =  21 = …  K1 =  12 =  22 = …  K2 = 0 H 1 : At least one  ij = 0 Test statistic The Distribution Under H 0 :   (DF = 2 K)

Interpreting Coefficients Not easy, as a change of probability for one category affects probabilities for other (two) categories.

11 22 2: Formulating Link Functions: Ordered Choice Categories Underlying Variable Defining Categories Category 1Category 2Category 3

Choices for Probability Distribution of U a. Ordered Probit Model for the i-th DM U i = follows N(  i,  =1) b. Ordered Logit Model for the i-th DM U i follows Logistic Distribution(  i )   i =  1 X 1i +  2 X 2i (no const)

a. Ordered Probit Model

b. Ordered Logit Model

Types of Variable Variable Quantitative Qualitative Continuous Discrete (counting) Ordinal Nominal

Poisson Regression for Counting Model of observations for Y Link Function Log Likelihood Function