Section Count Data Models. Introduction Many outcomes of interest are integer counts –Doctor visits –Low work days –Cigarettes smoked per day –Missed.

Section Count Data Models

Introduction Many outcomes of interest are integer counts –Doctor visits –Low work days –Cigarettes smoked per day –Missed school days OLS models can easily handle some integer models

Example –SAT scores are essentially integer values –Few at ‘tails’ –Distribution is fairly continuous –OLS models well In contrast, suppose –High fraction of zeros –Small positive values

OLS models will –Predict negative values –Do a poor job of predicting the mass of observations at zero Example –Dr visits in past year, Medicare patients(65+) –1987 National Medical Expenditure Survey –Top code (for now) at 10 –17% have no visits

visits | Freq. Percent Cum. ------------+----------------------------------- 0 | 915 17.18 17.18 1 | 601 11.28 28.46 2 | 533 10.01 38.46 3 | 503 9.44 47.91 4 | 450 8.45 56.35 5 | 391 7.34 63.69 6 | 319 5.99 69.68 7 | 258 4.84 74.53 8 | 216 4.05 78.58 9 | 192 3.60 82.19 10 | 949 17.81 100.00 ------------+----------------------------------- Total | 5,327 100.00

Poisson Model y i is drawn from a Poisson distribution Poisson parameter varies across observations f(y i ;λ i ) =e -λi λ i yi /y i ! For λ i >0 E[y i ]= Var[y i ] = λ i = f(x i, β)

λ i must be positive at all times Therefore, we CANNOT let λ i = x i β Let λ i = exp(x i β) ln(λ i ) = (x i β)

d ln(λ i )/dx i = β Remember that d ln(λ i ) = dλ i /λ i Interpret β as the percentage change in mean outcomes for a change in x

Problems with Poisson Variance grows with the mean –E[y i ]= Var[y i ] = λ i = f(x i, β) Most data sets have over dispersion, where the variance grows faster than the mean In dr. visits sample,  = 5.6, s=6.7 Impose Mean=Var, severe restriction and you tend to reduce standard errors

Negative Binomial Model Where γ i = exp(x i β) and δ ≥ 0 E[y i ] = δγ i = δexp(x i β) Var[y i ] = δ (1+δ) γ i Var[y i ]/ E[y i ] = (1+δ)

δ must always be ≥ 0 In this case, the variance grows faster than the mean If δ=0, the model collapses into the Poisson Always estimate negative binomial If you cannot reject the null that δ=0, report the Poisson estimates

Notice that ln(E[y i ]) = ln(δ) + ln(γ i ), so d ln(E[y i ]) /dx i = β Parameters have the same interpretation as in the Poisson model

In STATA POISSON estimates a MLE model for poisson –Syntax POISSON y independent variables NBREG estimates MLE negative binomial –Syntax NBREG y independent variables

Interpret results for Poisson Those with CHRONIC condition have 50% more mean MD visits Those in EXCELent health have 78% fewer MD visits BLACKS have 33% fewer visits than whites Income elasticity is 0.021, 10% increase in income generates a 2.1% increase in visits

Negative Binomial Interpret results the same was as Poisson Look at coefficient/standard error on delta Ho: delta = 0 (Poisson model is correct) In this case, delta = 5.21 standard error is 0.15, easily reject null. Var/Mean = 1+delta = 6.21, Poisson is mis-specificed, should see very small standard errors in the wrong model

Selected Results, Count Models Parameter (Standard Error) VariablePoissonNegative Binomial Age650.214(0.026)0.103(0.055) Age700.787(0.026)0.204(0.054) Chronic0.500(0.014)0.509(0.029) Excel-0.784(0.031)-0.527(0.059) Ln(Inc).0.021(0.007)0.038(0.016)

Section Count Data Models. Introduction Many outcomes of interest are integer counts –Doctor visits –Low work days –Cigarettes smoked per day –Missed.

Similar presentations

Presentation on theme: "Section Count Data Models. Introduction Many outcomes of interest are integer counts –Doctor visits –Low work days –Cigarettes smoked per day –Missed."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Section Count Data Models. Introduction Many outcomes of interest are integer counts –Doctor visits –Low work days –Cigarettes smoked per day –Missed.

Similar presentations

Presentation on theme: "Section Count Data Models. Introduction Many outcomes of interest are integer counts –Doctor visits –Low work days –Cigarettes smoked per day –Missed."— Presentation transcript:

Similar presentations

About project

Feedback