R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.

R Programming/ Binomial Models Shinichiro Suna

Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.

Contents 1 Logit model 1.1 Fake data simulations 1.2 Maximum likelihood estimation 1.3 Bayesian estimation 2 Probit model 2.1 Fake data simulations 2.2 Maximum likelihood estimation 2.3 Bayesian estimation

1. Logit model (Logistic Regession Analysis) Logit model (Logistic regression analysis) uses the logistic function. When there are several explanatory variables, F(x) = 1 / 1+ exp( - (B0 + B1*X1 + B2*X2 + ….) )

1. Fake data simulations x <- 1 + rnorm(1000,1) xbeta <- -1 + (x* 1) proba <- exp(xbeta)/(1 + exp(xbeta)) y <- ifelse(runif(1000,0,1) < proba,1,0) table(y) df <- data.frame(y, x)

1.2. Maximum likelihood estimation The standard way to estimate a logit model is glm() function with family binomial and link logit. (Fitting Generalized Linear Models)

1.2. Maximum likelihood estimation # Fitting Generalized Linear Models res <- glm(y ~ x, family = binomial(link=logit)) names(res) summary(res) # results confint(res) # confindence intervals exp(res$coefficients) # odds ratio exp(confint(res)) # Confidence intervals for odds ratio (delta method)

1.2. Maximum likelihood estimation > summary(res) # results Call: glm(formula = y ~ x, family = binomial(link = logit)) Deviance Residuals: Min 1Q Median 3Q Max -2.1406 -1.0044 0.5417 0.8104 1.7770 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.3561 0.5691 -2.383 0.017179 * x 1.1287 0.2938 3.841 0.000122 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 125.37 on 99 degrees of freedom Residual deviance: 106.71 on 98 degrees of freedom AIC: 110.71 Number of Fisher Scoring iterations: 4

1. 3. Bayesian estimation # Data generating process x <- 1 + rnorm(1000,1) xbeta <- -1 + (x* 1) proba <- exp(xbeta)/(1 + exp(xbeta)) y <- ifelse(runif(1000,0,1) < proba,1,0) table(y) # Markov Chain Monte Carlo for Logistic Regression library(MCMCpack) res <- MCMClogit(y ~ x) summary(res) plot(res)

1. 3. Bayesian estimation > summary(res) Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) -2.104 0.7199 0.007199 0.02239 x 1.491 0.3652 0.003652 0.01139 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) -3.6302 -2.570 -2.065 -1.618 -0.7416 x 0.8233 1.236 1.472 1.726 2.2805

2. Probit model The probit model is a type of regression where the dependent variable can only take two values. The name is from probability + unit.

2. Probit model Probit model uses the cumulative density function of a normal distribution.

2.1 Probit model - fake data simulation - # Generating Fake Data x1 <- 1 + rnorm(1000) x2 <- -1 + x1 + rnorm(1000) xbeta <- -1 + x1 + x2 proba <- pnorm(xbeta) y <- ifelse(runif(1000,0,1) < proba,1,0) mydat <- data.frame(y,x1,x2) table(y)

2. 2. Maximum likelihood # Fitting Generalized Linear Models res <- glm(y ~ x1 + x2, family = binomial(link=probit), data = mydat) names(res) summary(res) exp(res$coefficients) # odds ratio exp(confint(res)) # Confidence intervals for odds ratio (delta method)

2. 2. Maximum likelihood > summary(res) Call: glm(formula = y ~ x1 + x2, family = binomial(link = probit), data = mydat) Deviance Residuals: Min 1Q Median 3Q Max -2.06740 -0.17208 -0.00053 0.10700 1.96541 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -1.2324 0.4029 -3.059 0.00222 ** x1 1.1163 0.3495 3.194 0.00140 ** x2 1.5917 0.3751 4.244 2.2e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 135.372 on 99 degrees of freedom Residual deviance: 38.705 on 97 degrees of freedom AIC: 44.705 Number of Fisher Scoring iterations: 9

2. 2. Maximum likelihood library("sampleSelection") Res <- probit(y ~ x1 + x2, data = mydat) summary(res)

2. 2. Maximum likelihood > summary(res) -------------------------------------------- Probit binary choice model/Maximum Likelihood estimation Newton-Raphson maximisation, 7 iterations Return code 1: gradient close to zero Log-Likelihood: -19.35239 Model: Y == '1' in contrary to '0' 100 observations (59 'negative' and 41 'positive') and 3 free parameters (df = 97) Estimates: Estimate Std. error t value Pr(> t) (Intercept) -1.23237 0.41293 -2.9845 0.002841 ** x1 1.11631 0.36221 3.0819 0.002057 ** x2 1.59170 0.38771 4.1054 4.037e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Significance test: chi2(2) = 96.66693 (p=1.021042e-21) --------------------------------------------

2. 3. Bayesian estimation # Markov Chain Monte Carlo for Probit Regression library("MCMCpack") post <- MCMCprobit(y ~ x1 + x2, data = mydat) summary(post) plot(post)

2. 3. Bayesian estimation > summary(post) Iterations = 1001:11000 Thinning interval = 1 Number of chains = 1 Sample size per chain = 10000 1. Empirical mean and standard deviation for each variable, plus standard error of the mean: Mean SD Naive SE Time-series SE (Intercept) -1.387 0.4304 0.004304 0.03018 x1 1.244 0.3825 0.003825 0.02649 x2 1.771 0.4310 0.004310 0.04322 2. Quantiles for each variable: 2.5% 25% 50% 75% 97.5% (Intercept) -2.3091 -1.6606 -1.359 -1.092 -0.5912 x1 0.5472 0.9813 1.219 1.491 2.0493 x2 1.0402 1.4645 1.728 2.027 2.7537

R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.

Similar presentations

Presentation on theme: "R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables.

Similar presentations

Presentation on theme: "R Programming/ Binomial Models Shinichiro Suna. Binomial Models In binomial model, we have one outcome which is binary and a set of explanatory variables."— Presentation transcript:

Similar presentations

About project

Feedback